State of the MySQL forks
It's been some time since I last wrote an overview of the state of the MySQL forks, but the last few weeks have been eventful enough that it is a good time to again see how the competing variants are positioned against each other.
I have written on this topic 1-2 times a year. Here are links to the previous overviews:
Map of MySQL forks and branches (2010)
The state of MySQL forks: co-operating without co-operating (2010)
Observations on Drizzle and PostgreSQL
Percona.tv: State of the MySQL ecosystem (2011)
State of the MySQL forks: via a particular example of authentication plugins
As is the case with this post, many of those previous ones coincide with some events or discussion at the time they were written, and the posts contain links to other bloggers commenting on whatever was current then.
MariaDB is now enough of a fork to use its own version numbering
MariaDB team explained in a blog post why the next release will be numbered 10.0. The summary of it all is that they have now diverged so much from MySQL, that you can no longer directly say whether the next release will be like MySQL 5.5, MySQL 5.6 or something else completely.
We discussed something like this in the corridors of Percona Live UK 2011, and I predicted that MariaDB 5.5 will probably be the last MySQL release that MariaDB is able to merge from its parent. Already that effort took them well over a year. MySQL 5.6 will contain Oracle developed code for features that MariaDB already has, like sub-query optimization and better GIS support. Unless MariaDB developers want to toss out their own, already released code, this is another reason why they simply can't continue anymore to just wholesale merge MySQL as an upstream source.
So do I have an opinion on this? Of course I do...
Monty always said that MariaDB is not a fork, it is a branch. Well, now it is a fork. It is no longer a subset or superset of any specific MySQL version, it is different.
One of the marketing promises of MariaDB is to be fully backwards compatible. That is true for MySQL 5.1. It is now true for 5.5, but for a long time MariaDB did not actually contain MySQL 5.5 features at all. And finally, it won't be true for MySQL 5.6. It will be interesting to see in a few years how much these two forks continue to diverge.
MariaDB was from the beginning a bold effort. The MariaDB team knows they are capable of understanding all of the codebase and develop it further and they don't need any help from Oracle to do so. At the same time, they don't have the resources like Oracle does. Merging MySQL 5.5 took long enough. From 5.6 onwards we can bet that MySQL will have some features that MariaDB doesn't (yet) offer.
To me, the decision to "fork for real" seems a bit bold. There is no question the MariaDB developers can pull it off technically, and they are well funded to have the long term resources needed. But do they have the momentum needed? According to the 451 Group study, MariaDB is only now starting to compete with Percona for second place in the MySQL forks ecosystem in terms of adoption. When other former Sun projects have forked, be it LibreOffice, Jenkins or IllumOS, they have been able to move a significant majority of the developer mindshare with the fork and most crucially, have had the support from Linux distributions with them. MariaDB does not yet have this. Instead, most people continue to use stock MySQL, in particular as it is shipped with their favorite Linux distribution.
With the installed base still solidly following MySQL then, instead of being a competitive advantage, the fact that MariaDB is becoming more different from MySQL might instead turn into barriers to migrate. For instance, I used to cite the lack of semi-synchronous replication as a showstopper for using MariaDB. Now that it comes with MariaDB 5.5, I'm of course pointing to lack of Galera support (e.g. what Percona XtraDB Cluster does). Forking when you are still catching up with the competition certainly has its risk.
Unless of course Oracle would do something - a mistake, say - that would actively push people away from MySQL...
The big news this week was the reports by Ronald Bradford (consultant), Sergei Golubchik (MariaDB) and Mark Callaghan (Facebook) that Oracle has yet again tightened the screws a little bit. This was also picked up by mainstream tech websites, links to which you can google for yourself, and most recently by Giuseppe Maxia former MySQL Community team lead.
Following the already previously implemented policy of no longer using a public bug tracker, MySQL management had apparently realized that the community developers (including, but not restricted to competitors MariaDB and Percona) had no trouble figuring out bug fixes from the source code commits and test cases.
When trying to understand what follows, it is possibly timely to pause for a moment to understand the pressure Oracle faces from its competitors, in particular Percona (mostly in the US) and SkySQL/MariaDB (in Europe). In the past two years, Oracle has lost tens of millions of MySQL revenue to this competition. (Note: The new revenue seen by SkySQL and Percona is of course less, as they compete with price.) There must be a lot of pressure to do something!
The natural fix? Oracle has now stopped using the public launchpad branches completely, only publishing source tarballs together with each new MySQL release. (Update: The launchpad branches have now been updated, see more info and a link further below.)
In addition they have completely stopped publishing new regression tests. (Which is still true.) Essentially MySQL is now doing the same as was implemented for Solaris from day one: Development is now as closed as it can be without actually stopping to use the GPL completely. Source code is "thrown over the wall" together with each release, but that's all you get. (The return of bzr repositories with commit history improves the situation somewhat.)
Btw, Stewart Smith (Percona) announced that he is retrofitting launchpad trees out of the source tarballs coming out of Oracle. These might still be of some use if you need an "upstream" MySQL bzr repository for something.
Ok, so will missing test cases matter to most end users? There will of course always be those that will react adversely to increased amount of closedness, and will be looking towards MariaDB and Percona now to support a more open source alternative. But no, most people will not care. If anything, end users will be more driven by actual features being closed source (and only for paying customers), like the plugins we discussed in my last installment of this series that are available as open source implementations in MariaDB and Percona.
But there are people who will care. Linux distributors have a work flow that is very dependent on tracking bugs and commits from their upstream sources and have been more annoyed than most about these developments. When Oracle stopped using the public bug tracker, there was already a relatively serious discussion in Ubuntu (and indirectly affecting Debian) to replace MySQL with MariaDB as the default MySQL variant.
I was always wondering whether Oracle doesn't appreciate the importance of being included in a Linux distribution, but apparently they did and sent 7 people to the last Ubuntu Developer Summit, and for the time being they remain in Ubuntu. To then stop publishing your code commits on launchpad is probably as good a PR move as a kick in the nuts.
Hence it is easy to predict that one of these moves will be the straw that breaks the camel's neck, and we will see Linux distributions switching to another MySQL fork in the coming 12 months. And this is what will matter. Most of the end users will continue not to care, and they will continue to use whatever is automatically installed for them - by the Linux distro.
There is a lot more that could be said about Oracle MySQL. They are about to release MySQL 5.6 which will be (close to) on par with MariaDB features like sub-queries, microsecond data types and GIS, plus a slew of new features in replication (global transaction id for the win) and a NoSQL memcache API for 2x to 7x performance increases! When you read about people proclaim how good a steward Oracle has been for MySQL, it is of course to some extent also diplomacy, but it is also true that measured purely in engineering efficiency, MySQL of today is way more productive than it ever was before Oracle times. Which is not to say this is necessarily thanks to Oracle, but it is still a fact.
Anyway, screwing your community and in particular the Linux distro community is today more newsworthy than the great engineering work, so I will have to write more about those some other time.
Conveniently, Baron has today blogged about MySQL 5.6 features. If you want to know more about those, his post has it all.
Mark Leith otoh informs us that the out of date Launchpad trees are now refreshed again and that Oracle has made no decision to stop publishing code on Launchpad. Based on Stewart's post it seems at least the 5.1 tree was not updated since May. On the other hand, it appears that 5.1.64 was never released, which might explain why there was no update for so many months. In other words, I'm inclined to believe Mark's assertion that this was yet again a case of just general neglect by the Oracle team, rather than active witholding of code commits. Of course, that's exactly what easily happens when development is internal by default, and the publication of code is additional overhead.
So far there has been no comment on the missing test cases though.
In all of the above Percona Server has continued to tightly follow Oracle MySQL as an upstream, adding their own, mostly performance related, enhancements on top. This strategy has allowed them to release a MySQL 5.5 based release only months after Oracle, and again now they already have a 5.6
beta release out before MySQL 5.6 has even been released.
With their current strategy Percona is somewhat dependent on progress in their upstream MySQL. Here they have turned out to be lucky. Since Oracle is doing well in keeping up with MariaDB, it means Percona too is benefiting from this. Expect Percona to have sub-query optimizations and improved GIS later this year in a 5.6 GA release.
While the Percona Server strategy is relatively focused, and I wouldn't expect them to do large refactoring of the optimizer code base like MariaDB, they have actually often been able to be first to market with significant improvements. The XtraDB engine itself of course saved MySQL from a near-death experience when InnoDB didn't scale on multiple cores, and Xtrabackup is now the de facto backup tool for MySQL (regardless of your choice of server fork). Percona was also the first to integrate a NoSQL API via HandlerSocket. This year Percona has released Percona XtraDB Cluster, which is Percona Server integrated with Galera clustering. This is yet again a game changing introduction of new technology, which I'm certain will wipe out a whole category of also rans in the category of high-availability solutions.
Interestingly then, with a strategy that seems quite minimalistic at first sight, Percona in fact often comes away as over-achieving. If you would count only the amount of new features, or complexity of them, then certainly MariaDB is a more ambitious effort than Percona Server. But in practice Percona Server offers a cominbation of keeping tight with the MySQL upstream releases, while also delivering significant enhancements of their own. In practice then I have found that if anyone is to be considered a superset of everything good the MySQL ecosystem has to offer, surprisingly, it often turns out to be Percona Server.
To be continued...
In the end, all of the above are still good choices for a MySQL database. All of them are much more advanced than MySQL 5.1 was in its time. With the introduction of advanced synchronous clustering techniques like Galera, coupled with superior InnoDB peformance, MySQL competes well with any of your favorite NoSQL (key-value) solutions.
Still, to follow the competitive situation between the 3 MySQL compatible forks has perhaps introduced some cognitive overhead into the game. I hope to make some concluding remarks on all 3 of these in a follow up blog post.