The state of MySQL forks: co-operating without co-operating

Giuseppe "The Data Charmer" Maxia recently posted his take on the MySQL forks. I had been pondering whether to do the same, and seeing that what I planned to write will nicely complement Giuseppe's article, I was inspired to follow him into the same topic. Note that last Spring I created a Map of MySQL forks in preparation for Monty's keynote at the MySQL user conference. So let's see how things have evolved. I'll look into MySQL ecosystem as a whole and the forks separately.

The post is long, but the key takeaway is that despite the challenges, the combined development seen in the MySQL ecosystem is probably stronger than ever, the current situation is hard for an outsider to grasp but manageable, and if a few more obstacles can be overcome, we are looking into a very bright future indeed. There are more than 100 engineers (how much more?) working full time on the mysql code base (including both developers, QA, build engineers...). This development effort is an order of magnitude higher than other open source databases I'm aware of, in particular PostgreSQL and Drizzle. Often the open source project with most momentum and mass will come out as the winner, no matter what challenges it may seemingly be facing, and this is the case with MySQL too.

Oracle's MySQL

Negatives

  • My personal prediction of Oracle taking over MySQL has been that Oracle would not outright kill it but would reposition it into a nice sandbox where it is no longer a threat to its main database product. These moves can be seen in changes like the slogan: Where MySQL AB used to say "#1 open source database" or "#1 online database", Oracle has downgraded it to World's most popular open source database for the Web, suggesting that MySQL is mainly a web database, and even there leaving open the door to a certain proprietary alternative. Practical consequences of this is that MySQL Account Managers can no longer go to enterprise IT customers to propose an open source database strategy to replace their existing Oracle installations - this was rather expected - but must mostly focus on the web segment where MySQL already is a leader anyway.
  • Just like we've seen happening with most other communities of Sun projects, Oracle did try to have its way with the MySQL community too. The main target of attack was the annual MySQL user conference (and expo). Luckily for us, the conference was formally an O'Reilly conference and O'Reilly Media and Tim O'Reilly personally stepped up to keep the conference going in 2010 - with some help from Monty Program and Percona employees to fill the void left by MySQL AB's role (not to forget all the speakers, exhibitors and people such as Brian Aker organizing his traditional summit...). The 2010 conference was only marginally smaller than the 2009 with Sun, this can be counted as a victory. In 2011 Oracle is boycotting the conference and Oracle employees have not been able to submit a single talk, but by now we have enough people outside of Oracle that I have no doubts that the conference will again be a great success and fuel the community into another prosperous year of open source database development and adoption.
  • Significant employee turnover. Of the 400 MySQL AB employees acquired into Sun less than 3 years ago, way more than half have now left and still several a month are resigning. This was also quite expected, I personally knew about 50 people wanting to leave Oracle, and the real number seems to be more like 150. (And add to that 50+ resigned or RIFfed during Sun era.) To emphasize the challenge Oracle is facing here, when last Summer there was a small storm in a tea pot with Oracle employees blogging that everything is fine, they love Oracle and nobody is leaving ... today 2 out of the 3 most vocal ones have also resigned. The problem is that when everyone else is leaving, it's not fun to stay as the last one. But the good news is that even in the face of such a challenge, the MySQL engineering team remains quite productive - this topic will be continued in the list of positives. The other positive part is that most (albeit nearly not all) of those resigning have found their place within the MySQL ecosystem - some like Anders Karlsson seem to be able to contribute more now than he could in his previously hectic Sales Engineer role. Yoshinori's HandlerSocket is in my opinion the greatest MySQL innovation since the addition of InnoDB - both developed outside of MySQL. Update: HandlerSocket was not developed by Yoshinori Matsunobu, though it was first announced on his blog, the development was done by Akira Higuchi.
  • There are constant rumors from sources at Oracle that developers are being moved to develop closed source features. This is not yet reflected in MySQL 5.5, which is pretty much the fruit of work done at Sun, so it remains to be seen in the next couple of years what the reality behind these rumors is. Anyway, it is fair to note that going increasingly closed source was on MySQL AB's agenda anyway, the only negative here is that Oracle has more balls to go through with such plans than Sun had.

Positives

  • MySQL 5.1 already was a better release than those before it, you didn't have to wait a year after GA before the bugs were really ironed out. In 5.5 also the gap between major releases is decreasing, which is very welcome.
  • The biggest miss of 5.1 was the total lack of focus on performance. There was no performance enhancement over 5.0 and MySQL was lagging behind PostgreSQL in this respect. Since then a lot has changed. In 5.5 scalability is the main enhancement. Despite MySQL's poor management of its community, the "First Aid" in this respect actually came from Google and Percona work, without which MySQL would have been in big trouble. The upcoming 5.5 release also contains original work from the MySQL and InnoDB teams themselves. (It is notable that Percona is yet to publish a comparative benchmark of MySQL 5.5 and Percona Server and MariaDB openly admits that they don't focus on competing on multi-core scalability with the MySQL team, since they can include that work thanks to the GPL anyway.)
  • The MySQL engineering team, despite the high number of resignations, seems to be relatively productive. I personally think Tomas Ulin as the VP of engineering is the best MySQL has ever seen. Processes related to automated QA, reviews, cleaning up of compiler warnings, the "train model" of producing releases mean that paradoxically MySQL engineering seems to be more productive now, despite the hit it has taken in headcount and experience.
  • The improvements into scalability seem to remain GPL in the future too and will thus benefit all MySQL users, regardless of what happens to other features.

Summary: Oracle's MySQL is in pretty good shape, considering the circumstances. Chances are the community will be happier with the 5.5 release than anything we've seen for a very long time - focus is on scalability, finally! The community has been able to patch over the problems that are due to Oracle: keeping the MySQL conference alive, retaining the talent within the ecosystem and addressing enterprise customers with new 3rd party support providers, not the least of which is SkySQL.

Fork 1: Percona Server with XtraDB

In my map of MySQL forks last spring I predicted that XtraDB and the Percona patches would fold into MariaDB. They are in MariaDB, but Percona also productized their own fork of MySQL 5.1. The approach of Percona is to provide "boutique" performance enhancements to a MySQL 5.1 base. They are very focused on customer needs - most enhancements are sponsored by the customers directly.

Positives

  • Percona's work has a track record of a few years now and is running in production at many of their customers. (I wonder if they have any estimate of usage by non-customers, would be really interesting to know.) To continue with such a product is a wise and welcome move.
  • Focused on performance: More performance or at least better monitoring facilities.
  • As the oldest 3rd party MySQL support provider, especially with a proven capability to provide also product maintenance (fix bugs and develop new features), Percona's existence more than any other company has provide relief to anybody who has had doubts over Oracle acquiring MySQL.
  • XtraBackup is the only open source tool to provide online backups of a large MySQL database.

Negatives

  • I agree with Giuseppe, that Percona Server is not a sustainable product on its own, but depend on the work coming from MySQL to evolve. I think this question is relevant, not because Oracle is likely to kill MySQL any time soon, but it is still a yardstick of credibility and viability whether one is dependent on ones major competitor. On the other hand, if anything were to happen to MySQL, Percona could use MariaDB as a base instead and continue with their model unaffected. Hence, Percona Server is not viable as an independent product alone, but not dependent on Oracle either.
  • Very focused on Percona customers and their demands is a good thing, but also means the rest of the world is served less well. A stray 3rd party patch addressing something else than web scale is unlikely to ever get attention of the Percona engineers.
  • While no comparative benchmark has been published, it seems obvious that MySQL 5.5 is more scalable than Percona Server based on the 5.1 series. This is not a big problem, it will not be many weeks of work for Percona to re-base it's product on top of 5.5 instead.

Fork 2: MariaDB

Run by the creator of MySQL and an impressive group of senior MySQL Architects and Engineers, MariaDB 5.2 is based on MySQL 5.1 and pulls together a lot of 3rd party work such as XtraDB, PBXT, Sphinx, OQGRAPH engines and also new features. The core team itself is focused on finishing sub-query optimizations from the dropped MySQL 6.0 release - due in a later release - and some other very relevant enhancements such as fixing the broken group commit and a new pluggable replication API.

Positives:

  • Focus on integrating lot of exciting work from the MySQL ecosystem that was previously left without attention, both features and a significant MyISAM performance improvement.
  • I disagree with Giuseppe here: I think MariaDB is a viable fork and would survive in the hypothetical scenario that Oracle pulled the plug on MySQL. This takes into account the friendly co-operation from the XtraDB team at Percona, but still. Obviously, MariaDB is not today fixing bugs at the rate of the much larger MySQL team, but could grow into it if needed.
  • When new code from MySQL releases i merged, the MariaDB team at least partially reviews them. On occasion they have been able to exclude patches that introduced a new bug. On the other hand, the same is true the other way too: MySQL never included the pool-of-threads patch that is now in MariaDB 5.2, as it is said to have scalability issues.
  • Lot's of new features.
  • My earlier criticism of some decisions aside, this is still the most open and community oriented alternative of these three.

Negatives:

  • Does not yet have a lot of adoption, particularly only a few production / paying customer installations so far.
  • Bad luck: Own release schedule matches poorly with MySQL 5.5, it will probably be a long time before we see a stable MariaDB release based on MySQL 5.5.
  • Very much a single-vendor project, with only a few active outside contributors. Most of the "community contributions" in MariaDB 5.2 were originally contributed to MySQL (who neglected them) and then "pulled" into MariaDB by Monty Program, they do not yet constitute an active developer community around MariaDB itself (but they are proof of the potential inherent in the MySQL community).

Others

Out of the other forks mentioned in my map from last Spring:

  • OurDelta has now folded activity and the developers contribute to MariaDB. In fact they are probably among the most active contributors outside of Monty Program.
  • XAMPP seems to continue as before.
  • Many storage engines found their way into MariaDB, which makes it easier for users to try them. Some still remain, in particular Spider engine is not yet there. MariaDB also did good work in integrating some "Abandoned patches", yet many remain.
  • Drizzle is not a backward compatible fork so it is not discussed here, but I might post something about Drizzle and PostgreSQL separately.

MySQL ecosystem as a whole

The most important message is that the MySQL ecosystem as a whole is surviving, and despite all the challenges at the moment, looks like it is perhaps stronger and more productive than before.

It is a bit worrying, and definitively confusing to many, that the ecosystem is still disintegrating rather than unifying. It is difficult to follow which fork has which features. The situation is a bit like ten years ago, when Linux distributions would come and go, and users would be asking which one to try. The answer is the same here: if you want a specific feature, choose that fork, otherwise try whichever you want. In practice, most MySQL users still remain on MySQL 5.1 and are likely to upgrade to MySQL 5.5. Uptake of the forks will be slow, but their existence still serve as an important safeguard.

Also the "Abandoned patches" practice continues with a Facebook patch, Tivo patch, HandlerSocket, Anders' SQL statement monitor plugin and more turning up in their own source code repositories, often not even using Launchpad to publish them. However, with both MySQL and MariaDB now paying more attention to this work, and Percona to the extent it serves their focus (web customers), the patches are perhaps less abandoned than a year ago.

Positives:

  • The MySQL ecosystem as a whole is today more productive than ever before. (It is just difficult to see it.) It seems plausible that scalability/performance of MySQL 5.5 is better than any other open source alternative (Drizzle, PostgreSQL...). Mark Callaghan recently compared MySQL against MongoDB with MySQL coming out as a winner in all tests. And then we have things like HandlerSocket (NoSQL for MySQL -> 7x better performance) in the queue. All of this just proves that often the open source project with most momentum and mass will come out as the winner, no matter what challenges it may seemingly be facing.
  • To put it another way: I don't have any exact number, but there are in any case way more than 100 engineers working full time on the mysql code base (including both developers, QA, build engineers...). This development effort is an order of magnitude higher than other open source databases I'm aware of, in particular PostgreSQL and Drizzle.
  • While it is unfortunate that different actors have not been able to come together and focus their work on a unified project, in practice code seems to flow rather painlessly (considering the cisrumstances) between the main forks. This is much thanks to using distributed version control.
  • Same issue can be repeated for things like IRC channels and mailing lists: it would probably be better if everyone could work on one communication channel. On the other hand, in MySQL AB times almost all discussion was company internal, so the current situation is still better than what was before.
  • Despite the strained relations between the official Oracle and many of those on the outside, the developers have good contact with each other, and will help each other out with answering questions and sharing ideas. For instance, while Monty Program developers don't generally assign copyright to Oracle, they have done so on occasion to have bugs fixed in areas of the code they felt responsible for.

Negatives / challenges yet to overcome:

  • The inability of the forks to unify into a single community plays into MySQL's pocket. When we want to talk about the ecosystem as a whole, it will be known as the "MySQL ecosystem", since no alternative community exists or is in sight. As the owner of MySQL (and the mysql.com domain) Oracle is the main benefactor of this state.
  • Once forks get more usage, they will increasingly suffer from the lack of a free MySQL manual. Writing one from scratch will be a lot of work. I hope to contribute to that effort one day.
  • Duplicate work: MariaDB and Percona publish their original work as GPL, but Oracle wants copyright assignment for anything going into MySQL. For instance, a lot of duplicate work has gone into cleaning up compiler warnings from the code: done first by Drizzle, then MariaDB, then again separately by MySQL.
  • While we seem to be doing ok for now, it cannot be healthy in the long term to just work from separate trees. For instance the new replication API and group commit developed at MariaDB is a significant change, there is no way Oracle will be able to independently reverse engineer a compatible API. Sooner or later, such incompatibilities in key areas of the server will lead to divergence.
  • Despite both MySQL and MariaDB now paying more attention to new patches coming from the wider ecosystem, it seems they are still struggling to keep up and at any given time there are a handful of interesting plugins or patches that one needs to download separately. (This is another topic I intend to come back to at a later date.)
  • The year and all the "restructuring" has taken its toll on the community though. It seems to me that for instance posts on Planet MySQL are down almost 50% from what they used to be. Interestingly this is also the case for the independent consultants that used to top the list of most frequent bloggers, so its not just that there are less bloggers, also the active ones blog less.

Conclusions

The picture of the MySQL ecosystem drawn here is one where vendor interests dominate, and to match that also contributors that are not vendor publish their new code in isolation from others. Despite this, there is significant progress going on, and eventually new code "diffuses" itself into all the different forks/distributions.

We are well familiar with governance models for open source like the Benevolent Dictator model (Linus and others) or the Apache Model (core team with +1/-1 votes). The MySQL ecosystem is none of these, yet it seems to be quite productive anyway. I've been thinking I will start calling this mode of development the "Federalist model" of open source development, where different actors co-operate without co-operating, or rather, exchange and merge patches in a peer-to-peer fashion, communicating with each other but without seeing any one party as a leader of the others.

Thanks to bzr and the GPL, this seems to work surprisingly well, as MySQL has never seen so much interesting progress as it does today.

I think the two biggest external contributions to MySQL are HandlerSocket and everything Yasufumi Kinoshita has done for InnoDB/XtraDB/XtraBackup.

I doubt that external actors will ever come together unless there is a forum in which we are all equal participants. Such a forum does not exist today. By this I mean there is not the equivalent of an Apache project for external MySQL development.

What are the rules for the community in MariaDB? Everything on http://mariadb.org/about.html links to askmonty.org and montyprogram.com. On http://montyprogram.com/community I am told to see the main MariaDB page (recursion!) and http://kb.askmonty.org/v/contributing-to-the-mariadb-project which makes me think that MariaDB == MPAB.

How is the Facebook patch abandoned? The Google patch has not been updated for more than a year but the Facebook patch is derived from 5.1.47 and eventually will be updated for 5.1.52. It isn't a community project nor is it meant for end users but neither of those are preconditions for not being abandoned.

There was not a total lack of focus on performance in 5.1 as it included the InnoDB plugin. But that was done when InnoDB and MySQL were separate companies.

I don't get this. Which companies will work for free? I have some projects for them. I am not fond of statements like this that imply that Percona is only serving their own interests while others are serving the world. Unless, of course, others really are serving the world. But Percona makes a lot contributions: releases, xtrabackup, information on the blog posts, taking the time to write an amazing book, ...
>>> Very focused on Percona customers and their demands is a good thing, but also means the rest of the world is served less well.

I am not ready to grant MySQL 5.5 the role of best performance. Lets wait for benchmarks and deployments. The runtime and InnoDB changes are awesome. But there are a lot of changes in Percona that focus on reducing variance in response time (removing stalls). Then add the value of better monitoring which means that a Percona-based deployment is more likely to achieve its potential because the performance experts have a better chance of tuning it.

Thanks for comments, they're all good so I have to take them one by one:

I think the two biggest external contributions to MySQL are HandlerSocket and everything Yasufumi Kinoshita has done for InnoDB/XtraDB/XtraBackup.

I was thinking of InnoDB itself, as in Heikki Tuuri. You know, MySQL being ACID compliant and transactions and that. But I agree that the work done by Yasufumi and then continued/complemented by your team at Google is in the same league. Who knows, MySQL might have lost the game without it, internally we were quite unprepared when the multi-core craze came upon us. (And we were betting on Falcon to save us, so InnoDB was neglected.)

I doubt that external actors will ever come together unless there is a forum in which we are all equal participants. Such a forum does not exist today. By this I mean there is not the equivalent of an Apache project for external MySQL development.

And as you know I for one regret this. On the other hand, with this article I'm also trying to show the silver lining: If we cannot work together, then the best thing is to work separately. It seems to work surprisingly well!

I also think that some time in the future when the time is right, some kind of restructuring might again occur. After all, there are still people looking to resign from Oracle, we are not quite out of that phase yet. (And here I am already going for the next round :-)

What are the rules for the community in MariaDB? Everything on http://mariadb.org/about.html links to askmonty.org and montyprogram.com. On http://montyprogram.com/community I am told to see the main MariaDB page (recursion!) and http://kb.askmonty.org/v/contributing-to-the-mariadb-project which makes me think that MariaDB == MPAB.

I agree.

How is the Facebook patch abandoned? The Google patch has not been updated for more than a year but the Facebook patch is derived from 5.1.47 and eventually will be updated for 5.1.52. It isn't a community project nor is it meant for end users but neither of those are preconditions for not being abandoned.

This is possibly poor choice of an English word by me. I labeled the various patches as "Abandoned patches on the Internet" last winter. I didn't intend to point at the authors of the patches abandoning anything, rather that there were quite a few patches for which MySQL was slow to do anything at all. Percona maintained some of them in their builds (now Percona Server) but many were out of focus also for them. Come to think of it, many authors then actually had abandoned their patches, since nobody cared - but that was not the main point.

Anyway, the point of this article is that now the situation is much better, with all of the forks paying more attention to them. MySQL perhaps still mostly focused on the Google/Facebook patch like semi-synch replication, but it is a great start. MariaDB has done the most work to integrate an array of patches, such as virtual columns that I blogged about earlier. Hence this group is now "less abandoned" - just a continuation of the label I used a year ago, not really appropriate anymore. Even so, now we have a new set of interesting stuff that is not in any of the forks yet, so we have to catch up with those!

There was not a total lack of focus on performance in 5.1 as it included the InnoDB plugin. But that was done when InnoDB and MySQL were separate companies.

I of course don't have secret knowledge as to when the InnoDB plugin was conceived of, but when it was first announced, MySQL 5.1 had already been in RC (meaning feature freeze) for a long time. From the outside it looks like they worked on the plugin after finishing the main development work on the built in MySQL 5.1.

In other words, when MySQL/InnoDB 5.1 development started, there was no Product Manager with PERFORMANCE/SCALABILITY!!! on the top of his list. Now there is.

The plugin apparently patched this problem relatively well though, for instance yourself seem to think of it as an almost integral part of MySQL 5.1. (Which is not entirely wrong, I guess, if it works it works...)

I don't get this. Which companies will work for free? I have some projects for them. I am not fond of statements like this that imply that Percona is only serving their own interests while others are serving the world. Unless, of course, others really are serving the world. But Percona makes a lot contributions: releases, xtrabackup, information on the blog posts, taking the time to write an amazing book, ...
>>> Very focused on Percona customers and their demands is a good thing, but also means the rest of the world is served less well.

So what Percona produces is amazing. Maintaining an InnoDB fork, all the performance work, monitoring enhancements and XtraBackup are all useful to everyone that uses a version of MySQL.

On the other hand, Percona Server has a clear focus of serving Percona customers. It seems at this point in time, Percona customers do not want to see "new features", anything from virtual columns to complex sub queries and certainly do not want to see PBXT or a graph engine anywhere near the production deployments. So even if it would be relatively simple for them to merge some of these features from MariaDB - the work has already been done - they specifically won't do such a thing. So this means that those users that like to play with new features (and even use them in production) should choose MariaDB instead of Percona Server.

So it is not a generally negative thing - on the contrary what Percona is doing works great for them and their customer base - it is just that for another group of users, it is a reason to perhaps not choose Percona Server.

I am not ready to grant MySQL 5.5 the role of best performance. Lets wait for benchmarks and deployments. The runtime and InnoDB changes are awesome. But there are a lot of changes in Percona that focus on reducing variance in response time (removing stalls). Then add the value of better monitoring which means that a Percona-based deployment is more likely to achieve its potential because the performance experts have a better chance of tuning it.

Interesting comment! Since we haven't seen a shootout, we can only speculate :-) On the other hand, adding your logic to my earlier claim means that when Percona Server migrates to a MySQL 5.5 base, they should again be firmly on top.

Hi,

Percona offers a custom build service. If one of our customers wants to test an official Percona Server build with one of those plugins (OQGRAPH, etc) we would happily build them a release containing it. This is a service which we provide our customers at some levels of our support services.

Any customers that want to build such a plugin and test it themselves can easily do so, or they can purchase consulting time for us to do it.

For complex changes like Virtual Columns, this requires changing the parser, which is too difficult for a lightweight fork like Percona Server to maintain, but is great for MariaDB. If you want to use Virtual Columns, but want better performance, then you get the benefit of XtraDB in MariaDB. I think this is a fair trade-off.

Hi Justin

Thanks for pitching in with some clarifications. What you are saying is precisely what I tried to say too.

In particular, I'm not saying that Percona is doing any wrong choices with your Percona Server, quite the contrary, I think your strategy is absolutely correct. I'm just trying to highlight why one group of users might choose Percona Server and another might not. (Also, the post is perhaps primarily thinking of the use case of a user downloading what's there for free, not so much a customer willing to spend money.)

Add new comment

The content of this field is kept private and will not be shown publicly. Cookie & Privacy Policy
  • No HTML tags allowed.
  • External and mailto links in content links have an icon.
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.
  • Use [fn]...[/fn] (or <fn>...</fn>) to insert automatically numbered footnotes.
  • Each email address will be obfuscated in a human readable fashion or, if JavaScript is enabled, replaced with a spam resistent clickable link. Email addresses will get the default web form unless specified. If replacement text (a persons name) is required a webform is also required. Separate each part with the "|" pipe symbol. Replace spaces in names with "_".
About the bookAbout this siteAcademicAccordAmazonBeginnersBooksBuildBotBusiness modelsbzrCassandraCloudcloud computingclsCommunitycommunityleadershipsummitConsistencycoodiaryCopyrightCreative CommonscssDatabasesdataminingDatastaxDevOpsDistributed ConsensusDrizzleDrupalEconomyelectronEthicsEurovisionFacebookFrosconFunnyGaleraGISgithubGnomeGovernanceHandlerSocketHigh AvailabilityimpressionistimpressjsInkscapeInternetJavaScriptjsonKDEKubuntuLicensingLinuxMaidanMaker cultureMariaDBmarkdownMEAN stackMepSQLMicrosoftMobileMongoDBMontyProgramMusicMySQLMySQL ClusterNerdsNodeNoSQLodbaOpen ContentOpen SourceOpenSQLCampOracleOSConPAMPPatentsPerconaperformancePersonalPhilosophyPHPPiratesPlanetDrupalPoliticsPostgreSQLPresalespresentationsPress releasesProgrammingRed HatReplicationSeveralninesSillySkySQLSolonStartupsSunSybaseSymbiansysbenchtalksTechnicalTechnologyThe making ofTransactionsTungstenTwitterUbuntuvolcanoWeb2.0WikipediaWork from HomexmlYouTube