Earlier this year I performed my last act as Drizzle liaison to SPI by requesting that Drizzle be removed from the list of active SPI member projects and that about 6000 USD of donated funds be moved to the SPI general fund.
Drizzle project started in 2008, when Brian Aker and a few other MySQL employees were moved to Sun Microsystem's CTO labs. The background to why there was demand for such a project was in my opinion twofold:
1) Even if MySQL was a leading open source product at the time, with a vibrant and opinionated open source community, it was also the first large single vendor open source ecosystem. This meant that essentially all MySQL code was developed by engineers employed by MySQL itself. While this is not uncommon today, it was at the time, and often came as a surprise to outside community members who tried and then failed to contribute code to MySQL.
2) MySQL's engineering department had managed to tie themselves into a knot and progress was largely stalled during the 5.0 to 5.1 cycle. When I joined the company, MySQL was in the middle of a 13 month code freeze, trying to stabilize toward a 5.1 release. Crucial improvements, like support for more than 2 CPU cores, were developed outside the company. Notably by a team at Google led by Mark Callaghan.
With all the frustrations accumulated both inside and outside MySQL-the-company, Drizzle was received with much enthusiasm. Even Monty Widenius, creator of MySQL, endorsed the project and its more community oriented approach. According to my memory over 50 people contributed to Drizzle in its first year - another testament to how much goodwill and potential the project had in the MySQL community.
Drizzle's approach was to work from a clean slate. Breaking with full MySQL compatibility, a lot of the MySQL 5.0 features (stored procedures, triggers...) were deleted to arrive at a simpler code base. A few months later, Monty would create MariaDB and after that Percona would launch Percona Server, both of which remained compatible with the parent MySQL, and crucially produced regular, stable, production quality releases. In the end it has been these two MySQL forks that were able to fill the void that existed in the MySQL community at the time.
Let's rewrite everything
Drizzle's approach to "rewrite everything from scratch" is a very common lure that engineers around the World often dream of. I have heard this so many times at so many companies. Apart from the obvious reason - the "legacy" code base is crap and we will be better off starting from a clean slate - I've heard and observed the following reasons1 :
- The current code base is hard to understand to you, and writing your own code from scratch will result in a code base you understand better. Hopefully. (Yet other engineers may be perfectly capable of working with the legacy code base. The question is, why is it hard for you?)
- We should use Go for the new project.
- We should migrate all of our stack from AWS to GCP, said an engineer who had read a blog that GCP had better performance and may therefore solve a specific performance issue he had failed to fix.
- Writing your own tool is apparently easier than reading the documentation of an existing one that you could have just used.
- Writing your own tool from scratch is more fun. (Yes, one honest engineer really said this out loud once. I appreciated the honesty.)
One benefit - to the engineer, not the rest of the World - of such projects is that it gives the engineer several months, maybe years, of time that they can just happily hack on the clean slate code base without any pressure from users (or managers) that would create pressure to fix bugs, or you know, not create them in the first place. If you do decide to rewrite everything in Go, or migrate everything to GCP, then those are huge projects where success can only be evaluated months and years later. By that time the engineer (and their manager?) may have already moved to another company, where they can again start writing something from scratch...
In Drizzle's case it took about 3 years until the first "stable" release. While this is an eternity by today's standards, it's worth pointing out that also for MySQL at the time it was normal - or in any case a fact - to have 3 years between major releases. But even then, this was 3 years that the Drizzle developers could just enjoy life, hacking on code that did not have a single production user. Also later in life I've seen that engineers (and their managers) can commonly get away working 3 years on projects that never reach production capability.
How do I know it didn't have a single production user? When I became the Drizzle SPI Liaison, I also wanted to contribute, and I contributed documentation, because it was largely missing. While writing installation instructions, I found a bug: If you installed Drizzle as an RPM or DEB, and started it as root via init.d, then it would crash (when dropping privileges after opening the port). In other words, Drizzle developers would compile drizzled, run tests and tests would pass, but only when starting it as regular user. The ones who said they were running drizzle "in production" to power their Wordpress blogs, must have been running it in a screen terminal. This bug was a great example of the importance of testing what your users are actually doing!
I usually have a strong preference for fixing the production code and rejecting the lure of rewriting everything from scratch. Forcing developers to work on code that is actually running in somebody's production, helps keep everyone honest. If you can't work on the production code, why do you think you can create production quality code from scratch? This preference dates back to my experience with Drizzle.
A Drizzle eulogy would not be complete without a friendly pointing and laughing at DB Engines ranking. This website creates a monthly score estimating the "popularity" of hundreds of database products, by scraping the web for social media mentions, open job positions, and news articles. It is taken very seriously at many marketing departments in NoSQL startups, who tend to have a KPI to improve their db-engines ranking.
Even if Drizzle development had already stalled, and even if Drizzle never had even a single production user, for many years Drizzle was far ahead both MariaDB and Percona Server on the DB Engines list. To those of us who noticed this, and understood what it meant, it served as a good reminder of how seriously to take the monthly rankings.2 And to all the marketing departments who do take DB Engines seriously: If a handful of Drizzle hackers can create more positive publicity than you, then what is it that you are missing?
I would like to publicly thank Josh Berkus, who at the time served as SPI board member and also PostgreSQL lead developer, for sponsoring Drizzle in our SPI membership application. Your friendship and collegiality toward other open source database projects was always much appreciated.