A year with Drizzle
Today I'm coming out of the closet. Since I'm a professional database expert I try to be like the mainstream and use the commercial MySQL forks (including MySQL itself). But I think those close to me have already known for some time that I like community based open source projects. I cannot deny it any longer, so let me just say it: I'm a Drizzle contributor and I'm very much engaged!
I've been eyeing the Drizzle project since it started in 2008. Already then there were dozens of MySQL hackers for which this project was a refuge they instantly flocked to. Finally a real open source project based on MySQL code that they could contribute to, and they did. It was like a breath of fresh air in a culture that previously had only accepted one kind of relationships: that between an employer and an employee. Drizzle was more liberal. It accepted also forms of engagement already common in most other open source projects that are based on relationships between 2 or more consenting contributors.
But in 2008 I wasn't yet ready to engage with Drizzle. Like I said, I worked in a role where I would go to database users and help them use MySQL in demanding production settings. So as much as I admired Drizzle already back then, I needed something that could give me good releases, and support me when needed.
So I stuck with my MySQL relationship - I was employed with MySQL. But eventually our paths separated, as they say, and I then became employed with another MySQL fork, MariaDB instead. I tried to give it my best, but also MariaDB couldn't satisfy what I really needed. Eventually I left also MariaDB.
With two failed relationships behind me, I tried to be single and free for a while. Solo hacking. Based on what I learned from the MariaDB team I even published my own MySQL builds. You might remember exactly a year ago I blogged a lot about what I called MepSQL. It was quite interesting and I became quite good at building MySQL packages - you could use the MepSQL build tools to easily package any of the forks: MySQL vanilla, Percona, MariaDB and the Facebook branch. You could perhaps call it a meta-fork: I tried to provide technology to unite the fractured forks in the MySQL community.
My plan was to use this as a platform to showcase more of the community action out there in the MySQL ecosystem: patches and enhancements not yet available to users in any of the commercial forks (MySQL itself, Percona, MariaDB). There are lots of them, but just to mention 2 that go mostly unnoticed but are really great: The sqlstats plugin by Anders Karlsson (this is like pt-query-digest but inside MySQL, without logging overhead) and InnoDB row cache by Wen Tong of Taobao.
I suppose in hindsight it is clear that even if it was a solo project, I was showing clear signs of being at least community-curious, if not more. Indeed, it is my firm belief that the community as a whole will always produce more great code than any single commercial vendor can handle.
And then it happened. Drizzle announced their first stable version. I decided I should have a look at it to see what it's like. I didn't official abandon MepSQL or any of the other forks, but I lay low with MepSQL and stopped blogging about it. It seemed best to give Drizzle a chance, even if I didn't commit to it in any way.
I went to last year's Drizzle Day that took place after the MySQL user conference. (There is one again this year.) There were many other like minded people, everyone was very friendly and welcoming. They even wanted me to do a talk about trends in the MySQL world. When talking about JSON I decided to tease Stewart a little and said that "Stewart could probably do this in a day" and you know what, that's exactly what he did a week later.
I also pointed out that many community projects choose to make their status legally recognized by joining some non-profit foundation that provides accounting and legal services to them. The Drizzle boys said they hadn't done that because they just wanted to keep a low profile and enjoy hacking together. But once they realized that these umbrella organizations like Software in the Public Interest are really easy to join and not a lot of paperwork, they thought it was a good idea. They asked me if I wanted to make it happen. It felt good - it was my first thing I would do for the Drizzle project.
By making the project's non-profit status official Drizzle immediately got a lot of goodwill from other open source community projects. In fact, the ceremony of applying for SPI membership was officiated by Josh Berkus, well known for his contributions in another open source database community called PostgreSQL, but also a Drizzle contributor.
Being a non-profit community project again payed off. When the site was done, it turned out we couldn't actually point our domain "drizzle.org" to it. This was a feature of the paid tiers. At OSCON I talked to Angela Byron, known as webchick in Drupal circles, and she arranged us to get an upgrade for free. It again felt good to get this kind of help from another open source community. And I should say, big thanks to Acquia, the company behind Drupal Gardens for sponsoring Drizzle in this way.
I have to say I'm pretty satisfied with the new website. I'm not a graphics designer, but have some basic training from a few university courses. Using the Drupal Gardens templates really helped though, I couldn't have done it without them.
It should be emphasized that I never really did anything of even this complexity in C++ before. I mean Symbian code yes, but you wouldn't recognize it as C++ before someone tells you that's what it is. Even then I didn't actually code much Symbian either. My point is, it is true what they say: Drizzle code has really been refactored into a clean and modular code base. Writing a plugin is dead simple. It's great that even a novice like me can approach a complex project like a relational database server and produce a useful, non-trivial feature just in a couple of weeks! This was great validation for me that it is true what they say about Drizzle - the years of refactoring work has really paid of and it is a pleasant project to start hacking on: both socially and from a code perspective.
If you read Stewarts blog post carefully - or just read the comments where I point this out - you'll notice there is a bug: Every result set has an empty row at the end. This is not a bug in Stewart's json_server, rather in the Drizzle Execute API. I traced the function calls made when you use this API and fixed the off by one error. I had become a committer to Drizzle core code.
The Drizzle 7 release of last year has, imho, proven quite stable. I mean, of course it had bugs, but compared to some historical MySQL releases I'd say it is good quality software. Unfortunately it was not a very polished piece of software. You know, it was something a Steve Wozniak would create, but not something Steve Jobs would approve of. In particular, while there was a manual, and it had parts that were quite good (SQL and core functions are actually documented), it wasn't really useful to get Drizzle up and running. Most plugins weren't even mentioned in the documentation.
Another problem was that the people that used to build RPM and DEB packages became busy with new day jobs and the 7.1 series deteriorated into just source releases.
This was the reason I stopped doing PR tasks in the Autumn. It seemed pointless to evangelize a website when people would not find any packages to download on said website if they actually went there. Altough it must be said Drizzle is much easier to compile from source than my experience with MySQL was. Especially after I updated the README file to reflect reality. Anyway, I swore I will not post a single blog about Drizzle until there are binary packages available again. I shouldn't have done that...
The first problem has been vigorously attacked by Daniel Nichter with myself and a few others helping. I'd say the current manual is something we don't have to be ashamed of anymore. Btw, I have really enjoyed working with you Daniel. Your focus on the end user experience is exactly what Drizzle needs just now, and without you being there I might have lost hope along the way. You are the Steve Jobs of Drizzle, except you are a much nicer person than he was :-)
While there has been some personnel turnover for sure, another thing that has continued to work great is the build and continuous integration process. Mark Atwood is there almost every night (well, it's night in Europe) and tirelessly merging patches, and giving feedback on those that don't pass the test suite. Even if some of the core devs have either completely gone or are very passive, it is Mark's work that really keeps the pulse going. And for all the talk of Rackspace pulling out as a sponsor, there's still someone submitting patches because there's a queue building every week. So Drizzle is doing fine, but it is much thanks to Mark who keeps the engine running.
The third thing I think is really great is the Google Summer of Code students. Drizzle has attended and been really successful with this Google program each year, with a high percentage of successfully completed projects. Last Summer 3 students completed their projects successfully, and they are all still active in various ways with the project. VJ Samuel has even become the release manager for the release candidates! I already look forward to next Summer so I can be part of GSoC too.
The thing that wasn't so great was... have you paid attention what is still missing? The packages. Nobody was doing them. I should've realized this earlier. In an open source project, if there is something that needs to be done, you look around and nobody is doing it, it means you have to do it. Even if somebody promises to do something, you look around and don't see any code actually appearing on launchpad, it means you have to do it.
A thing in the same category was the version number. Drizzle had initially wanted to adopt the Ubuntu release system. But somewhere along the line it just became a datestamp that was really difficult to understand. You needed secret knowledge to know that 2011.03.13 and 2011.11.29 are not the same major version while 2011.11.29 and 2012.01.30 are the same version. Not something Steve Jobs would approve of.
So I submitted a patch that changed the whole build system to produce versions that look like 7.1.31-rc instead. I was a bit nervous of this one. It felt great when my plugins where swiftly approved and merged, and that's exactly the point with making open source projects modular so you can be very inclusive of new contributors. But here I was touching the very heart of everything. Much to my delight, this code was approved of and after fixing some test failures we finally managed to merge the code and release the first RC with a human readable version number. It even says "-rc" now so you know what it is!
Acceptance of my proposal again grew my faith in the relationship I have with Drizzle. I feel like sometimes we don't talk enough in the Drizzle project, because the other guys often are busy at work and don't have time for my emails. It felt good to know they understand what is important to me and approved of it.
Unfortunately the change in version numbers along with some other last minute changes meant that my work on DEB and RPM packaging was obsolete and couldn't be merged into the RC as I had hoped. I have now pushed a branch that does the DEBs correctly again, but need to also update the RPMs.
Then Drizzle will have binary packages for the 7.1 series and I can break my silence and blog about it again. Oops, guess I already did ;-) Yeah, it's been a pretty great year!
If you are one of those brave souls that are not afraid to compile Drizzle from source, the brand new release candidate can be downloaded here.