open life blog

Benchmarking Galera on a disk bound workload

Update 2012-01-09: I have now been able to understand the poor(ish) results in this benchmark. They are very likely due to a bad hardware setup and neither Galera nor InnoDB is to blame. See https://openlife.cc/blogs/2012/january/re-doing-galera-disk-bound-bench…

After getting very good results with Galera with using a memory bound workload, I was eager to then also test a disk bound workload. Also this time I learned a lot about how Galera behaves and will try to share those findings here.

Setup

The setup for these tests is exactly the same as in last weeks benchmarks, except that I've now loaded the database with 10 tables containing 40 million rows each. This sums up to about 90GB database:

Keynoting at OpenSQLCamp-Froscon next week

Speaking of conferences, in general, and OpenSQLCamps in particular, there is one a week from now, and I will be speaking! It is organized as a single room track at Froscon, Germany, by Felix Schupp (Blackray/Softmethod) and Volker Oboda (Primebase). The content is mostly a collection of database related talks originally submitted via the main Froscon call for papers. (In other words, unlike many previous camps, the schedule is all set.)

I'm a little excited about this one, because for the first time in my career as speaker I will be giving the keynote. The title of my talk is

How I learned to use SQL and how I learned not to use it

Running sysbench tests against a Galera cluster

So, vacation is over and I was in luck: Already during first week I had ample time to finally put Galera replication to the test. It was a great experience: I learned a lot, and eventually got the great results I was hoping to see.

Again I've started by just running the standard Sysbench oltp read-write test. Since this is a commonly used benchmark, it produces numbers that are comparable with others running the same benchmarks. Including, as it happens, Galera developers themselves.

These tests were run on an 8 core server with 32 GB of RAM and the disk on some EMC device with a 2,5GB write cache.

The MySQL conference is back for 2012, courtesy of Percona

The past few years of MySQL conferences...

Every year since Oracle's acquisition of MySQL in 2009, there's been some uncertainty around the annual MySQL conference, which used to be co-organized by MySQL AB (in charge of content) and O'Reilly (conference logistics). As my career unfolded during those years, I've seen relatively close how the conferences of 2010 and 2011 happened. As there's been a lot of re-structuring in the community around various forks and new employers, I've felt that the annual conference was the one thing that kept us together, the one common forum where everyone would meet. For this reason I have been personally very engaged (as have many others) in helping O'Reilly get through the conferences of 2010 and 2011 and I'm very grateful to Tim, Gina and the rest of the O'Reilly team that they have provided us with this forum and gravitation point for the past two years.

During this years conference it was openly speculated that it would be the last O'Reilly MySQL conference. EnterpriseDB being the main sponsor at a MySQL conference... kind of gave you a hint. With Oracle constantly boycotting and refusing to sponsor the conference of its own community, the business justification for O'Reilly to keep going just wasn't there anymore.

So once again we were facing uncertainty of what to do next year.

When we have been discussing alternatives for the next MySQL conference, I always maintained there are 3 things from which people recognize the MySQL conference: time, location and name. So I was encouraging people to come up with solutions that would maintain those 3 variables as constant as possible.

One-liner for condensing sysbench output into a csv file

An important part of benchmarking is to draw graphs. A graph can reveal results you wouldn't have spotted just by looking at raw numbers. By the way, the process of massaging the raw numbers into graphs will often reveal things too.

Sysbench output tends to be quite wordy, especially when you have a script that runs 1, 2, 4, 8... threads with the same test. To manually copy paste the numbers into a spreadsheet is tiresome. So I came up with this monster shell one-liner to condense the output into a csv file. I'm posting it here so I will find it the next time I need it:

The ultimate MySQL high availability solution

A while ago Baron blogged about his utter dislike for MMM, a framework supposedly used as a MySQL high-availability solution. While I have no personal experience with this framework, reading the comments to that blog I'm indeed convinced that Baron is right. For one thing, it includes the creator of MMM agreeing.

Baron's post still suggests - and having spoken with him I know that's what he has in mind - that a better solution could be built, it's just MMM that has a poor design. I'm going to go further than that: Personally, I've come to think that this family of so called clustering suites is just categorically the wrong approach to database high-availability. I will now explain why they fail, and what the right way is instead.

MySQL community counseling: Talking about your feelings

Last week Monty Taylor wrote an interesting blog post Oracle do not, in fact, comprise the total set of MySQL Experts where he protested against the title of Oracle's new podcast series Meet The MySQL Experts. Now, when I say "interesting" I'm not really referring to the factual argument he is making...

What was interesting about this was to see Monty burst out like that and express some true human feelings. Through all the controversies we've seen around MySQL, the Drizzle team has made a point of staying out of such discussions and just working on cleaning up their code and adding cool new stuff (added as plugins, of course). And if anything, I would have expected it to be someone like Stewart to finally break and start ranting about something, if it were to happen...

Just to be clear: I do not actually agree with Monty on the factual topic he is raising. We are of course all very geeky and arguing about English grammar is a good way to relax, but as far as I'm concerned it is quite common for titles of podcasts and such to be shortened versions of the full, grammatically correct sentence whose meaning the are conveying. After all, it would be silly to have a podcast called "Meet the MySQL experts who work in Oracle's R&D department, but excluding those experts that do work at Oracle's support or consulting organizations, even if they are great minds too, and also excluding anyone not working for Oracle at all."

About the bookAbout this siteAcademicAccordAmazonBeginnersBooksBuildBotBusiness modelsbzrCassandraCloudcloud computingclsCommunitycommunityleadershipsummitConsistencycoodiaryCopyrightCreative CommonscssDatabasesdataminingDatastaxDevOpsDistributed ConsensusDrizzleDrupalEconomyelectronEthicsEurovisionFacebookFrosconFunnyGaleraGISgithubGnomeGovernanceHandlerSocketHigh AvailabilityimpressionistimpressjsInkscapeInternetJavaScriptjsonKDEKubuntuLicensingLinuxMaidanMaker cultureMariaDBmarkdownMEAN stackMepSQLMicrosoftMobileMongoDBMontyProgramMusicMySQLMySQL ClusterNerdsNodeNoSQLodbaOpen ContentOpen SourceOpenSQLCampOracleOSConPAMPPatentsPerconaperformancePersonalPhilosophyPHPPiratesPlanetDrupalPoliticsPostgreSQLPresalespresentationsPress releasesProgrammingRed HatReplicationSeveralninesSillySkySQLSolonStartupsSunSybaseSymbiansysbenchtalksTechnicalTechnologyThe making ofTransactionsTungstenTwitterUbuntuvolcanoWeb2.0WikipediaWork from HomexmlYouTube