Galera

Video: My Introduction to Galera talk at OUGF Harmony 2012

Sheeri was also in Hämeenlinna for the OUGF Harmony conference a few weeks ago, and of course she had her video camera with her. My talk "Synchronous multi-master clusters with MySQL: an introduction to Galera" is now posted on Youtube. (You may also want to watch the slides on slideshare while listening.)

I think this is a really good talk if you are interested in Galera. First we cover how to get a Galera cluster running, including the most important options and status variables to pay attention to. I cover both the internal Galera architecture and how to use load balancers and clustering frameworks, and explain how galera handles node crashes and network partitioning.

HowTo use MySQL JDBC loadbalancer with Galera multi-master clusters

Some time ago I finally had the chance to test the built-in load balancing feature in MySQL's JDBC driver together with a 3 node Galera cluster. I have used this feature at a MySQL Cluster customer many years ago, so I knew it worked and I knew it was great, but I didn't know if it would work with Galera. Galera sometimes returns some error states that are different from what MySQL Cluster does and the main point of the test was to see how the loadbalancing in the JDBC driver reacts to that.

Slides from Introduction to Galera talk

On Thursday I went to the Harmony conference arranged by Oracle User Group Finland to speak about Galera clustering. (They chose the topic based on my suggestions.) The slides are now available on SlideShare. I'm pretty satisfied with this talk myself, the slides contain the most important steps you need to know to get started, but also the internal architecture of Galera, how it works, and what kind of replication topologies and load balancing you would want to use with it. And benchmarks of course.

Speaking at OUGF Harmony Finland about Galera

Heli from Oracle User Group Finland invited me to speak at this years OUGF Harmony conference which starts tomorrow. Last year I had some proposals accepted but had to decline due to work travel.

This year they wanted to learn more about Galera and I was of course more than happy to go and speak. My talk is titled "Synchronous Multi-Master Clusters with MySQL: an Introduction to Galera." It contains some parts of what we presented at the MySQL Conference, but is more of an introduction and less about benchmarking.

How Galera does Rolling Schema Upgrade, really

This post is about a fairly technical detail of how Galera works. I'm writing it down in preparation for testing this feature so that I can agree with Alex whether to file a bug or not. I'm sharing it on my blog just in case someone else might benefit from learning this.

Galera 2.0 introduces rolling schema upgrades. This is a new way to do non-blocking schema changes in MySQL.

As the name suggests, it is done as a rolling upgrade. Having seen clusters doing rolling upgrades before, I assumed this is what happens:

  • Execute alter table on Node 1.
  • Node 1 is removed from the cluster and stops processing transactions.
  • Node 1 completes alter table.
  • Node 1 re-joins cluster and catches up so that it is in sync.

Comments on the Codership Galera vs NDB cloud shootout

Alex Yurchenko finally posted results on a benchmark he has planned to do for a long time: Galera vs NDB cloud shootout.

Their blog requires registration to comment, so I'll post my comment here instead:

***

Sysbench can do the loadbalancing itself, so there is no need for external loadbalancer. Just add a comma separated list of master MySQL nodes to --mysql-host. This is similar to what the JDBC and PHP drivers can do too, and it is my favorite architecture. Why introduce extra layers of stuff that you don't need and that doesn't bring any additional value?

MySQL progress in a year

Usually people do this around New Year, I will do it in February. Actually, I was inspired to do this after reviewing all the talks for this year's MySQL Conference - what a snapshot into the state of where we are! It made me realize we've made important progress in the past year, worth taking a moment to celebrate it. So here we go...

Diversification

In the past few years there was a lot of fear and doubt about MySQL due to Oracle taking over the ownership. But if you ask me, I was more worried for MySQL because of MySQL itself. I've often said that if MySQL had been a healthy open source project - like the other 3 components in the LAMP stack - then most of the NoSQL technologies we've seen come about would never have been started as their own projects, because it would have been more natural to build those needs on top of MySQL. You could have had a key-value store (HandlerSocket, Memcache API...), sharding (Spider) and clustering (Galera). You could have had a graph storage engine (OQGraph engine isn't quite it, I understand it is internally an InnoDB table still?). There could even have been MapReduce functionality, although I do think the Hadoop ecosystem targets problems that actually are better solved without MySQL.

Re-doing Galera disk bound benchmark

I've been promising I should re-visit once more the disk bound sysbench tests I ran on Galera. In December I finally had some lab time to do that. If you remember what troubled me then it was that in all my other Galera benchmarks performance with Galera was equal or much better compared to performance on a single MySQL node. (And this is very unusual wrt high availability solutions, usually they come with a performance penalty. This is why Galera is so great.) However, on the tests with a disk bound workload, there was performance degradation, and what was even more troubling was the performance seemed to decrease more when adding more write masters.

In these tests I was able to understand the performance decrease and it had nothing to do with Galera and not even InnoDB. It's a defect in my lab setup: all nodes kept their data on a partition mounted from an EMC SAN device - the same device for all nodes. Hence, when directing work to more nodes, and the workload is bottlenecked by disk access, naturally performance would decrease rather than increase. Unfortunately I don't currently have servers available (but will have sometime during this year) where I could re-run this same test with local disks.

As part of this lab session I also investigated the effect varying the number of Galera slave applier threads, which I will report on in the remainder of this post. Of course, the results are a bit obscure now due to the problematic lab setup wrt SAN, but I'll make some observations nevertheless.
While the previous tests were run on MySQL 5.1, this test was run on MySQL 5.5 and I will make some observations there too.

About the bookAbout this siteAcademicAccordAmazonBeginnersBooksBuildBotBusiness modelsbzrCassandraCloudcloud computingclsCommunitycommunityleadershipsummitConsistencycoodiaryCopyrightCreative CommonscssDatabasesdataminingDatastaxDevOpsDistributed ConsensusDrizzleDrupalEconomyelectronEthicsEurovisionFacebookFrosconFunnyGaleraGISgithubGnomeGovernanceHandlerSocketHigh AvailabilityimpressionistimpressjsInkscapeInternetJavaScriptjsonKDEKubuntuLicensingLinuxMaidanMaker cultureMariaDBmarkdownMEAN stackMepSQLMicrosoftMobileMongoDBMontyProgramMusicMySQLMySQL ClusterNerdsNodeNoSQLodbaOpen ContentOpen SourceOpenSQLCampOracleOSConPAMPPatentsPerconaperformancePersonalPhilosophyPHPPiratesPlanetDrupalPoliticsPostgreSQLPresalespresentationsPress releasesProgrammingRed HatReplicationSeveralninesSillySkySQLSolonStartupsSunSybaseSymbiansysbenchtalksTechnicalTechnologyThe making ofTransactionsTungstenTwitterUbuntuvolcanoWeb2.0WikipediaWork from HomexmlYouTube