2 MySQL lessons for real life

Between following (from a distance) the talks at Fosdem and anticipating the ones at MySQL User Conference in April, I was reminded of 2 interesting MySQL talks that have had a deeper meaning to me than their original speakers probably intended. I thought today could be a good time to share these 2 stories that for me personally are filed in the "things I learned from MySQL AB and Sun" folder...

"If you can't solve the problem, try solving some other problem"

These genial words were uttered by MySQL consultant and instructor Kai Voigt at the MySQL Customer Conference in London, Oct 23, 2008. (The slides seem to be gone from mysql.com following the Oracle takeover and revamp of the site. The talk was titled something like "Top 5 performance tuning tips for MySQL".)

The problem Kai was solving at a customer was to select all data points (x, y) that are within a given radius r from a point (a, b). Quite a common problem given how maps and spatial databases are becoming increasingly popular. So something like:


SELECT userid, x, y FROM locations WHERE
r >= sqrt((a-x)*(a-x)+(b-y)*(b-y));

(r, a and b are parameters and would be substituted by "?" or a number in an actual SQL query.)

The problem is that to calculate the radius from something, you end up using the formula from Pythagoras theorem. While a simple concept, the squares and square roots are too complex for MySQL to make any useful optimization on which index to use, so it ends up calculating the radius for all records in the entire table!

Failing to modify the query to something that MySQL could handle more optimally, Kai eventually had to go back to the customer to say: "Would you mind if instead of looking up all records that are within the circle with (a,b) as the center point, we could return all records that are within a square with (a,b) as a center point"

The customer was ok with that, so Kai modified the query to this simpler form:


SELECT userid, x, y FROM locations WHERE
ABS(x-a)

To really make it easy for MySQL to use an index for this query, we still need to "solve" x and y from the formula:


SELECT userid, x, y FROM locations WHERE
(x a-r) AND (y b-r);

With the above query, MySQL can quickly return you all userid's that are within a certain square that has point (a,b) at its center.

However, the really funny thing is that once you have found the records that are within the square, it becomes possible to go back to the original problem of finding records that are also within the original circle. This is because the circle is fully inside the square, so you can now calculate the Pythagoras theorem only for the records you found with the previous query. Essentially you filter out points that are in the "corners" of the square, and keep the ones that are inside the circle. This is feasible to do efficiently since you do it for the small resultset, not for the entire table.

So by adding the "complex" Pythagoras theorem back into the SQL query, Kai eventually got the answer he was originally looking for:


SELECT userid, x, y FROM locations WHERE
(x a-r) AND (y b-r);
AND r >= sqrt((a-x)*(a-x)+(b-y)*(b-y));

In addition to learning a useful SQL trick, this particular one has stuck with me for a year now, because I really like the general principle: "If you can't solve the problem, try solving some other problem". If you're stuck on a problem, maybe you're trying to solve the wrong thing? Or maybe it's the right thing but you need to make it simpler first, before you can solve it?

Very useful!

Outrun the Lions

The other "deep thought" talk I never heard myself, but read about in the blog post by Baron Schwartz from Percona, called Outrun the Lions. This is based on the story from the African savanna: "Every day, a gazelle wakes and knows it must outrun the fastest lion; every day a lion awakes and knows it must outrun the slowest gazelle.

I've heard the same insight stated in another way: If you wake up in a cave, next to one dragon and one halfling... remember that you don't need to outrun the dragon, you only need to outrun the halfling! I do like the analogy of lions and gazelles better though, since in that story it doesn't matter if you are the gazelle or the lion, you face exactly the same problem.

I'm actually not sure how exactly the lion's and gazelles map into Baron's advice on MySQL performance tuning. Whether you see yourself as the gazelle or the lion, the moral of the story is that you don't need to be the fastest runner on the planet, you only need to be fast enough to survive.

Taken the right way, that can often be comforting advice and also reminds you not to over optimize everything.

ahmad (not verified)

Mon, 2010-02-08 10:13

It's a great advice. It's make me think. I've stuck with a problem for a while, may be I looking the wrong way or may be I should make it simple to understand. Thanks any way. nice blog :)

Add new comment

The content of this field is kept private and will not be shown publicly. Cookie & Privacy Policy
  • No HTML tags allowed.
  • External and mailto links in content links have an icon.
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.
  • Use [fn]...[/fn] (or <fn>...</fn>) to insert automatically numbered footnotes.
  • Each email address will be obfuscated in a human readable fashion or, if JavaScript is enabled, replaced with a spam resistent clickable link. Email addresses will get the default web form unless specified. If replacement text (a persons name) is required a webform is also required. Separate each part with the "|" pipe symbol. Replace spaces in names with "_".
About the bookAbout this siteAcademicAccordAmazonBeginnersBooksBuildBotBusiness modelsbzrCassandraCloudcloud computingclsCommunitycommunityleadershipsummitConsistencycoodiaryCopyrightCreative CommonscssDatabasesdataminingDatastaxDevOpsDistributed ConsensusDrizzleDrupalEconomyelectronEthicsEurovisionFacebookFrosconFunnyGaleraGISgithubGnomeGovernanceHandlerSocketHigh AvailabilityimpressionistimpressjsInkscapeInternetJavaScriptjsonKDEKubuntuLicensingLinuxMaidanMaker cultureMariaDBmarkdownMEAN stackMepSQLMicrosoftMobileMongoDBMontyProgramMusicMySQLMySQL ClusterNerdsNodeNoSQLodbaOpen ContentOpen SourceOpenSQLCampOracleOSConPAMPPatentsPerconaperformancePersonalPhilosophyPHPPiratesPlanetDrupalPoliticsPostgreSQLPresalespresentationsPress releasesProgrammingRed HatReplicationSeveralninesSillySkySQLSolonStartupsSunSybaseSymbiansysbenchtalksTechnicalTechnologyThe making ofTransactionsTungstenTwitterUbuntuvolcanoWeb2.0WikipediaWork from HomexmlYouTube

Search

Recent blog posts

Recent comments