Slides from my Percona Live talk evaluating the new spatial features in MySQL 5.6 and MariaDB 5.5 are now online. This is new material I have never presented before. It is based on work I have done in my job at Nokia HERE.com location services. So even if at this conference it draws less attention than my HA talks, it is actually what I'm most proud of to present.
TL;DR summary is that PostgreSQL has lots of features but MySQL has much better ease of use and performance. (I copy paste this standard sentence into any PostgreSQL vs MySQL evaluation I do :-) The MongoDB info is basically outdated, as the new 2.4 release introduces completely new implementation based on GeoJSON, new indexing, neither of which I tested.
With MySQL it was for a long time the case that a lot of sub queries would actually perform poorly, because of poor execution plans. (This is no longer the case in MariaDB 5.5 or the upcoming MySQL 5.6.) Because of this, any MySQL DBA knows the rule of thumb that sub-queries should basically be avoided and you can usually get the same result by using JOINs instead.
I've now learned why PostgreSQL DBAs like sub queries so much. PostgreSQL - being the most advanced open source database - apparently does the exact opposite optimizations as MySQL: it requires you to rewrite simple queries into complex subqueries to get what you want. (Update: Mark Callaghan points out that MySQL - while it does create indexes automatically for foreign keys - actually has the same problems with the query plan as Postgres has in this post. See comments for details.)
My three previous blog posts I already wrote from Froscon. In this post I still want to go back and mention some people I met and discussions I had.
The MySQL side
There were of course many MySQL people, with both SkySQL and Oracle sponsoring. It was great to meet Carsten from Oracle, who has joined the MySQL Sales Engineer team in Europe (he moved from an OpenOffice position). That's my former team, so it was great to see a new face!
Going there the person I was most looking forward to meet was Hana Hütter, formerly a MySQL account manager for Central Europe, and now doing the same at SkySQL. My first ever MySQL sales gig was with Hana, and Ralf Gebhart who is also now with SkySQL but was not at Froscon. While Ralf was there only that first time to teach me how to be a Sales Engineer, with Hana we then continued to sell MySQL into telecom companies in many European cities. I had not met Hana since I left Sun. It was really great to see again. We had lunch, I told about my experiments with Galera and she told about her customers being interested in it too. It was great to get in touch with the "frontlines" again!
She also told me a great story about MySQL adoption in Europe...
In my quest to understand spatial GIS functionality, I have come to the ultimate goal: evaluation the actual database products themselves.
PostgreSQL / PostGIS
PostGIS is a variant of PostgreSQL with spatial extensions. The main reason for maintaining the GIS feature set outside of PostgreSQL proper seems to be licensing: the spatial extensions are
LGPL GPL licensed.
PostGIS is widely recognized as the most mature and feature-rich GIS implementation for SQL databases (and perhaps any database), matched only by the costly Spatial extension for the Oracle Enterprise database. (See comparisons in the links below.)
While the underlying index should be opaque to the user of a DBMS with spatial features, the API used to define spatial types and operate on them is of course more visible. The relevant standard in this space is often referred to as "OpenGIS", however the Open Geospatial Consortium in fact defines a long list of standards. The standard relevant to SQL databases is known more precisely as "OpenGIS Implementation Specification for Geographic information - Simple feature access - Part 2: SQL option" aka "Simple feature access".
It is not meaningful to recite the standard at length in my blog, my focus is instead on actual implementations that I will blog about later. The following points are however worth noting:
I work for a company that is the leading supplier of automotive maps, and wants to be the leading supplier of online maps. So it was only a matter of time that I needed to learn more about how spatial extensions work in different open source databases. Let's start from the beginning, understanding various spatial data structures that are used in implementations...
Links are provided to Wikipedia articles - which are both comprehensive, yet easy to understand - for those who want to get a deeper understanding of each structure. All Wikipedia articles on spatial indexes are listed here: http://en.wikipedia.org/wiki/Spatial_index#Spatial_index
My recent account of The State of MySQL forks seems to have gotten quite a lot of attention. I promised to follow up with a separate piece about Drizzle and also PostgreSQL, as the other major open source database, so I'd better keep that promise now.
During this autumn I've had the pleasure of working closely with Georg Greve, Founder and former President of the Free Software Foundation Europe. Seeing that he had just left his previous post, we realized that his experience would be invaluable to do some of the heavy lifting involved in setting up processes for this fresh association. And so it has been!
As a result, we now have mundane things like a post box and accountant in Zurich where the non-profit association has its legal home. We are finally able to accept membership applications through the new website.