Databases

Node failures in Serializeable Snapshot Isolation

Previously in this series: Reading about Serializeable Snapshot Isolation.

Last week I took a deep dive into articles on Serializeable Snapshot Isolation. It ended on a sad note, as I learned that to extended SSI to a sharded database using 2PC for distributed transactions1 , there is a need to persist - which means replicate - all read sets in addition to all writes.

This conclusion has been bothering me, so before diving into other papers on distributed serializeable transactions, I wanted to understand better what exactly happens in SSI when a node (shard) fails. This blog doesn't introduce any new papers, just more details. And more speculation.

  • 1for example, MongoDB

Reading about Serializeable Snapshot Isolation

Previously in this series: Toward strict consistency in distributed databases.

Apparently, when Snapshot Isolation was invented, it was received with great enthusiasm, since it avoids all the anomalies familiar from the SQL standard. It was only as time went by that people discovered that SI had some completely new anomalies and therefore wasn't equivalent to serializeable isolation. (This I learned from the PostgreSQL paper below.)

Toward strict consistency in distributed databases

July is vacation time in Finland. Which means it is a great time to read blogs and papers on consistency in distributed databases! This and following blogs are my reading notes and thoughts. Mostly for my own benefit but if you benefit too, that's great. Note also that my ambition is not to appear like an authority on consistency. I just write down questions I had and things I learned. If you have comments, feel free to write some below. That's part of the point of blogging this!

20 years later, what's left of the CAP theorem?

The CAP theorem was published in (party like it's...) 1999: Fox Armando, Brewer Eric A: Harvest, Yield, and Scalable Tolerant Systems.

Since its publication it has provided a beacon and rallying cry around which web scale distributed databases could be built and debated. It(s interpretation) has also evolved. Quite quickly the original 1999 formulation was abandoned, and from there it has further eroded as real world database implementations have provided ever more finer grained trade offs for navigating the space that - after all - was correctly mapped out by the CAP theorem.

Pick ANY two? Really?

Presentation: Databases and the Cloud (and why it is more difficult for databases)

A week ago I again had the pleasure to give a guest lecture at Tampere University of Technology. I've visited them the first time when I worked as MySQL pre-sales in Sun.

To be trendy, I of course had to talk about the cloud. It turns out every section has the subtitle "...and why it is more difficult for databases". I also rightfully claim to have invented the NoSQL key-value development model in 2005.

The different ways of doing HA in MySQL

A week ago Baron wrote a blog post which can only be described as the final nail in the coffin for MMM. At MySQL AB we never used or recommended MMM as a High Availability solution. I never really asked about details about that, but surely one reason was that it is based on using the MySQL replication. At MySQL/Sun we recommended against asynchronous replication as a HA solution so that was the end of it as far as MMM was concerned. Instead we recommended DRBD, shared disk or MySQL Cluster based solutions. Of course, to replicate across continents (geographical redundancy) you will mostly just use asynchronous replication, also MySQL Cluster used the standard MySQL replication for that purpose.

A database for everyone (comments on Sybase acquisition)

One thing I haven't seen anybody commenting on is the fact that with SAP acquiring Sybase, it will be the last major independent database company to be merged into a larger SW company. (To say this, you can qualify MySQL AB as a major database company, but disqualify, say, EnterpriseDB or InterBase, which imho is entirely reasonable.)

Analyst reports on database market share in SMB segment? Anyone?

Has anyone read the report Microsoft, IBM and Oracle Lead the SMB and Mid-Market Database Segment - Yankee Group - 7/31/2006 - 5 Pages - ID: YANL1327246?

Paying 800USD just to know the percentages of each database is a bit expensive for my small startup budget, especially since it is not me who needs the information.

About the bookAbout this siteAcademicAccordAmazonBeginnersBooksBuildBotBusiness modelsbzrCassandraCloudcloud computingclsCommunitycommunityleadershipsummitConsistencycoodiaryCopyrightCreative CommonscssDatabasesdataminingDatastaxDevOpsDistributed ConsensusDrizzleDrupalEconomyelectronEthicsEurovisionFacebookFrosconFunnyGaleraGISgithubGnomeGovernanceHandlerSocketHigh AvailabilityimpressionistimpressjsInkscapeInternetJavaScriptjsonKDEKubuntuLicensingLinuxMaidanMaker cultureMariaDBmarkdownMEAN stackMepSQLMicrosoftMobileMongoDBMontyProgramMusicMySQLMySQL ClusterNerdsNodeNoSQLodbaOpen ContentOpen SourceOpenSQLCampOracleOSConPAMPPatentsPerconaperformancePersonalPhilosophyPHPPiratesPlanetDrupalPoliticsPostgreSQLPresalespresentationsPress releasesProgrammingRed HatReplicationSeveralninesSillySkySQLSolonStartupsSunSybaseSymbiansysbenchtalksTechnicalTechnologyThe making ofTransactionsTungstenTwitterUbuntuvolcanoWeb2.0WikipediaWork from HomexmlYouTube

Search

Recent blog posts

Recent comments