20 years later, what's left of the CAP theorem?

The CAP theorem was published in (party like it's...) 1999: Fox Armando, Brewer Eric A: Harvest, Yield, and Scalable Tolerant Systems.

Since its publication it has provided a beacon and rallying cry around which web scale distributed databases could be built and debated. It(s interpretation) has also evolved. Quite quickly the original 1999 formulation was abandoned, and from there it has further eroded as real world database implementations have provided ever more finer grained trade offs for navigating the space that - after all - was correctly mapped out by the CAP theorem.

Pick ANY two? Really?

In honor of the birthday, I had to go back to the original 1999 paper and reread: Did it really say what we thought it said? And yes it did! I may be 20 years older, but my memory is still reliable... The original phrasing of the CAP theorem was formulated as the simple and easy to remember "pick 2 out of 3" menmonic: "Strong CAP Principle. Strong Consistency, High Availability, Partition-resilience: Pick at most 2."

As database experts more senior than me (and more cooler too because they worked on NoSQL systems when I was still a MySQL expert) would religiously recite this truth, I always wondered silently to myself: So wait, I can choose a database that's both Consistent and Highly Available? Except when there's a network failure / partition... So what does High Availability mean, if not resistance to network failures?

To be fair, there are other kinds of failures too, not just network failures. Nevertheless, enticing as it may be, people rarely spoke of choosing the CA option. Maybe also because no database tried to implement such a confusing and useless concept? Eventually we also started explicitly re-stating the theorem in a form where you can't choose a world without network partitions, rather the choice to be made is between CP and AP. (Also Brewer himself wrote an article years ago explaining this point, as well as some of the below.)

20 years later, what's left of Consistency? Quite a lot, it turns out...

While explicitly stating that the trade off really is just between Consistency and Availability helps, it turns out that this isn't really true either! I mean, it kind of is, but it's still a gross over-simplification, and by the end of this blog post you will have learned that in practice no database system actually provides either of those anyway.

Let's talk about consistency first, shall we? As distributed database systems evolved, they reached a point where their high availability architectures started to be rather resilient, and they would no longer fall over and lose data at the smallest network failure. (Even in the relational MySQL land clustering solutions like Galera and NDB achieved the same, replacing Pacemaker and MMM, which had failed to. And soon after Postgresql has introduced better replication too.) No longer having to lose sleep worrying about split brain and lost data, a NoSQL user may have started asking the question: So what kind of Consistency am I getting, actually?

I know the SQL standard defines 4 Isolation Levels: READ UNCOMMITTED, READ COMITTED, REPEATABLE READ and SERIALIZEABLE. Maybe it's one of those? Nope.

Turns out, even some modern relational databases are nowadays built on a Multi Version Concurrency Control architecture, which results in something called Snapshot Isolation. It's a great isolation level, even if it isn't even given a mention in good old SQL-92. PostgreSQL has solved this problem by giving you snapshot isolation when you asked for repeatable read. Confuses the heck out of the best of us!

But the list does not end there. There's no standards body for this, but Kyle Kingsbury has done an awesome job of cataloguing 16 different Consistency Levels. Your favorite NoSQL database probably implements one or more of those. In fact, it's possible that you get two of them at the same time! For example, MongoDB transactions provide Snapshot Isolation AND Causal Consistency.

In general, the isolation levels from the relational database world (in the left branch) are concerned with how well the database isolates concurrent transactions from each other, while the consistency levels (in the right branch) are concerned with causality between sequential transactions, but typically only for single record operations. The former is primarily provided by the (single node) database engine, while the latter is a property of the clustering machinery.

And, just like you know is the case for the 4 SQL isolation levels, in most cases also NoSQL databases let the client choose which consistency level is needed for the application in question. So Consistency is not this magic thing you either have or have not - it's one of 16 (or more) levels, which you can fluidly choose from, and change, for every single transaction.

Back to the CAP theorem: It says "Strong Consistency". If you looked at Kingsbury's map, you may have noticed that's not actually there! So which level is this "Strong Consistency"? The general consensus is that the CAP paper describes what is actually Linearizeable consistency. It could also be Strict Serializeable (the highest in the hierarchy), but I'm not sure any existing database actually implements that?

FWIW, in Kingsbury's map, all the consistency levels with pink color behave like a CP system.

20 years later, what's left of Availability?

As we learned to build more sophisticated distributed databases, also the AP option in CAP was questioned. The basic argument is that network partitions are an exceptional event, so wiring your entire architecture and consistency semantics to support this mythical AP property is counter productive, when it burdens your application developers with being reponsible for ensuring the data consistency that the database is not providing.

The FoundationDB documentation takes this argumentation even further: The AP kind of Availability isn't really availability at all! Consider a distributed database with (at least) one node on each continent. So let's imagine Australia is cut off from the internet for a day. Because the database is of type AP, application nodes in Australia can continue to read AND WRITE to the node that is still operational in Australia, even if disconnected from the rest of the cluster. The other continents of course remain unaware of what goes on in Australia, but from an application point of view, the writes to the database succeeded and the application code is relieved from error handling duties.

Does this make any sense? It may or may not, depends on the application. Maybe users in Australia are satisfied they can continue to operate on their own data without problems? For many IoT and other high write volume use cases (really, think: waterhose) this semantic is an excellent choice. As long as the application nodes can reach one database node, they can dump the incoming wave of data there and be done with it. And these applications often don't have that high demands on consistency anyway.

Yet in many cases the criticism is right: it creates a lot of complications for (supposedly) solving an exceptional situation.

And, taking the example even further: Imagine that a kangaroo breaks into the Australian data center, breaks the air conditioning and the heat wave melts the database server and all data is lost. Were those Australian database writes now successfully committed to the database or not?

But... Then there's the real spoiler. And thanks to Daniel Abadi for pointing this out and largely inspiring this post as well.

In all the NoSQL excitement, we forgot that no system can be 100% available anyway!

So just like we can only choose different levels of C, we can only choose different levels of A too, but no system will ever be "always" Available.1

In conclusion

So, 20 years later, what's left of the CAP theorem? We quickly discarded the CA option, the whole point is that you have to choose between C and A. We have 16 different levels of Consistency, none of which is actually called "Strong Consistency". And thanks to Daniel Abadi for reminding us that no system provides A either.

I partly wanted to write this blog post as consolation and reassurance. If you ever felt dumb when others talked about AP and CP systems, if you ever were lectured by someone, in a pissing contest, even, it might be nice to learn that the reason the CAP theorem didn't quite make sense is because it doesn't! Jokes on that guy, because no system provides any of C, A or P - strictly speaking.

What fascinates me about the CAP theorem is how it could at the same time be so wrong and so confusing, yet still so tremendously helpful, even fundamental, in re-shaping an entire industry! Simplifications can be powerful, I guess? Happy Birthday, CAP theorem

  • 1To be fair, the original CAP paper only said "Highly Available", not "Available", as many practitioners interpret it. But otoh it's a valid argument that CP systems in the end aren't much less available, and neither CP nor AP are always available, so the point is that the difference and real world value of choosing AP is actually rather incremental.

Mark Callaghan (not verified)

Wed, 2019-02-20 21:45

Abadi has made many great contributions to distributed DBMS. PACELC is my favorite. But most people with production DBMS experience could tell you that 100% A doesn’t happen.

Add new comment

The content of this field is kept private and will not be shown publicly. Cookie & Privacy Policy
  • No HTML tags allowed.
  • External and mailto links in content links have an icon.
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.
  • Use [fn]...[/fn] (or <fn>...</fn>) to insert automatically numbered footnotes.
  • Each email address will be obfuscated in a human readable fashion or, if JavaScript is enabled, replaced with a spam resistent clickable link. Email addresses will get the default web form unless specified. If replacement text (a persons name) is required a webform is also required. Separate each part with the "|" pipe symbol. Replace spaces in names with "_".
About the bookAbout this siteAcademicAccordAmazonBeginnersBooksBuildBotBusiness modelsbzrCassandraCloudcloud computingclsCommunitycommunityleadershipsummitConsistencycoodiaryCopyrightCreative CommonscssDatabasesdataminingDatastaxDevOpsDistributed ConsensusDrizzleDrupalEconomyelectronEthicsEurovisionFacebookFrosconFunnyGaleraGISgithubGnomeGovernanceHandlerSocketHigh AvailabilityimpressionistimpressjsInkscapeInternetJavaScriptjsonKDEKubuntuLicensingLinuxMaidanMaker cultureMariaDBmarkdownMEAN stackMepSQLMicrosoftMobileMongoDBMontyProgramMusicMySQLMySQL ClusterNerdsNodeNoSQLodbaOpen ContentOpen SourceOpenSQLCampOracleOSConPAMPPatentsPerconaperformancePersonalPhilosophyPHPPiratesPlanetDrupalPoliticsPostgreSQLPresalespresentationsPress releasesProgrammingRed HatReplicationSeveralninesSillySkySQLSolonStartupsSunSybaseSymbiansysbenchtalksTechnicalTechnologyThe making ofTransactionsTungstenTwitterUbuntuvolcanoWeb2.0WikipediaWork from HomexmlYouTube