Data massage: how databases have been scaled from one to one million nodes

2013/12/03 by admin

Despite the CAP theorem, databases have been scaled from one to one million nodes. Globally distributed, highly available and fast. Databases supporting ACID transactions. How? A story and slides from a three hours workshop at the PHP Summit 2013 (Berlin)…

Data massage: How databases have been scaled from one to one million nodes from Ulf Wendel

Please note: long, deep dive, quite some theory. And, neither funny NoSQL bashing nor a practical guide to MySQL scaling. Only 10 MySQL slides or so…

As we all know, traditional relational databases have seen massive pressure from NoSQL in recent time. They have been considered unflexible, slow and not been designed for the Cloud. Or, just too expensive, if massive scalability was needed. It was argued, the CAP theorem basically means pick “two out of three” of consistency, availability and partition tolerance. Not long ago Brewer himself has said a word or two on the abuse of CAP for fuzz.

The talk starts in the 60th and 70th with relational databases before it gets to explore CAP and early NoSQL systems. Two early systems at the oposite ends of the design space CAP allows are Amazon Dynamo and Bigtable. They became blueprints for so many distributed NoSQL systems: Riak, HBase, BigCouch, … Then came Google Spanner (ACID transactions), which is highly available, replicates synchronously between data centers and is fast – how?

Some distributed NoSQL systems innovate at the borderline between distributed systems theory and the theory of parallel and distributed databases. Unless you work with a system at one of the extremes of the CAP design space, such as Riak, this is never much visible to you. Time to look at their algorithms and poke around the question what their findings may mean to relational systems that support ACID transactions.

It turns out that it could be fruitful to learn from each other. Some NoSQL folks seem to dig into distributed SQL query execution and on-disk storage. Topics from the theory of (parallel) databases. RDBMS can NoSQL can intersect in topics such as co-location in the presence of partitioning to avoid distributed queries/transactions. RDBMS can try to move towards microkernels and loosely coupled modules – something no classic RDBMS developer would ever do as it may mean longer paths (= slower)… – and so forth.

Whatever. You can look at NoSQL vs. RDBMS from a “relation vs. JSON”, “joins are evil” perspective. Or, you dig deeper to see how they scale from one to one million node. Unfortunately, you may come across topics that may be new to you as a (PHP) application developer:

Overlay networks, distributed hash tables
Vector clocks, quorums, CRDT’s
FLP impossibility theorem, Paxos protocol family
Virtual Synchrony, Atomic Broadcast

If you don’t mind that, you may answer the question whether Spanner proves CAP wrong yourself. The talk will also help you to judge whether the CAP theorem matters at all in practice!

Happy hacking!

@Ulf_Wendel