CAP Theorem & Distributed Systems

The CAP theorem is the most important theoretical framework for understanding the fundamental trade-offs in distributed systems. This chapter covers CAP, its practical extension PACELC, consistency models, and the consensus algorithms that underpin distributed coordination.

The CAP Theorem

Formulated by Eric Brewer in 2000, the CAP theorem states that a distributed data store can provide at most two out of three guarantees simultaneously:

CAP Theorem Triangle
C Consistency A Availability P Partition Tolerance CA Systems AP Systems CP Systems RDBMS (single node) Cassandra, DynamoDB HBase, MongoDB

Definitions

  • Consistency (C)Every read receives the most recent write or an error. All nodes see the same data at the same time. This is linearizability: the strongest consistency model.
  • Availability (A)Every request receives a non-error response, even if it may not reflect the most recent write. The system always responds.
  • Partition Tolerance (P)The system continues to operate despite arbitrary message loss or delay between nodes (network partitions).
The Real Choice Is Between C and A
In any real distributed system, network partitions will happen. You cannot opt out of partition tolerance. So the practical choice is: during a partition, do you sacrifice consistency (return potentially stale data) or availability (return an error)?

CP vs. AP Systems

CP (Consistency + Partition Tolerance)

  • During a partition, some requests may be rejected to ensure consistency.
  • Examples: HBase, MongoDB (default), ZooKeeper, etcd.
  • Use when: data correctness is more important than availability (financial systems, inventory management).

AP (Availability + Partition Tolerance)

  • During a partition, all nodes respond, but some may return stale data.
  • Examples: Cassandra, DynamoDB, CouchDB.
  • Use when: the system must always respond (social media feeds, shopping carts, DNS).

PACELC: Beyond CAP

The CAP theorem only describes behavior during a partition. PACELC extends it:

If there is a Partition (P), choose between Availability (A) and Consistency (C). Else (E), when the system is running normally, choose between Latency (L) and Consistency (C).

SystemPartition: A or C?Normal: L or C?Classification
DynamoDBALPA/EL
CassandraALPA/EL
MongoDBCCPC/EC
HBaseCCPC/EC
Cosmos DBConfigurableConfigurableTunable

Consistency Models

Beyond the CAP binary of "consistent or not," there is a spectrum of consistency models:

Strong Consistency (Linearizability)

Every read returns the most recent write. The system behaves as if there is a single copy of the data. This is the strongest guarantee but the most expensive in terms of latency and coordination.

Sequential Consistency

Operations from each client appear in order, but there is no real-time guarantee. Client A's writes appear in the order they were issued, but Client B may see them with a delay.

Causal Consistency

If event A causes event B, then everyone sees A before B. Unrelated events may be seen in different orders by different nodes. A practical middle ground used by many distributed systems.

Eventual Consistency

If no new writes occur, all replicas will eventually converge to the same state. The weakest guarantee, but the one that allows the highest availability and lowest latency. Used by DNS, email, and most AP systems.

Consensus Algorithms

Consensus algorithms allow distributed nodes to agree on a single value (e.g., "who is the leader?" or "what is the committed value?") despite failures.

Raft

Raft is designed to be understandable. It elects a leader, and the leader manages all writes. A write is committed when a majority of nodes acknowledge it.

  • Leader election: Nodes start as followers. If a follower does not hear from a leader within a timeout, it becomes a candidate and requests votes. The candidate with a majority of votes becomes the leader.
  • Log replication: The leader receives writes, appends them to its log, and replicates them to followers. When a majority have the entry, it is committed.
  • Safety: A new leader always has all committed entries. No committed data is ever lost.

Used by: etcd, Consul, CockroachDB, TiKV.

Paxos

Paxos is the original consensus algorithm, proven correct but notoriously difficult to understand and implement. It inspired Raft, which achieves the same guarantees with a simpler protocol.

Used by: Google Chubby, Google Spanner (Multi-Paxos).

Distributed System Fallacies

Peter Deutsch enumerated eight assumptions that new distributed system developers wrongly make. Internalizing these prevents naive architectural mistakes:

  1. The network is reliable.
  2. Latency is zero.
  3. Bandwidth is infinite.
  4. The network is secure.
  5. Topology does not change.
  6. There is one administrator.
  7. Transport cost is zero.
  8. The network is homogeneous.
Practical Implication
Every network call can fail, be slow, or return a partial result. Design every inter-service interaction with timeouts, retries (with exponential backoff), circuit breakers, and fallback behaviors.

Key Takeaways

  • The CAP theorem states you cannot have consistency, availability, and partition tolerance simultaneously. Since partitions are unavoidable, the real choice is C vs. A during failures.
  • PACELC extends CAP to normal operation: even without partitions, you trade latency for consistency.
  • Strong consistency is expensive. Eventual consistency is cheap and fast. Choose based on business requirements.
  • Raft is the go-to consensus algorithm for leader election and replicated state machines. Understand how it works.
  • Never assume the network is reliable. Design every distributed interaction to handle failure gracefully.

Chapter Check-Up

Quick quiz to reinforce what you just learned.

๐Ÿงช

Practice What You Learned

Configure replication consistency and observe CAP tradeoffs under partition.

Start Guided Lab โ†’