Caching in System Design | System Design Course

Why Cache?

Speed: Reading from memory is ~100,000x faster than reading from disk. A cache hit avoids the slow path entirely.
Reduced Load: Fewer queries hit the database, freeing it to handle writes and complex queries.
Cost Efficiency: Serving responses from cache requires fewer servers and less database capacity.
Resilience: A cache can serve stale data temporarily if the database goes down.

Cache Layers

Caching can happen at every layer of the stack:

Cache Layers in a Typical Stack

Caching Strategies

Cache-Aside (Lazy Loading)

The most common pattern. The application checks the cache first. On a cache miss, it reads from the database, stores the result in the cache, and returns it to the client.

result = cache.get(key)
if result is None:                # Cache miss
    result = db.query(key)        # Read from DB
    cache.set(key, result, ttl)   # Populate cache
return result

Pros: Cache only contains data that is actually requested. Simple to implement.
Cons: First request always hits the database. Cache and database can become inconsistent if data is updated in the database without invalidating the cache.

Write-Through

Every write goes to both the cache and the database simultaneously. The cache always has the latest data.

Pros: Cache is always up to date. No stale data.
Cons: Higher write latency (two writes per operation). Cache may contain data that is never read (wasted memory).

Write-Behind (Write-Back)

Writes go to the cache first, and the cache asynchronously flushes changes to the database later. This significantly reduces write latency as perceived by the client.

Pros: Very fast writes. Batching writes to the database is efficient.
Cons: Risk of data loss if the cache crashes before flushing. More complex to implement.

Read-Through

Similar to cache-aside, but the cache itself is responsible for loading data from the database on a miss. The application only talks to the cache.

Pros: Simpler application code. Cache handles all data loading logic.
Cons: Tightly couples cache to database. Not all cache solutions support this natively.

Eviction Policies

When a cache is full, it must decide which entries to remove to make room for new ones:

Policy	Behavior	Best For
LRU	Least Recently Used: evicts the entry that has not been accessed for the longest time.	General purpose. Most popular choice.
LFU	Least Frequently Used: evicts the entry accessed the fewest times.	Workloads with distinct popularity tiers.
FIFO	First In, First Out: evicts the oldest entry.	Simple, predictable behavior.
Random	Evicts a random entry.	When access patterns are unpredictable.
TTL-based	Entries expire after a fixed time-to-live.	Data with known staleness tolerance.

Cache Invalidation

The hardest problem in caching. When data in the database changes, the corresponding cache entry becomes stale. Strategies include:

TTL (Time-to-Live): Set an expiration time. After TTL expires, the next read triggers a cache miss and fresh load. Simple but allows staleness up to the TTL duration.
Event-Driven Invalidation: When data is written to the database, publish an event that triggers cache deletion. Tighter consistency but more complex.
Versioned Keys: Append a version number to cache keys. When data changes, increment the version. Old keys naturally become unreferenced and expire via TTL.
Active Invalidation: The write path explicitly deletes or updates the cache entry in the same operation that modifies the database.

The Two Hard Problems

Phil Karlton famously said there are only two hard things in computer science: cache invalidation and naming things. Cache invalidation is genuinely difficult because it requires coordinating two independent data stores. Any gap between the database write and the cache invalidation creates a window where stale data can be served.

Cache Stampede (Thundering Herd)

When a popular cache entry expires, many concurrent requests miss the cache simultaneously and all hit the database at once, potentially overwhelming it. Mitigations:

Locking: Only one request fetches from the database; others wait for the cache to be repopulated.
Probabilistic Early Expiration: Randomly refresh cache entries before they actually expire, spreading the load.
Background Refresh: A background process refreshes popular entries before their TTL expires.
Stale-While-Revalidate: Serve the stale cached value immediately while fetching a fresh copy in the background.

Distributed Caching

In a multi-server environment, a centralized cache (like Redis or Memcached) ensures all application servers share the same cached data. Key considerations:

Partitioning: Distribute cache keys across multiple cache nodes using consistent hashing (Chapter 11).
Replication: Replicate data across cache nodes for availability. Redis Cluster and Redis Sentinel support this.
Serialization: Data must be serialized to store in cache and deserialized on read. Choose efficient formats (MessagePack, Protobuf) for large objects.
Memory Management: Monitor memory usage. An out-of-memory cache starts evicting entries aggressively, reducing hit rate.

Redis vs. Memcached

Redis

Rich data structures (strings, hashes, lists, sets, sorted sets, streams)
Persistence options (RDB snapshots, AOF log)
Replication and clustering built-in
Pub/sub messaging
Lua scripting
Single-threaded (6.0+ supports I/O threading)

Memcached

Simple key-value only
No persistence
Multi-threaded; scales well on multi-core
Lower memory overhead per key
Simpler operational model
Better for pure caching (no data structure needs)

Key Takeaways

Caching can reduce latency from hundreds of milliseconds (database) to sub-millisecond (in-memory cache).
Cache-aside is the most common and flexible strategy. Use write-through or write-behind when consistency or write performance is critical.
Always set a TTL. Unbounded cache growth will eventually exhaust memory.
LRU is the default eviction policy for most use cases.
Plan for cache invalidation from the start. Retrofitting it is painful.
Guard against cache stampedes with locking or probabilistic early expiration.
Prefer Redis for rich data structure needs; Memcached for pure, simple caching at scale.

Chapter Check-Up

Quick quiz to reinforce what you just learned.