Database Scaling

When a single database server can no longer handle the load, whether in read throughput, write throughput, or storage capacity, you must scale. This chapter covers vertical and horizontal scaling strategies, replication topologies, partitioning, and sharding.

Vertical vs. Horizontal Scaling

Vertical Scaling (Scale Up)

  • Add more CPU, RAM, or storage to the existing server.
  • No application changes needed.
  • Has a hard ceiling: machines have maximum specs.
  • Single point of failure remains.
  • Can get very expensive at the high end.

Horizontal Scaling (Scale Out)

  • Add more servers and distribute data across them.
  • Application must be aware of multiple nodes.
  • Theoretically unlimited scaling.
  • Provides redundancy and fault tolerance.
  • Adds complexity: data distribution, consistency, coordination.
Rule of Thumb
Scale vertically first: it is simpler and cheaper for moderate loads. Switch to horizontal scaling when you hit vertical limits, need geographic distribution, or require high availability.

Replication

Replication copies data from one database server (the leader/primary) to one or more replicas (followers/secondaries). It improves read throughput and provides redundancy.

Leader-Follower (Master-Slave)

Leader-Follower Replication
Application Writes + Reads Leader (Primary) Writes Follower 1 Reads Follower 2 Reads Follower 3 Reads Replication Stream
  • All writes go to the leader. The leader streams changes to followers.
  • Reads can go to any follower, distributing read load.
  • Synchronous replication: The leader waits for the follower to acknowledge before confirming the write. Strong consistency but higher latency.
  • Asynchronous replication: The leader confirms immediately; followers catch up later. Lower latency but followers may be slightly behind (replication lag).
  • Failover: If the leader dies, a follower is promoted to leader (manually or via automated tooling like Patroni for PostgreSQL).

Leader-Leader (Multi-Master)

Multiple nodes accept writes. Changes replicate between all leaders. This enables writes in multiple regions but introduces conflict resolution complexity: what happens when two leaders modify the same row simultaneously?

Conflict resolution strategies: last-write-wins (LWW), application-level resolution, CRDTs (conflict-free replicated data types).

Partitioning (Sharding)

Partitioning splits data across multiple database nodes (shards). Each shard holds a subset of the data. This is the primary mechanism for scaling writes and storage beyond what a single machine can handle.

Horizontal Partitioning (Sharding)

Different rows go to different shards based on a sharding key (also called partition key). The sharding key determines which shard holds a given record.

Sharding Strategies

StrategyHow It WorksProsCons
Range-BasedShard by ranges of the key (e.g., A-M on shard 1, N-Z on shard 2).Range queries are efficient within a shard.Hotspots if certain ranges are more popular.
Hash-BasedHash the key and mod by the number of shards.Even distribution of data.Range queries require scatter-gather across all shards.
Directory-BasedA lookup table maps keys to shards.Flexible; can rebalance without rehashing.Lookup table becomes a bottleneck and SPOF.
Geo-BasedShard by geographic region.Data locality; lower latency for regional users.Cross-region queries are expensive.

Choosing a Sharding Key

The sharding key is the most important decision in a sharded system. A good key:

  • Distributes data evenly across shards (avoids hotspots).
  • Aligns with common query patterns (queries should hit one shard, not all).
  • Has high cardinality (many unique values).
Common Mistake
Choosing a sharding key with low cardinality (e.g., country_code with only 200 values) limits the number of shards you can have. Similarly, using a timestamp as a sharding key sends all recent data to the same shard, creating a hotspot.

Problems with Sharding

  • Cross-shard queries: JOINs across shards are expensive or impossible. Denormalization becomes necessary.
  • Rebalancing: Adding or removing shards requires redistributing data. This is operationally complex.
  • Distributed transactions: ACID transactions across shards require two-phase commit or similar protocols, which are slow and complex.
  • Referential integrity: Foreign key constraints cannot span shards.
  • Operational complexity: More nodes means more things that can break: backups, monitoring, schema migrations.

Vertical Partitioning

Instead of splitting rows, split columns. Store frequently accessed columns in one table/database and rarely accessed columns in another. Example: store user profile data in one database and user activity logs in another. This is also called functional partitioning.

Read Replicas as a Scaling Strategy

Before sharding, try read replicas. If your workload is 90% reads and 10% writes, adding read replicas can handle 10x more read traffic without the complexity of sharding. Sharding is only necessary when writes or storage exceed what a single primary can handle.

Key Takeaways

  • Scale vertically first. Add read replicas next. Shard only when necessary.
  • Leader-follower replication offloads reads; accept some replication lag or pay for synchronous replication.
  • The sharding key determines everything: data distribution, query patterns, and operational complexity.
  • Hash-based sharding gives even distribution; range-based sharding enables efficient range queries.
  • Sharding introduces significant complexity. Cross-shard queries, rebalancing, and distributed transactions are real challenges. Do not shard prematurely.

Chapter Check-Up

Quick quiz to reinforce what you just learned.

๐Ÿงช

Practice What You Learned

Set up database replication with automatic failover in our guided lab.

Start Guided Lab โ†’