Scaling Databases | System Design Course

Vertical vs. Horizontal Scaling

Vertical Scaling (Scale Up)

Add more CPU, RAM, or storage to the existing server.
No application changes needed.
Has a hard ceiling: machines have maximum specs.
Single point of failure remains.
Can get very expensive at the high end.

Horizontal Scaling (Scale Out)

Add more servers and distribute data across them.
Application must be aware of multiple nodes.
Theoretically unlimited scaling.
Provides redundancy and fault tolerance.
Adds complexity: data distribution, consistency, coordination.

Rule of Thumb

Scale vertically first: it is simpler and cheaper for moderate loads. Switch to horizontal scaling when you hit vertical limits, need geographic distribution, or require high availability.

Replication

Replication copies data from one database server (the leader/primary) to one or more replicas (followers/secondaries). It improves read throughput and provides redundancy.

Leader-Follower (Master-Slave)

Leader-Follower Replication

All writes go to the leader. The leader streams changes to followers.
Reads can go to any follower, distributing read load.
Synchronous replication: The leader waits for the follower to acknowledge before confirming the write. Strong consistency but higher latency.
Asynchronous replication: The leader confirms immediately; followers catch up later. Lower latency but followers may be slightly behind (replication lag).
Failover: If the leader dies, a follower is promoted to leader (manually or via automated tooling like Patroni for PostgreSQL).

Leader-Leader (Multi-Master)

Multiple nodes accept writes. Changes replicate between all leaders. This enables writes in multiple regions but introduces conflict resolution complexity: what happens when two leaders modify the same row simultaneously?

Conflict resolution strategies: last-write-wins (LWW), application-level resolution, CRDTs (conflict-free replicated data types).

Partitioning (Sharding)

Partitioning splits data across multiple database nodes (shards). Each shard holds a subset of the data. This is the primary mechanism for scaling writes and storage beyond what a single machine can handle.

Horizontal Partitioning (Sharding)

Different rows go to different shards based on a sharding key (also called partition key). The sharding key determines which shard holds a given record.

Sharding Strategies

Strategy	How It Works	Pros	Cons
Range-Based	Shard by ranges of the key (e.g., A-M on shard 1, N-Z on shard 2).	Range queries are efficient within a shard.	Hotspots if certain ranges are more popular.
Hash-Based	Hash the key and mod by the number of shards.	Even distribution of data.	Range queries require scatter-gather across all shards.
Directory-Based	A lookup table maps keys to shards.	Flexible; can rebalance without rehashing.	Lookup table becomes a bottleneck and SPOF.
Geo-Based	Shard by geographic region.	Data locality; lower latency for regional users.	Cross-region queries are expensive.

Choosing a Sharding Key

The sharding key is the most important decision in a sharded system. A good key:

Distributes data evenly across shards (avoids hotspots).
Aligns with common query patterns (queries should hit one shard, not all).
Has high cardinality (many unique values).

Common Mistake

Choosing a sharding key with low cardinality (e.g., country_code with only 200 values) limits the number of shards you can have. Similarly, using a timestamp as a sharding key sends all recent data to the same shard, creating a hotspot.

Problems with Sharding

Cross-shard queries: JOINs across shards are expensive or impossible. Denormalization becomes necessary.
Rebalancing: Adding or removing shards requires redistributing data. This is operationally complex.
Distributed transactions: ACID transactions across shards require two-phase commit or similar protocols, which are slow and complex.
Referential integrity: Foreign key constraints cannot span shards.
Operational complexity: More nodes means more things that can break: backups, monitoring, schema migrations.

Vertical Partitioning

Instead of splitting rows, split columns. Store frequently accessed columns in one table/database and rarely accessed columns in another. Example: store user profile data in one database and user activity logs in another. This is also called functional partitioning.

Read Replicas as a Scaling Strategy

Before sharding, try read replicas. If your workload is 90% reads and 10% writes, adding read replicas can handle 10x more read traffic without the complexity of sharding. Sharding is only necessary when writes or storage exceed what a single primary can handle.

Key Takeaways

Scale vertically first. Add read replicas next. Shard only when necessary.
Leader-follower replication offloads reads; accept some replication lag or pay for synchronous replication.
The sharding key determines everything: data distribution, query patterns, and operational complexity.
Hash-based sharding gives even distribution; range-based sharding enables efficient range queries.
Sharding introduces significant complexity. Cross-shard queries, rebalancing, and distributed transactions are real challenges. Do not shard prematurely.

Database Scaling

Vertical vs. Horizontal Scaling

Vertical Scaling (Scale Up)

Horizontal Scaling (Scale Out)

Replication

Leader-Follower (Master-Slave)

Leader-Leader (Multi-Master)

Partitioning (Sharding)

Horizontal Partitioning (Sharding)

Sharding Strategies

Choosing a Sharding Key

Problems with Sharding

Vertical Partitioning

Read Replicas as a Scaling Strategy

Key Takeaways

Chapter Check-Up

Practice What You Learned