Vertical vs. Horizontal Scaling
Vertical Scaling (Scale Up)
- Add more CPU, RAM, or storage to the existing server.
- No application changes needed.
- Has a hard ceiling: machines have maximum specs.
- Single point of failure remains.
- Can get very expensive at the high end.
Horizontal Scaling (Scale Out)
- Add more servers and distribute data across them.
- Application must be aware of multiple nodes.
- Theoretically unlimited scaling.
- Provides redundancy and fault tolerance.
- Adds complexity: data distribution, consistency, coordination.
Replication
Replication copies data from one database server (the leader/primary) to one or more replicas (followers/secondaries). It improves read throughput and provides redundancy.
Leader-Follower (Master-Slave)
- All writes go to the leader. The leader streams changes to followers.
- Reads can go to any follower, distributing read load.
- Synchronous replication: The leader waits for the follower to acknowledge before confirming the write. Strong consistency but higher latency.
- Asynchronous replication: The leader confirms immediately; followers catch up later. Lower latency but followers may be slightly behind (replication lag).
- Failover: If the leader dies, a follower is promoted to leader (manually or via automated tooling like Patroni for PostgreSQL).
Leader-Leader (Multi-Master)
Multiple nodes accept writes. Changes replicate between all leaders. This enables writes in multiple regions but introduces conflict resolution complexity: what happens when two leaders modify the same row simultaneously?
Conflict resolution strategies: last-write-wins (LWW), application-level resolution, CRDTs (conflict-free replicated data types).
Partitioning (Sharding)
Partitioning splits data across multiple database nodes (shards). Each shard holds a subset of the data. This is the primary mechanism for scaling writes and storage beyond what a single machine can handle.
Horizontal Partitioning (Sharding)
Different rows go to different shards based on a sharding key (also called partition key). The sharding key determines which shard holds a given record.
Sharding Strategies
| Strategy | How It Works | Pros | Cons |
|---|---|---|---|
| Range-Based | Shard by ranges of the key (e.g., A-M on shard 1, N-Z on shard 2). | Range queries are efficient within a shard. | Hotspots if certain ranges are more popular. |
| Hash-Based | Hash the key and mod by the number of shards. | Even distribution of data. | Range queries require scatter-gather across all shards. |
| Directory-Based | A lookup table maps keys to shards. | Flexible; can rebalance without rehashing. | Lookup table becomes a bottleneck and SPOF. |
| Geo-Based | Shard by geographic region. | Data locality; lower latency for regional users. | Cross-region queries are expensive. |
Choosing a Sharding Key
The sharding key is the most important decision in a sharded system. A good key:
- Distributes data evenly across shards (avoids hotspots).
- Aligns with common query patterns (queries should hit one shard, not all).
- Has high cardinality (many unique values).
Problems with Sharding
- Cross-shard queries: JOINs across shards are expensive or impossible. Denormalization becomes necessary.
- Rebalancing: Adding or removing shards requires redistributing data. This is operationally complex.
- Distributed transactions: ACID transactions across shards require two-phase commit or similar protocols, which are slow and complex.
- Referential integrity: Foreign key constraints cannot span shards.
- Operational complexity: More nodes means more things that can break: backups, monitoring, schema migrations.
Vertical Partitioning
Instead of splitting rows, split columns. Store frequently accessed columns in one table/database and rarely accessed columns in another. Example: store user profile data in one database and user activity logs in another. This is also called functional partitioning.
Read Replicas as a Scaling Strategy
Before sharding, try read replicas. If your workload is 90% reads and 10% writes, adding read replicas can handle 10x more read traffic without the complexity of sharding. Sharding is only necessary when writes or storage exceed what a single primary can handle.
Key Takeaways
- Scale vertically first. Add read replicas next. Shard only when necessary.
- Leader-follower replication offloads reads; accept some replication lag or pay for synchronous replication.
- The sharding key determines everything: data distribution, query patterns, and operational complexity.
- Hash-based sharding gives even distribution; range-based sharding enables efficient range queries.
- Sharding introduces significant complexity. Cross-shard queries, rebalancing, and distributed transactions are real challenges. Do not shard prematurely.