Schema Design for System Design | System Design Course

The Schema Design Workflow

Schema design is not guesswork. It follows a systematic process that starts with understanding the application's data and access patterns.

Gather Requirements

Identify the entities in the system, the relationships between them, and the access patterns (what queries will the application run?). Talk to stakeholders, read product specs, and examine existing APIs or user stories.

Define Entities & Relationships

Create an Entity-Relationship (ER) diagram. Identify each entity (user, post, order), its attributes (name, email, created_at), and relationships (a user has many posts, an order belongs to a user). Determine cardinality: one-to-one, one-to-many, many-to-many.

Normalize the Schema

Apply normalization rules (1NF through BCNF) to eliminate data redundancy and ensure data integrity. Start fully normalized; you can always denormalize later with clear justification.

Design Indexes

Analyze the query patterns and create indexes to support them. Every index speeds up reads but slows down writes. Choose indexes strategically based on actual access patterns, not speculation.

Denormalize Where Justified

If specific read queries are too slow even with indexes, consider strategic denormalization: duplicating data to avoid expensive joins. Document every denormalization decision and the trade-off it introduces.

Plan for Evolution

Schemas change over time. Design with migration in mind: use nullable columns for new fields, avoid breaking changes, and establish a migration workflow that supports zero-downtime deployments.

Normalization

Normalization is the process of organizing data to reduce redundancy and prevent update anomalies. Each normal form builds on the previous one.

Normal Form	Rule	Eliminates	Example Violation
1NF	All columns contain atomic (indivisible) values. No repeating groups or arrays.	Repeating groups	A `phone_numbers` column storing "555-1234, 555-5678"
2NF	Every non-key column depends on the entire primary key (no partial dependencies). Relevant when the primary key is composite.	Partial dependencies	In (student_id, course_id) -> student_name: student_name depends only on student_id
3NF	No non-key column depends on another non-key column (no transitive dependencies).	Transitive dependencies	In a table with city and zip_code: city depends on zip_code, not on the primary key
BCNF	Every determinant is a candidate key. Handles edge cases that 3NF misses when there are overlapping candidate keys.	Remaining anomalies	When a non-candidate-key attribute determines part of a candidate key

1NF: Atomic Values

First Normal Form requires that every column contains a single, atomic value. No arrays, no comma-separated lists, no nested structures.

Violation of 1NF

-- BAD: tags column contains multiple values
CREATE TABLE posts (
    id        SERIAL PRIMARY KEY,
    title     TEXT NOT NULL,
    tags      TEXT  -- stores "react, typescript, nextjs"
);

-- GOOD: separate junction table for tags
CREATE TABLE posts (
    id     SERIAL PRIMARY KEY,
    title  TEXT NOT NULL
);

CREATE TABLE tags (
    id    SERIAL PRIMARY KEY,
    name  TEXT UNIQUE NOT NULL
);

CREATE TABLE post_tags (
    post_id  INT REFERENCES posts(id),
    tag_id   INT REFERENCES tags(id),
    PRIMARY KEY (post_id, tag_id)
);

2NF: No Partial Dependencies

Second Normal Form applies to tables with composite primary keys. Every non-key column must depend on the entire composite key, not just part of it.

Violation of 2NF

-- BAD: student_name depends only on student_id, not on (student_id, course_id)
CREATE TABLE enrollments (
    student_id    INT,
    course_id     INT,
    student_name  TEXT,     -- partial dependency!
    grade         CHAR(1),
    PRIMARY KEY (student_id, course_id)
);

-- GOOD: separate students table
CREATE TABLE students (
    id    INT PRIMARY KEY,
    name  TEXT NOT NULL
);

CREATE TABLE enrollments (
    student_id  INT REFERENCES students(id),
    course_id   INT REFERENCES courses(id),
    grade       CHAR(1),
    PRIMARY KEY (student_id, course_id)
);

3NF: No Transitive Dependencies

Third Normal Form eliminates columns that depend on other non-key columns rather than depending directly on the primary key.

Violation of 3NF

-- BAD: city depends on zip_code, not on customer_id
CREATE TABLE customers (
    id        INT PRIMARY KEY,
    name      TEXT NOT NULL,
    zip_code  TEXT,
    city      TEXT   -- transitive dependency: id -> zip_code -> city
);

-- GOOD: separate zip_codes table
CREATE TABLE zip_codes (
    code  TEXT PRIMARY KEY,
    city  TEXT NOT NULL
);

CREATE TABLE customers (
    id        INT PRIMARY KEY,
    name      TEXT NOT NULL,
    zip_code  TEXT REFERENCES zip_codes(code)
);

How Far to Normalize?

For most applications, 3NF is sufficient. BCNF handles rare edge cases involving overlapping candidate keys that you are unlikely to encounter in typical web applications. Start at 3NF and only go further if you encounter specific anomalies. Going beyond BCNF (4NF, 5NF) is almost never necessary in practice.

Strategic Denormalization

Normalization optimizes for write correctness (no update anomalies). Denormalization optimizes for read performance (fewer joins). In production systems, you often need both.

When to Denormalize

Read-heavy workloads: If a query joining 5 tables runs thousands of times per second, pre-computing the join into a denormalized table can dramatically reduce latency.
Computed aggregates: Storing a follower_count column on the users table avoids counting rows in the followers table on every profile view.
Caching hot paths: Denormalized data is essentially an application-level cache. Treat it with the same rigor: define invalidation strategies and acceptable staleness.

Common Denormalization Patterns

Materialized ColumnsStore a computed or duplicated value directly on the row. Example: posts.comment_count updated via a trigger or application logic when comments are added/removed.
Materialized ViewsA database-maintained denormalized table that is automatically refreshed. Supported by PostgreSQL (CREATE MATERIALIZED VIEW). Good for dashboard queries, but refresh has a cost.
Precomputed Join TablesA separate table storing the result of a frequently-run join. Updated asynchronously via CDC, triggers, or background jobs.
Embedded Documents (NoSQL)In document databases like MongoDB, embed related data within the parent document to avoid "joins" (which MongoDB does not natively support well).

The Denormalization Contract

Every denormalization creates a maintenance burden: you must keep the duplicated data in sync. Before denormalizing, document: (1) what data is duplicated, (2) how it is kept in sync (trigger, application code, async job), (3) what happens if it goes stale, and (4) the query it optimizes and the measured performance gain. If you cannot answer all four, do not denormalize.

Index Design Strategy

Indexes are the single most important tool for query performance. A well-chosen index turns a full table scan (O(n)) into a B-tree lookup (O(log n)).

How B-Tree Indexes Work

The most common index type is the B-tree (or B+ tree). It maintains a sorted, balanced tree structure where:

Leaf nodes contain pointers to actual table rows, sorted by the indexed column(s).
Lookups, range scans, and ORDER BY queries are efficient.
Each insert/update/delete must also update every index on the table.

Index Types

Index Type	Best For	Not Suitable For	Example
B-Tree	Equality, range, sorting, prefix matching	Full-text search, high-cardinality arrays	`CREATE INDEX idx_email ON users(email)`
Hash	Exact equality lookups only	Range queries, sorting	`CREATE INDEX idx_token ON sessions USING hash(token)`
GIN	Full-text search, JSONB containment, array membership	Range queries on scalar values	`CREATE INDEX idx_tags ON posts USING gin(tags)`
GiST	Geometric data, range types, nearest-neighbor	Simple equality/range on scalars	`CREATE INDEX idx_location ON places USING gist(point)`
Partial	Queries that filter on a constant condition	Queries without the filter condition	`CREATE INDEX idx_active ON users(email) WHERE active = true`
Covering	Queries that can be answered entirely from the index	Queries needing many columns	`CREATE INDEX idx_cover ON orders(user_id) INCLUDE (total, status)`

Composite Index Ordering

The column order in a composite index matters enormously. The index can only be used for queries that filter on a left prefix of the indexed columns.

Composite Index Column Order

-- Index on (country, city, zip_code)
CREATE INDEX idx_location ON addresses(country, city, zip_code);

-- CAN use this index (left prefixes):
SELECT * FROM addresses WHERE country = 'US';
SELECT * FROM addresses WHERE country = 'US' AND city = 'Seattle';
SELECT * FROM addresses WHERE country = 'US' AND city = 'Seattle' AND zip_code = '98101';

-- CANNOT use this index efficiently:
SELECT * FROM addresses WHERE city = 'Seattle';          -- skips country
SELECT * FROM addresses WHERE zip_code = '98101';        -- skips country and city
SELECT * FROM addresses WHERE country = 'US' AND zip_code = '98101'; -- skips city

Rule of thumb for column ordering: Place equality filters first, then range filters, then sort columns. This maximizes the portion of the index that the database can use.

Schema Design for NoSQL

NoSQL databases require a fundamentally different approach to schema design. Instead of normalizing and then querying with flexible joins, you design the schema around your access patterns.

DynamoDB: Single-Table Design

DynamoDB is a key-value and document store that supports only primary key lookups and secondary index queries. There are no joins. The single-table design pattern stores multiple entity types in the same table:

DynamoDB Single-Table Design for a Social App

// All entities in one table, distinguished by key structure
// PK = Partition Key, SK = Sort Key

// User entity
{ PK: "USER#alice",   SK: "PROFILE",        name: "Alice", email: "alice@ex.com" }

// User's posts (sorted by timestamp)
{ PK: "USER#alice",   SK: "POST#2024-01-15", title: "Hello World", body: "..." }
{ PK: "USER#alice",   SK: "POST#2024-01-20", title: "Second Post", body: "..." }

// User's followers
{ PK: "USER#alice",   SK: "FOLLOWER#bob",    followed_at: "2024-01-10" }
{ PK: "USER#alice",   SK: "FOLLOWER#carol",  followed_at: "2024-01-12" }

// Access patterns supported by primary key:
// - Get user profile:      PK = "USER#alice", SK = "PROFILE"
// - List user's posts:     PK = "USER#alice", SK begins_with "POST#"
// - List user's followers: PK = "USER#alice", SK begins_with "FOLLOWER#"
// - Get specific follower: PK = "USER#alice", SK = "FOLLOWER#bob"

Key principles of DynamoDB schema design:

Start with access patterns: List every query the application needs before designing the schema. The schema is shaped by queries, not by entity relationships.
Overload keys: Use prefixed strings (USER#, POST#, FOLLOWER#) to store multiple entity types with the same partition key.
Use GSIs for alternate access patterns: Global Secondary Indexes project data into a different key structure. Example: a GSI with PK = email lets you look up users by email.
Accept duplication: Since there are no joins, data that is needed together must be stored together, even if it means duplicating it across items.

MongoDB: Embedding vs. Referencing

MongoDB stores data as JSON-like documents. The key design decision is whether to embed related data within a document or reference it in a separate collection.

Embedding (Denormalized)

Store related data directly within the parent document.
Single read retrieves all related data (no joins).
Best for: one-to-few relationships, data accessed together, data that does not change independently.
Risk: documents can grow large (16MB limit). Updates to embedded data require updating the parent document.

Referencing (Normalized)

Store a reference (ID) to a document in another collection.
Requires a second query or $lookup (aggregation pipeline join) to retrieve related data.
Best for: one-to-many or many-to-many relationships, data that changes independently, large or unbounded arrays.
Risk: multiple queries or expensive $lookup operations. No foreign key enforcement.

MongoDB: Embedding vs. Referencing

// EMBEDDED: comments stored inside the post document
{
  _id: ObjectId("..."),
  title: "Schema Design Guide",
  author: "alice",
  comments: [
    { user: "bob",   text: "Great post!", date: "2024-01-15" },
    { user: "carol", text: "Very helpful", date: "2024-01-16" }
  ]
}

// REFERENCED: comments in a separate collection
// posts collection
{ _id: ObjectId("post1"), title: "Schema Design Guide", author: "alice" }

// comments collection
{ _id: ObjectId("c1"), post_id: ObjectId("post1"), user: "bob",   text: "Great post!" }
{ _id: ObjectId("c2"), post_id: ObjectId("post1"), user: "carol", text: "Very helpful" }

Schema Migrations & Zero-Downtime Deployment

Production schemas must evolve as the application changes. The challenge is making schema changes without taking the application offline.

Safe Migration Patterns

Add Column (safe)Adding a nullable column or a column with a default value is almost always safe. The existing application code ignores the new column until it is deployed to use it.
Rename Column (unsafe)A rename breaks all existing queries referencing the old name. Instead, use the expand-contract pattern: add the new column, deploy code that writes to both, backfill old data, deploy code that reads from the new column, then drop the old column.
Drop Column (unsafe)Dropping a column breaks any code still referencing it. First deploy code that no longer reads the column. Then drop the column in a later migration after confirming no queries reference it.
Add NOT NULL (unsafe)Adding a NOT NULL constraint to an existing column fails if any rows have NULL values. First backfill all NULLs with a default value, then add the constraint.
Add Index (mostly safe)In PostgreSQL, use CREATE INDEX CONCURRENTLY to build the index without locking the table. Standard CREATE INDEX acquires a write lock and can block production traffic.

The Expand-Contract Pattern

This is the safest approach for breaking schema changes. It separates the migration into phases:

Expand

Add the new column/table alongside the old one. Deploy application code that writes to both old and new locations. This is backward-compatible: old code still works.

Migrate Data

Backfill existing data from the old column to the new one. Run this as a background job, not a blocking migration.

Switch Reads

Deploy application code that reads from the new column instead of the old one. The old column is still being written to as a safety net.

Contract

Once you have confirmed the new column is correct and all reads use it, stop writing to the old column. After a grace period, drop the old column.

Worked Example: Social Media Schema

Let us design the schema for a social media platform step by step, applying everything we have learned.

Requirements

Users can create profiles, write posts, and comment on posts.
Users can follow other users.
The home feed shows posts from followed users, newest first.
Profile pages show a user's posts and follower/following counts.

Normalized Schema (3NF)

PostgreSQL Schema (3NF)

CREATE TABLE users (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    username    TEXT UNIQUE NOT NULL,
    email       TEXT UNIQUE NOT NULL,
    display_name TEXT NOT NULL,
    bio         TEXT,
    avatar_url  TEXT,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE TABLE posts (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    author_id   UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    body        TEXT NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE TABLE comments (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    post_id     UUID NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
    author_id   UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    body        TEXT NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE TABLE follows (
    follower_id  UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    following_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    created_at   TIMESTAMPTZ NOT NULL DEFAULT now(),
    PRIMARY KEY (follower_id, following_id),
    CHECK (follower_id != following_id)  -- prevent self-follows
);

-- Indexes driven by access patterns:

-- Feed query: posts from followed users, newest first
CREATE INDEX idx_posts_author_created ON posts(author_id, created_at DESC);

-- Comments for a post, newest first
CREATE INDEX idx_comments_post_created ON comments(post_id, created_at DESC);

-- Who does a user follow? Who follows a user?
CREATE INDEX idx_follows_following ON follows(following_id);
-- (follower_id is already covered by the composite primary key)

Feed Query

Home Feed Query

-- Get the 20 most recent posts from users that :current_user follows
SELECT p.id, p.body, p.created_at, u.username, u.avatar_url
FROM posts p
JOIN follows f ON f.following_id = p.author_id
JOIN users u ON u.id = p.author_id
WHERE f.follower_id = :current_user_id
ORDER BY p.created_at DESC
LIMIT 20;

Strategic Denormalization

The feed query above requires a join between three tables for every feed load. For a high-traffic application, we might denormalize:

Denormalization: Follower/Following Counts

-- Add cached counts to the users table
ALTER TABLE users ADD COLUMN follower_count  INT NOT NULL DEFAULT 0;
ALTER TABLE users ADD COLUMN following_count INT NOT NULL DEFAULT 0;

-- Update counts when follows change (via trigger or application code)
CREATE OR REPLACE FUNCTION update_follow_counts() RETURNS TRIGGER AS $$
BEGIN
    IF TG_OP = 'INSERT' THEN
        UPDATE users SET following_count = following_count + 1 WHERE id = NEW.follower_id;
        UPDATE users SET follower_count  = follower_count + 1  WHERE id = NEW.following_id;
    ELSIF TG_OP = 'DELETE' THEN
        UPDATE users SET following_count = following_count - 1 WHERE id = OLD.follower_id;
        UPDATE users SET follower_count  = follower_count - 1  WHERE id = OLD.following_id;
    END IF;
    RETURN NULL;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_follow_counts
AFTER INSERT OR DELETE ON follows
FOR EACH ROW EXECUTE FUNCTION update_follow_counts();

This denormalization means profile pages can display follower/following counts with a single row lookup instead of a COUNT(*) query on the follows table. The trade-off: the counts may be momentarily inconsistent during concurrent follow/unfollow operations, but for a social media application, this is acceptable.

Feed at Scale: Fan-Out Strategies

For users with millions of followers (celebrities), the feed query above becomes impractical. Two strategies exist:

Fan-out on write (push model): When a user publishes a post, write a copy of the post ID to every follower's feed inbox. Reading the feed is a simple lookup. This works well for users with few followers but is expensive for celebrities.
Fan-out on read (pull model): When a user opens their feed, query posts from all followed users in real time. This works well for celebrities but is expensive for users who follow many accounts.
Hybrid: Use fan-out on write for normal users and fan-out on read for celebrities (users with more than N followers). Twitter/X uses this hybrid approach.

Key Takeaways

Follow a systematic workflow: requirements, entities, normalization, indexes, strategic denormalization, migration planning.
Normalize to 3NF by default. It eliminates redundancy and prevents update anomalies. BCNF is rarely necessary for web applications.
Denormalize only with clear justification. Document what is duplicated, how it stays in sync, and what the measured performance gain is.
Index design follows access patterns. Use composite indexes with equality columns first, then range columns, then sort columns. Use partial and covering indexes for specialized queries.
NoSQL schema design is access-pattern-driven. DynamoDB uses single-table design with overloaded keys. MongoDB uses embedding for tightly coupled data and referencing for loosely coupled data.
Use the expand-contract pattern for zero-downtime migrations. Never rename or drop a column in a single step in production.
Schema design is iterative. Start simple, measure real query performance, and optimize based on data, not assumptions.

Database Schema Design

The Schema Design Workflow

Gather Requirements

Define Entities & Relationships

Normalize the Schema

Design Indexes

Denormalize Where Justified

Plan for Evolution

Normalization

1NF: Atomic Values

2NF: No Partial Dependencies

3NF: No Transitive Dependencies

Strategic Denormalization

When to Denormalize

Common Denormalization Patterns

Index Design Strategy

How B-Tree Indexes Work

Index Types

Composite Index Ordering

Schema Design for NoSQL

DynamoDB: Single-Table Design

MongoDB: Embedding vs. Referencing

Embedding (Denormalized)

Referencing (Normalized)

Schema Migrations & Zero-Downtime Deployment

Safe Migration Patterns

The Expand-Contract Pattern

Expand

Migrate Data

Switch Reads

Contract

Worked Example: Social Media Schema

Requirements

Normalized Schema (3NF)

Feed Query

Strategic Denormalization

Feed at Scale: Fan-Out Strategies

Key Takeaways

Chapter Check-Up