Design Spotify

Problem Statement

Design the architecture for Spotify - the world's largest audio streaming platform with 615 million monthly active users and a catalog of 100 million tracks. Your design must cover:

- Audio streaming - serve audio files in multiple quality levels (low 24 kbps → very high 320 kbps → lossless 1,411 kbps). The player pre-buffers the next 30 seconds and seamlessly transitions between songs with gapless playback.•Content ingestion - artists/labels upload tracks via Spotify for Artists. Each track is transcoded into multiple formats (OGG Vorbis, AAC, FLAC), tagged with metadata, and distributed to CDN edge nodes.•Personalization engine - powers Discover Weekly, Release Radar, Daily Mixes, and the home feed. Uses collaborative filtering, content-based analysis (audio features via ML), and contextual signals (time of day, mood, listening history).•Search - search across 100 M tracks, artists, albums, podcasts, and playlists with type-ahead autocomplete in < 50 ms.•Offline mode - premium users can download playlists for offline listening. DRM (Widevine/FairPlay) protects content.•Social features - shared playlists, friend activity feed, collaborative queue ("Group Session").•Royalty calculation - for every stream, calculate the fractional royalty owed to artists/labels based on the user's subscription revenue and total streams. This is a massive batch compute job.

The key challenge is low-latency audio streaming with personalization at massive scale, combined with a complex royalty/payment system.

What You'll Learn

Design Spotify's music streaming platform - audio delivery, personalized playlists, offline mode, and social features for 600 M users. Build this architecture under realistic production constraints, then validate tradeoffs in the design lab simulation.

CDNDatabasesCachingMicroservicesStorageAnalytics

Constraints

Monthly active users615,000,000

Catalog size100,000,000 tracks

Concurrent streams (peak)~30,000,000

Audio start time< 200 ms

Search / autocomplete< 50 ms

Storage (all formats)~100 PB

Daily streams~1,500,000,000

Availability target99.99%

Learn the Concept

CDN Topic Hub Databases Topic Hub Caching Topic Hub Microservices Topic Hub Storage Topic Hub Analytics Topic Hub

Related guided labs:

Your First System Load Balancing & Horizontal Scaling Database Replication & Read Scaling

Problem Statement

What You'll Learn

Constraints

Interview-Ready Approach

1) Clarify Scope and SLOs

2) Capacity Planning Method

3) Architecture Decisions

4) Reliability and Failure Strategy

5) Validation Plan

6) Trade-offs to Call Out in Interviews

Practical Notes

Hints (5)

Learn the Concept

Practice Next