HardVideo Streaming · Part 1

Video Streaming - On-Demand Platform

CDNMedia ProcessingDatabasesCachingStorageMicroservices

Problem Statement

StreamVault is building a subscription-based video streaming platform. Think Netflix: a massive library of movies and TV shows, personalized recommendations, and smooth playback worldwide. Key components:

- Video ingestion pipeline - content partners upload raw video files (up to 100 GB each). The system must transcode each video into 10+ resolution/bitrate variants (360p to 4K), generate thumbnails, extract subtitles, and store everything durably.Adaptive streaming - the player dynamically switches between quality levels based on the user's bandwidth (HLS / DASH).Content delivery - serve video segments from edge servers close to the user. Buffer ratio target: < 0.5% (fewer than 1 in 200 play-minutes should experience buffering).Recommendation engine - personalized "For You" row powered by viewing history, ratings, and collaborative filtering.Concurrent viewers - handle massive spikes when a popular show drops (e.g., 10 million concurrent streams for a hit series premiere).

This is one of the most infrastructure-intensive system design problems, combining massive storage, heavy compute (transcoding), global distribution, and real-time streaming.

What You'll Learn

Design a Netflix-like video streaming platform serving 100 M users worldwide. Build this architecture under realistic production constraints, then validate tradeoffs in the design lab simulation.

CDNMedia ProcessingDatabasesCachingStorageMicroservices

Constraints

Registered users100,000,000
Peak concurrent streams~10,000,000
Content library~50,000 titles
New uploads/day~200 videos
Video sizes (raw)Up to 100 GB
Transcoding variants10+ per video
Buffer ratio< 0.5%
Global regions10+
Availability target99.99%
ApproachClick to expand

Interview-Ready Approach

1) Clarify Scope and SLOs

  • Problem statement: Design a Netflix-like video streaming platform serving 100 M users worldwide.
  • Design for a peak load target around 80,000 RPS (including burst headroom).
  • Registered users: 100,000,000
  • Peak concurrent streams: ~10,000,000
  • Content library: ~50,000 titles
  • New uploads/day: ~200 videos
  • Video sizes (raw): Up to 100 GB

2) Capacity Planning Method

  • Convert traffic and growth constraints into request rate, storage growth, and concurrency budgets.
  • Keep at least 2-3x safety margin per tier (ingress, compute, storage, async workers).
  • Reserve explicit latency budgets per hop so p95 can be defended in review.

3) Architecture Decisions

  • CDN: Serve static and cacheable content from edge and keep origin strictly for misses and dynamic requests.
  • Media Processing: Split ingest, transform, and delivery into independent stages with async orchestration.
  • Databases: Define a clear system-of-record and design read/write paths separately before adding optimizations.
  • Caching: Put cache on hot read paths first and pick cache-aside or write-through explicitly.
  • Storage: Use object storage for large blobs and keep metadata/authorization separate in the API tier.
  • Microservices: Split services by business boundary, not by technical layer, and enforce ownership per domain.

4) Reliability and Failure Strategy

  • Define cache keys and purge workflows before launch to avoid stale/global outages.
  • Store original media durably and make transforms replayable.
  • Use strong write constraints (transactions or conditional writes) and explicit backup/restore strategy.
  • Bound staleness with TTL + invalidation hooks for critical entities.
  • Enforce lifecycle policies, retention tiers, and checksum validation.

5) Validation Plan

  • Run one peak-load test, one dependency-degradation test, and one failover test.
  • Verify idempotency for all retried writes and async consumers.
  • Track user-facing SLOs first: p95 latency, error rate, and successful throughput.

6) Trade-offs to Call Out in Interviews

  • CDN: Long TTL improves latency/cost; short TTL improves freshness.
  • Media Processing: Pre-processing improves playback UX, but requires substantial compute/storage budget.
  • Databases: SQL gives stronger transactional guarantees; NoSQL often gives better write scaling and flexibility.
  • Caching: Higher hit rate cuts latency/cost, but stale data and invalidation bugs become primary risks.
  • Storage: Object storage is cheap and durable, but random low-latency reads are weaker than databases/caches.

Practical Notes

  • Separate hot storage (popular content on SSDs at edge) from cold storage (long-tail content on S3).
  • Predictive pre-positioning: push a new season's files to edge caches hours before the premiere.
  • The transcoding pipeline is embarrassingly parallel - spin up GPU workers per video segment.

Learn the Concept

Practice Next

Reference SolutionClick to reveal

Why This Solution Works

Request path: The solution keeps ingress, service logic, and stateful dependencies separated so each layer can scale independently.

Reference flow: Web Clients -> DNS -> CDN Edge -> Load Balancer -> API Gateway -> Core Service -> Primary SQL DB -> Redis Cache

Design strengths

  • Cache sits on the read path to absorb repeated queries and keep DB pressure stable.
  • Media processing is handled by background workers so user-facing latency stays low.

Interview defense

  • This design makes bottlenecks explicit (ingress, core compute, persistence, async workers).
  • It supports progressive scaling without re-architecting the core request path.
  • It keeps correctness-sensitive state changes in durable systems while offloading background work asynchronously.