Public Solution

Video Streaming - On-Demand Platform

Video Streaming - On-Demand Platform solution gives a production-minded baseline for this prompt. You get a concise requirements recap, a component-by-component architecture breakdown, explicit tradeoffs for latency, availability, cost, and complexity, plus failure mitigations and scoring rationale so you can benchmark your own design quickly.

HardCdnMedia ProcessingDatabasesCaching

Requirements Recap

RequirementTarget
Registered users100,000,000
Peak concurrent streams~10,000,000
Content library~50,000 titles
New uploads/day~200 videos
Video sizes (raw)Up to 100 GB
Transcoding variants10+ per video
Buffer ratio< 0.5%
Global regions10+
Availability target99.99%

Architecture Breakdown (Component-by-Component)

  1. 1. Web Clients

    Generates user traffic and receives responses.

    Acts as an entry layer that routes traffic into the rest of the system.

  2. 2. DNS

    Resolves domain names to reachable service endpoints.

    Bridges 1 incoming flow to 1 downstream dependency.

  3. 3. CDN Edge

    Serves cacheable and static content from edge locations.

    Bridges 1 incoming flow to 1 downstream dependency.

  4. 4. Load Balancer

    Distributes requests across healthy backend instances.

    Bridges 1 incoming flow to 1 downstream dependency.

  5. 5. API Gateway

    Handles api gateway responsibilities in this design.

    Bridges 1 incoming flow to 1 downstream dependency.

  6. 6. Core Service

    Handles microservice responsibilities in this design.

    Bridges 1 incoming flow to 3 downstream dependencies.

  7. 7. Redis Cache

    Stores hot data to reduce origin read latency.

    Bridges 1 incoming flow to 1 downstream dependency.

  8. 8. Message Queue

    Buffers asynchronous work to smooth traffic spikes.

    Bridges 1 incoming flow to 1 downstream dependency.

  9. 9. Primary SQL DB

    Persists relational data with transactional guarantees.

    Acts as a sink or system-of-record endpoint in the architecture flow.

  10. 10. Media Workers

    Processes asynchronous jobs outside the request path.

    Bridges 1 incoming flow to 1 downstream dependency.

  11. 11. Object Storage

    Stores large files and media objects durably.

    Acts as a sink or system-of-record endpoint in the architecture flow.

Tradeoffs (Latency / Availability / Cost / Complexity)

DecisionLatencyAvailabilityCostComplexity
Keep the request path focused on core business operationsShorter synchronous path keeps average response time stableFewer inline dependencies reduce immediate failure blast radiusAvoids unnecessary infrastructure in the first rolloutLower coordination overhead for small teams
Push cacheable responses to edge locationsFaster global response time for static and hot assetsEdge cache can mask origin incidents temporarilyLower origin egress and compute, with CDN transfer feesCache key and purge strategy must be explicit
Keep a clear system of record for transactional writesPredictable read/write behavior with indexed accessStrong correctness with managed backup and recoveryStorage and IOPS spend grows with write volumeSchema evolution and query tuning required
Cache hot reads in front of the primary data storeLower median and tail latency on repeated readsAbsorbs origin pressure during read spikesAdds cache infra spend but reduces database scaling pressureRequires TTL and invalidation discipline

Failure Modes and Mitigations

  • Failure mode: Primary datastore saturation increases latency and write timeouts

    Mitigation: Tune indexes, add read offload where valid, and cap expensive query classes.

  • Failure mode: Cache stampede after hot-key expiry overloads the database

    Mitigation: Use request coalescing, jittered TTLs, and stale-while-revalidate for hot keys.

  • Failure mode: One degraded dependency causes cascading failures across services

    Mitigation: Apply timeouts, retries with budgets, and circuit breakers on every service boundary.

Why This Scores Well

  • Availability (35%): Redundant routing and data paths reduce single points of failure under burst traffic.
  • Latency (20%): The design keeps hot reads close to users and reduces expensive origin round-trips.
  • Resilience (25%): Asynchronous buffering, observability, and service boundaries isolate faults and improve recovery.
  • Cost Efficiency (10%) + Simplicity (10%): Higher complexity is scoped to requirements that actually demand scale or stronger fault tolerance.

Next Step CTA

Validate this architecture by solving the prompt yourself, then practice the highest-leverage component in a guided lab and topic hub.

FAQ

  • What should I change first if traffic doubles?

    Profile the bottleneck first, then scale the hot path component (usually compute, cache, or read path) before adding new system layers.

  • Why is CDN emphasized in this solution?

    It is the highest-leverage topic for this challenge constraints and directly improves score-impacting metrics like latency, availability, or resilience.

  • How do I validate this architecture quickly?

    Run the same challenge in the simulator, compare score breakdown metrics, and then test one tradeoff change at a time.

Related Reading