HardEnterprise

Design Uber

DatabasesWebSocketsMessage QueuesCachingGeo DistributionAnalytics

Problem Statement

Design the architecture for Uber - the world's largest ride-hailing platform operating in 10,000+ cities across 70+ countries with 130 million monthly active users. Your design must cover:

- Real-time driver matching - when a rider requests a ride, find the optimal driver within 15 seconds. The matching considers distance, ETA, driver rating, vehicle type, and predicted demand. Uses geospatial indexing over millions of active drivers.Live location tracking - every active driver sends a GPS update every 4 seconds. The system ingests millions of location updates per second, updating a real-time geospatial index.Surge pricing - dynamically adjust fares based on supply (available drivers) and demand (ride requests) per geographic zone. Recalculated every 30 seconds.ETA prediction - predict arrival time using real-time traffic data, historical patterns, and ML models. ETA must account for road conditions, events, and weather.Trip lifecycle - request → match → driver en route → pickup → in-trip → dropoff → payment → rating. Each state transition is an event in an event-sourced system.Payments - process payments across 70+ countries with different payment methods, currencies, and tax rules.Driver/rider safety - real-time trip monitoring, anomaly detection (unexpected route changes), and emergency SOS.

The core challenge is the real-time geospatial system operating at global scale with sub-second latency requirements.

What You'll Learn

Design Uber's ride-hailing platform - real-time matching, surge pricing, ETA prediction, and global dispatch for 100 M+ users. Build this architecture under realistic production constraints, then validate tradeoffs in the design lab simulation.

DatabasesWebSocketsMessage QueuesCachingGeo DistributionAnalytics

Constraints

Monthly active users130,000,000
Daily trips~25,000,000
Active drivers (peak)~5,000,000
Location updates/second~1,250,000
Match time< 15 seconds
Surge recalculationEvery 30 seconds
Countries70+
Availability target99.99%
ApproachClick to expand

Interview-Ready Approach

1) Clarify Scope and SLOs

  • Problem statement: Design Uber's ride-hailing platform - real-time matching, surge pricing, ETA prediction, and global dispatch for 100 M+ users.
  • Design for a peak load target around 9,028 RPS (including burst headroom).
  • Monthly active users: 130,000,000
  • Daily trips: ~25,000,000
  • Active drivers (peak): ~5,000,000
  • Location updates/second: ~1,250,000
  • Match time: < 15 seconds

2) Capacity Planning Method

  • Convert traffic and growth constraints into request rate, storage growth, and concurrency budgets.
  • Keep at least 2-3x safety margin per tier (ingress, compute, storage, async workers).
  • Reserve explicit latency budgets per hop so p95 can be defended in review.

3) Architecture Decisions

  • Databases: Define a clear system-of-record and design read/write paths separately before adding optimizations.
  • WebSockets: Use persistent connection gateways and decouple fanout via pub/sub or queues.
  • Message Queues: Move non-blocking and retry-heavy work to async consumers with explicit retry and DLQ policies.
  • Caching: Put cache on hot read paths first and pick cache-aside or write-through explicitly.
  • Geo Distribution: Route users to nearest region/edge while keeping write-consistency boundaries explicit.
  • Analytics: Maintain separate OLTP and analytics paths; stream events into a warehouse/time-series layer.

4) Reliability and Failure Strategy

  • Use strong write constraints (transactions or conditional writes) and explicit backup/restore strategy.
  • Track connection churn, backpressure, and session resumption behavior.
  • Guarantee idempotent consumers and trace every message with correlation IDs.
  • Bound staleness with TTL + invalidation hooks for critical entities.
  • Design region failover and data residency controls as first-class requirements.

5) Validation Plan

  • Run one peak-load test, one dependency-degradation test, and one failover test.
  • Verify idempotency for all retried writes and async consumers.
  • Track user-facing SLOs first: p95 latency, error rate, and successful throughput.

6) Trade-offs to Call Out in Interviews

  • Databases: SQL gives stronger transactional guarantees; NoSQL often gives better write scaling and flexibility.
  • WebSockets: WebSockets reduce interaction latency but complicate scaling and state management.
  • Message Queues: Async pipelines absorb spikes well, but increase eventual-consistency complexity.
  • Caching: Higher hit rate cuts latency/cost, but stale data and invalidation bugs become primary risks.
  • Geo Distribution: Global latency improves, but cross-region consistency and operations become harder.

Practical Notes

  • Use a geospatial index (S2 geometry cells or geohash) to partition the world into cells. Each cell tracks active drivers. 'Find nearby drivers' = query the cell + neighboring cells.
  • Driver locations live in-memory (Redis with geospatial commands or a custom spatial index) - not in a traditional database.
  • Surge pricing: a stream processor aggregates supply/demand per S2 cell every 30 seconds. Price = f(demand/supply ratio).

Learn the Concept

Practice Next

Reference SolutionClick to reveal

Why This Solution Works

Request path: The solution keeps ingress, service logic, and stateful dependencies separated so each layer can scale independently.

Reference flow: Mobile Clients -> DNS -> Load Balancer -> API Gateway -> Core Service -> Primary NoSQL DB -> Replica SQL DB -> Redis Cache

Design strengths

  • Cache sits on the read path to absorb repeated queries and keep DB pressure stable.
  • Async queue/event bus isolates bursty workloads and supports retries without blocking synchronous requests.
  • Analytics pipeline is separated from OLTP path to avoid reporting workloads impacting transactions.

Interview defense

  • This design makes bottlenecks explicit (ingress, core compute, persistence, async workers).
  • It supports progressive scaling without re-architecting the core request path.
  • It keeps correctness-sensitive state changes in durable systems while offloading background work asynchronously.