MediumTicket Booking · Part 1

Ticket Booking - Concert & Event Reservations

DatabasesCachingRate LimitingMessage QueuesConsistency

Problem Statement

SeatGrab is building an event ticketing platform for concerts, sports, and theater. The hardest scenario: a major artist announces a stadium tour and 500,000 fans try to buy tickets for a 50,000-seat venue the moment they go on sale.

Core requirements:

- Seat map - the venue has assigned seating. Users browse an interactive seat map, select specific seats, and purchase. Two users must never successfully purchase the same seat.Hold & release - when a user selects seats, they're temporarily held for 5 minutes. If the user doesn't complete checkout, the seats are released back to the pool.Waiting room / queue - to prevent the site from crashing, users are placed in a virtual queue when demand exceeds capacity. The system admits users in FIFO order at a controlled rate.Dynamic pricing - ticket prices adjust based on demand and section popularity, recalculated every 30 seconds.Anti-bot measures - detect and block automated bots that try to snatch tickets (CAPTCHA, device fingerprinting, rate limiting).Post-sale features - digital tickets with QR codes, ticket transfers, and refunds.

This challenge is about concurrency control, fairness, and handling extreme traffic spikes.

What You'll Learn

Design a Ticketmaster-like platform handling 500 k users competing for seats in a 50 k-capacity venue. Build this architecture under realistic production constraints, then validate tradeoffs in the design lab simulation.

DatabasesCachingRate LimitingMessage QueuesConsistency

Constraints

Concurrent users at sale~500,000
Venue capacity50,000 seats
Seat hold timeout5 minutes
Checkout confirmation< 3 seconds
Double-booking toleranceZero
Queue admission rate~5,000 users/minute
Price recalculationEvery 30 seconds
Availability target99.9%
ApproachClick to expand

Interview-Ready Approach

1) Clarify Scope and SLOs

  • Problem statement: Design a Ticketmaster-like platform handling 500 k users competing for seats in a 50 k-capacity venue.
  • Design for a peak load target around 75,000 RPS (including burst headroom).
  • Concurrent users at sale: ~500,000
  • Venue capacity: 50,000 seats
  • Seat hold timeout: 5 minutes
  • Checkout confirmation: < 3 seconds
  • Double-booking tolerance: Zero

2) Capacity Planning Method

  • Convert traffic and growth constraints into request rate, storage growth, and concurrency budgets.
  • Keep at least 2-3x safety margin per tier (ingress, compute, storage, async workers).
  • Reserve explicit latency budgets per hop so p95 can be defended in review.

3) Architecture Decisions

  • Databases: Define a clear system-of-record and design read/write paths separately before adding optimizations.
  • Caching: Put cache on hot read paths first and pick cache-aside or write-through explicitly.
  • Rate Limiting: Enforce token/sliding-window limits at ingress and for sensitive internal APIs.
  • Message Queues: Move non-blocking and retry-heavy work to async consumers with explicit retry and DLQ policies.
  • Consistency: Classify operations by consistency requirement: strong for money/inventory, eventual for feeds/analytics.

4) Reliability and Failure Strategy

  • Use strong write constraints (transactions or conditional writes) and explicit backup/restore strategy.
  • Bound staleness with TTL + invalidation hooks for critical entities.
  • Return deterministic 429 behavior with clear retry headers.
  • Guarantee idempotent consumers and trace every message with correlation IDs.
  • Use idempotency keys and conflict-resolution rules on retried/distributed writes.

5) Validation Plan

  • Run one peak-load test, one dependency-degradation test, and one failover test.
  • Verify idempotency for all retried writes and async consumers.
  • Track user-facing SLOs first: p95 latency, error rate, and successful throughput.

6) Trade-offs to Call Out in Interviews

  • Databases: SQL gives stronger transactional guarantees; NoSQL often gives better write scaling and flexibility.
  • Caching: Higher hit rate cuts latency/cost, but stale data and invalidation bugs become primary risks.
  • Rate Limiting: Aggressive limits protect the system but can hurt legitimate burst traffic.
  • Message Queues: Async pipelines absorb spikes well, but increase eventual-consistency complexity.
  • Consistency: Stronger consistency improves correctness, but often increases latency and coordination costs.

Practical Notes

  • Use pessimistic locking (SELECT FOR UPDATE) or Redis atomic operations to prevent double-booking of seats.
  • A virtual waiting room (queue) with token-based admission controls the thundering herd problem.
  • Seat holds can use Redis keys with TTL - automatic release after 5 minutes if not purchased.

Learn the Concept

Practice Next

Reference SolutionClick to reveal

Why This Solution Works

Request path: The solution keeps ingress, service logic, and stateful dependencies separated so each layer can scale independently.

Reference flow: Web Clients -> Load Balancer -> API Gateway -> Rate Limiter -> API Service -> Primary SQL DB -> Read Model DB -> Redis Cache

Design strengths

  • Cache sits on the read path to absorb repeated queries and keep DB pressure stable.
  • Async queue/event bus isolates bursty workloads and supports retries without blocking synchronous requests.
  • Security controls are enforced at ingress to protect downstream capacity.

Interview defense

  • This design makes bottlenecks explicit (ingress, core compute, persistence, async workers).
  • It supports progressive scaling without re-architecting the core request path.
  • It keeps correctness-sensitive state changes in durable systems while offloading background work asynchronously.