HardEnterprise

Design Airbnb

DatabasesSearchCachingGeo DistributionAPI DesignMessage Queues

Problem Statement

Design the architecture for Airbnb - the world's largest hospitality marketplace with 150 million users, 7 million listings, and 2 million bookings per night at peak. Your design must cover:

- Listing management - hosts create listings with photos, descriptions, amenities, house rules, and availability calendars. Listings are searchable and filterable by 50+ attributes.Search & discovery - search by location, dates, guests, price range, amenities, and property type. Results ranked by relevance, price, host quality, and location desirability. Support map-based browsing with listings rendered on a map as the user pans/zooms.Availability & booking - the calendar availability system must prevent double-bookings. When a guest requests dates, the system must atomically check availability and reserve. Hosts can set different prices for different dates (seasonal pricing).Payments - split payments: guest pays at booking, host receives payout after check-in. Support 70+ currencies, handle refunds, cancellation policies, and service fees. Escrow model.Reviews - two-sided reviews (guest reviews host AND host reviews guest), both revealed simultaneously after a blind period to prevent retaliation bias.Dynamic pricing - suggest optimal nightly prices to hosts based on demand, local events, seasonality, comparable listings, and day of week. "Smart Pricing" feature.Trust & safety - identity verification, background checks, fraud detection (fake listings, payment fraud), and a resolution center for disputes.Messaging - real-time chat between guests and hosts with pre-booking questions, booking confirmations, and check-in instructions.

The core challenge is a two-sided marketplace with complex availability/pricing logic, geospatial search, and trust systems.

What You'll Learn

Design Airbnb's marketplace - listing search, booking, payments, reviews, pricing, and trust at 150 M+ users. Build this architecture under realistic production constraints, then validate tradeoffs in the design lab simulation.

DatabasesSearchCachingGeo DistributionAPI DesignMessage Queues

Constraints

Total users150,000,000+
Active listings7,000,000
Peak bookings/night~2,000,000
Search QPS (peak)~20,000
Search latency< 500 ms
Currencies supported70+
Calendar availability check< 200 ms
Availability target99.99%
ApproachClick to expand

Interview-Ready Approach

1) Clarify Scope and SLOs

  • Problem statement: Design Airbnb's marketplace - listing search, booking, payments, reviews, pricing, and trust at 150 M+ users.
  • Design for a peak load target around 10,417 RPS (including burst headroom).
  • Total users: 150,000,000+
  • Active listings: 7,000,000
  • Peak bookings/night: ~2,000,000
  • Search QPS (peak): ~20,000
  • Search latency: < 500 ms

2) Capacity Planning Method

  • Convert traffic and growth constraints into request rate, storage growth, and concurrency budgets.
  • Keep at least 2-3x safety margin per tier (ingress, compute, storage, async workers).
  • Reserve explicit latency budgets per hop so p95 can be defended in review.

3) Architecture Decisions

  • Databases: Define a clear system-of-record and design read/write paths separately before adding optimizations.
  • Search: Use primary store for writes and async index updates for search relevance + scale.
  • Caching: Put cache on hot read paths first and pick cache-aside or write-through explicitly.
  • Geo Distribution: Route users to nearest region/edge while keeping write-consistency boundaries explicit.
  • API Design: Standardize API boundaries, idempotency keys, pagination, and error contracts first.
  • Message Queues: Move non-blocking and retry-heavy work to async consumers with explicit retry and DLQ policies.

4) Reliability and Failure Strategy

  • Use strong write constraints (transactions or conditional writes) and explicit backup/restore strategy.
  • Track indexing lag and support reindex from source of truth.
  • Bound staleness with TTL + invalidation hooks for critical entities.
  • Design region failover and data residency controls as first-class requirements.
  • Apply strict input validation and backward-compatible versioning.

5) Validation Plan

  • Run one peak-load test, one dependency-degradation test, and one failover test.
  • Verify idempotency for all retried writes and async consumers.
  • Track user-facing SLOs first: p95 latency, error rate, and successful throughput.

6) Trade-offs to Call Out in Interviews

  • Databases: SQL gives stronger transactional guarantees; NoSQL often gives better write scaling and flexibility.
  • Search: Search index gives rich querying but introduces eventual consistency and index ops overhead.
  • Caching: Higher hit rate cuts latency/cost, but stale data and invalidation bugs become primary risks.
  • Geo Distribution: Global latency improves, but cross-region consistency and operations become harder.
  • API Design: Rich APIs improve developer speed but can create long-term compatibility burden.

Practical Notes

  • Search: Elasticsearch with geospatial queries (geo_bounding_box for map view). Denormalize listing + host + review data into a search document.
  • Availability: store as date ranges per listing in a relational DB. Use SELECT FOR UPDATE for atomic booking to prevent double-booking.
  • Map-based search: use geohash-based clustering to group nearby listings into markers when zoomed out. Switch to individual pins when zoomed in.

Learn the Concept

Practice Next

Reference SolutionClick to reveal

Why This Solution Works

Request path: The solution keeps ingress, service logic, and stateful dependencies separated so each layer can scale independently.

Reference flow: Web Clients -> DNS -> Load Balancer -> API Gateway -> Core Service -> Primary SQL DB -> Read Model DB -> Redis Cache

Design strengths

  • Cache sits on the read path to absorb repeated queries and keep DB pressure stable.
  • Async queue/event bus isolates bursty workloads and supports retries without blocking synchronous requests.

Interview defense

  • This design makes bottlenecks explicit (ingress, core compute, persistence, async workers).
  • It supports progressive scaling without re-architecting the core request path.
  • It keeps correctness-sensitive state changes in durable systems while offloading background work asynchronously.