MediumIntermediate

Appointment Booking System

DatabasesAPI DesignCachingNotifications

Problem Statement

BookNow is building an appointment scheduling platform (like Calendly meets salon booking). Service businesses (doctors, salons, consultants) use it to manage appointments. Features:

- Availability calendar - providers set their available time slots (e.g., Mon-Fri 9 AM-5 PM, 30-min slots). Block off holidays, lunch, and personal time.Online booking - customers browse available slots and book appointments. The system must prevent double-bookings with concurrent access.Buffer time - configurable gaps between appointments (e.g., 15 minutes between haircuts for cleanup).Multiple service types - a salon offers "Haircut (30 min)", "Color (90 min)", "Blowout (45 min)" - slot length varies by service.Reminders - send email and SMS reminders 24 hours and 1 hour before the appointment.Cancellation & rescheduling - customers can cancel/reschedule up to 24 hours before. Released slots become available again.Waitlist - when a popular provider is fully booked, customers join a waitlist and get notified if a slot opens.Multi-location - a business with 3 locations, each with its own providers and calendars.

Targeting 10,000 businesses with 500,000 bookings per day.

What You'll Learn

Design an appointment scheduling platform with availability management, booking, reminders, and waitlists for service businesses. Build this architecture under realistic production constraints, then validate tradeoffs in the design lab simulation.

DatabasesAPI DesignCachingNotifications

Constraints

Businesses~10,000
Providers per business~10 avg
Bookings per day~500,000
Double-booking toleranceZero
Booking confirmation latency< 1 second
Reminder delivery accuracy> 99%
Availability target99.9%
ApproachClick to expand

Interview-Ready Approach

1) Clarify Scope and SLOs

  • Problem statement: Design an appointment scheduling platform with availability management, booking, reminders, and waitlists for service businesses.
  • Design for a peak load target around 100 RPS (including burst headroom).
  • Businesses: ~10,000
  • Providers per business: ~10 avg
  • Bookings per day: ~500,000
  • Double-booking tolerance: Zero
  • Booking confirmation latency: < 1 second

2) Capacity Planning Method

  • Convert traffic and growth constraints into request rate, storage growth, and concurrency budgets.
  • Keep at least 2-3x safety margin per tier (ingress, compute, storage, async workers).
  • Reserve explicit latency budgets per hop so p95 can be defended in review.

3) Architecture Decisions

  • Databases: Define a clear system-of-record and design read/write paths separately before adding optimizations.
  • API Design: Standardize API boundaries, idempotency keys, pagination, and error contracts first.
  • Caching: Put cache on hot read paths first and pick cache-aside or write-through explicitly.
  • Notifications: Model notifications as event-driven fanout with per-channel workers (email/push/webhook).

4) Reliability and Failure Strategy

  • Use strong write constraints (transactions or conditional writes) and explicit backup/restore strategy.
  • Apply strict input validation and backward-compatible versioning.
  • Bound staleness with TTL + invalidation hooks for critical entities.
  • Track delivery state machine and dead-letter undeliverable events.

5) Validation Plan

  • Run one peak-load test, one dependency-degradation test, and one failover test.
  • Verify idempotency for all retried writes and async consumers.
  • Track user-facing SLOs first: p95 latency, error rate, and successful throughput.

6) Trade-offs to Call Out in Interviews

  • Databases: SQL gives stronger transactional guarantees; NoSQL often gives better write scaling and flexibility.
  • API Design: Rich APIs improve developer speed but can create long-term compatibility burden.
  • Caching: Higher hit rate cuts latency/cost, but stale data and invalidation bugs become primary risks.
  • Notifications: Multi-channel coverage increases reach but adds per-channel failure modes and policy complexity.

Practical Notes

  • Use pessimistic locking (SELECT FOR UPDATE on the slot row) to prevent concurrent users from booking the same slot.
  • Store availability as a list of time ranges per provider per day. When a booking is made, subtract from available ranges.
  • Reminders: a scheduled job scans for appointments due for reminders in the next few minutes. Enqueue messages to an SMS/email queue.

Learn the Concept

Practice Next

Reference SolutionClick to reveal

Why This Solution Works

Request path: The solution keeps ingress, service logic, and stateful dependencies separated so each layer can scale independently.

Reference flow: Web Clients -> Load Balancer -> API Gateway -> API Service -> Primary NoSQL DB -> Redis Cache -> Message Queue -> Background Workers

Design strengths

  • Cache sits on the read path to absorb repeated queries and keep DB pressure stable.
  • Async queue/event bus isolates bursty workloads and supports retries without blocking synchronous requests.

Interview defense

  • This design makes bottlenecks explicit (ingress, core compute, persistence, async workers).
  • It supports progressive scaling without re-architecting the core request path.
  • It keeps correctness-sensitive state changes in durable systems while offloading background work asynchronously.