MediumIntermediate

Content Moderation Pipeline

Message QueuesDatabasesAPI DesignStorageMonitoring

Problem Statement

SafeGuard is building a content moderation pipeline for a social media platform. Every piece of user-generated content must be screened before appearing publicly. The system handles:

- Text moderation - detect hate speech, harassment, spam, and misinformation in post text and comments using NLP models. Flag with confidence scores.Image moderation - detect nudity, violence, graphic content, and banned symbols in uploaded images using a vision model.Video moderation - sample frames from uploaded videos and run image classification. Also analyze audio transcripts for harmful speech.Confidence-based routing - high-confidence violations (> 95%) are auto-removed. Low-confidence flags (50-95%) go to a human review queue. Content scoring < 50% passes automatically.Human review - a moderation dashboard where human reviewers see flagged content, make a judgment (approve/remove/escalate), and provide a reason.Appeals - users can appeal a removal. Appeals go to a senior reviewer.Policy updates - when moderation policies change, the system can retroactively re-evaluate recently published content.

Process 10 million content items per day with a median review time of 5 minutes for queued items.

What You'll Learn

Design a content moderation system that screens text, images, and video using AI + human review for a social platform. Build this architecture under realistic production constraints, then validate tradeoffs in the design lab simulation.

Message QueuesDatabasesAPI DesignStorageMonitoring

Constraints

Content items/day~10,000,000
Auto-moderation latency< 30 seconds
Human review queue time< 30 minutes
False positive rate< 1%
False negative rate< 0.1%
Human reviewers~500
Availability target99.95%
ApproachClick to expand

Interview-Ready Approach

1) Clarify Scope and SLOs

  • Problem statement: Design a content moderation system that screens text, images, and video using AI + human review for a social platform.
  • Design for a peak load target around 579 RPS (including burst headroom).
  • Content items/day: ~10,000,000
  • Auto-moderation latency: < 30 seconds
  • Human review queue time: < 30 minutes
  • False positive rate: < 1%
  • False negative rate: < 0.1%

2) Capacity Planning Method

  • Convert traffic and growth constraints into request rate, storage growth, and concurrency budgets.
  • Keep at least 2-3x safety margin per tier (ingress, compute, storage, async workers).
  • Reserve explicit latency budgets per hop so p95 can be defended in review.

3) Architecture Decisions

  • Message Queues: Move non-blocking and retry-heavy work to async consumers with explicit retry and DLQ policies.
  • Databases: Define a clear system-of-record and design read/write paths separately before adding optimizations.
  • API Design: Standardize API boundaries, idempotency keys, pagination, and error contracts first.
  • Storage: Use object storage for large blobs and keep metadata/authorization separate in the API tier.
  • Monitoring: Instrument golden signals (latency, traffic, errors, saturation) per tier and per tenant/domain.

4) Reliability and Failure Strategy

  • Guarantee idempotent consumers and trace every message with correlation IDs.
  • Use strong write constraints (transactions or conditional writes) and explicit backup/restore strategy.
  • Apply strict input validation and backward-compatible versioning.
  • Enforce lifecycle policies, retention tiers, and checksum validation.
  • Alert on user-impact SLOs, not only infrastructure metrics.

5) Validation Plan

  • Run one peak-load test, one dependency-degradation test, and one failover test.
  • Verify idempotency for all retried writes and async consumers.
  • Track user-facing SLOs first: p95 latency, error rate, and successful throughput.

6) Trade-offs to Call Out in Interviews

  • Message Queues: Async pipelines absorb spikes well, but increase eventual-consistency complexity.
  • Databases: SQL gives stronger transactional guarantees; NoSQL often gives better write scaling and flexibility.
  • API Design: Rich APIs improve developer speed but can create long-term compatibility burden.
  • Storage: Object storage is cheap and durable, but random low-latency reads are weaker than databases/caches.
  • Monitoring: Deep observability speeds incident response but raises ingestion and tooling costs.

Practical Notes

  • Design as a pipeline: upload → text classifier → image classifier → decision engine → publish or queue for human review.
  • Use a message queue between stages so each classifier can scale independently.
  • Human review: distribute items round-robin with a maximum queue depth per reviewer. Track inter-rater reliability.

Learn the Concept

Practice Next

Reference SolutionClick to reveal

Why This Solution Works

Request path: The solution keeps ingress, service logic, and stateful dependencies separated so each layer can scale independently.

Reference flow: Web Clients -> Load Balancer -> API Gateway -> API Service -> Primary SQL DB -> Message Queue -> Background Workers -> Object Storage

Design strengths

  • Async queue/event bus isolates bursty workloads and supports retries without blocking synchronous requests.
  • Monitoring and logs are wired in from day one for rapid incident triage.

Interview defense

  • This design makes bottlenecks explicit (ingress, core compute, persistence, async workers).
  • It supports progressive scaling without re-architecting the core request path.
  • It keeps correctness-sensitive state changes in durable systems while offloading background work asynchronously.