SafeGuard is building a content moderation pipeline for a social media platform. Every piece of user-generated content must be screened before appearing publicly. The system handles:
- Text moderation - detect hate speech, harassment, spam, and misinformation in post text and comments using NLP models. Flag with confidence scores.•Image moderation - detect nudity, violence, graphic content, and banned symbols in uploaded images using a vision model.•Video moderation - sample frames from uploaded videos and run image classification. Also analyze audio transcripts for harmful speech.•Confidence-based routing - high-confidence violations (> 95%) are auto-removed. Low-confidence flags (50-95%) go to a human review queue. Content scoring < 50% passes automatically.•Human review - a moderation dashboard where human reviewers see flagged content, make a judgment (approve/remove/escalate), and provide a reason.•Appeals - users can appeal a removal. Appeals go to a senior reviewer.•Policy updates - when moderation policies change, the system can retroactively re-evaluate recently published content.
Process 10 million content items per day with a median review time of 5 minutes for queued items.
Design a content moderation system that screens text, images, and video using AI + human review for a social platform. Build this architecture under realistic production constraints, then validate tradeoffs in the design lab simulation.
Request path: The solution keeps ingress, service logic, and stateful dependencies separated so each layer can scale independently.
Reference flow: Web Clients -> Load Balancer -> API Gateway -> API Service -> Primary SQL DB -> Message Queue -> Background Workers -> Object Storage