This challenge builds on Search Engine 1 - Web Crawler & Index. Complete it first for the best experience.
FindIt has expanded from developer docs to a full-scale web search engine indexing 1 billion pages. The system must now handle:
- Link-based ranking (PageRank) - compute a global authority score for every page based on the web link graph. This is a massive offline computation that runs periodically over billions of nodes and edges.•Autocomplete / query suggestions - as the user types, suggest completions from a trie of popular queries (updated hourly). Autocomplete must respond in < 50 ms.•Personalized results - use search history and click-through data to re-rank results per user.•Spell correction - handle misspelled queries ("javscript tutorial" → "javascript tutorial").•Multi-region deployment - serve search results from the nearest data center. Index replicas in each region.•Query throughput - handle 100,000 search queries per second at peak.
This is a capstone-level challenge combining information retrieval, graph algorithms, ML ranking, and planetary-scale infrastructure.
Scale to 1 B pages with PageRank, autocomplete, personalized results, and multi-region serving. Build this architecture under realistic production constraints, then validate tradeoffs in the design lab simulation.
Request path: The solution keeps ingress, service logic, and stateful dependencies separated so each layer can scale independently.
Reference flow: Web Clients -> DNS -> Load Balancer -> Core Service -> Primary NoSQL DB -> Replica SQL DB -> Redis Cache -> Event Bus