Loading...

Real-Time News Event Detection Pipeline from Social Media Firehose

A hard Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.

Domain
Pipeline Design
Difficulty
hard
Seniority
senior

Problem

We detect breaking news and real-world events from the full Twitter firehose and 1 million other data sources. When an earthquake happens or a building catches fire, we need to identify it from thousands of simultaneous posts and send a single validated alert to our clients - hedge funds, newsrooms, and government agencies - within 60 seconds. Right now our pipeline can detect events but the deduplication logic is brittle and we miss multi-source signals. Design the event detection and deduplication pipeline.

Practice This Problem

Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it instantly.