One Earthquake, Ten Thousand Tweets
A hard Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.
- Domain
- Pipeline Design
- Difficulty
- hard
- Seniority
- L5
Problem
We detect breaking news and real-world events from the full Twitter firehose and 1 million other data sources. When an earthquake happens or a building catches fire, we need to identify it from thousands of simultaneous posts and send a single validated alert to our clients - hedge funds, newsrooms, and government agencies - within 60 seconds. Right now our pipeline can detect events but the deduplication logic is brittle and we miss multi-source signals. Design the event detection and deduplication pipeline.
Summary
The firehose is on. Separate signal from noise.
Practice This Problem
Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it automatically.