A clickstream pipeline at a streaming company suffered a six-hour outage
A medium Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.
- Domain
- Pipeline Design
- Difficulty
- medium
Problem
A clickstream pipeline at a streaming company suffered a six-hour outage. A Geo-IP service deployed a config change that increased p99 latency from 80ms to 4 seconds. The pipeline had no retry policy on the Geo-IP call, no circuit breaker, no DLQ, and no backpressure on Kafka beyond retention. Three hours of data was permanently lost when Kafka retention dropped messages that could not be enriched. Trace the postmortem by adding the four missing patterns the section names so the next latency spike is bounded.
Practice This Problem
Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it automatically.