A consumer-app metrics streaming aggregation must serve a real-time dashboard with sub-minute freshn
A medium Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.
- Domain
- Pipeline Design
- Difficulty
- medium
Problem
A consumer-app metrics streaming aggregation must serve a real-time dashboard with sub-minute freshness. The team profiled lateness for a full week: 50th percentile is 2 seconds, 95th is 30 seconds, 99th is 8 minutes, 99.9th is 4 hours. Engine state cost grows linearly with allowed lateness; a 4-hour budget is unaffordable on the current cluster. The section's framing: pick a percentile target (95th for most consumer dashboards, 99th for important reports), compute state cost at that percentile, and plan a reconciliation pass for everything past the budget. Pick the budget by setting the aggregation's allowed lateness in its name to a value that matches the 95th-or-99th percentile for this workload, and add a reconciliation pass node downstream that catches events past the budget.
Practice This Problem
Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it automatically.