Five Times the Traffic, Five Times the Bill
A hard Pipeline Design mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.
- Domain
- Pipeline Design
- Difficulty
- hard
- Seniority
- L7
Interview Prompt
Our platform's data volumes are unpredictable: we see 5x swings between our quietest and busiest hours, with sudden spikes during product launches. We've been running a fixed-size Spark cluster that's over-provisioned 80% of the time and still falls behind during spikes. Operations needs to act on issues within a couple minutes; analytics dashboards tolerate up to 15. A small fraction of incoming events arrive malformed and end up polluting the reports analysts read. The CFO wants the bill to stop swinging with traffic and to come down from where it is today. Design a pipeline that handles variable volume, serves both consumers on the right cadence, keeps bad events out of analytics, and keeps costs predictable.
Summary
Scale up when needed. Do not bankrupt the team.
How This Interview Works
- Read the vague prompt (just like a real interview)
- Ask clarifying questions to the AI interviewer
- Write your pipeline design solution with real code execution
- Get instant feedback and a hire/no-hire decision