DataDriven
LearnPracticeInterviewDiscussDailyJobs

Five Times the Traffic, Five Times the Bill

A hard Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.

Domain
Pipeline Design
Difficulty
hard
Seniority
L7

Problem

Our platform's data volumes are unpredictable: we see 5x swings between our quietest and busiest hours, with sudden spikes during product launches. We've been running a fixed-size Spark cluster that's over-provisioned 80% of the time and still falls behind during spikes. Operations needs to act on issues within a couple minutes; analytics dashboards tolerate up to 15. A small fraction of incoming events arrive malformed and end up polluting the reports analysts read. The CFO wants the bill to stop swinging with traffic and to come down from where it is today. Design a pipeline that handles variable volume, serves both consumers on the right cadence, keeps bad events out of analytics, and keeps costs predictable.

Summary

Scale up when needed. Do not bankrupt the team.

Practice This Problem

Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it automatically.

Related

  • All Practice Problems
  • Mock Interview Mode
  • System Design Interview Questions
  • Data Engineering Interview Prep Guide
  • Daily Challenge
  • Data Engineering Lessons