AWS Pipeline with Auto-Scaling and Cost Governance

A hard Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.

Domain: Pipeline Design
Difficulty: hard
Seniority: staff

Problem

Our platform's data volumes are unpredictable - we see 5x swings between our quietest and busiest hours, with sudden spikes during product launches. We've been running a fixed-size Spark cluster that's over-provisioned 80% of the time and still falls behind during spikes. Design a data pipeline on AWS that handles variable volume efficiently, auto-scales without intervention, and keeps costs predictable.

Practice This Problem

Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it instantly.