DataDriven
LearnPracticeInterviewDiscussDailyJobs

A retailer's revenue pipeline runs Flink streaming with a $4,000/month bill, feeding a CFO dashboard

A medium Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.

Domain
Pipeline Design
Difficulty
medium

Problem

A retailer's revenue pipeline runs Flink streaming with a $4,000/month bill, feeding a CFO dashboard that the CFO reads once each morning at 7am Pacific. The streaming pipeline was built without a cost conversation, and the latency it provides has zero dollar value because the consumer reads once a day. Apply the cost-story framing this section just taught: the latency value is zero, streaming costs 5-50x more than equivalent batch, the right answer is to downgrade. Replace the Flink streaming transform with nightly batch (plain Spark, dbt, or PySpark are all batch tools that satisfy this), remove the local RocksDB state store (batch maintains no inter-run state), tag the warehouse mart with slaFreshness < 24h to match the actual consumer freshness need, and remove the real-time slaFreshness from the streaming nodes. The CFO dashboard's freshness need is unchanged; only the cost is.

Practice This Problem

Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it automatically.

Related

  • All Practice Problems
  • Mock Interview Mode
  • System Design Interview Questions
  • Data Engineering Interview Prep Guide
  • Daily Challenge
  • Data Engineering Lessons