DataDriven
LearnPracticeInterviewDiscussDailyJobs

A retailer's revenue pipeline runs Flink streaming with a $4,000/month bill, feeding a CFO dashboard

A medium Pipeline Design mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.

Domain
Pipeline Design
Difficulty
medium

Interview Prompt

A retailer's revenue pipeline runs Flink streaming with a $4,000/month bill, feeding a CFO dashboard that the CFO reads once each morning at 7am Pacific. The streaming pipeline was built without a cost conversation, and the latency it provides has zero dollar value because the consumer reads once a day. Apply the cost-story framing this section just taught: the latency value is zero, streaming costs 5-50x more than equivalent batch, the right answer is to downgrade. Replace the Flink streaming transform with nightly batch (plain Spark, dbt, or PySpark are all batch tools that satisfy this), remove the local RocksDB state store (batch maintains no inter-run state), tag the warehouse mart with slaFreshness < 24h to match the actual consumer freshness need, and remove the real-time slaFreshness from the streaming nodes. The CFO dashboard's freshness need is unchanged; only the cost is.

How This Interview Works

  1. Read the vague prompt (just like a real interview)
  2. Ask clarifying questions to the AI interviewer
  3. Write your pipeline design solution with real code execution
  4. Get instant feedback and a hire/no-hire decision

Related

  • All Mock Interviews
  • Practice Mode (untimed)
  • System Design Interview Questions
  • Data Engineering Interview Prep Guide
  • Practice Problems
  • Daily Challenge