DataDriven
LearnPracticeInterviewDiscussDailyJobs

A retailer's orders pipeline processes 1 billion events per day at peak volume, and an executive das

A medium Pipeline Design mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.

Domain
Pipeline Design
Difficulty
medium

Interview Prompt

A retailer's orders pipeline processes 1 billion events per day at peak volume, and an executive dashboard reads the result at 7am Pacific each morning. The canvas has the four roles in place but no rhythm decision: the transform is labeled plain Spark (which the canvas grader treats as batch), the warehouse mart has no slaFreshness, and the throughput-vs-latency tradeoff has not been named. Apply the latency-vs-throughput framing this section just taught and pick which dimension constrains this pipeline. The 7am dashboard read is a Tier 4 freshness ask (< 24h end-to-end), and 1 billion events per day is a high-throughput requirement that batch handles 10-50x cheaper than streaming. Pick batch and tag the warehouse mart with slaFreshness < 24h. Do not introduce a streaming engine; the latency target does not require it, and the throughput cost would jump 10-50x for no consumer-visible benefit.

How This Interview Works

  1. Read the vague prompt (just like a real interview)
  2. Ask clarifying questions to the AI interviewer
  3. Write your pipeline design solution with real code execution
  4. Get instant feedback and a hire/no-hire decision

Related

  • All Mock Interviews
  • Practice Mode (untimed)
  • System Design Interview Questions
  • Data Engineering Interview Prep Guide
  • Practice Problems
  • Daily Challenge