DataDriven
LearnPracticeInterviewDiscussDailyJobs

A layered pipeline on the canvas serves three consumers from one Kafka source through a raw zone and

A medium Pipeline Design mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.

Domain
Pipeline Design
Difficulty
medium

Interview Prompt

A layered pipeline on the canvas serves three consumers from one Kafka source through a raw zone and a curated layer, with three serving destinations. Most nodes have no slaFreshness label, and one serving node claims < 1min freshness while its only upstream is the < 24h curated layer; this is the tier-mismatch failure mode the section just named. Apply the tier-per-node analysis this section just taught: walk backward from each consumer, set the slaFreshness on every node to a value consistent with what its downstream consumer actually needs and what its upstream can deliver. Specifically: tag the raw zone with < 15min (tier 2 ingestion lag), tag the ML features serving with < 2h (tier 3 to match the ML inference cadence), and fix the live 5-min mart's mismatch by either upgrading its upstream path so it can honestly carry < 15min, or by downgrading its label to < 24h to match its actual upstream. No node downstream may carry a faster slaFreshness than its upstream.

How This Interview Works

  1. Read the vague prompt (just like a real interview)
  2. Ask clarifying questions to the AI interviewer
  3. Write your pipeline design solution with real code execution
  4. Get instant feedback and a hire/no-hire decision

Related

  • All Mock Interviews
  • Practice Mode (untimed)
  • System Design Interview Questions
  • Data Engineering Interview Prep Guide
  • Practice Problems
  • Daily Challenge