Learn Practice Interview Discuss Daily Jobs

A layered pipeline on the canvas serves three consumers from one Kafka source through a raw zone and

A medium Pipeline Design mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.

Domain: Pipeline Design
Difficulty: medium

Interview Prompt

A layered pipeline on the canvas serves three consumers from one Kafka source through a raw zone and a curated layer, with three serving destinations. Most nodes have no slaFreshness label, and one serving node claims < 1min freshness while its only upstream is the < 24h curated layer; this is the tier-mismatch failure mode the section just named. Apply the tier-per-node analysis this section just taught: walk backward from each consumer, set the slaFreshness on every node to a value consistent with what its downstream consumer actually needs and what its upstream can deliver. Specifically: tag the raw zone with < 15min (tier 2 ingestion lag), tag the ML features serving with < 2h (tier 3 to match the ML inference cadence), and fix the live 5-min mart's mismatch by either upgrading its upstream path so it can honestly carry < 15min, or by downgrading its label to < 24h to match its actual upstream. No node downstream may carry a faster slaFreshness than its upstream.

How This Interview Works

Read the vague prompt (just like a real interview)
Ask clarifying questions to the AI interviewer
Write your pipeline design solution with real code execution
Get instant feedback and a hire/no-hire decision

Related

All Mock Interviews
Practice Mode (untimed)
System Design Interview Questions
Data Engineering Interview Prep Guide
Practice Problems
Daily Challenge