A layered pipeline on the canvas serves three consumers from one Kafka source through a raw zone and
A medium Pipeline Design mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.
- Domain
- Pipeline Design
- Difficulty
- medium
Interview Prompt
A layered pipeline on the canvas serves three consumers from one Kafka source through a raw zone and a curated layer, with three serving destinations. Most nodes have no slaFreshness label, and one serving node claims < 1min freshness while its only upstream is the < 24h curated layer; this is the tier-mismatch failure mode the section just named. Apply the tier-per-node analysis this section just taught: walk backward from each consumer, set the slaFreshness on every node to a value consistent with what its downstream consumer actually needs and what its upstream can deliver. Specifically: tag the raw zone with < 15min (tier 2 ingestion lag), tag the ML features serving with < 2h (tier 3 to match the ML inference cadence), and fix the live 5-min mart's mismatch by either upgrading its upstream path so it can honestly carry < 15min, or by downgrading its label to < 24h to match its actual upstream. No node downstream may carry a faster slaFreshness than its upstream.
How This Interview Works
- Read the vague prompt (just like a real interview)
- Ask clarifying questions to the AI interviewer
- Write your pipeline design solution with real code execution
- Get instant feedback and a hire/no-hire decision