A layered pipeline on the canvas serves three consumers from one Kafka source through a raw zone and
A medium Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.
- Domain
- Pipeline Design
- Difficulty
- medium
Problem
A layered pipeline on the canvas serves three consumers from one Kafka source through a raw zone and a curated layer, with three serving destinations. Most nodes have no slaFreshness label, and one serving node claims < 1min freshness while its only upstream is the < 24h curated layer; this is the tier-mismatch failure mode the section just named. Apply the tier-per-node analysis this section just taught: walk backward from each consumer, set the slaFreshness on every node to a value consistent with what its downstream consumer actually needs and what its upstream can deliver. Specifically: tag the raw zone with < 15min (tier 2 ingestion lag), tag the ML features serving with < 2h (tier 3 to match the ML inference cadence), and fix the live 5-min mart's mismatch by either upgrading its upstream path so it can honestly carry < 15min, or by downgrading its label to < 24h to match its actual upstream. No node downstream may carry a faster slaFreshness than its upstream.
Practice This Problem
Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it automatically.