DataDriven
LearnPracticeInterviewDiscussDailyJobs

A layered pipeline on the canvas serves three consumers from one Kafka source through a raw zone and

A medium Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.

Domain
Pipeline Design
Difficulty
medium

Problem

A layered pipeline on the canvas serves three consumers from one Kafka source through a raw zone and a curated layer, with three serving destinations. Most nodes have no slaFreshness label, and one serving node claims < 1min freshness while its only upstream is the < 24h curated layer; this is the tier-mismatch failure mode the section just named. Apply the tier-per-node analysis this section just taught: walk backward from each consumer, set the slaFreshness on every node to a value consistent with what its downstream consumer actually needs and what its upstream can deliver. Specifically: tag the raw zone with < 15min (tier 2 ingestion lag), tag the ML features serving with < 2h (tier 3 to match the ML inference cadence), and fix the live 5-min mart's mismatch by either upgrading its upstream path so it can honestly carry < 15min, or by downgrading its label to < 24h to match its actual upstream. No node downstream may carry a faster slaFreshness than its upstream.

Practice This Problem

Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it automatically.

Related

  • All Practice Problems
  • Mock Interview Mode
  • System Design Interview Questions
  • Data Engineering Interview Prep Guide
  • Daily Challenge
  • Data Engineering Lessons