DataDriven
LearnPracticeInterviewDiscussDailyJobs

A 200-store pizza chain has three reporting needs from one Postgres orders database and one Kafka to

A medium Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.

Domain
Pipeline Design
Difficulty
medium

Problem

A 200-store pizza chain has three reporting needs from one Postgres orders database and one Kafka topic of in-store kitchen events. The canvas has all five endpoints. Apply the entire beginner tier: (1) pick the rhythm per consumer using the three-question test (s0, s4) - the CFO revenue chart is daily (Tier 4, batch), the store-ops kitchen dashboard is sub-15-min (Tier 1 or 2, streaming), the marketing retention chart is weekly (Tier 5, batch); (2) build the canonical batch shape (s1) for the daily and weekly paths with a shared raw zone in object storage, a curated transform, a warehouse, an orchestrator, and slaFreshness < 24h on the daily warehouse table; (3) build the canonical streaming shape (s2) for the kitchen-ticket path with a streaming engine (Flink, Spark Structured Streaming, Kafka Streams, or Beam, not plain Spark or dbt), a serving store, and slaFreshness real-time or < 1min on the streaming side; (4) keep the slowest-hop rule (s3) honest so no consumer's slaFreshness exceeds its upstream's. The two sources must share a single raw zone in object storage (S3, GCS, or ADLS) so a schema change in either source breaks at most one extract. Three distinct freshness tiers must be visible across the canvas.

Practice This Problem

Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it automatically.

Related

  • All Practice Problems
  • Mock Interview Mode
  • System Design Interview Questions
  • Data Engineering Interview Prep Guide
  • Daily Challenge
  • Data Engineering Lessons