A 200-store pizza chain has three reporting needs from one Postgres orders database and one Kafka to
A medium Pipeline Design mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.
- Domain
- Pipeline Design
- Difficulty
- medium
Interview Prompt
A 200-store pizza chain has three reporting needs from one Postgres orders database and one Kafka topic of in-store kitchen events. The canvas has all five endpoints. Apply the entire beginner tier: (1) pick the rhythm per consumer using the three-question test (s0, s4) - the CFO revenue chart is daily (Tier 4, batch), the store-ops kitchen dashboard is sub-15-min (Tier 1 or 2, streaming), the marketing retention chart is weekly (Tier 5, batch); (2) build the canonical batch shape (s1) for the daily and weekly paths with a shared raw zone in object storage, a curated transform, a warehouse, an orchestrator, and slaFreshness < 24h on the daily warehouse table; (3) build the canonical streaming shape (s2) for the kitchen-ticket path with a streaming engine (Flink, Spark Structured Streaming, Kafka Streams, or Beam, not plain Spark or dbt), a serving store, and slaFreshness real-time or < 1min on the streaming side; (4) keep the slowest-hop rule (s3) honest so no consumer's slaFreshness exceeds its upstream's. The two sources must share a single raw zone in object storage (S3, GCS, or ADLS) so a schema change in either source breaks at most one extract. Three distinct freshness tiers must be visible across the canvas.
How This Interview Works
- Read the vague prompt (just like a real interview)
- Ask clarifying questions to the AI interviewer
- Write your pipeline design solution with real code execution
- Get instant feedback and a hire/no-hire decision