A retail company wants a daily summary of orders by region
A medium Pipeline Design mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.
- Domain
- Pipeline Design
- Difficulty
- medium
Interview Prompt
A retail company wants a daily summary of orders by region. The Postgres production.orders source, an Airflow orchestrator placeholder, the Snowflake mart.orders_by_region destination, and the Morning regional dashboard are already on the canvas. The three tasks that do the actual work, the dependency chain, the schedule, and the retry policy still need to be built. Apply the first-DAG framing this section just taught and add the section's three named task transforms: extract_orders (reads Postgres production.orders, writes raw.orders), clean_orders (reads raw.orders, writes stg.orders), and aggregate_orders (reads stg.orders, writes mart.orders_by_region). Encode each task's reads and writes in its node name so the chain is readable from the diagram. Connect them in order: Postgres production.orders -> extract_orders -> clean_orders -> aggregate_orders -> Snowflake mart.orders_by_region; the dashboard reads from the mart. Rename the Airflow orchestrator node so its name encodes both the section's schedule (daily 2am) and the section's uniform retry policy (3 retries, exponential backoff, 2-minute initial delay) since the canvas has no separate fields for these and the orchestrator name is the only grader-visible place to record them. Tag each of the three task transforms with tech_label Python (matching the section's PythonOperator example).
How This Interview Works
- Read the vague prompt (just like a real interview)
- Ask clarifying questions to the AI interviewer
- Write your pipeline design solution with real code execution
- Get instant feedback and a hire/no-hire decision