DataDriven
LearnPracticeInterviewDiscussDailyJobs

A producer team renamed customer_id to user_id last week, dropped a required column, and started sen

A medium Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.

Domain
Pipeline Design
Difficulty
medium

Problem

A producer team renamed customer_id to user_id last week, dropped a required column, and started sending strings where numbers were expected. The pipeline silently absorbed all three changes; downstream joins and aggregates are now wrong in ways nobody has noticed. The section's pattern is schema validation: assert column exists, type matches, nullability respected, value-in-declared-range. Tools that can author and run these assertions include Great Expectations, dbt tests, and Soda, all of which the team can adopt incrementally. Validate the schema by adding a schema-validation check between the source and the curated table whose name lists what it asserts (existence, types, nullability, ranges).

Practice This Problem

Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it automatically.

Related

  • All Practice Problems
  • Mock Interview Mode
  • System Design Interview Questions
  • Data Engineering Interview Prep Guide
  • Daily Challenge
  • Data Engineering Lessons