DataDriven
LearnPracticeInterviewDiscussDailyJobs

A single Postgres operational database has two tables of interest

A medium Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.

Domain
Pipeline Design
Difficulty
medium

Problem

A single Postgres operational database has two tables of interest. customers (~2M rows, slowly changing) needs hourly freshness; orders (~500M rows, +10M/day) needs five-minute freshness. The destination is Snowflake. The section's three legitimate strategies: full load + incremental pull (cheapest; no deletes on the incremental side), trigger-based CDC (captures deletes; doubles writes), and log-based CDC with Debezium (low overhead; needs WAL access and slot management). Pick the CDC strategy by adding a strategy-marker node downstream of the database whose name states the chosen strategy per table and the trade-off that drove it (scale, deletes, latency, or team capacity).

Practice This Problem

Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it automatically.

Related

  • All Practice Problems
  • Mock Interview Mode
  • System Design Interview Questions
  • Data Engineering Interview Prep Guide
  • Daily Challenge
  • Data Engineering Lessons