# A batch ETL processes 10,000 rows

Canonical URL: <https://datadriven.io/problems/a-batch-etl-processes-10000-rows-one-row-fails-two-extrem-30782ca8>

Domain: Pipeline Design · Difficulty: medium

## Problem

A batch ETL processes 10,000 rows. One row fails. Two extreme answers are both wrong: failing the entire batch loses progress on every good row, and silently dropping the bad row hides what may be a symptom of a larger issue. The section names three strategies: all-or-nothing (depends on idempotency), skip-and-quarantine (a percentage threshold guards the failure), and partial commit with checkpoint (resume from the last committed unit). Pick the strategy that fits this analytics workload by replacing the ingest transform's name with one that states the chosen strategy and its bounding parameter. If skip-and-quarantine is chosen, add a quarantine destination.

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/a-batch-etl-processes-10000-rows-one-row-fails-two-extrem-30782ca8)
- [System Design Interview Questions](https://datadriven.io/data-engineering-system-design)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.