# A streaming pipeline writes a Parquet file every 30 seconds per partition into an Iceberg table

Canonical URL: <https://datadriven.io/problems/a-streaming-pipeline-writes-a-parquet-file-every-30-seconds-26d21079>

Domain: Pipeline Design · Difficulty: medium

## Problem

A streaming pipeline writes a Parquet file every 30 seconds per partition into an Iceberg table. After four months the table holds 11 million sub-megabyte files and query latency has drifted from 12 seconds to 8 minutes. Apply the section's small-files framing and add the named scheduled job that runs alongside the streaming writer to keep file size in the section's target range.

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/a-streaming-pipeline-writes-a-parquet-file-every-30-seconds-26d21079)
- [System Design Interview Questions](https://datadriven.io/data-engineering-system-design)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.