# A streaming pipeline ingests three sources with different lateness profiles

Canonical URL: <https://datadriven.io/problems/a-streaming-pipeline-ingests-three-sources-with-different-la-46b3eb83>

Domain: Pipeline Design · Difficulty: medium

## Problem

A streaming pipeline ingests three sources with different lateness profiles. Source A is a high-volume mobile event stream with a long retry tail (99.9th percentile lateness: 4 hours). Source B is an IoT sensor stream where individual partitions go idle for hours at a time. Source C is a financial market-data feed where the producer emits explicit end-of-session marker events. The section names four watermark strategies (ascending timestamps, bounded out-of-orderness, punctuated, per-key) and one operational fix (idle-partition detection). Pick the watermark strategy by adding three watermark generator transforms, one per source, each named to state the strategy and the parameters that match its source's profile, and add an idle-detection annotation where a source needs it.

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/a-streaming-pipeline-ingests-three-sources-with-different-la-46b3eb83)
- [System Design Interview Questions](https://datadriven.io/data-engineering-system-design)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.