# A 2024-era streaming system on the canvas runs a single Flink pipeline producing one materialized vi

Canonical URL: <https://datadriven.io/problems/a-2024-era-streaming-system-on-the-canvas-runs-a-single-flin-2afd03d4>

Domain: Pipeline Design · Difficulty: medium

## Problem

A 2024-era streaming system on the canvas runs a single Flink pipeline producing one materialized view. The Kafka log only retains a few days in-cluster, so reprocessing more than a week of history is impossible: a bug fix that requires replaying a month of orders cannot be applied without losing data. Apply the Kappa architecture this section just taught. Make the system replay-capable: (1) add a long-retention tiered-storage backing for the Kafka log in object storage (S3, GCS, or ADLS) so the event log can hold 12-24 months of events affordably, and (2) add a parallel materialized view (Snowflake mart_orders_v2 or a separate warehouse table) so the Flink pipeline can replay the log into a new view during a bug-fix or schema migration without disturbing the live v1 view. Once v2 catches up to live and is validated, the dashboard cuts over. Do not add a batch layer; Kappa is stream-only with batch as replay through the same code path.

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/a-2024-era-streaming-system-on-the-canvas-runs-a-single-flin-2afd03d4)
- [System Design Interview Questions](https://datadriven.io/data-engineering-system-design)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.