# A 2018-era system on the canvas runs the same aggregation logic in two engines: a plain Spark nightl

Canonical URL: <https://datadriven.io/problems/a-2018-era-system-on-the-canvas-runs-the-same-aggregation-lo-300f16bd>

Domain: Pipeline Design · Difficulty: medium

## Problem

A 2018-era system on the canvas runs the same aggregation logic in two engines: a plain Spark nightly batch transform feeding the daily executive dashboard, and a Flink streaming transform feeding the live ops dashboard. Two codebases, two failure profiles, two on-call rotations, and silent drift between the two views when one team changes the logic and the other team forgets. Apply the unified-engine framing this section just taught and collapse the two transforms into a single unified-engine transform (Spark Structured Streaming or Beam) that writes to both views; the daily view runs with trigger=once, the live view runs with trigger=processingTime='1 minute'. The application code is identical for both rhythms; only the trigger configuration differs. Remove (or replace) the plain Spark and Flink transforms; both downstream views now read from the unified pipeline. The rhythm choice becomes a config decision, not a separate codebase.

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/a-2018-era-system-on-the-canvas-runs-the-same-aggregation-lo-300f16bd)
- [System Design Interview Questions](https://datadriven.io/data-engineering-system-design)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.