# Thirty Million Unique Jobs a Year

> One press run, many orders. Group them right.

Canonical URL: <https://datadriven.io/problems/thirty_million_unique_jobs_a_year>

Domain: Pipeline Design · Difficulty: hard · Seniority: L6

## Problem

We produce 30 million custom print products a year for small businesses - business cards, flyers, and banners - each one unique. Our profitability depends on ganging: combining multiple customer orders onto a single press sheet to maximize utilization. The ganging algorithm needs a real-time view of pending orders, and our operations team needs to know before a job starts whether it will meet its delivery promise. Our analytics are also fragmented across 10 acquired companies running different ERP systems. Design a pipeline that supports all three needs.

## Worked solution and explanation

### Why this problem exists in real interviews

Four needs that pull on the same order data: ganging needs sub-minute candidate pools, ten brands need to converge into one canonical shape, customer service needs in-flight promise scoring, and committed gang sheets have to be immutable. The trap is letting the ganging optimizer hit the order database directly and treating the ten-brand canonicalization as ETL rather than as a contract that everything reads through.

The default reach is for the ganging optimizer to query each brand's order database directly every few minutes, with a nightly cross-brand ETL for analytics. The optimizer puts read pressure on operational systems; the operations team backs the data team off. A confirmed order in brand A's database doesn't show up to ganging until the next ETL run; the print cycle moves on without it. Customer-promise probability is computed from yesterday's snapshot. A cancellation after commit mutates the original sheet because the design didn't draw the immutability boundary.

> **Trick to Solving**
>
> CDC each brand's orders into one canonical bus, ganging reads from a streaming pool, press telemetry rescores promises, committed sheets are immutable.
> 
> 1. Each brand's orders feed CDC into a streaming canonicalizer; the result is one canonical order shape on a single bus that both ganging and analytics read.
> 2. The ganging optimizer reads a sub-minute candidate pool fed by the bus; it never queries the order databases directly.
> 3. Press-telemetry events update each in-flight order's probability of meeting its promise; an alert fires to customer service when the probability crosses the threshold.
> 4. Committed gang sheets land in an immutable store; cancellations after commit route to a replacement-reprint flow, not a sheet edit.

---

### Walk the requirements

#### Step 1: Newly confirmed orders join the ganging pool within tens of seconds

Confirmed orders flow from each brand's order database via CDC into a streaming canonicalizer, then onto a candidate-pool store the ganging optimizer reads. End-to-end is tens of seconds. The optimizer doesn't query the order databases; the pool is the contract. A 'ganging queries the orders' design is what's been putting read pressure on operations and missing print cycles when ETL hasn't caught up.

#### Step 2: Ten brands, one canonical shape

Each brand's order schema is its own; cross-brand work needs one shape. The canonicalizer maps each brand's orders into a canonical (order_id, brand, customer, product, dimensions, quantity, deadline, status) before anything downstream reads. Ganging reads canonical orders; analytics reads canonical orders; the cross-brand warehouse holds canonical orders. A 'we'll harmonize at query time' approach pushes the brand-shape problem onto every consumer; canonical-up-front is what keeps each consumer simple.

#### Step 3: Press telemetry refreshes each in-flight order's promise

As production progresses, each press emits telemetry; a streaming consumer updates each in-flight order's probability of meeting its delivery promise based on current production state. When the probability crosses the threshold, an alert fires to customer service with the order id and the predicted slip. Promise scoring from yesterday's snapshot is the version where customer service finds out about a missed promise from the customer; press-telemetry-driven scoring is what makes the alert proactive.

#### Step 4: Committed gang sheets are immutable; cancellations are reprints

Once a gang sheet is committed to print, its contents can't change. The committed sheet writes to an immutable store; downstream production reads from there. A post-commit cancellation routes to a replacement-reprint flow that produces the remaining items on a fresh sheet, leaving the original intact. Mutating the committed sheet is the version where the press's view and the system's view diverge mid-run; immutability is the boundary that keeps them aligned.

---

### The shape that fits

> **What this design gives up**
>
> CDC across ten brands is ten connectors to operate; the canonical shape has to absorb each brand's edge cases; press-telemetry promise scoring is a streaming consumer with state per in-flight order; an immutable committed-sheet store doubles storage for sheets that also exist in the warehouse. Implementation cost is the price; the win is ganging that doesn't pressure operations, ten brands that look like one to every consumer, customer service that gets ahead of slips, and committed sheets that don't change under the press.

> **What reviewers check**
>
> A reviewer looks at the canvas for these properties:
> - A streaming path canonicalizes orders from each brand and lands them in the ganging optimizer's candidate pool within tens of seconds.
> - All ten brands map into one canonical order shape that analytics and ganging read.
> - Press-telemetry events refresh each in-flight order's delivery probability and alert customer service when it crosses the threshold.
> - Committed gang sheets are immutable; post-commit cancellations route to a replacement reprint.

> **The mistake that ships**
>
> What gets shipped lets the ganging optimizer query each brand's order database every few minutes, runs nightly cross-brand ETL, scores delivery promises off yesterday's snapshot, and mutates committed sheets when cancellations come in. Operations notices the read load on order systems and the data team backs off; the optimizer misses orders that should have ganged. A customer service alert fires after the customer has already noticed the missed promise. A post-commit cancellation rewrites a sheet mid-press and the press output diverges from the system. The eventual rebuild is canonical CDC, a streaming pool, press-telemetry scoring, and immutable committed sheets.

---

## Common follow-up questions

- An eleventh brand is acquired with its own order schema. What in this design extends, and what doesn't? _(Tests whether the candidate sees the canonicalizer as the extension point: a new mapping into the canonical shape, a new CDC connector, and the new brand's orders flow into the same bus. The ganging pool, the press-telemetry path, and the cross-brand warehouse don't change.)_
- Press telemetry signals a slip on an in-flight order, but the gang sheet has already committed. What does this design do, and where does customer service look? _(Tests whether the candidate sees that the committed sheet stays as-is (it's immutable), the slip alert routes through customer service for the affected order, and recovery options are recorded against the order rather than against the sheet. Mutating the sheet to fix the slip would break the committed-sheet boundary.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/thirty_million_unique_jobs_a_year)
- [System Design Interview Questions](https://datadriven.io/data-engineering-system-design)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.