# Which Promotion Is Actually Working

> Was the promotion worth it? The data knows.

Canonical URL: <https://datadriven.io/problems/which_promotion_is_actually_working>

Domain: Pipeline Design · Difficulty: hard · Seniority: L6

## Problem

We are a consumer goods company running dozens of trade promotions simultaneously across hundreds of retail partners, and our commercial analytics team needs to measure promotion ROI in near-real time to see which promotions are working and which are wasting money. Right now the data is fragmented across retailer portals, our own ERP, and third-party syndicated data providers. Design the ingestion pipeline and the BigQuery analytics architecture.

## Worked solution and explanation

### Why this problem exists in real interviews

ROI on trade promotions is a join: sales attributed to whichever promotion was running at the time, reconciled to the ERP, derived from sources that arrive on completely different clocks. The trap is treating the four sources as one pipeline and the join as 'pick today's promotion.' Both shortcuts produce numbers that look right and aren't.

The natural shape is one nightly load that pulls every source, joins sales to whatever promotion is currently active, drops anything that fails validation, and lands the result in the ROI mart. Two days later, finance asks why the warehouse total doesn't match the ERP and discovers the validation step has been silently dropping rows. The commercial team asks why a sale from last week is credited to this week's promotion. The syndicated weekly file holds up the entire load every Monday morning.

> **Trick to Solving**
>
> Per-source cadence into the warehouse, point-in-time promotion lookup at join time, reconciliation as a gate before publish.
> 
> 1. Each source runs on its own cadence: ERP near real-time, retailer portals on their available schedule, syndicated data weekly. They land in the warehouse independently and the join happens against a shared promotion calendar.
> 2. The promotion calendar is a slowly-changing dimension. A sale joins on `sale_date BETWEEN promo_start AND promo_end`, not on whichever promotion is active now.
> 3. Reconciliation against ERP shipments is a quality gate, not a manual report. The ROI mart either reconciles or the run halts and pages.
> 4. Bad rows go to a quarantine table for review, not to the floor. The good rows continue; the commercial team triages the bad rows on their own time.

---

### Walk the requirements

#### Step 1: Each source on its own cadence into the warehouse

ERP shipments need near-real-time visibility; retailer portals arrive on their own schedules; syndicated market data lands weekly. Each source has its own ingest path and writes to its own staged table in the warehouse. The unified sales fact reads from all of them. One shared schedule wastes compute on the weekly source and starves the fast source; per-source cadence sizes the work to the consumer that actually needs it. Without a warehouse tier there's nowhere for the unified fact to live.

#### Step 2: Attribute sales to the promotion that was running then

Promotions are a slowly-changing dimension keyed on (promotion_id, valid_from, valid_to). A sale joins on `sale_date BETWEEN valid_from AND valid_to`, picking the promotion active when the sale happened. A late-arriving sale from last week still gets last week's promotion, not this week's. The version that joins on 'current promotion' is the version that gets every sale wrong as soon as a new promotion kicks off.

#### Step 3: Reconcile to the ERP as a gate, not a comment

Before the ROI mart is published, a reconciliation check compares pipeline sales totals against ERP shipment totals for the same window. If the gap is larger than a tolerance, the run halts and pages, no half-published mart, no email-after-the-fact. Finance reads the published numbers; if those numbers don't match the ERP, finance has stopped trusting the warehouse. Reconciliation in the run is what keeps that trust.

#### Step 4: Quarantine bad partner rows; keep the good rows moving

Retailer files routinely contain duplicate rows and bad product codes. Validation routes those rows to a quarantine table with the rejection reason; the rest of the file continues into the unified sales fact. The commercial team queries the quarantine table on their own schedule, fixes upstream issues, and replays. Silently dropping the bad rows is the move that loses count of how often this is happening, and finance ends up reconciling against numbers that quietly excluded a chunk of the file.

---

### The shape that fits

> **What this design gives up**
>
> Per-source cadence means ingest infrastructure for each source instead of one nightly load, with monitoring per source. The point-in-time join scans a date range and is more expensive than an equi-join. Reconciliation as a gate halts the pipeline when ERP and warehouse disagree, which is louder than a quiet mismatch but stops everything until it's resolved. Pipeline simplicity and steady-flow availability are what get sacrificed; in return, the ROI numbers reconcile and the commercial team trusts them.

> **What reviewers check**
>
> A reviewer looks at the canvas for these properties:
> - Each source ingests on its own cadence and converges in a unified warehouse fact.
> - Sales join the promotion calendar as a slowly-changing dimension at sale-time.
> - A reconciliation gate compares pipeline totals to the ERP before publishing the ROI mart.
> - Invalid retailer rows route to a quarantine without halting the rest of the file.

> **The mistake that ships**
>
> What gets built first does one nightly load over all three sources, joins sales to the promotion currently active, silently filters bad retailer rows, and writes to the ROI mart without reconciliation. The commercial team reports a promotion is performing well based on attribution to the wrong promotion. Finance signs off on a quarter, then has to walk it back when a manual ERP comparison turns up a gap. The team rebuilds with per-source cadence, an as-of join, and a reconciliation gate. Three properties the first cut treated as afterthoughts and the rebuild treats as load-bearing.

---

## Common follow-up questions

- A retailer's syndicated file lands twice in the same week with overlapping rows. What in this design protects you, and what doesn't? _(Tests whether the candidate sees that the upsert sink dedups exact-id duplicates, but rows with the same business key and slightly different values are a real merge problem the dedup key alone doesn't solve. The fix is either a deterministic source-of-truth ranking or a quarantine of conflicting rows for review.)_
- Commercial wants intra-day ROI for a single high-value promotion, while everything else stays on the current cadence. What changes in the design? _(Tests whether the candidate scales freshness without rebuilding the whole pipeline. The fast path becomes a partial materialization on the streaming side for that one promotion's sales, joined against the same promotion calendar; the rest of the warehouse stays as-is.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/which_promotion_is_actually_working)
- [System Design Interview Questions](https://datadriven.io/data-engineering-system-design)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.