# Mark to Market

Canonical URL: <https://datadriven.io/problems/mark-to-market-trade-reconciliation>

Domain: Pipeline Design · Difficulty: medium · Seniority: mid

## Problem

A brokerage platform processes about 2 billion trade execution events a day, and two teams read them: risk wants intraday position dashboards that stay within a few seconds of the market, while regulatory reporting needs every trade counted exactly once in the end-of-day books. Trades arrive out of order and some are corrected hours after execution, so the design has to reconcile late and amended fills without double-counting or dropping them.

## Worked solution and explanation

### Why this problem exists in real interviews

Two readers of the same trade stream with opposite correctness budgets: risk wants positions that feel live and will forgive a count that is off by a few fills, while regulatory reporting needs every trade counted exactly once or it is a reportable filing error. The trap is one pipeline that tries to serve both, fast enough to feel live but exact enough to file. You get neither: the intraday aggregator is too slow to feel live, and its approximate numbers are not safe to submit.

The obvious answer is a stream that aggregates fills into a positions table both teams read, with the end-of-day report just a snapshot of that table at the close. It falls apart on the two facts the prompt buries: fills arrive out of order and 1 to 2 percent get amended or busted hours later. A correction that lands after the close never makes it into the snapshot, and a producer retry double-counts a fill the regulators then see twice.

> **Trick to Solving**
>
> Fan one durable, replayable log out to two paths sized for two budgets, and make exactly-once a property of the books, not a hope of the stream.
> 
> 1. Land every fill on a replayable queue first; nothing downstream is the source of truth.
> 2. Intraday: a streaming aggregation feeds a low-latency serving store. Approximate is fine; live is the budget.
> 3. End-of-day: a batch job replays the full day from the log, anchors on execution time so late fills land in the right day, dedups on the execution id, and upserts the settled books into the warehouse.

---

### Break down the requirements

#### Step 1: Land fills on a replayable log before anything aggregates

Out-of-order arrival and hours-late corrections mean no downstream store can be the source of truth, because it can only ever reflect what arrived by the time it ran. A durable message queue keyed by execution id lets the batch replay the entire trading day on demand and lets the stream consume the same events independently. Skip this and a correction that lands after the streaming window closes is simply lost.

#### Step 2: Stream the intraday positions; approximate is acceptable

Risk needs the dashboard within seconds, so a stream processor aggregates fills into a low-latency serving store the dashboard reads. A fill counted a few seconds late or a position briefly off by one trade does not hurt anyone on the risk desk. Trying to make this path exact is what makes it too slow to feel live, and it still would not be safe to file.

#### Step 3: Reconcile the books in a batch that replays the whole day

The end-of-day job reads the full day off the log, anchors every aggregation on the trade execution timestamp rather than when the event arrived, dedups on the stable execution id, and applies amendments and busts by referencing the original id. The result upserts idempotently into the warehouse the reporting team queries. Because it replays from the log, a fill or correction that arrived late is still placed in the correct trading day.

---

### The shape that fits

> **Scale and Cost**
>
> At 2 billion fills a day the volume concentrates around open and close. The streaming path is sized for that peak and is the steady-cost component. The batch reconciliation is one heavy daily pass over the full log; replaying the entire day rather than an incremental window is the deliberate cost you pay so corrections are never missed. The dedup index on execution id is what keeps the exactly-once guarantee cheap: it is an upsert, not a full re-scan.

> **Interviewers Watch For**
>
> The strong signal is naming which path owns correctness. A senior candidate says the intraday numbers are approximate by design and the books are the exact, filed record, and never lets reporting read the streaming aggregate. They also anchor aggregations on execution time, not arrival time, and dedup on a stable id so retries and amendments are handled, not assumed away.

> **Common Pitfall**
>
> Reusing the intraday streaming aggregate as the end-of-day report. It looks elegant and ships fast, then a producer retry double-counts a fill the regulators see twice, and a correction that arrives after the close never lands. The fix is a separate batch that replays the durable log and dedups, which is exactly the work the single-path design skipped.

---

## Common follow-up questions

- A correction for a trade arrives three hours after the close, after the books were already submitted. What does the design do? _(Tests whether the candidate treats the books as restateable: a re-run of the reconciliation off the log produces an amended filing with an audit trail, rather than an in-place edit nobody can reconstruct.)_
- Volume doubles after an acquisition and the intraday path starts lagging during the open. What do you change, and does it touch the books? _(Tests whether the candidate scales the streaming path independently (more partitions, more parallelism) while recognizing the batch reconciliation is decoupled and unaffected.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/mark-to-market-trade-reconciliation)
- [System Design Interview Questions](https://datadriven.io/data-engineering-system-design)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.