# Held to Account

Canonical URL: <https://datadriven.io/problems/fidelity-brokerage-book-of-record>

Domain: Data Modeling · Difficulty: medium · Seniority: junior

## Problem

Fidelity's brokerage platform needs a book of record where compliance can reconstruct what any account held on any past trading day, and where one customer may own several accounts while an account may be jointly held by several customers. Design the model so a trade's executed price stays fixed for all time even as the security's quoted price keeps moving.

## Worked solution and explanation

### Why this problem exists in real interviews

This is a point-in-time reconstruction problem wearing a brokerage costume. The real question is whether you can answer 'what did this account hold on a Tuesday three years ago' without replaying every trade, while keeping joint ownership and a frozen executed price honest. Anyone can draw a customers table and a trades table. The trap is reaching for the trade ledger to rebuild holdings and reaching for the security's current quote to value them: do both and your compliance report silently restates history every time a price ticks or a co-owner is added.

> **Trick to solving**
>
> Three signals point at three different structures, and a strong candidate names all three before drawing a single box. 'Jointly held by several customers' is a many-to-many bridge. 'As of any past trading day, fast' is a periodic snapshot fact, not a sum over the ledger. 'Executed price stays fixed as quotes move' is a measure snapshotted onto the trade fact, divorced from the security dimension.

---

### Break down the requirements

#### Step 1: Model joint ownership as a bridge

A customer owns many accounts and an account is owned by many customers. A single owner_id on dim_accounts cannot express the joint case, so ownership becomes account_owners, one row per (customer, account). The account loses its owner column and gains a relationship.

#### Step 2: Declare the trade grain and freeze the price

fact_trades is one row per executed fill: one account, one security, one moment, with quantity and executed_price captured at that moment. That executed_price is a snapshot, not a lookup. The instant you compute trade value by joining to today's quote, every historical statement becomes wrong.

#### Step 3: Add a snapshot fact for point-in-time holdings

Compliance needs end-of-day holdings for any past date in milliseconds. Summing the entire trade history per query does not scale to tens of millions of trades a day. A daily periodic snapshot, one row per account per security per trading day, makes the as-of-date read a single indexed lookup.

#### Step 4: Separate the drifting quote from the facts

The security's quoted price changes all day. It belongs to dim_securities (current) or a price-history table (intraday), never copied as a mutable column the trade or position facts read at query time. Position market value gets snapshotted into fact_position_snapshots using the close-of-day quote for that date.

---

### The reference model

Below is one defensible design. The anchor is two facts at two grains sharing conformed dimensions: fact_trades is the event ledger with a frozen executed price, and fact_position_snapshots is the as-of-date book of record. Joint ownership lives in the account_owners bridge.

> **How it scales**
>
> At 30M accounts the snapshot fact is the heavy table: even at a few positions per account it is hundreds of millions of rows per day, so partition it by date_key and it prunes to a single day per point-in-time query. fact_trades grows by tens of millions of rows daily but is only scanned for trade-history reads, not for holdings. The bridge stays tiny (a handful of owners per account) and joins cheaply.

> **Interviewers watch for**
>
> The candidate who says 'positions are semi-additive: I can sum across securities on one date but never across dates' has done this before. So has the one who refuses to put owner_id on dim_accounts the moment joint accounts are mentioned, and who freezes executed_price on the fact instead of joining to the quote.

> **Common pitfall**
>
> Deriving holdings by summing fact_trades on every query. It is correct on paper and a disaster in practice: each compliance read scans years of ledger, and the moment a back-dated correction lands, every prior statement silently changes. The snapshot fact exists precisely to make the historical book immutable and the read O(1).

---

### Trade-offs and alternatives

**Snapshot fact for holdings**

fact_position_snapshots holds one row per account/security/day.

* As-of-date read is one indexed, partition-pruned lookup
* History is immutable; corrections create new dated rows
* Costs storage: positions written every trading day

**Derive holdings from the trade ledger**

Sum fact_trades up to the as-of date on demand.

* No extra storage, single source of truth
* Every read scans years of trades and slows with volume
* Back-dated corrections silently restate every prior statement

---

## Common follow-up questions

- A trade is busted (cancelled) two days after execution. How does your model reflect that without rewriting the original snapshot? _(Tests whether corrections are append-only reversing events plus a new snapshot rather than destructive updates.)_
- How would you support intraday position views, not just end-of-day? _(Tests whether the candidate adds a finer snapshot grain or a price-history table rather than abusing the daily fact.)_
- Ownership of an account changes when a co-owner is added mid-year. How do point-in-time ownership queries stay correct? _(Tests effective-dating the account_owners bridge versus a simple current-state bridge.)_
- At this scale, how do you partition and cluster fact_position_snapshots for the seven-year compliance window? _(Tests partitioning by date_key and pruning strategy for point-in-time reads.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/fidelity-brokerage-book-of-record)
- [Data Modeling Interview Questions](https://datadriven.io/data-modeling-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.