# Insurance Claims Lifecycle

> A claim gets filed. Then it gets complicated. Then it gets reassigned. Then it loops back.

Canonical URL: <https://datadriven.io/problems/insurance_claims_lifecycle>

Domain: Data Modeling · Difficulty: hard · Seniority: L5

## Problem

We need a data model for our insurance claims system. A claim goes through many stages: filed, assigned, investigated, approved or denied, then paid out. We need to track the full lifecycle with version history for audit. Design it.

## Worked solution and explanation

### Why this problem exists in real interviews

This probes whether a candidate understands **event sourcing as an audit pattern** and can combine it with a Type 2 dimension for point-in-time reconstruction. Regulated domains demand immutable history and reversible state, which is a stricter bar than ordinary OLTP modeling.

> **Trick to Solving**
>
> When a prompt says “track the full lifecycle with versioning,” the trick is to recognize you have two shapes of state: the current-state dimension and the immutable event log. Before drawing tables, a strong candidate asks: who audits this, what counts as tamper-evident, and do we need point-in-time reconstruction?
> 
> 1. Build claim_events as the append-only source of truth
> 2. Project claims as a Type 2 dimension for fast point-in-time reads
> 3. Split claim_payments as its own 1:M fact
> 4. Treat adjuster assignment as temporal, not a current-state FK

---

### Break down the requirements

#### Step 1: Declare the event grain

One row in `claim_events` equals one state transition on one claim at one point in time. No updates, no deletes. This is the audit substrate.

#### Step 2: Model claims as a Type 2 dimension

`claims` carries mutable attributes (status, amount, current adjuster) versioned by `valid_from` and `valid_to`. Point-in-time queries use BETWEEN on those columns.

#### Step 3: Separate payments from claims

A single claim can pay out in installments. `claim_payments` is a child fact with its own grain (one row per disbursement), not a column on claims.

#### Step 4: Track adjuster assignment temporally

If an adjuster changes hands three times, the event log captures it and the Type 2 claims row reflects the current holder. Reassignment is never a silent UPDATE.

---

### The solution

Below is one conceptually sound design. The anchor is that claim_events is the system of record and the claims table is a derived projection, which is how you reconcile fast reads with strict auditability.

> **Why this works**
>
> Event sourcing gives you legal-grade auditability and replay. The Type 2 projection gives analysts a fast point-in-time read without reconstructing state from scratch. Payments as a separate entity cleanly handles installment disbursements.

> **Interviewers watch for**
>
> Strong candidates name “event sourcing,” “point-in-time reconstruction,” and “immutable log” without being prompted. They reject any design that mutates `claim_events` and they explain how the Type 2 projection is rebuilt from the log on demand.

> **Common pitfall**
>
> Using UPDATE on a `claims` row to change status and amount in place. The resulting history is unrecoverable and fails even a casual audit. A second pitfall is bolting payments onto claims as a `paid_amount` column, which collapses the 1:M.

---

### The analysis pattern

**Claim state as of an arbitrary date**

```sql
SELECT
    c.claim_id,
    c.status,
    c.claim_amount,
    a.full_name AS adjuster
FROM claims c
JOIN dim_adjusters a ON a.adjuster_id = c.current_adjuster_id
WHERE c.claim_id = 90125
  AND TIMESTAMP '2026-01-15 00:00:00+00' >= c.valid_from
  AND (c.valid_to IS NULL OR TIMESTAMP '2026-01-15 00:00:00+00' < c.valid_to)
```

---

### Trade-offs and alternatives

**Event log + Type 2 projection**

claim_events is source of truth; claims is a versioned projection.

* Fast point-in-time reads without replay
* Full audit trail preserved
* Two things to keep consistent

**Append-only event log only**

Everything derives from claim_events at query time.

* One source of truth, always
* Reconstructions are expensive on hot paths
* Analyst queries must window over the event stream

---

## Common follow-up questions

- How would you support legal hold where some claims cannot be touched for seven years? _(Tests whether the candidate separates retention policy from schema and understands tombstoned projections.)_
- A claim gets a retroactive amount correction after payment. How is that modeled? _(Tests whether corrections are append events with reverse/forward pairs, not destructive updates.)_
- How do you compute the average time from filed to paid without replaying events? _(Tests whether milestone timestamps are projected onto claims or derived from a materialized timeline view.)_
- Two regions must not share adjuster data due to regulation. Where is that enforced? _(Tests whether isolation lives in the schema (region column and row-level security) or downstream.)_
- How would the schema support an external auditor replaying state as of any minute in the last five years? _(Tests the boundary between replayable event log and pre-projected snapshots.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/insurance_claims_lifecycle)
- [Data Modeling Interview Questions](https://datadriven.io/data-modeling-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.