# Thirty Countries, One Solvency Number

> Premiums collected globally. Losses happen locally.

Canonical URL: <https://datadriven.io/problems/thirty_countries_one_solvency_number>

Domain: Pipeline Design · Difficulty: hard · Seniority: L6

## Problem

We are a global insurer with business units across 30 countries, each running their own policy and claims systems. The group actuarial team needs a consolidated view of global premium writings and loss events to calculate solvency capital requirements and set reserve targets. Design a data ingestion platform that collects this data from all business units and makes it available for actuarial analysis within hours of being written.

## Worked solution and explanation

### Why this problem exists in real interviews

Thirty business units, thirty source systems, one quarter-close. The trap is treating it as 'pull from each system, dump in a warehouse, group actuarial sorts it out.' Group actuarial can't sort out thirty shapes; the canonical schema has to be the contract every BU's data conforms to, the FX semantics have to be locked at transaction time, and the volume-drop alert has to fire before the quarter-close, not at it.

The whiteboard answer is one big extraction job that pulls all 30 BUs and lands raw data in the warehouse for actuaries to query. The slowest BU pushes the close past its window. Each BU's data has different field names, currencies, and granularity; actuaries spend a week stitching the canonical shape by hand. A historical premium gets revalued at today's FX every time the warehouse rebuilds. A BU's volume drops sharply one Tuesday and the CRO finds out at quarter close, six weeks later.

> **Trick to Solving**
>
> Per-BU ingest with sensors, canonical schema enforced before the warehouse, FX at transaction time on the row, volume-drop alerts that fire in hours.
> 
> 1. An orchestrator schedules each BU's ingest as its own task with a per-BU sensor; group actuarial sees per-BU progress on a dashboard, not just a final pass/fail.
> 2. Every BU's data is mapped to a canonical premium and claims schema before it lands in the warehouse. Group actuarial reads the canonical view; the source-shape problem is upstream.
> 3. Each transaction stores the original local amount and the converted amount using the FX rate for that transaction's date. A historical query never re-converts.
> 4. Per-BU volume baselines are tracked daily; a sharp deviation triggers an alert in hours, not quarterly.

---

### Walk the requirements

#### Step 1: Per-BU ingest with sensors so the close has visibility

An orchestrator runs one ingest task per BU on its own schedule, with a sensor that fires before the BU's expected landing if it's at risk. Group actuarial's dashboard shows per-BU status: which BUs have landed, which are late, and how late. The close depends on the slowest, but the orchestrator names which BU it is. Without an orchestration layer there's nothing watching per-BU SLAs; without a warehouse tier the canonical view has nowhere to live.

#### Step 2: Canonical schema enforced before the warehouse

Each BU's data maps to one canonical premium and claims schema before it lands in the warehouse. The mapping lives in a per-BU adapter; the canonical contract is the same for every BU. Group actuarial reads the canonical fact, not thirty source shapes. A 'we'll harmonize in actuarial's notebook' approach is the version where the close takes a week of glue work; canonical-up-front is what makes group actuarial's life possible.

#### Step 3: FX at transaction time, stored alongside the local amount

A historical premium has to be valued at the FX rate in effect when it was written. Each row stores the original local amount, the FX rate for the transaction date, and the converted amount; the warehouse never re-derives at query time. A 'compute USD on the fly using current FX' approach silently rewrites every historical premium every time the rate changes; locking the rate on the row is the version that keeps history stable.

#### Step 4: Per-BU volume baseline with hours-not-quarters alerting

Each BU has a daily volume baseline; a sharp deviation triggers an alert within hours, not at quarter close. The alert is per-BU so the CRO can name which BU dropped, not 'something is wrong somewhere.' Treating the volume drop as a quarterly reconciliation finding is what made the requirement: the alert has to fire in hours because that's when the business can act.

---

### The shape that fits

> **What this design gives up**
>
> Per-BU ingest with adapters is more orchestration than one extraction job. The canonical schema is upfront work that pays off slowly. Storing both local and converted FX per row roughly doubles row width on the financial columns. Per-BU volume baselines need a baseline that has to be maintained and tuned. Speed-to-first-warehouse is the cost; what arrives is a close that lands on time, a canonical view actuaries can reason about, history that doesn't drift on FX changes, and volume-drop alerts that catch issues in hours.

> **What reviewers check**
>
> A reviewer looks at the canvas for these properties:
> - An orchestrator schedules per-BU ingest with sensors and alerts before the quarter-close window.
> - Premium and claims facts hold the original local amount alongside the FX-converted amount at transaction time.

> **The mistake that ships**
>
> The build that ships pulls all 30 BUs into one nightly job, dumps raw data into the warehouse, and lets actuarial harmonise in notebooks. The close slips because of the slowest BU and nobody can name which one. A historical premium gets re-converted at today's FX every rebuild, so last year's quarterly numbers shift with the dollar. A BU's volume drops sharply on a Tuesday and the CRO finds out at quarter close. The team rebuilds with per-BU ingest, a canonical schema, FX-on-the-row, and volume baselines. The next quarter close runs through the rebuild; the close after that is when the design starts paying off.

---

## Common follow-up questions

- A new BU is acquired in a country we haven't operated in before, with its own policy system. What in this design lets you onboard it without rewriting the warehouse? _(Tests whether the candidate sees the per-BU adapter as the extension point: a new adapter mapping the new system to the canonical schema, a new sensor in the orchestrator, a new FX line in the rate table. The canonical fact, the warehouse, and the consumers don't change.)_
- Group actuarial wants the close to use end-of-quarter FX for everything, not transaction-time FX. What changes, and what doesn't? _(Tests whether the candidate sees that the row keeps both the local and converted amounts; the close-time FX is a separate calculation against the local amounts using the quarter-end rate. Both views are queryable from the same table because the local amount and the FX rate are both retained.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/thirty_countries_one_solvency_number)
- [System Design Interview Questions](https://datadriven.io/data-engineering-system-design)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.