# Employee Transfer Tracking System

> People switch teams. HR loses track.

Canonical URL: <https://datadriven.io/problems/employee_transfer_tracking_system>

Domain: Data Modeling · Difficulty: medium · Seniority: L5

## Problem

We're a large tech company with 80,000 employees across 30 offices. People transfer between departments, change managers, and relocate to different offices. HR currently stores everything in a single employee table and loses history every time someone moves. Can you design a schema that tracks the full movement history?

## Worked solution and explanation

### Why this problem exists in real interviews

This probes whether a candidate recognizes the classic **SCD Type 2 on the employee dimension** pattern and can model **atomic transfers that change multiple attributes at once**. HR data is the textbook use case, and a single-row-per-employee design fails the moment anyone asks 'who managed this team last quarter?'

> **Trick to Solving**
>
> Before drawing tables, a strong candidate asks: do we need to answer questions as-of a historical date, and is a transfer a single event or multiple separate attribute changes? The signal is 'losing history on every move,' which maps straight to Type 2 employee assignments.
> 
> 1. Keep `employees` as the identity table
> 2. Put mutable attributes (department, manager, office) on `employee_assignments` with effective dates
> 3. Record each transfer as one atomic row in `transfers`
> 4. Self-reference departments for the org hierarchy

---

### Break down the requirements

#### Step 1: Separate identity from assignment

`employees` holds stable attributes (employee_id, name, hire_date). Everything that changes (department, manager, location) lives on `employee_assignments` with effective dates.

#### Step 2: Make assignments Type 2

Each row carries `effective_from` and `effective_to`, with NULL on the current row. A partial unique index on `(employee_id) WHERE effective_to IS NULL` guarantees exactly one current assignment per person.

#### Step 3: Record transfers atomically

A transfer can change department, manager, and location together. `transfers` is one row per transfer event with the old and new assignment keys. That guarantees the three attribute changes are correlated rather than appearing as three unrelated history rows.

#### Step 4: Self-reference departments

An org chart is a hierarchy. `departments` carries `parent_department_id` pointing back at itself, which supports rollups like 'all engineering headcount' without a bridge table.

#### Step 5: Keep locations as a separate dimension

Office location changes independently of department (remote reorgs, office closures). A dedicated `locations` dimension lets location Type 2 evolve on its own cadence.

---

### The solution

Below is one conceptually sound approach: identity on `employees`, mutable context on `employee_assignments` (Type 2), and a transfer event log that ties the changes together.

> **Why this works**
>
> Identity is stable; context is temporal. The assignment table answers 'what was true as of a date,' and the transfer table answers 'why did it change.' The trade-off is extra joins on every report in exchange for a reproducible response to every historical question.

> **Interviewers watch for**
>
> Strong candidates separate identity from assignment without being told. They also propose the partial unique index on current rows. Weak candidates bolt a `history` JSON column onto `employees` and then cannot explain how to query for the state on a specific date.

> **Common pitfall**
>
> Using Type 2 on every attribute independently, so a department change creates a row, a manager change creates a different row, and the two look uncorrelated. A transfer event ties them together as one correlated change.

---

### The analysis pattern

**Headcount by department as of a historical date**

```sql
SELECT
    d.department_name,
    COUNT(*) AS headcount
FROM employee_assignments ea
JOIN departments d ON d.department_id = ea.department_id
WHERE ea.effective_from <= '2024-01-01'
  AND (ea.effective_to IS NULL OR ea.effective_to > '2024-01-01')
GROUP BY d.department_name
ORDER BY headcount DESC
```

---

### Trade-offs and alternatives

**Type 2 assignments plus transfer log**

Familiar Kimball pattern, easy point-in-time queries, BI tools handle it natively. Cost: two writes per transfer (close old, insert new) and join cost on every report.

**Append-only HR event stream**

Raw event log of every HR change, projections built downstream. Cost: more infrastructure and analysts need projection tables to answer simple questions; buys replay and audit.

---

## Common follow-up questions

- How do you represent a promotion that changes title but nothing else? _(Tests whether title lives on employee_assignments or on its own history table.)_
- How would you handle a retroactive correction to a transfer date? _(Tests whether corrections rewrite the assignment rows or append a correction event.)_
- What if two employees switch positions simultaneously? _(Tests whether the transfer event model handles atomic multi-employee swaps.)_
- How do you compute tenure in current role without scanning the full history? _(Tests whether the candidate queries `effective_from` on the current assignment.)_
- At 80k employees with monthly reorgs, how do you keep the department hierarchy query cheap? _(Tests materialized path or closure table for the self-referential departments tree.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/employee_transfer_tracking_system)
- [Data Modeling Interview Questions](https://datadriven.io/data-modeling-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.