# Model Accuracy Drift

> Accuracy used to be higher.

Canonical URL: <https://datadriven.io/problems/model_accuracy_drift>

Domain: SQL · Difficulty: hard · Seniority: L5

## Problem

We track model versions and their accuracy over time. For each model, calculate the difference between the accuracy of its most recent version and the average accuracy across all previous versions, rounded to 2 decimal places. If a model has only one version, return 0 for the difference. Show model name, average lifetime accuracy, latest accuracy, and the difference.

## Worked solution and explanation

### Why this problem exists in real interviews

Extracting insights from ml_models.mdl_name grouped by version via row numbering and grouping is the central task. It is used in senior-level rounds to test whether you pick the right aggregation function and partition boundary on the first attempt.

> **Trick to Solving**
>
> Mode requires finding the most frequent value. The trick is combining `COUNT` with `ORDER BY DESC LIMIT 1`.
> 
> 1. Group by the target column and count occurrences
> 2. Order by count descending
> 3. Handle ties if the prompt requires it

---

### Break down the requirements

#### Step 1: Structure the query as multi-step CTEs

This solution uses 2 CTEs to break the logic into readable stages. Each CTE produces an intermediate result that feeds the next.

#### Step 2: Self-join the table

Join `ml_models` to itself to compare or pair rows within the same table. Use an inequality condition to avoid duplicate pairs.

#### Step 3: Filter to the target rows

Apply the `WHERE` filter to restrict the working set before aggregation. Filtering early reduces the number of rows that downstream operations process.

#### Step 4: Assign row numbers for deduplication

`ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ... DESC)` tags each row within its group. The outer query filters to `rn = 1` to keep only the target row.

#### Step 5: Aggregate with AVG

Group by the output grain and apply `AVG()` to compute the metric. The `GROUP BY` must match exactly what the output needs: one row per group key.

#### Step 6: Handle NULLs with COALESCE

`COALESCE(expr, 0)` replaces NULL with a default value. This prevents NULL propagation from silently corrupting sums, counts, or display values.

#### Step 7: Order the final output

Apply `ORDER BY` as specified to produce the expected row sequence. When tied values exist, add a secondary sort column for determinism.

---

### The solution

**Latest-vs-historical comparison with LEFT JOIN fallback**

```sql
WITH latest AS (
    SELECT mdl_name, accuracy,
        ROW_NUMBER() OVER (PARTITION BY mdl_name ORDER BY train_at DESC) AS rn
    FROM ml_models
),
prior_avg AS (
    SELECT mdl_name, AVG(accuracy) AS avg_prior
    FROM ml_models m
    WHERE EXISTS (
        SELECT 1 FROM latest l
        WHERE l.mdl_name = m.mdl_name AND l.rn = 1
            AND m.train_at < (SELECT train_at FROM ml_models WHERE model_id = l.model_id)
    )
    GROUP BY mdl_name
)
SELECT l.mdl_name,
    COALESCE(ROUND(l.accuracy - p.avg_prior, 2), 0) AS accuracy_drift
FROM latest l
LEFT JOIN prior_avg p ON l.mdl_name = p.mdl_name
WHERE l.rn = 1
ORDER BY l.mdl_name
```

> **Cost Analysis**
>
> The query scans 5K rows from `ml_models`. CTEs in most engines are optimization fences. For production workloads, consider inlining or materializing the intermediate results.

> **Interviewers Watch For**
>
> Naming the output grain ("one row per X") before writing the GROUP BY shows you think about data shape, not just syntax. Explaining why `ROW_NUMBER` is preferred over `DISTINCT` for deduplication shows you understand the difference between collapsing and selecting.

> **Common Pitfall**
>
> Comparing dates stored as TEXT without casting can produce lexicographic instead of chronological ordering. Always confirm the column type.

---

## Common follow-up questions

- What happens to your result if ml_models.accuracy contains NULLs for some rows? _(Tests whether the candidate accounts for NULL behavior in aggregates and comparisons on accuracy.)_
- If two rows in ml_models have identical values in the ORDER BY columns, how does your ranking handle the tie? _(Tests understanding of RANK vs DENSE_RANK vs ROW_NUMBER tie-breaking behavior.)_
- If ml_models grows to hundreds of millions of rows, how would you partition or index on train_at to maintain performance? _(Tests partitioning strategy for time-series data in train_at.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/model_accuracy_drift)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.