# 90th Pctl Model Accuracy Gap

> Most models are fine. The bottom 10% are not.

Canonical URL: <https://datadriven.io/problems/90th_pctl_model_accuracy_gap>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

The ML platform team is auditing model quality for the first half of 2026. They set a target accuracy of 0.95 and want to identify outlier models whose accuracy deviates most from that target. Compute the absolute gap between each model's accuracy and 0.95, then surface only the models in the 90th percentile of that gap. Return the model name, its accuracy, and the gap as a decimal value.

## Worked solution and explanation

### Why this problem exists in real interviews

ML platform teams use this to see if you pin down what 90th percentile means before writing. `PERCENTILE_CONT` interpolates, `PERCENTILE_DISC` returns a real row, `NTILE(10) = 9` returns a bucket. They differ on small samples. The probe is whether you'll ask which one the team actually wants instead of picking whichever you remember the syntax for.

---

### Break down the requirements

#### Step 1: Scope the audit window

Filter `ml_models` with `train_at BETWEEN '2026-01-01' AND '2026-06-30'`. H1 is inclusive on both ends. If `train_at` is a timestamp the upper bound silently drops June 30 evening rows, flag that out loud.

#### Step 2: Compute the gap

Use `ABS(accuracy - 0.95)` so models both above and below 0.95 contribute symmetrically. The prompt says deviates most from target, signed difference would hide a model at 0.99 that overshoots.

#### Step 3: Bucket into deciles

`NTILE(10) OVER (ORDER BY ABS(accuracy - 0.95))` assigns each model a 1 through 10 bucket. Decile 9 is the second-highest bucket, which on 1500 rows is roughly rank 1201 through 1350, the 80th to 90th percentile band.

#### Step 4: Filter and project

Wrap in a subquery, then `WHERE decile = 9`. Project `mdl_name`, `accuracy`, and `CAST(ABS(accuracy - 0.95) AS REAL) AS accuracy_gap`. Cast is defensive: integer division on the raw subtraction is not an issue here, but the contract requires a decimal.

---

### The solution

**NTILE DECILE GAP**

```sql
SELECT mdl_name, accuracy, CAST(ABS(accuracy - 0.95) AS REAL) AS accuracy_gap
FROM (
  SELECT mdl_name, accuracy, ABS(accuracy - 0.95) AS accuracy_diff,
    NTILE(10) OVER (ORDER BY ABS(accuracy - 0.95)) AS decile
  FROM ml_models
  WHERE train_at BETWEEN '2026-01-01' AND '2026-06-30'
)
WHERE decile = 9
```

> **Cost Analysis**
>
> 1500 rows is trivial. `NTILE` forces a full sort over the filtered set, no index helps the window. An index on `train_at` prunes the H1 scan. At 1.5M rows you would push the gap computation into a CTE so the sort runs on a thinner projection.

> **Interviewers Watch For**
>
> Ask which 90th percentile semantics the team wants. `PERCENTILE_CONT(0.9)` interpolates between two values and returns a synthetic number, `PERCENTILE_DISC(0.9)` returns an actual model's gap, `NTILE(10) = 9` returns a bucket of models. Each gives a different answer on 1500 rows.

> **Common Pitfall**
>
> Writing `WHERE decile >= 9` and returning the top 20 percent, or `WHERE decile = 10` thinking decile 10 means 90th percentile. `NTILE(10)` numbers buckets 1 through 10 in ascending order; bucket 9 is the 80 to 90 band, bucket 10 is 90 to 100.

> **The Elegant Move**
>
> Compute `ABS(accuracy - 0.95)` once in the subquery and reuse it via `accuracy_diff`. Recomputing in the outer SELECT works but invites a typo where the inner ORDER BY and outer projection drift.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- Rewrite this using `PERCENTILE_CONT` and explain when the answers diverge. _(Probes whether you understand the difference between bucket-based and interpolated percentile semantics.)_
- How would you compute the 90th percentile gap per framework instead of globally? _(Tests `PARTITION BY framework` on the NTILE window and the gotcha that small frameworks get coarse buckets.)_
- What if `accuracy` can be NULL for failed training runs? _(Probes whether you filter nulls before the window or risk NULL gaps sorting first and skewing the buckets.)_
- The team wants top 10 percent, not top 10 to 20 percent. What changes? _(Tests whether you switch to `decile = 10` or move to `PERCENT_RANK >= 0.9` for cleaner semantics.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/90th_pctl_model_accuracy_gap)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.