# Extremely Late Resolutions

> Twenty minutes past the SLA. Still unresolved.

Canonical URL: <https://datadriven.io/problems/extremely_late_resolutions>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

Our alerting system logs resolution times. For each month, calculate the percentage of resolved alerts where resolution took more than 20 minutes beyond the predicted time. Format months as 'YYYY-MM'. Return the month and late percentage.

## Worked solution and explanation

### Why this problem exists in real interviews

This appears on interview slates because row numbering within partitions combined with nested subqueries on `alert_events` exposes whether a candidate thinks about edge cases in `svc_name`, `severity`, `status` or just writes mechanical queries.

---

### Break down the requirements

#### Step 1: Partition by `svc_name`

`PARTITION BY svc_name` creates groups. Within each group, `ORDER BY fired_at ASC` determines the ranking.

#### Step 2: Filter to rank 1

`WHERE rnk = 1` in the outer query selects the target row per group.

---

### The solution

**Row-number for extremely late resolutions**

```sql
SELECT *
FROM (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY svc_name ORDER BY fired_at ASC) AS rnk
    FROM alert_events
) ranked
WHERE rnk = 1
ORDER BY svc_name
```

> **Cost Analysis**
>
> Window function sorts within each `svc_name` partition. An index on `(svc_name, fired_at)` avoids a full sort.

> **Interviewers Watch For**
>
> The interviewer checks whether you use ROW_NUMBER (one row) vs. RANK/DENSE_RANK (ties) based on the prompt requirements.

> **Common Pitfall**
>
> Using GROUP BY with MIN(fired_at) gives the value but not the other columns. ROW_NUMBER gives the full row.

---

## Common follow-up questions

- The `severity` column in `alert_events` has roughly 3% NULLs. How does your query handle those rows, and would the result change if NULLs were replaced with zeros? _(Tests whether the candidate understands how NULLs propagate through aggregation functions and whether their WHERE/JOIN conditions implicitly filter them out.)_
- Your window function uses a default frame. What is the implicit frame, and would switching to ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW change anything? _(Tests knowledge of default window frames (RANGE vs ROWS) and when the distinction matters.)_
- `alert_id` in `alert_events` has ~25M distinct values. What index strategy keeps your query from doing a full table scan? _(Tests whether the candidate can design indexes for high-cardinality columns and understands selectivity.)_
- If `alert_id` in `alert_events` contained negative values, would your query still produce correct results? _(Tests whether the candidate validated assumptions about the domain of numeric columns.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/extremely_late_resolutions)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.