# Oldest Alert per Service

> The oldest unresolved alert per service.

Canonical URL: <https://datadriven.io/problems/oldest_alert_per_service>

Domain: SQL · Difficulty: hard · Seniority: L4

## Problem

During an incident review, the on-call team needs the oldest unresolved alert for each service. Show the service name, alert ID, severity, status, and the time the alert was fired, by service name.

## Worked solution and explanation

### Why this problem exists in real interviews

The interviewer wants to see you apply row numbering to alert_events.svc_name while accounting for the distribution of severity. This surfaces in senior-level rounds because small logic errors produce results that look correct at a glance.

---

### Break down the requirements

#### Step 1: Filter to unresolved alerts

`WHERE resolved IS NULL` keeps only alerts that have not been resolved.

#### Step 2: Rank by fired_at within each service

`ROW_NUMBER() OVER (PARTITION BY svc_name ORDER BY fired_at ASC)` assigns 1 to the oldest unresolved alert per service.

#### Step 3: Filter to rank 1

Wrap in a subquery and `WHERE rn = 1` to get one alert per service.

#### Step 4: Order by service name

`ORDER BY svc_name` for alphabetical output.

---

### The solution

**Window function for per-group oldest**

```sql
SELECT svc_name, alert_id, severity, status, fired_at
FROM (
    SELECT
        svc_name, alert_id, severity, status, fired_at,
        ROW_NUMBER() OVER (PARTITION BY svc_name ORDER BY fired_at ASC) AS rn
    FROM alert_events
    WHERE resolved IS NULL
) sub
WHERE rn = 1
ORDER BY svc_name
```

> **Cost Analysis**
>
> With `alert_events` (30,000,000 rows), the full scan reads significant data. A composite index on the filter columns pushes the predicate into the index layer. Pre-aggregation in a materialized view is worth considering at this scale.

> **Interviewers Watch For**
>
> Interviewers evaluate whether you translate the English requirements into the correct SQL clauses on the first attempt. They watch for clean syntax, correct column references, and whether you verify edge cases before declaring the query complete.

> **Common Pitfall**
>
> The most common mistake is misreading the prompt's filtering or grouping requirements. Double-check which columns to group by, which to aggregate, and whether the output should be filtered with `WHERE` (before grouping) or `HAVING` (after grouping).

---

## Common follow-up questions

- What happens to your result if alert_events.severity contains NULLs for some rows? _(Tests whether the candidate accounts for NULL behavior in aggregates and comparisons on severity.)_
- If two rows in alert_events have identical values in the ORDER BY columns, how does your ranking handle the tie? _(Tests understanding of RANK vs DENSE_RANK vs ROW_NUMBER tie-breaking behavior.)_
- With millions of distinct values in alert_events.alert_id, what index strategy would you use to keep this query performant? _(Tests indexing knowledge specific to high-cardinality columns like alert_id.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/oldest_alert_per_service)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.