# Top Regions by Critical Alerts

> Which regions have the highest volume of critical alerts

Canonical URL: <https://datadriven.io/problems/top_regions_by_critical_alerts>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

During an incident postmortem, the on-call engineer needs to know which regions produce the most critical alerts. Use service health checks to map each service to its region, count critical-severity alerts per region, and return the top 5 regions by alert volume.

## Worked solution and explanation

### Why this problem exists in real interviews

On-call ranking questions look trivial until you write them. The interviewer is watching whether you pin down three things out loud before typing: is `severity` case-sensitive, what does 'top 5' mean when counts tie, and do `svc_health` rows with NULL `region` get a bucket or get dropped. Skip those and your answer is wrong before the first `SELECT`.

---

### Break down the requirements

#### Step 1: Resolve region via svc_health

Join `alert_events.svc_name` to `svc_health.svc_name`. One service can have many health checks, so the join fans out; you are counting alert rows, not distinct alerts, so dedupe `svc_health` on (`svc_name`, `region`) if a service spans regions.

#### Step 2: Filter critical severity defensively

The source has `'critical'` and `'Critical'` both observed. Use `severity IN ('critical', 'Critical')` or `LOWER(severity) = 'critical'`. Don't trust a single-case equality.

#### Step 3: Aggregate by region

`GROUP BY sh.region`, `COUNT(*) AS critical_count`. State out loud whether NULL region is its own bucket. Default here: keep it (matches `GROUP BY` behavior), surface it, let the interviewer veto.

#### Step 4: Top 5 with tie semantics

`LIMIT 5` returns exactly 5 rows and silently breaks ties. `RANK() OVER (ORDER BY COUNT(*) DESC) <= 5` keeps every region tied at rank 5. Pick `RANK` unless the interviewer says exactly five.

---

### The solution

**TOP REGIONS BY CRITICAL ALERTS**

```sql
SELECT region, critical_count
FROM (
  SELECT
    sh.region,
    COUNT(*) AS critical_count,
    RANK() OVER (ORDER BY COUNT(*) DESC) AS rnk
  FROM alert_events ae
  INNER JOIN svc_health sh
    ON ae.svc_name = sh.svc_name
  WHERE ae.severity IN ('critical', 'Critical')
  GROUP BY sh.region
) ranked
WHERE rnk <= 5
ORDER BY critical_count DESC;
```

> **Cost Analysis**
>
> `alert_events` is 10M rows, `svc_health` is 50M. Push the severity filter before the join so the 10M side shrinks to whatever fraction is critical, then hash-join on `svc_name`. Without an index on `alert_events.severity`, expect a full scan; that is still cheaper than joining first and filtering after.

> **Interviewers Watch For**
>
> Before writing, ask: 'Is severity case-normalized upstream, or do I defend against both casings?' and 'For top 5, do you want exactly 5 rows or every region tied at rank 5?' These two questions are the scored signal. Writing without asking marks you as someone who ships ambiguous metrics.

> **Common Pitfall**
>
> Joining first and counting `alert_events.alert_id` looks safe but double-counts: if `svc_health` has 7 rows for `payments` across 1 region, every payments alert counts 7 times. Either dedupe `svc_health` to one row per (`svc_name`, `region`) before the join, or use `COUNT(DISTINCT ae.alert_id)`.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- How would you change the query if a service can appear in multiple regions in svc_health? _(Probes whether you understand join fan-out and can articulate an explicit grain decision (alert per region vs alert per service).)_
- Rewrite this so ties at rank 5 still return only 5 rows, broken by region name. _(Tests `ROW_NUMBER` vs `RANK` vs `DENSE_RANK` and deterministic tiebreakers.)_
- If `region` is NULL for 8% of services, how should that appear in the output? _(Checks whether you treat NULL as a bucket, an error, or a filter, and whether you flag the choice.)_
- Same question but only alerts that fired in the last 24 hours: where does the predicate go? _(Probes filter pushdown awareness and whether `fired_at` lives in `WHERE` or a windowed subquery.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/top_regions_by_critical_alerts)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.