# Alert Hotspots by Service and Severity

> Some services and severities light up more than others.

Canonical URL: <https://datadriven.io/problems/alert_hotspots_by_service_and_severity>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

The on-call team wants to identify which service and severity combinations generate the most noise. For each pairing, show the alert count, sorted from most alerts to fewest.

## Worked solution and explanation

### Why this problem exists in real interviews

Working with `alert_events`, this problem isolates conditional branching with CASE combined with division-safe NULLIF guards. The interviewer expects candidates to articulate why `svc_name`, `severity`, `status` matter for correctness before touching the keyboard.

> **Trick to Solving**
>
> Any rate or ratio problem requires **null-safe division**. If the denominator can be zero, the query crashes or returns NULL silently.
> 
> 1. Identify the numerator and denominator conditions
> 2. Use `SUM(CASE WHEN ... THEN 1 ELSE 0 END)` for the numerator
> 3. Wrap the denominator in `NULLIF(..., 0)` to prevent division by zero

---

### Break down the requirements

#### Step 1: Group by `severity`

`GROUP BY severity` produces one output row per distinct value of `severity`.

#### Step 2: Compute the ratio with CASE and NULLIF

The numerator uses `SUM(CASE WHEN condition THEN 1 ELSE 0 END)`. Wrapping the denominator in `NULLIF(COUNT(*), 0)` prevents division by zero.

#### Step 3: Round and order

Use `ROUND(..., 4)` for clean decimal output and sort by rate descending.

---

### The solution

**Case-branch for alert hotspots service severity**

```sql
SELECT
    severity,
    ROUND(
        1.0 * COUNT(CASE WHEN status = 'success' THEN 1 END)
        / NULLIF(COUNT(*), 0),
        4
    ) AS rate
FROM alert_events
GROUP BY severity
ORDER BY rate DESC
```

> **Cost Analysis**
>
> The main table has 500K rows. The GROUP BY reduces the row count early, keeping downstream operations cheap.

> **Interviewers Watch For**
>
> Strong candidates state the correct `GROUP BY` grain before writing any SQL, showing they think about the output shape first. Division-by-zero handling is a silent correctness bug; interviewers watch for `NULLIF` or equivalent protection.

> **Common Pitfall**
>
> Selecting a non-aggregated column without including it in `GROUP BY` is the most common error. Some engines reject it; others silently return arbitrary values.

---

## Common follow-up questions

- The `severity` column in `alert_events` has roughly 3% NULLs. How does your query handle those rows, and would the result change if NULLs were replaced with zeros? _(Tests whether the candidate understands how NULLs propagate through aggregation functions and whether their WHERE/JOIN conditions implicitly filter them out.)_
- Your CASE expression branches on `svc_name`. What happens if a new category value appears that none of your WHEN clauses match? _(Tests whether the candidate uses a meaningful ELSE branch or lets unmatched rows silently become NULL.)_
- The `svc_name` column in `alert_events` has a zipf distribution, meaning a few values dominate. How does that skew affect your query plan and parallelism? _(Tests understanding of data skew: the optimizer may choose a bad plan when histogram statistics are stale.)_
- Could you express this same logic as a single query without CTEs or subqueries? What readability trade-off does that introduce? _(Tests whether the candidate can flatten nested logic and understands when decomposition aids maintainability.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/alert_hotspots_by_service_and_severity)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.