# The Noise Floor

> Not every alert deserves a 3 a.m. page. Find the services where most of them claim they do.

Canonical URL: <https://datadriven.io/problems/the-noise-floor>

Domain: SQL · Difficulty: medium · Seniority: mid

## Problem

For each service, we want to know what percentage of its alerts are high or critical severity. Surface only the services where more than half of the alerts qualify, report that percentage rounded to one decimal, worst first with ties broken by service name.

## Worked solution and explanation

### What this is really asking

Strip the urgency-audit costume off and this is a per-group ratio with a casing trap baked into the numerator. The skill being probed: can you classify rows into high-urgency vs not, count both buckets in one pass, and filter groups on the resulting fraction. The thing that quietly sinks people is `severity`: the column holds 'critical', 'Critical', 'high', and 'HIGH' as separate strings. Compare against 'critical' literally and you drop every capitalized page, the share collapses, and a service that is genuinely on fire reads as quiet.

---

### Break down the requirements

#### Step 1: Normalize before you classify

Wrap severity in LOWER() and test membership in ('critical', 'high'). This is the one decision that determines whether your numerator is correct. Do it once and reuse the exact same expression everywhere the classification appears.

#### Step 2: Build the share in a single scan

SUM(CASE WHEN LOWER(severity) IN ('critical','high') THEN 1 ELSE 0 END) counts the urgent alerts; COUNT(*) counts all of them. The ratio of those two is the per-service share. No self-join, no subquery, one GROUP BY svc_name.

#### Step 3: Filter on the ratio, not the count

The threshold is 'more than half', so HAVING compares the urgent-over-total fraction to 0.5. Multiply by 1.0 (or 100.0) so the division is floating point; integer division would floor every share to 0 and silently return nothing.

#### Step 4: Report and order

Project the share as a readable percentage rounded to one decimal, then ORDER BY it descending so the worst offenders sit on top. Add svc_name as a tiebreak so two services with identical shares come back in a stable order, exactly as the expected output shows.

---

### The solution

**HIGH-URGENCY SHARE PER SERVICE**

```sql
SELECT
  svc_name,
  ROUND(100.0 * SUM(CASE WHEN LOWER(severity) IN ('critical', 'high') THEN 1 ELSE 0 END) / COUNT(*), 1) AS high_urgency_pct
FROM alert_events
GROUP BY svc_name
HAVING SUM(CASE WHEN LOWER(severity) IN ('critical', 'high') THEN 1 ELSE 0 END) * 1.0 / COUNT(*) > 0.5
ORDER BY high_urgency_pct DESC, svc_name
```

> **Cost Analysis**
>
> At interview scale this is a few hundred rows, but the shape matters at 2B alert rows: it is a single sequential scan that hash-aggregates by svc_name (a few dozen distinct services), so memory stays tiny and there is no join to blow up. LOWER(severity) in the CASE is a per-row function call, cheap here; if you partitioned the table by fired_at you would add a date predicate to prune, but with no window asked for, the full scan is the honest plan.

> **Interviewers Watch For**
>
> Whether you reach for LOWER() unprompted after glancing at the data, and whether the HAVING predicate operates on the fraction rather than a raw count. Saying out loud 'I'm normalizing case because severity is dirty' is the tell that you have actually looked at production alert data before.

> **Common Pitfall**
>
> Writing HAVING COUNT(*) > something, or HAVING on the urgent count alone. That answers 'which services have many urgent alerts', not 'which services are mostly urgent'. A chatty service with 400 alerts and 60 urgent ones (15 percent) would wrongly pass a count threshold while a quiet service that is 80 percent critical gets dropped. The ratio is the whole point.

---

## Common follow-up questions

- Only count alerts from the last 30 days. Where does the predicate go and why does it not belong in HAVING? _(Tests understanding that a row-level time filter belongs in WHERE (before grouping), and awareness of partition pruning on fired_at.)_
- Add a minimum-volume guard so a service with 2 alerts, both critical, does not show up as 100 percent. _(Pushes a second HAVING condition (COUNT(*) >= N) and a conversation about statistical noise in small samples.)_
- Now break the share out by severity level instead of a single high-urgency bucket. _(Probes conditional aggregation across multiple CASE buckets, or a pivot, while keeping one scan.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the-noise-floor)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.