# Noisy Endpoints

> The routes generating the most noise.

Canonical URL: <https://datadriven.io/problems/noisy_endpoints>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

The SRE team is triaging an incident and needs a quick view of which API endpoints are generating the most error traffic. A response code at or above 400 counts as a failure. For each endpoint that has accumulated more than two such failures, show the endpoint, the total number of failures, and the typical response time across those failed calls. List from the most failure-prone endpoint to the least.

## Worked solution and explanation

### Why this problem exists in real interviews

Against the api_calls table, HAVING filter and grouping on endpoint values is the key operation. Interviewers favor this in mid-level screens because it exposes whether candidates handle ties, NULLs, and ordering edge cases correctly.

---

### Break down the requirements

#### Step 1: Filter to error responses

`WHERE status >= 400` keeps only failed API calls.

#### Step 2: Group by endpoint and aggregate

`GROUP BY endpoint` with `COUNT(*)` for failure count and `AVG(latency)` for typical response time.

#### Step 3: Apply HAVING threshold

`HAVING COUNT(*) > 2` excludes endpoints with only sporadic failures.

#### Step 4: Order by failure count descending

Surface the noisiest endpoints first with `ORDER BY COUNT(*) DESC`.

---

### The solution

**Filter on status, group, HAVING, order**

```sql
SELECT
    endpoint,
    COUNT(*) AS failure_count,
    AVG(latency) AS avg_latency
FROM api_calls
WHERE status >= 400
GROUP BY endpoint
HAVING COUNT(*) > 2
ORDER BY failure_count DESC
```

> **Cost Analysis**
>
> At `api_calls` (5,000,000,000 rows), a full table scan is expensive. Partition pruning (if the table is partitioned on the filter column) is the first optimization. A covering index on the `GROUP BY` + filter columns eliminates random I/O. Consider a materialized view for repeated dashboard queries.

> **Interviewers Watch For**
>
> Interviewers evaluate whether you translate the English requirements into the correct SQL clauses on the first attempt. They watch for clean syntax, correct column references, and whether you verify edge cases before declaring the query complete.

> **Common Pitfall**
>
> The most common mistake is misreading the prompt's filtering or grouping requirements. Double-check which columns to group by, which to aggregate, and whether the output should be filtered with `WHERE` (before grouping) or `HAVING` (after grouping).

---

## Common follow-up questions

- What happens to your result if api_calls.latency contains NULLs for some rows? _(Tests whether the candidate accounts for NULL behavior in aggregates and comparisons on latency.)_
- What is the difference between filtering in WHERE versus HAVING for this query against api_calls? _(Tests whether the candidate understands pre-aggregation vs post-aggregation filtering.)_
- With millions of distinct values in api_calls.call_id, what index strategy would you use to keep this query performant? _(Tests indexing knowledge specific to high-cardinality columns like call_id.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/noisy_endpoints)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.