# Users With API Errors

> Count unique users who have triggered an API error response

Canonical URL: <https://datadriven.io/problems/users_with_api_errors>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

How many unique users have triggered at least one API error? An API error is any call that returned a status code of 400 or higher.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests filtering with a numeric comparison and counting distinct entities. Interviewers check whether you correctly define an error condition and produce a single aggregate count.

---

### Break down the requirements

#### Step 1: Define error condition

`WHERE status >= 400` identifies API calls that returned error codes (4xx and 5xx).

#### Step 2: Count unique users with errors

`COUNT(DISTINCT user_id)` returns the number of unique users who triggered at least one error.

---

### The solution

**Distinct count with numeric threshold filter**

```sql
SELECT COUNT(DISTINCT user_id) AS users_with_errors
FROM api_calls
WHERE status >= 400
```

> **Cost Analysis**
>
> Scanning 2B rows for the status filter is expensive. With ~15 distinct status codes, maybe 20% are errors (~400M rows). An index on `(status, user_id)` with a range condition enables an efficient scan of just the error portion.

> **Interviewers Watch For**
>
> Whether you use `>=` 400 (correct) vs `> 400` (misses 400 Bad Request). The prompt says "400 or higher."

> **Common Pitfall**
>
> Including NULL user_ids in the count. With 10% null fraction on user_id, `COUNT(DISTINCT user_id)` already excludes NULLs, but be aware that anonymous error calls are not counted.

---

## Common follow-up questions

- How would you break this down by error code? _(Add GROUP BY status for a per-code user count.)_
- What if you needed the error rate per user? _(Count errors vs total calls per user_id.)_
- How would you find the most common error message for these users? _(Join back to get err_msg and use COUNT with GROUP BY for frequency analysis.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/users_with_api_errors)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.