# Error Category Breakdown

> Postmortem time. Categorize the errors.

Canonical URL: <https://datadriven.io/problems/error_category_breakdown>

Domain: SQL · Difficulty: hard · Seniority: L4

## Problem

During an incident postmortem, the on-call engineer needs to categorize error tracks into three buckets: those related to timeouts, those related to connections, and those related to memory. Each category should show its label and corresponding count.

## Worked solution and explanation

### Why this problem exists in real interviews

Categorizing errors into buckets with `CASE WHEN` tests conditional aggregation. The prompt asks for three specific buckets based on keywords in the error type, testing string pattern matching combined with grouping.

---

### Break down the requirements

#### Step 1: Define the three buckets

Use `CASE WHEN err_type LIKE '%...' THEN 'bucket_name'` to classify each error into one of three categories based on the prompt's criteria.

#### Step 2: Count per bucket

`GROUP BY` the CASE expression and `COUNT(*)` per bucket.

---

### The solution

**CASE-based error categorization**

```sql
SELECT
    CASE
        WHEN err_type LIKE '%timeout%' OR err_type LIKE '%connection%' THEN 'network'
        WHEN err_type LIKE '%auth%' OR err_type LIKE '%permission%' THEN 'security'
        ELSE 'application'
    END AS error_category,
    COUNT(*) AS error_count
FROM err_tracks
GROUP BY error_category
ORDER BY error_count DESC
```

> **Cost Analysis**
>
> Single-pass scan with CASE evaluation per row. Hash aggregation on 3 buckets is trivially cheap.

> **Interviewers Watch For**
>
> Strong candidates proactively add percentage of total: `COUNT(*) * 100.0 / SUM(COUNT(*)) OVER ()` to show relative distribution.

> **Common Pitfall**
>
> Ordering CASE conditions incorrectly can misclassify errors. CASE evaluates in order and returns the first match.

---

## Common follow-up questions

- How would you add a percentage-of-total column? _(Tests SUM() OVER() window function.)_
- What if new error types do not fit the three buckets? _(Tests the ELSE clause robustness.)_
- How would you track the breakdown week over week? _(Tests adding date dimension.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/error_category_breakdown)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.