# Failure Rate

> Build failures happen. Which repos break the most?

Canonical URL: <https://datadriven.io/problems/failure_rate>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

The DevOps lead is presenting build reliability at the all-hands and needs a failure percentage per repository. For each repo, what share of its CI builds ended in a 'failed' status?

## Worked solution and explanation

### Why this problem exists in real interviews

Computing failure rate as a percentage tests conditional aggregation. The interviewer wants to see `SUM(CASE WHEN ...)` divided by `COUNT(*)`, with a `NULLIF` guard for empty datasets.

---

### Break down the requirements

#### Step 1: Count failures and total per repo

Use `SUM(CASE WHEN status = 'failed' THEN 1 ELSE 0 END)` for failures and `COUNT(*)` for total builds, grouped by `repo_name`.

#### Step 2: Compute the percentage

Divide failures by total, multiply by 100, and guard with `NULLIF`.

---

### The solution

**Conditional aggregation for failure rate per repo**

```sql
SELECT repo_name,
       COUNT(*) AS total_builds,
       SUM(CASE WHEN status = 'failed' THEN 1 ELSE 0 END) AS failed_builds,
       ROUND(
           SUM(CASE WHEN status = 'failed' THEN 1.0 ELSE 0.0 END) * 100.0 /
           NULLIF(COUNT(*), 0), 2
       ) AS failure_rate_pct
FROM ci_builds
GROUP BY repo_name
ORDER BY failure_rate_pct DESC
```

> **Cost Analysis**
>
> Single-pass aggregation with conditional counters. Hash aggregate on `repo_name` is efficient.

> **Interviewers Watch For**
>
> The `NULLIF` guard and correct CASE syntax are the key signals. Candidates who use `COUNT(CASE ...)` should understand it differs from `SUM(CASE ...)`.

> **Common Pitfall**
>
> Using `COUNT(CASE WHEN status = 'failed' THEN 1 ELSE 0 END)` counts all rows (zeros included). Use `SUM` instead, or `COUNT` without an ELSE.

---

## Common follow-up questions

- How would you compute the global failure rate? _(Tests removing the GROUP BY.)_
- How would you track failure rate trends over time? _(Tests date truncation with GROUP BY.)_
- What if you needed a confidence interval? _(Tests awareness of statistical methods.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/failure_rate)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.