# Build Health

> Rank every repo by how often CI stays green.

Canonical URL: <https://datadriven.io/problems/build_health>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

The platform team wants a CI health leaderboard for the all-hands. Rank repositories in `ci_builds` by success rate, where rate is measured against every build the repo ran (not just builds that resolved to success or failed). Ignore repos that haven't built enough to be meaningful.

## Worked solution and explanation

### What this is really asking

`status` is an enum with more than two values. `'success'` and `'failed'` are the ones you report; the rate is over `COUNT(*)`, so canceled and in-flight builds count against the repo.

---

### Break down the requirements

#### Step 1: Pivot status into columns

Two `SUM(CASE WHEN status = ? THEN 1 END)` expressions for the resolved counts plus a `COUNT(*)` for `total_builds`, all in one `GROUP BY repo_name` pass.

#### Step 2: Rate is over all builds

Divide successes by `COUNT(*)`, not by `success + failed`. Canceled, running, and queued builds belong in the denominator. Wrap the result in `ROUND(..., 3)` so the report column has three decimals.

#### Step 3: Gate, then sort

`HAVING COUNT(*) >= 3` drops low-volume repos. `ORDER BY success_rate DESC` puts the healthiest on top.

---

### The solution

**CI SUCCESS RATE BY REPO**

```sql
SELECT
  repo_name,
  SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) AS success_count,
  SUM(CASE WHEN status = 'failed'  THEN 1 ELSE 0 END) AS failed_count,
  COUNT(*) AS total_builds,
  ROUND(100.0 * SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) / COUNT(*), 3) AS success_rate
FROM ci_builds
GROUP BY repo_name
HAVING COUNT(*) >= 3
ORDER BY success_rate DESC
```

> **Cost Analysis**
>
> Single hash aggregate keyed on `repo_name`. Lifetime scope means no partition pruning on `built_at`. Per-repo cardinality (~5k) keeps the hash table in memory.

> **Interviewers Watch For**
>
> The denominator is `COUNT(*)`, and the numerator is scaled by `100.0` before the divide. Integer `100` truncates to zero in engines that follow SQL division semantics.

> **Common Pitfall**
>
> Filtering `WHERE status IN ('success','failed')` simplifies the rate but hides canceled and stuck builds. Flaky repos look healthier than they are.

> **The False Start**
>
> `success_count * 100.0 / (success_count + failed_count)` reads natural and is wrong here. The prompt anchors the rate to every build the repo ran, not just the resolved ones.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- How would you scope this to the last 30 days? _(Add `WHERE built_at >= date('now','-30 days')`. Partition pruning on `built_at` shrinks the scan from lifetime to a single month.)_
- What if a new `status` value gets introduced upstream? _(A new value silently lowers every repo's rate without warning. Audit `DISTINCT status` on each run, or normalize through a staging view that maps unknowns to a known bucket.)_
- How would you break ties between repos with the same rate? _(Add `COUNT(*) DESC` as a secondary key so the busier repo wins. Volume is the natural tiebreaker for health.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/build_health)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.