# Build Success Rate by Trigger

> Which triggers produce green builds?

Canonical URL: <https://datadriven.io/problems/build_success_rate_by_trigger>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

The CI/CD team suspects that manual triggers fail more often than automated ones. For each trigger type, show the total number of builds and the fraction with a status of 'success', expressed as a decimal.

## Worked solution and explanation

### Why this problem exists in real interviews

This is the canonical "success rate by group" pattern. The interviewer is checking that you GROUP BY the dimension that defines a row in the output (`trigger`), and that you compute the rate with a single pass using conditional aggregation, not a join or subquery.

> **Trick to Solving**
>
> A success rate is just **conditional sum over total**.
> 
> 1. The denominator is `COUNT(*)` per group
> 2. The numerator is `SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END)` per group
> 3. Cast one side to a real number so integer division doesn't truncate the rate to zero

---

### Break down the requirements

#### Step 1: Group by trigger type

`GROUP BY trigger` produces one output row per distinct trigger type (manual, schedule, push, etc.).

#### Step 2: Count total builds per trigger

`COUNT(*)` per group gives the total number of builds for that trigger. This is the second column of the output.

#### Step 3: Compute the success fraction

`SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END)` counts only the successful builds in each group, divided by `COUNT(*)`. Cast the numerator to REAL (or multiply by 1.0) so the engine returns a fraction instead of integer-truncating to 0.

---

### The solution

**Conditional sum over total per trigger**

```sql
SELECT
    trigger,
    COUNT(*) AS total_builds,
    CAST(SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) AS REAL)
      / COUNT(*) AS success_rate
FROM ci_builds
GROUP BY trigger
```

> **Time and Space Complexity**
>
> **Time:** O(n) single hash-aggregate scan over `ci_builds`. There are only 5 distinct triggers, so the hash table is tiny and the result is read straight off the aggregate.
> 
> **Space:** O(g) where g is the number of trigger types (5).

> **Interviewers Watch For**
>
> Strong candidates name the output grain ("one row per trigger") before writing SQL, and explicitly cast or multiply by 1.0 so integer division doesn't silently zero out the rate. They also avoid `WHERE status = 'success'` which would make every group's rate look like 100%.

> **Common Pitfall**
>
> Adding `WHERE status = 'success'` to filter rows. That removes the failures from the denominator too, so every trigger ends up with a rate of 1.0 and the question becomes meaningless.

---

## Common follow-up questions

- How would you rewrite the rate using `AVG(...)` over a boolean expression instead of SUM/COUNT? _(Tests AVG of a boolean expression as a more concise alternative to SUM(CASE)/COUNT.)_
- What if the team only wanted triggers with at least 1000 builds? Where does that filter go? _(Tests filtering aggregates with HAVING.)_
- How would you produce both success_rate and failure_rate side by side per trigger? _(Tests whether the candidate can extend conditional aggregation to multiple status values in a single query.)_
- If one trigger only has 3 builds, its rate is 0%, 33%, 67%, or 100%. How would you smooth small-sample rates so they don't dominate the result? _(Tests awareness that small samples produce unstable rates and statistical correction techniques.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/build_success_rate_by_trigger)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.