# Daily Error Count Change

> Errors, trending up or down?

Canonical URL: <https://datadriven.io/problems/daily_error_count_change>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

SRE wants a daily trend chart for error volume. For each day errors were recorded, show the number of errors that landed that day, how many landed the day before, and how much it moved. Order from earliest day to latest. The first day on record has no prior day to compare against.

## Worked solution and explanation

### Why this problem exists in real interviews

Two stages: roll up to a daily grain, then compare each day to the prior day. The interviewer is checking that you reach for `LAG` over a CTE rather than self-joining the daily aggregate to itself, and that you remember `LAG` returns NULL on the first row instead of a sentinel like 0.

> **Trick to Solving**
>
> Aggregate first, window function second.
> 
> 1. CTE: one row per calendar day with the error row count
> 2. Outer: `LAG(error_count) OVER (ORDER BY error_date)` for the prior day, then subtract for the change

---

### Break down the requirements

#### Step 1: Roll up to one row per day

Group `err_tracks` by the calendar date of `first_at` and use `COUNT(*)` to get the number of error rows per day. The schema has a `count` column too, but the prompt asks for the row count, not the sum of that column.

#### Step 2: Add the previous-day column

Wrap the daily aggregate in a CTE so `LAG` runs over the small daily set instead of the 30M-row source. `LAG(error_count) OVER (ORDER BY error_date)` returns the previous day's count, NULL on day one.

#### Step 3: Compute the day-over-day delta

`error_count - prev_count` produces the day-over-day change. Subtraction with NULL stays NULL, so day one's change is also NULL, which matches the contract.

#### Step 4: Project and order

`ORDER BY error_date` ascending so the chart starts with the earliest day. Project four columns: `error_date`, `error_count`, `prev_count`, `day_over_day_change`.

---

### The solution

**Daily aggregate, then LAG for the prior-day comparison**

```sql
WITH daily_errors AS (
    SELECT
        DATE(first_at) AS error_date,
        COUNT(*) AS error_count
    FROM err_tracks
    GROUP BY DATE(first_at)
),
with_lag AS (
    SELECT
        error_date,
        error_count,
        LAG(error_count) OVER (ORDER BY error_date) AS prev_count
    FROM daily_errors
)
SELECT
    error_date,
    error_count,
    prev_count,
    error_count - prev_count AS day_over_day_change
FROM with_lag
ORDER BY error_date
```

> **Time and Space Complexity**
>
> **Time:** O(n) hash aggregate over 30M rows reduces to one row per day (~365). The LAG over ~365 rows is essentially free.
> 
> **Space:** O(d) for the daily CTE where d is the number of distinct days.

> **Interviewers Watch For**
>
> Strong candidates put the LAG inside a second CTE so the day-over-day subtraction reads cleanly in the outer SELECT, instead of repeating the LAG expression twice. They also leave day one's `prev_count` and `day_over_day_change` as NULL rather than coalescing to 0.

> **Common Pitfall**
>
> Applying LAG before the daily aggregation. LAG over the raw 30M rows compares individual error records to each other, not daily totals. Always aggregate to the right grain first, then use a window function.

---

## Common follow-up questions

- How would you also flag days where the count more than doubled, using a percentage-change threshold? _(Tests percentage change with division-by-zero guards.)_
- If a day had zero errors, the current query has no row for it. How would you fill the gap so that day appears with `error_count = 0`? _(Tests calendar join + LEFT JOIN to surface zero days.)_
- How would you replace the day-over-day change with a 7-day rolling average instead? _(Tests AVG() OVER (ROWS BETWEEN 6 PRECEDING AND CURRENT ROW).)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/daily_error_count_change)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.