# Service Uptime Minutes

> Status changed. How long was it actually up?

Canonical URL: <https://datadriven.io/problems/service_uptime_minutes>

Domain: SQL · Difficulty: medium · Seniority: L5

## Problem

Each alert has a fired_at timestamp and a resolved timestamp (NULL if still open). For each service, calculate the total minutes spent in an active alert state. Only consider alerts that were resolved within the first year after the service's earliest health check.

## Worked solution and explanation

### Why this problem exists in real interviews

The core skill being tested is grouped SUM aggregation, applied to the `svc_health` and `alert_events` tables in a alerting context. Getting the `resolved` column right is where most candidates slip.

> **Trick to Solving**
>
> Finding the 'second highest' or 'runner-up' is a classic pattern. Avoid ORDER BY/LIMIT when precision matters.
> 
> 1. Use a scalar subquery to find the maximum
> 2. Filter the outer query to values strictly less than that maximum
> 3. Take the MAX of the filtered set

---

### Break down the requirements

#### Step 1: Isolate the intermediate result in a CTE

The `svc_start` CTE computes the intermediate aggregation that the outer query builds on. This separation keeps each layer focused on a single task.

#### Step 2: Join `svc_health` to `alert_events`

The join connects the two tables on their shared key. This brings the columns needed for filtering and aggregation into a single row set.

#### Step 3: Filter out null values

Exclude rows where `resolved` is NULL. This prevents nulls from polluting aggregations or creating phantom groups.

---

### The solution

**Isolate the intermediate result in a cte to find service uptime min...**

```sql
WITH svc_start AS (
    SELECT svc_name, MIN(checked) AS first_check
    FROM svc_health
    GROUP BY svc_name
)
SELECT a.svc_name, SUM(CAST((julianday(a.resolved) - julianday(a.fired_at)) * 1440 AS INTEGER)) AS total_downtime_minutes
FROM alert_events a
JOIN svc_start s ON a.svc_name = s.svc_name
WHERE a.resolved IS NOT NULL AND a.fired_at <= datetime(s.first_check, '+365 days')
GROUP BY a.svc_name
```

> **Cost Analysis**
>
> With ~30M rows, the GROUP BY reduces the working set before any downstream operations; the join cost depends on the smaller table's cardinality; CTEs materialize intermediate results, which can be beneficial or costly depending on the engine. An index on the filter/join columns would reduce the scan to a seek.

> **Interviewers Watch For**
>
> Interviewers watch for whether you decompose the problem into named, testable stages rather than nesting everything; how you handle NULL values and whether you account for them in filters and aggregations; how you handle date arithmetic and whether you account for edge cases like month boundaries.

> **Common Pitfall**
>
> Forgetting to filter NULLs creates phantom groups or inflated counts. Always check `null_fraction` in the schema before assuming columns are clean.

---

## Common follow-up questions

- If `resolved` in `alert_events` is NULL for some rows, how would your aggregation or join logic be affected? _(Probes understanding of NULL propagation through joins and aggregate functions on `alert_events.resolved`.)_
- If `svc_health` grew to contain billions of rows, which part of your query would become the bottleneck given the cardinality of `check_id`? _(Tests ability to identify performance hotspots related to `svc_health.check_id` at scale.)_
- Would materializing the CTE as a temp table improve performance when joining `svc_health` and `alert_events`? _(Tests understanding of CTE materialization behavior across different database engines.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/service_uptime_minutes)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.