# First and Last Timeout Per Service

> First timeout. Last timeout. Each service.

Canonical URL: <https://datadriven.io/problems/first_and_last_timeout_per_service>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

Surface the first and last times each service experienced a connection timeout. Only look at error messages that mention 'timed out', and results should be ordered by service name.

## Worked solution and explanation

### What this is really asking

`first_at` is the column you reduce, `message LIKE '%timed out%'` is the filter, `svc_name` is the group key. One row per service, earliest and latest timeout, across 40M partitioned rows.

---

### Break down the requirements

#### Step 1: Filter to timeout messages

`message LIKE '%timed out%'` in WHERE so the aggregate only sees timeouts, not every err_type.

#### Step 2: Reduce to MIN and MAX

Group by `svc_name`, pull `MIN(first_at)` and `MAX(first_at)`. One pass over the filtered set.

#### Step 3: Order by service name

Sort by `svc_name` ascending so the grader sees deterministic output.

---

### The solution

**FIRST AND LAST TIMEOUT PER SERVICE**

```sql
SELECT
  svc_name,
  MIN(first_at) AS earliest_timeout,
  MAX(first_at) AS latest_timeout
FROM err_tracks
WHERE message LIKE '%timed out%'
GROUP BY svc_name
ORDER BY svc_name;
```

> **Cost Analysis**
>
> Leading `%` blocks any btree on message, so the filter is a scan. No time bound means partitions do not prune. One hash aggregate, O(N).

> **Interviewers Watch For**
>
> Reaching for ROW_NUMBER when a flat MIN and MAX answers it, or filtering on `err_type` instead of the `message` text the prompt names.

> **Common Pitfall**
>
> Putting LIKE in HAVING. It still returns the right rows but aggregates non-timeouts first and throws them away. Filter in WHERE.

> **The False Start**
>
> First instinct is two self-joins, one for earliest, one for latest, joined on svc_name. That reads the table three times and breaks on ties. Pivot to one GROUP BY with MIN and MAX.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- How would you also return the count of timeouts per service? _(Add `COUNT(*)` to the SELECT; the GROUP BY already supports it.)_
- What if you needed the err_id of the earliest timeout? _(Switch to ROW_NUMBER OVER (PARTITION BY svc_name ORDER BY first_at) and keep rn = 1; MIN cannot carry sibling columns.)_
- How would partition pruning on first_at help here? _(A date window in WHERE on first_at lets the planner skip partitions and cut scan size.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/first_and_last_timeout_per_service)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.