# Timeout Warning Logs

> Timeout warnings. The postmortem trail.

Canonical URL: <https://datadriven.io/problems/timeout_warning_logs>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

During an incident postmortem, the on-call engineer needs all available fields for warning-level log entries where the message mentions a timeout.

## Worked solution and explanation

### Why this problem exists in real interviews

On a 50M row `server_logs` table, the interviewer is probing whether you pin down ambiguous text matching before you write. They want to hear you ask: am I matching the literal substring `timeout`, or any variant (`timed out`, `timing out`, `Timeout`)? And is `log_level` stored as `WARN`, `WARNING`, or mixed case? Skipping that conversation is the signal they care about.

---

### Break down the requirements

#### Step 1: Confirm the literal

Ask whether `message` contains the substring `timeout` or also variants. The brief says mentions a timeout, so a `LIKE '%timeout%'` substring match on `message` is the agreed interpretation.

#### Step 2: Pin the severity literal

Confirm `log_level` is stored as `'WARN'` (not `'WARNING'` or `'warn'`). String equality in WHERE is case sensitive in most engines; one wrong literal returns zero rows on 50M scans.

#### Step 3: Select all fields

Postmortem needs every column, so `SELECT *` is correct here. Do not project a subset; the on-call wants `response_time_ms` and `log_timestamp` alongside `message`.

#### Step 4: Combine with AND

Two predicates joined by `AND` on `log_level` and `message`. No need for `IN` or `OR` chains since you have one severity and one substring.

---

### The solution

**TIMEOUT WARNING LOGS**

```sql
SELECT *
FROM server_logs
WHERE log_level = 'WARN'
  AND message LIKE '%timeout%'
```

> **Cost Analysis**
>
> 50M rows, partitioned by `log_timestamp`, but this query has no time predicate so partition pruning does not fire. Leading-wildcard `LIKE '%timeout%'` cannot use a B-tree on `message`. Expect a full scan. If the postmortem has a window, push a `log_timestamp BETWEEN` predicate to prune partitions before the substring filter.

> **Interviewers Watch For**
>
> Say out loud: should I match only `timeout` or also `timed out` and `timing out`, and is severity stored as `WARN` or `WARNING`? Writing without asking is the miss. If they say variants, switch to `LOWER(message) LIKE '%timeout%' OR LOWER(message) LIKE '%timed out%'`.

> **Common Pitfall**
>
> Writing `message LIKE 'timeout%'` (no leading wildcard) misses every log where timeout is mid-sentence, which is most of them. The other classic: `log_level = 'warning'` lowercase when the column stores `'WARN'`, returning zero rows and looking like the data is missing.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- How would you handle case-insensitive matching on `log_level` and `message` without killing performance? _(Probes awareness that wrapping the column in `LOWER()` defeats indexes; the fix is a functional index or storing a normalized column.)_
- If the postmortem covers a 6-hour window, how do you rewrite this to exploit the `log_timestamp` partition? _(Tests partition pruning intuition: add a `log_timestamp BETWEEN` predicate so the scan touches one or two partitions instead of fifty.)_
- What index would you add if this query runs hourly? _(Tests whether you know leading-wildcard `LIKE` cannot use B-tree; the answer is a trigram index (Postgres `pg_trgm`) or a full-text index on `message`.)_
- How would you extend this to match `timeout`, `timed out`, and `timing out` in one pass? _(Probes regex versus OR-chain trade-offs: `message ~* 'tim(e|ing|ed)\s?out'` in Postgres versus three OR clauses.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/timeout_warning_logs)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.