# Morning Warning Logs

> Warnings before noon.

Canonical URL: <https://datadriven.io/problems/morning_warning_logs>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

After noticing pre-business-hours warning spikes, the SRE team wants to review every WARN-level server log entry that fired before noon. Return all fields.

## Worked solution and explanation

### Why this problem exists in real interviews

Querying server_logs for server_name data using date extraction tests whether you can translate a business requirement into the right column references and filter sequence. It shows up as a fundamentals check to verify practical fluency.

---

### Break down the requirements

#### Step 1: Read from `server_logs`

The query targets `server_logs` with 6 columns. Identify which columns are needed for the output.

#### Step 2: Filter to the target rows

Apply the date filter using `STRFTIME` to extract and compare the relevant time component. This restricts rows before aggregation.

#### Step 3: Return the result set

Select the required columns with any necessary aliasing or formatting.

---

### The solution

**Time-of-day filter with level match**

```sql
SELECT log_id, server_name, log_level, message, response_time_ms, log_timestamp
FROM server_logs
WHERE log_level = 'WARN'
    AND CAST(STRFTIME('%H', log_timestamp) AS INTEGER) < 12
```

> **Cost Analysis**
>
> The query scans 40M rows from `server_logs`.

> **Interviewers Watch For**
>
> Candidates who verbalize their approach before typing, naming the output columns and expected row count, consistently perform better.

> **Common Pitfall**
>
> Returning more columns than the prompt asks for can trigger a "wrong schema" failure in automated grading. Match the output specification exactly.

---

## Common follow-up questions

- What happens to your result if server_logs.response_time_ms contains NULLs for some rows? _(Tests whether the candidate accounts for NULL behavior in aggregates and comparisons on response_time_ms.)_
- How would you verify that your aggregation on server_logs.log_id is not double-counting due to duplicate rows? _(Tests data quality awareness and deduplication strategies.)_
- With millions of distinct values in server_logs.log_id, what index strategy would you use to keep this query performant? _(Tests indexing knowledge specific to high-cardinality columns like log_id.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/morning_warning_logs)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.