# Log Priority

> Which servers are on fire before coffee?

Canonical URL: <https://datadriven.io/problems/log_priority>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

The on-call rotation wants a triage view of recent server activity. For each log entry that has a recorded response time, show the server name, the original log level, the response time in milliseconds, and a triage label: logs at the CRITICAL or ERROR level should be marked urgent, everything else routine. Surface the urgent entries first, and within the same triage label, arrange by response time from slowest to fastest.

## Worked solution and explanation

### Why this problem exists in real interviews

Querying server_logs for server_name data using query construction tests whether you can translate a business requirement into the right column references and filter sequence. It shows up as a fundamentals check to verify practical fluency.

---

### Break down the requirements

#### Step 1: Filter to the target rows

Apply the `WHERE` filter to restrict the working set before aggregation. Filtering early reduces the number of rows that downstream operations process.

#### Step 2: Order the final output

Apply `ORDER BY` as specified to produce the expected row sequence. When tied values exist, add a secondary sort column for determinism.

---

### The solution

**Filtered retrieval of high-priority entries**

```sql
SELECT log_id, server_name, log_level, message, response_time_ms, log_timestamp
FROM server_logs
WHERE log_level IN ('ERROR', 'CRITICAL')
ORDER BY log_timestamp DESC
```

> **Cost Analysis**
>
> The query scans 500M rows from `server_logs`. CTEs in most engines are optimization fences. For production workloads, consider inlining or materializing the intermediate results.

> **Interviewers Watch For**
>
> Breaking complex logic into named CTEs shows the interviewer you prioritize readability and debuggability.

> **Common Pitfall**
>
> Comparing dates stored as TEXT without casting can produce lexicographic instead of chronological ordering. Always confirm the column type.

---

## Common follow-up questions

- What happens to your result if server_logs.response_time_ms contains NULLs for some rows? _(Tests whether the candidate accounts for NULL behavior in aggregates and comparisons on response_time_ms.)_
- How would you verify that your aggregation on server_logs.log_id is not double-counting due to duplicate rows? _(Tests data quality awareness and deduplication strategies.)_
- With millions of distinct values in server_logs.log_id, what index strategy would you use to keep this query performant? _(Tests indexing knowledge specific to high-cardinality columns like log_id.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/log_priority)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.