# Last Server Activity

> Each server's last heartbeat.

Canonical URL: <https://datadriven.io/problems/last_server_activity>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

The SRE team wants to spot servers that may have gone silent. Show each server and its most recent log timestamp, with the most recently active server first.

## Worked solution and explanation

### Why this problem exists in real interviews

Querying server_logs for server_name data using grouping tests whether you can translate a business requirement into the right column references and filter sequence. It shows up as a fundamentals check to verify practical fluency.

> **Trick to Solving**
>
> Keeping the most recent row per group is a classic `ROW_NUMBER` pattern.
> 
> 1. `ROW_NUMBER() OVER (PARTITION BY group_col ORDER BY ts DESC)` assigns 1 to the latest row
> 2. Wrap in a subquery or CTE
> 3. Filter to `rn = 1`

---

### Break down the requirements

#### Step 1: Aggregate with MAX

Group by the output grain and apply `MAX()` to compute the metric. The `GROUP BY` must match exactly what the output needs: one row per group key.

#### Step 2: Order the final output

Apply `ORDER BY` as specified to produce the expected row sequence. When tied values exist, add a secondary sort column for determinism.

---

### The solution

**MAX aggregate per group for latest activity**

```sql
SELECT server_name, MAX(log_timestamp) AS last_activity
FROM server_logs
GROUP BY server_name
ORDER BY last_activity DESC
```

> **Cost Analysis**
>
> The query scans 60M rows from `server_logs`. CTEs in most engines are optimization fences. For production workloads, consider inlining or materializing the intermediate results.

> **Interviewers Watch For**
>
> Explaining why `ROW_NUMBER` is preferred over `DISTINCT` for deduplication shows you understand the difference between collapsing and selecting. Breaking complex logic into named CTEs shows the interviewer you prioritize readability and debuggability.

> **Common Pitfall**
>
> Comparing dates stored as TEXT without casting can produce lexicographic instead of chronological ordering. Always confirm the column type.

---

## Common follow-up questions

- What happens to your result if server_logs.response_time_ms contains NULLs for some rows? _(Tests whether the candidate accounts for NULL behavior in aggregates and comparisons on response_time_ms.)_
- How would you verify that your aggregation on server_logs.log_id is not double-counting due to duplicate rows? _(Tests data quality awareness and deduplication strategies.)_
- With millions of distinct values in server_logs.log_id, what index strategy would you use to keep this query performant? _(Tests indexing knowledge specific to high-cardinality columns like log_id.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/last_server_activity)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.