# Second Highest Latency by Method

> Almost the slowest. By method.

Canonical URL: <https://datadriven.io/problems/second_highest_latency_by_method>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

Find the second-highest latency API endpoint in each HTTP method group. If multiple endpoints share the highest latency, the second-highest is the next unique latency value below that. Show method, endpoint, and latency.

## Worked solution and explanation

### Why this problem exists in real interviews

Interviewers use this latency monitoring scenario to test DENSE_RANK for gapless ranking against the `api_calls` table. The focus is on how you handle columns like `endpoint`, `method`, and `latency` when building the result.

---

### Break down the requirements

#### Step 1: Isolate the intermediate result in a CTE

The `ranked` CTE computes the intermediate aggregation that the outer query builds on. This separation keeps each layer focused on a single task.

#### Step 2: Select the target columns

The SELECT clause picks exactly the columns the prompt asks for. Returning extra columns or missing a required alias would fail the grading check.

---

### The solution

**Dense-rank for second highest latency by method**

```sql
WITH ranked AS (
    SELECT method, endpoint, latency, DENSE_RANK() OVER (PARTITION BY method ORDER BY latency DESC) AS rnk
    FROM api_calls
)
SELECT method, endpoint, latency
FROM ranked
WHERE rnk = 2
```

> **Cost Analysis**
>
> With ~200M rows, the window function runs on the reduced set after filtering and grouping; CTEs materialize intermediate results, which can be beneficial or costly depending on the engine. An index on the filter/join columns would reduce the scan to a seek.

> **Interviewers Watch For**
>
> Interviewers watch for which ranking function you choose and whether you can explain the difference between `ROW_NUMBER`, `RANK`, and `DENSE_RANK`; whether you decompose the problem into named, testable stages rather than nesting everything; whether you use a subquery or self-join, and can explain the tradeoffs.

> **Common Pitfall**
>
> Using `ROW_NUMBER()` when the prompt requires tie inclusion silently drops tied rows. Read the prompt for language about ties or 'include all'.

---

## Common follow-up questions

- If `err_msg` in `api_calls` is NULL for some rows, how would your aggregation or join logic be affected? _(Probes understanding of NULL propagation through joins and aggregate functions on `api_calls.err_msg`.)_
- If `api_calls` grew to contain billions of rows, which part of your query would become the bottleneck given the cardinality of `latency`? _(Tests ability to identify performance hotspots related to `api_calls.latency` at scale.)_
- What is the default window frame for your window function, and would explicitly setting ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW change anything? _(Tests knowledge of implicit vs explicit window frame specifications.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/second_highest_latency_by_method)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.