# Endpoint Ranking

> The slowest endpoints. Called to the principal's office.

Canonical URL: <https://datadriven.io/problems/endpoint_ranking>

Domain: SQL · Difficulty: hard · Seniority: L4

## Problem

The SRE team wants the slowest endpoints surfaced for a performance review. For each endpoint in api_calls, count its calls and compute its average latency, skipping rows where latency is NULL. Then rank endpoints by average latency, slowest first, with ties sharing a position. Return the endpoint, call count, average latency, and position.

## Worked solution and explanation

### Why this problem exists in real interviews

Ranking queries are a staple of SQL interviews. This tests window functions: whether you aggregate before ranking and choose the correct function based on tie-handling requirements.

> **Trick to Solving**
>
> Before ranking, aggregate to the right grain. If you rank raw rows, you rank individual calls instead of endpoints. Always: aggregate first, rank second.
> 
> 1. Determine the grain (one row per endpoint)
> 2. Compute the ranking metric (call count, average latency)
> 3. Apply the window function and filter in an outer query

---

### Break down the requirements

#### Step 1: Aggregate per endpoint

`GROUP BY endpoint` with `COUNT(*)` and `AVG(latency)` to get per-endpoint metrics.

#### Step 2: Rank with a window function

`RANK() OVER (ORDER BY COUNT(*) DESC)` assigns positions. Wrap the aggregation in a CTE.

#### Step 3: Filter to top N

`WHERE rnk <= 10` in the outer query selects the top endpoints.

---

### The solution

**Aggregate then rank with window function**

```sql
SELECT endpoint, total_calls, avg_latency, rnk
FROM (
    SELECT endpoint,
           COUNT(*) AS total_calls,
           ROUND(AVG(latency), 2) AS avg_latency,
           RANK() OVER (ORDER BY AVG(latency) DESC) AS rnk
    FROM api_calls
    GROUP BY endpoint
) ranked
WHERE rnk <= 10
ORDER BY rnk
```

> **Cost Analysis**
>
> The GROUP BY reduces to one row per endpoint. The window function sorts the small aggregated set, which is cheap.

> **Interviewers Watch For**
>
> The interviewer checks two things: (1) aggregate before rank, and (2) correct choice of RANK vs. ROW_NUMBER vs. DENSE_RANK.

> **Common Pitfall**
>
> Using `LIMIT 10` instead of a window function silently drops ties. If two endpoints are tied at position 10, LIMIT returns one arbitrarily.

---

## Common follow-up questions

- When would you use DENSE_RANK instead of RANK? _(Tests understanding: DENSE_RANK never skips rank numbers.)_
- How would you rank within each HTTP method? _(Tests PARTITION BY usage in the window function.)_
- What if the ranking metric was p99 latency? _(Tests changing the aggregate while keeping the ranking structure.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/endpoint_ranking)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.