# Average Update Call Latency

> Follow-up calls. How fast?

Canonical URL: <https://datadriven.io/problems/average_update_call_latency>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

The platform team is investigating whether API latency improves after a user's initial handshake. An 'update call' is any API call after a user's earliest one. What is the average latency across all update calls platform-wide?

## Worked solution and explanation

### What this is really asking

`ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY call_time)` picks exactly one row as rn=1 per user, even when two calls share a call_time. Filtering rn > 1 removes the handshake row, then AVG(latency) runs over the rest of api_calls.

---

### Break down the requirements

#### Step 1: Tag each user's earliest call

Partition by user_id, order by call_time ascending, assign ROW_NUMBER. The user's first call gets rn=1; everything else is an update call.

#### Step 2: Drop the firsts, average the rest

Wrap the windowed query in a subquery and filter rn > 1 in the outer WHERE. AVG(latency) then runs platform-wide, not per user.

#### Step 3: Cast for a clean numeric

AVG on an INTEGER column can return an integer in some engines. CAST AS REAL forces a decimal so 47.3 does not collapse to 47.

---

### The solution

**AVERAGE UPDATE CALL LATENCY**

```sql
SELECT CAST(AVG(latency) AS REAL) AS avg_update_call_latency
FROM (
  SELECT *,
         ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY call_time) AS rn
  FROM api_calls
) ranked
WHERE rn > 1
```

> **Cost Analysis**
>
> ROW_NUMBER over 600M rows is one partitioned sort by (user_id, call_time). The call_time partition on api_calls does not help the sort, but the window is single-pass and streams into AVG without an extra join.

> **Interviewers Watch For**
>
> Why ROW_NUMBER and not RANK or MIN(call_time)? RANK ties both first calls at 1, MIN-with-NOT-IN drops both. ROW_NUMBER guarantees exactly one row per user is excluded.

> **Common Pitfall**
>
> Filtering rn > 1 inside the window query (next to the SELECT) does not work. WHERE runs before window functions are materialized, so the predicate must live in an outer query.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- How would you compute average update latency per user instead of platform-wide? _(Same subquery, then GROUP BY user_id in the outer query, AVG(latency) per group.)_
- What if a user has only one call ever? _(They contribute zero rows after the rn > 1 filter, which is correct: they have no update calls.)_
- Two calls share the same call_time for one user. Which is the handshake? _(ROW_NUMBER breaks the tie nondeterministically. Add a tiebreaker like call_id to ORDER BY for reproducibility.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/average_update_call_latency)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.