# Smooth Latency

> Noisy latency readings, smoothed into a trend you can trust.

Canonical URL: <https://datadriven.io/problems/smooth_latency>

Domain: SQL · Difficulty: medium · Seniority: L5

## Problem

Your team is investigating whether certain API endpoints are getting slower over time. For every row in api_calls where latency is not NULL, compute a running average of latency partitioned by endpoint and ordered by call_time, covering all calls up to and including the current one. Return endpoint, latency, and running_avg.

## Worked solution and explanation

### Why this problem exists in real interviews

Running aggregates per group are a staple of SRE dashboards and anomaly detection pipelines. Interviewers use this pattern to distinguish candidates who know window functions from those who try to self-join or use correlated subqueries, which fall apart at billion-row scale.

---

### Break down the requirements

#### Step 1: Filter nulls before windowing

The WHERE latency IS NOT NULL filter removes the ~0.1% of rows with missing latency before the window function runs. Filtering first is cheaper than letting the window absorb nulls.

#### Step 2: Partition by endpoint

PARTITION BY endpoint resets the running average for each endpoint independently. Without this, AVG accumulates across all endpoints and produces a meaningless global trend.

#### Step 3: Order and frame

ORDER BY call_time inside the window determines the sequence of calls. The default frame ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW covers every call up to the current one, which is exactly the running average definition.

---

### The solution

**Running average per endpoint**

```sql
SELECT
    endpoint,
    latency,
    AVG(latency) OVER (
        PARTITION BY endpoint
        ORDER BY call_time
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_avg
FROM api_calls
WHERE latency IS NOT NULL
```

> **Cost Analysis**
>
> At 2 billion rows partitioned by 300 endpoints, the engine sorts each partition independently. A composite index on (endpoint, call_time) converts the sort to an index scan and can cut wall time by 10x on warm cache.

> **Interviewers Watch For**
>
> Interviewers watch for whether you use PARTITION BY rather than GROUP BY (which would collapse the rows), whether you correctly frame the window, and whether you handle the NULL filter before or inside the window.

> **Common Pitfall**
>
> Omitting the frame clause relies on the database default, which varies. In most databases ORDER BY alone implies RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, not ROWS. For non-unique call_time values, RANGE ties all rows with the same timestamp together, so the running average jumps in steps rather than updating row by row. Explicit ROWS is safer.

---

## Common follow-up questions

- Two calls to the same endpoint share an identical call_time. How does your window frame handle ties, and would RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW give a different result than ROWS? _(Tests whether the candidate understands RANGE vs ROWS frame semantics when the ORDER BY key is non-unique.)_
- If you changed the WHERE to allow NULLs and used AVG(latency) over the full table, how would NULLs affect the running average compared to a query that replaces NULLs with zero? _(Tests knowledge of how NULL propagates through AVG vs COUNT.)_
- How would you change this query to compute a 5-call rolling average instead of the running average from the start of history? _(Tests ability to convert a running average into a bounded rolling window.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/smooth_latency)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.