# Intra-Region Latency Diff

> Same region. Different latency.

Canonical URL: <https://datadriven.io/problems/intra_region_latency_diff>

Domain: SQL · Difficulty: hard · Seniority: L5

## Problem

The infra team tracks node-level latency within regions. For each service node (svc_name), compute the average absolute latency difference between it and every other node in the same region, but only when both nodes have a status of 'healthy' and their uptime values are within 5 percentage points of each other.

## Worked solution and explanation

### What this is really asking

`svc_health` self-joins on `region` with `check_id != check_id`, so every healthy check pairs with every other healthy check in its region. The `ABS(uptime diff) <= 5` predicate is an inequality join, not a row filter.

---

### Break down the requirements

#### Step 1: Self-join the fact

Alias `svc_health` as `s1` and `s2`. Join on same `region` and `s1.check_id != s2.check_id` so a check never pairs with itself. Pairs appear twice (s1,s2) and (s2,s1), which is fine for an average.

#### Step 2: Apply pairwise predicates

Both sides need `status = 'healthy'`. The uptime band is `ABS(s1.uptime - s2.uptime) <= 5`. That predicate compares two rows, so it belongs in the join/WHERE, not a per-row filter.

#### Step 3: Aggregate from the left side

Group by `s1.svc_name` and take `AVG(ABS(s1.latency - s2.latency))`. The average is over every valid neighbor pair, weighted by how many neighbors each check has.

---

### The solution

**AVG PAIRWISE LATENCY DIFF PER SERVICE**

```sql
SELECT s1.svc_name,
       AVG(ABS(s1.latency - s2.latency)) AS avg_latency_diff
FROM svc_health s1
JOIN svc_health s2
  ON s1.region = s2.region
 AND s1.check_id != s2.check_id
WHERE s1.status = 'healthy'
  AND s2.status = 'healthy'
  AND ABS(s1.uptime - s2.uptime) <= 5
GROUP BY s1.svc_name;
```

> **Cost Analysis**
>
> A self-join on a 70M-row table is quadratic worst case. The `region` equality is the hash key; `ABS(uptime diff) <= 5` and `check_id != s2.check_id` are residual filters that the planner can't accelerate. Sub-region the join key in production.

> **Interviewers Watch For**
>
> Whether you put `status = 'healthy'` on both aliases. Filtering only `s1` lets unhealthy `s2` rows leak into the average.

> **Common Pitfall**
>
> Writing `s1.check_id < s2.check_id` to dedupe pairs changes the average if neighbor counts differ across checks. The expected query keeps both directions, so each check's neighborhood is fully represented.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- How would you cut self-join cost on 70M rows? _(Pre-aggregate the latest healthy check per `(svc_name, region)` in a CTE, then self-join the reduced set. You lose intra-day pairs but kill the cardinality.)_
- What if `uptime` is NULL for some checks? _(`ABS(NULL - x) <= 5` is unknown, so those pairs drop. If the policy is 'treat NULL uptime as healthy', wrap in `COALESCE(uptime, 100)` or guard with `IS NOT NULL`.)_
- How would you also return the neighbor count behind each average? _(Add `COUNT(*) AS pair_count` to the SELECT. It exposes whether a small-neighborhood node's average is statistically thin.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/intra_region_latency_diff)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.