# The Gap Reporter

> The missing IDs in the log - somebody has to notice.

Canonical URL: <https://datadriven.io/problems/the_gap_reporter>

Domain: Python · Difficulty: medium · Seniority: L4

## Problem

Given a sorted list of distinct integers and inclusive bounds lower and upper, return a list of [start, end] pairs covering all integers in [lower, upper] that are NOT in the list. Each pair is a maximal contiguous gap (single integers use [x, x]).

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **range gap detection in a sorted sequence**, a pattern used in data quality audits, partition analysis, and monitoring alerting. It probes boundary handling and range arithmetic.

---

### Break down the requirements

#### Step 1: Start scanning from the lower bound

Track the next expected value, starting at `lower`.

#### Step 2: For each number in the sorted list, check for a gap

If the current number is greater than the expected value, the range `[expected, current - 1]` is missing.

#### Step 3: Handle the trailing gap

After processing all numbers, if the expected value is still within bounds, report the final gap `[expected, upper]`.

---

### The solution

**Linear scan with expected-value tracking**

```python
def find_gaps(nums: list, lower: int, upper: int) -> list:
    gaps = []
    expected = lower
    for num in nums:
        if num > expected:
            gaps.append([expected, num - 1])
        expected = num + 1
    if expected <= upper:
        gaps.append([expected, upper])
    return gaps
```

> **Time and Space Complexity**
>
> **Time:** O(n) where n is the length of the sorted input.
> 
> **Space:** O(g) where g is the number of gaps found.

> **Interviewers Watch For**
>
> Whether you handle the trailing gap after the loop. Many candidates forget to check `expected <= upper` after the last element.

> **Common Pitfall**
>
> Off-by-one errors on gap boundaries. The gap ends at `num - 1`, not `num`, because `num` itself is present.

---

## Common follow-up questions

- What if single missing values should be reported as a scalar, not a range? _(Tests formatting `[x, x]` as just `x`.)_
- What if the input is unsorted? _(Tests sorting first, adding O(n log n).)_
- How would you report this in SQL? _(Tests `LAG()` window function to compare consecutive rows.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_gap_reporter)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.