# Longest Gap Between Token Events

> The longest gap between token events.

Canonical URL: <https://datadriven.io/problems/longest_gap_between_token_events>

Domain: SQL · Difficulty: medium · Seniority: L5

## Problem

The security team is reviewing API key rotation policy. Calculate the longest period in days between consecutive token issuances and the longest period between consecutive token expirations in api_tokens. Ignore null expiration dates. Return both maximum gap values.

## Worked solution and explanation

### Why this problem exists in real interviews

Time-series analysis over event streams often asks "what is the longest period where no activity happened?" Platform teams use it to spot outages, lulls, or batch-issued cohorts. The interview tests whether the candidate reaches for `LAG()` over an ordered stream instead of self-joining the table to itself on `id = id - 1` (which does not work when rows are not densely numbered).

> **Trick to Solving**
>
> "Longest gap between consecutive events across the whole table" = `MAX(t - LAG(t) OVER (ORDER BY t))`, with no PARTITION clause. In SQLite, express the time difference in days by subtracting `JULIANDAY()` values. The subtlety: issue and expire are two independent orderings of the same table, so each needs its own ordered scan in its own CTE.

---

### Break down the requirements

#### Step 1: Order by issued globally and take the LAG gap

Inside one CTE, sort every row in `api_tokens` by `issued` with no partitioning, and compute `JULIANDAY(issued) - JULIANDAY(LAG(issued) OVER (ORDER BY issued))`. Each row now carries the days since the previous issue across the whole table.

#### Step 2: Order by expires and take the LAG gap, excluding NULLs

Inside a second CTE, filter to `expires IS NOT NULL` first, then sort globally by `expires` and apply the same `LAG` trick. Tokens with no expiration drop out of this timeline as the prompt requires.

#### Step 3: Take the max of each stream

`MAX(gap)` over each CTE is the longest gap. The first row in each stream has a NULL gap (no prior row); `MAX` ignores NULLs.

#### Step 4: Assemble one row with two columns

Emit a single row using two scalar subqueries: one for the issue max, one for the expire max.

---

### The solution

**Two independent LAG streams over api_tokens**

```sql
WITH issue_gaps AS (
    SELECT JULIANDAY(issued)
           - JULIANDAY(LAG(issued) OVER (ORDER BY issued)) AS gap
    FROM api_tokens
),
expire_gaps AS (
    SELECT JULIANDAY(expires)
           - JULIANDAY(LAG(expires) OVER (ORDER BY expires)) AS gap
    FROM api_tokens
    WHERE expires IS NOT NULL
)
SELECT (SELECT MAX(gap) FROM issue_gaps)  AS max_issue_gap_days,
       (SELECT MAX(gap) FROM expire_gaps) AS max_expire_gap_days
```

> **Cost Analysis**
>
> `api_tokens` has 400,000 rows. Each CTE does one scan plus one sort for the window. Indexes on `issued` and `expires` turn both sorts into ordered range scans. The scalar subqueries each reduce to a single MAX.

> **Interviewers Watch For**
>
> The biggest signal is recognizing that the two gaps are independent global sorts of the same table, not a single query that tries to order by both columns. A second signal: filtering `expires IS NOT NULL` before the expire window so tokens that never expire do not participate in the expire timeline, as the prompt requires.

> **Common Pitfall**
>
> Adding `PARTITION BY token_id` turns this into a per-token gap query, which is a different question and gives a much smaller answer. Re-read the prompt: the gap is across the whole table, not per token. The other common trap is including tokens with NULL expirations in the expire stream; the prompt says they should not contribute.

---

## Common follow-up questions

- What if you wanted the longest gap per owner rather than globally? _(Add `PARTITION BY owner_id` to each window and group the final MAX by `owner_id`.)_
- How would you return the actual dates that bookended the longest gap? _(Keep `issued` and `LAG(issued)` in the CTE, then order by `gap DESC LIMIT 1` to grab the bookend timestamps alongside the max.)_
- With millions of tokens, what index strategy keeps the window ordered without an explicit sort? _(A plain btree index on `issued` and another on `expires` lets the planner read rows in window order and skip the sort entirely.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/longest_gap_between_token_events)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.