# Tenure Spread for Active Tokens

> Tenure extremes among active tokens.

Canonical URL: <https://datadriven.io/problems/tenure_spread_for_active_tokens>

Domain: SQL · Difficulty: hard · Seniority: L4

## Problem

Security wants the tenure profile of every API token still in service. A token is in service if it has no expiration date or its expiration is today or later. Across that population, report how many days separate the oldest and newest tokens, how many tokens were issued on that earliest day, and how many were issued on that latest day.

## Worked solution and explanation

### Why this problem exists in real interviews

Three small skills stacked into one query: a NULL-aware filter, a min/max bound, and tie counts at each end of that range. The interviewer is checking that the active filter handles `expires IS NULL` (never-expiring tokens) and that the boundary counts are computed in one pass instead of three round trips.

> **Trick to Solving**
>
> Two CTE layers keep the logic clean.
> 
> 1. `active_tokens`: filter to tokens with `expires IS NULL OR expires >= today`
> 2. `bounds`: take `MIN(issued)` and `MAX(issued)` across active_tokens
> 3. Outer: scalar subqueries against `active_tokens` count how many share the min and max issued dates

---

### Break down the requirements

#### Step 1: Define the active population

Build `active_tokens` with `WHERE expires IS NULL OR expires >= DATE('now')`. The `IS NULL` branch keeps tokens that never expire; the `>= today` branch keeps tokens whose expiration is today or in the future.

#### Step 2: Find the issue-date extremes

Compute `MIN(issued)` and `MAX(issued)` over the active set in a `bounds` CTE. Two scalar values: the earliest issue date and the latest.

#### Step 3: Day spread between the extremes

Cast to days: `CAST(JULIANDAY(max_issued) - JULIANDAY(min_issued) AS INTEGER)` gives the integer day gap on SQLite. On Postgres, `(max_issued - min_issued)` returns an interval that can be cast to integer days.

#### Step 4: Count ties at each boundary

Two scalar subqueries against `active_tokens` count rows where `issued = min_issued` and `issued = max_issued` to surface ties at each end. Project three columns: `day_spread`, `tokens_at_earliest`, `tokens_at_latest`.

---

### The solution

**NULL-aware filter, bounds CTE, scalar tie counts**

```sql
WITH active_tokens AS (
    SELECT token_id, issued, expires
    FROM api_tokens
    WHERE expires IS NULL OR expires >= DATE('now')
),
bounds AS (
    SELECT MIN(issued) AS min_issued, MAX(issued) AS max_issued
    FROM active_tokens
)
SELECT
    CAST(JULIANDAY(b.max_issued) - JULIANDAY(b.min_issued) AS INTEGER) AS day_spread,
    (SELECT COUNT(*) FROM active_tokens WHERE issued = b.min_issued) AS tokens_at_earliest,
    (SELECT COUNT(*) FROM active_tokens WHERE issued = b.max_issued) AS tokens_at_latest
FROM bounds b
```

> **Time and Space Complexity**
>
> **Time:** O(n) scan over `api_tokens` (500K rows) for the active filter and the bounds. The two scalar tie-count subqueries each touch the active set once more; on a partial index keyed by `issued` they collapse to small range probes.
> 
> **Space:** O(active) for the materialized active set; the bounds row is a single tuple.

> **Interviewers Watch For**
>
> Strong candidates flag the NULL trap immediately: `expires > today` silently drops never-expiring tokens because `NULL > anything` is NULL. They write `expires IS NULL OR expires >= today` from the start.

> **Common Pitfall**
>
> `WHERE expires >= DATE('now')` without the IS NULL branch. Tokens with `expires IS NULL` are silently excluded, so the spread and tie counts shrink. The `IS NULL OR ...` pattern is the only correct way to express "in service" when NULL means "no expiration".

---

## Common follow-up questions

- If `issued` were a TIMESTAMP instead of a DATE, what changes about the day spread calculation? _(Tests handling timestamps and timezones in date arithmetic.)_
- How would the query change if security wanted the spread and tie counts per `scope` instead of globally? _(Tests adapting the same shape to a per-group rollup.)_
- How would you produce the same three columns using `MIN()`/`MAX()` and `COUNT() FILTER` in a single SELECT, without CTEs? _(Tests collapsing CTEs into one window-function pass.)_
- If `api_tokens` grew to 100M rows, what index would let the active filter avoid a full scan? _(Tests partial-index design for selective filters.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/tenure_spread_for_active_tokens)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.