# Most Recent Token Usage

> Each user's latest token activity.

Canonical URL: <https://datadriven.io/problems/most_recent_token_usage>

Domain: SQL · Difficulty: easy · Seniority: L4

## Problem

The security team is auditing API token activity. For each token owner, pull only the single most recently used record. Include all token fields (ID, owner, scope, status, issued date, expiration, last used date, request count) and a row indicator.

## Worked solution and explanation

### Why this problem exists in real interviews

Per-entity latest-row queries are the bread and butter of operational reporting. Interviewers use this prompt to see whether you reach for ROW_NUMBER with a deterministic tie-break instead of GROUP BY + MAX, which silently loses non-aggregated columns or breaks on ties.

---

### Break down the requirements

#### Step 1: Pick one row per owner_id, latest first

Use ROW_NUMBER() OVER (PARTITION BY owner_id ORDER BY last_used DESC, token_id DESC). PARTITION BY owner_id makes each owner an independent ranking, and ORDER BY last_used DESC puts the freshest row at rank 1.

#### Step 2: Tie-break on largest token_id

If two tokens share the same last_used, the prompt says keep the larger token_id. Adding token_id DESC as a secondary sort key in the window guarantees a single rn = 1 row per owner with no nondeterminism.

#### Step 3: Filter rn = 1 in the outer query and project all columns

Wrap the windowed query in a subquery and filter WHERE rn = 1. Project every api_tokens column explicitly so the result schema matches the spec, and order the final output by owner_id ascending.

---

### The solution

**ROW_NUMBER per owner with explicit tie-break**

```sql
SELECT token_id, owner_id, scope, status, issued, expires, last_used, requests FROM (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY owner_id ORDER BY last_used DESC, token_id DESC) AS rn FROM api_tokens
) WHERE rn = 1
ORDER BY owner_id
```

> **Cost Analysis**
>
> api_tokens holds 200,000 rows. The window function does one full scan plus an in-memory sort partitioned by owner_id. With an index on (owner_id, last_used DESC, token_id DESC) SQLite can stream the partitions in order and stop at the first row of each partition, but at 200k rows the unindexed plan is fine. The cost is dominated by the partitioned sort, not the final filter.

> **Interviewers Watch For**
>
> They want ROW_NUMBER with both sort keys spelled out. GROUP BY owner_id + MAX(last_used) is wrong because it cannot return token_id, scope, status, etc. for the chosen row without a self-join. They also check that you order the final output by owner_id ascending rather than relying on the engine.

> **Common Pitfall**
>
> Omitting token_id DESC from the window's ORDER BY produces nondeterministic results when two rows share last_used. The query passes some test runs and fails others, which is the worst kind of bug. The other trap is using RANK or DENSE_RANK instead of ROW_NUMBER, which keeps both tied rows and breaks the one-row-per-owner contract.

---

## Common follow-up questions

- How would you write this without a window function in plain SQLite? _(Forces a self-join: api_tokens t JOIN (SELECT owner_id, MAX(last_used) AS mx FROM api_tokens GROUP BY owner_id) m ON ... and a follow-up tie-break by token_id. The candidate should mention this is more code and slower than ROW_NUMBER.)_
- What changes if last_used can be NULL and you still want one row per owner? _(Tests whether the candidate orders NULLs correctly. SQLite sorts NULL first in ascending and last in descending, so DESC pushes NULL last, which usually matches intent but should be stated.)_
- How would you adapt this to return the latest two tokens per owner instead of one? _(Trivial change to WHERE rn <= 2, but the candidate should note that the tie-break suddenly matters more because two rows are kept.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/most_recent_token_usage)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.