# The Spender Leaderboard

> Spending speaks. The leaderboard does the listening.

Canonical URL: <https://datadriven.io/problems/top_spenders_dense_rank>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

Show the top 5 users by total transaction value. Tied users share the same rank with no gaps. Include all tied users at each rank.

## Worked solution and explanation

### Why this problem exists in real interviews

This is a foundational window function problem. Interviewers check whether you know the difference between `ROW_NUMBER`, `RANK`, and `DENSE_RANK`, and whether you aggregate before ranking.

> **Trick to Solving**
>
> "Tied users share the same rank with no gaps" and "include all tied users" are the two signals for `DENSE_RANK`. `ROW_NUMBER` would break ties arbitrarily; `RANK` would skip numbers after ties.
> 
> 1. Aggregate total spend per user
> 2. Apply `DENSE_RANK()` over the aggregate
> 3. Filter to rank `<= 5`

---

### Break down the requirements

#### Step 1: Aggregate total per user

`GROUP BY user_id` with `SUM(total_amount)` collapses 80M transactions to 4M user-level totals.

#### Step 2: Rank with DENSE_RANK

`DENSE_RANK() OVER (ORDER BY total_spend DESC)` assigns the same rank to tied values without gaps.

#### Step 3: Filter to top 5 ranks

Wrap in a subquery and filter `WHERE rnk <= 5` to include all users at ranks 1 through 5.

---

### The solution

**Aggregate then dense rank for tie-inclusive top-N**

```sql
SELECT user_id, total_spend
FROM (
    SELECT
        user_id,
        SUM(total_amount) AS total_spend,
        DENSE_RANK() OVER (ORDER BY SUM(total_amount) DESC) AS rnk
    FROM transactions
    GROUP BY user_id
) ranked
WHERE rnk <= 5
```

> **Cost Analysis**
>
> The `GROUP BY` reduces 80M rows to 4M. The window sort on 4M rows is the bottleneck. If performance is critical, a materialized view of user-level totals would eliminate the repeated aggregation.

> **Interviewers Watch For**
>
> Using `LIMIT 5` instead of `DENSE_RANK` is the most common error. `LIMIT` silently drops tied users and produces non-deterministic output.

> **Common Pitfall**
>
> Ranking individual transactions instead of user-level aggregates. Always aggregate to the output grain before applying window functions.

---

## Common follow-up questions

- What if you needed exactly 5 rows, breaking ties by user_id? _(Now `ROW_NUMBER` with a compound ORDER BY is the correct choice.)_
- How would the result change if some users had negative transaction amounts? _(Tests awareness that SUM includes negatives, potentially reordering rankings.)_
- Could you solve this without a subquery? _(Some engines support `QUALIFY` for inline window-function filtering.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/top_spenders_dense_rank)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.