# User With Most Transactions

> The most active buyer.

Canonical URL: <https://datadriven.io/problems/user_with_most_transactions>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

The VIP program is honoring the user with the most transactions. Pull their username. If multiple users tie for the top spot, include all of them.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests top-1 with tie inclusion. Interviewers check whether you handle the "if ties, include all" requirement with a subquery or window function instead of `LIMIT 1`.

---

### Break down the requirements

#### Step 1: Count transactions per user

`GROUP BY user_id` with `COUNT(*)` produces the transaction count for each user.

#### Step 2: Find the maximum count

A subquery computes `MAX(txn_count)` to identify the top count.

#### Step 3: Return all users at the top count

Filter to users whose count equals the maximum, then join to `users` for the username.

---

### The solution

**Tie-inclusive top-1 with subquery**

```sql
SELECT u.username
FROM users u
JOIN (
    SELECT user_id, COUNT(*) AS txn_count
    FROM transactions
    GROUP BY user_id
) t ON u.user_id = t.user_id
WHERE t.txn_count = (
    SELECT MAX(txn_count)
    FROM (
        SELECT COUNT(*) AS txn_count
        FROM transactions
        GROUP BY user_id
    )
)
```

> **Cost Analysis**
>
> Two GROUP BY passes on 80M transactions (one for the main query, one for the MAX subquery). The optimizer may cache the result. The join to users is on the filtered set (1 or a few users), so it is trivial.

> **Interviewers Watch For**
>
> Using `LIMIT 1` instead of the MAX-based approach. LIMIT silently drops tied users and is non-deterministic. The prompt says "include all of them."

> **Common Pitfall**
>
> Computing the GROUP BY twice in separate subqueries instead of using a CTE to share the result. While both are correct, a CTE avoids redundant computation.

---

## Common follow-up questions

- How would you solve this with DENSE_RANK instead? _(DENSE_RANK() OVER (ORDER BY COUNT(*) DESC) = 1 is an alternative approach.)_
- What if you also needed the transaction count alongside the username? _(Add txn_count to the SELECT list.)_
- How would you find the top 3 users by transaction count with tie inclusion? _(Change the MAX-based filter to DENSE_RANK <= 3.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/user_with_most_transactions)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.