# Cart Sizes

> Power buyers. Big carts.

Canonical URL: <https://datadriven.io/problems/cart_sizes>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

The product team is designing a bulk-checkout experience and wants to identify the most active buyers. For each user, show their username and how many transactions they have completed. Only surface users with more than two purchases, listed from most active to least.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests a basic JOIN with GROUP BY and HAVING. Interviewers use it to verify you can combine data from two tables, aggregate correctly, apply a post-aggregation filter, and sort the output.

---

### Break down the requirements

#### Step 1: Join users to transactions

Join `users` to `transactions` on `user_id` to associate each transaction with a username.

#### Step 2: Count transactions per user

`GROUP BY u.user_id, u.username` with `COUNT(*)` gives the transaction count per user.

#### Step 3: Filter and sort

`HAVING COUNT(*) > 2` removes low-activity users. `ORDER BY COUNT(*) DESC` ranks most active first.

---

### The solution

**Join, aggregate, and filter with HAVING**

```sql
SELECT u.username, COUNT(*) AS transaction_count
FROM users u
JOIN transactions t ON u.user_id = t.user_id
GROUP BY u.user_id, u.username
HAVING COUNT(*) > 2
ORDER BY transaction_count DESC
```

> **Cost Analysis**
>
> The join between 1M users and 5M transactions is driven by the smaller table. An index on `transactions.user_id` enables a nested-loop join. The HAVING filter reduces output but does not reduce join cost.

> **Common Pitfall**
>
> Grouping only by `username` without `user_id` risks merging distinct users who share a name. Always group by the primary key when including a non-key display column.

---

## Common follow-up questions

- What if you also needed to show users with zero transactions? _(Tests LEFT JOIN and COALESCE to surface users with no purchases.)_
- How would you modify this to show the top 10 buyers only? _(Tests LIMIT usage and whether HAVING is still needed.)_
- What happens if a user has exactly 2 transactions? _(Verifies understanding of strict inequality in HAVING.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/cart_sizes)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.