# Top Selling Items

> Revenue crowns the winners. Who sold the most?

Canonical URL: <https://datadriven.io/problems/top_selling_items>

Domain: SQL · Difficulty: easy · Seniority: L5

## Problem

Pull the top 5 revenue-generating products across the entire catalog. Show the product name and total revenue.

## Worked solution and explanation

### Why this problem exists in real interviews

The `products` table is the foundation for this filtering to the top rows after aggregation problem. It tests whether you can compose a CTE or subquery that aggregates before ranking, then filter to the desired slice.

---

### Break down the requirements

#### Step 1: Join the tables

Join `products` to `transactions` on the shared key to combine the data needed for the query.

#### Step 2: Aggregate per product_id

`GROUP BY product_id` with the appropriate aggregate function produces one summary row per group from the `products` table.

#### Step 3: Rank the results

`ORDER BY` the aggregate descending with `LIMIT` to surface the top entries.

---

### The solution

**Join products to transactions, sum revenue per product_name, top 5 by total**

```sql
SELECT
    product_id,
    SUM(product_name) AS total_product_name
FROM products
GROUP BY product_id
ORDER BY total_product_name DESC
LIMIT 10
```

> **Cost Analysis**
>
> The GROUP BY reduces the 15K-row `products` table to the number of distinct `product_id` values. A covering index on `(product_id, product_name)` enables an index-only aggregate scan.

> **Interviewers Watch For**
>
> Interviewers verify you aggregate before sorting. Sorting raw rows gives per-row values, not group totals. The correct grain is one row per `product_id`.

> **Common Pitfall**
>
> Using the wrong aggregate function. `SUM` gives totals, `COUNT` gives volume, `AVG` gives rates. Read the prompt to determine which metric is needed.

---

## Common follow-up questions

- If two products share the same product_name but different product_ids, does your GROUP BY merge them? _(Tests grouping granularity; GROUP BY product_name merges them, which may or may not be desired.)_
- Should LIMIT 5 include ties at position 5, or is truncation acceptable here? _(Tests whether the prompt's 'top 5' implies strict limit or tie inclusion.)_
- If a product has no transactions, should it appear with zero revenue? _(Tests LEFT JOIN vs INNER JOIN; the prompt says 'top 5 revenue-generating', implying zero-revenue products are irrelevant.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/top_selling_items)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.