# Top Shelf

> Buyers need to know ceiling prices before negotiating with vendors.

Canonical URL: <https://datadriven.io/problems/top_shelf>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

The procurement team negotiates vendor contracts based on price ceilings in each product category. For every category, they need the highest price and the name of the product at that price point. If a product has no price on record, skip it. Show only the top three most expensive categories by their ceiling price.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests combining aggregation with NULL handling and a `LIMIT` clause. Interviewers check whether you filter NULLs before or after aggregation and whether you can retrieve both the aggregate and a correlated column.

---

### Break down the requirements

#### Step 1: Exclude NULL prices

`WHERE price IS NOT NULL` removes products with no price on record. With 8% null fraction, this filters out ~480 of 6,000 products.

#### Step 2: Find the max price per category with the product name

Use a window function or self-join to pair `MAX(price)` per category with the corresponding `product_name`. A simple `GROUP BY` alone cannot retrieve the product name.

#### Step 3: Limit to top 3 categories

`ORDER BY max_price DESC LIMIT 3` surfaces the three most expensive categories.

---

### The solution

**Window function to pair max price with product name**

```sql
SELECT category, product_name, price AS max_price
FROM (
    SELECT
        category,
        product_name,
        price,
        ROW_NUMBER() OVER (
            PARTITION BY category
            ORDER BY price DESC
        ) AS rn
    FROM products
    WHERE price IS NOT NULL
) ranked
WHERE rn = 1
ORDER BY max_price DESC
LIMIT 3
```

> **Cost Analysis**
>
> With only 6,000 rows and 15 categories, the window function sorts a few hundred rows per partition. This is trivially fast. An index on `(category, price DESC)` would optimize the partition sort.

> **Interviewers Watch For**
>
> Candidates who try `GROUP BY category` with `MAX(price)` but cannot retrieve `product_name` without a subquery or join. This is a common stumbling point for junior candidates.

> **Common Pitfall**
>
> Selecting `product_name` alongside `GROUP BY category, MAX(price)` without a window function or self-join produces undefined behavior in some SQL engines. Always use a deterministic approach.

---

## Common follow-up questions

- What if two products tie for the highest price in a category? _(Tests awareness that `ROW_NUMBER` picks one arbitrarily; `RANK` or `DENSE_RANK` would include ties.)_
- How would you solve this with a self-join instead of a window function? _(Tests alternative approaches: join products to a subquery of MAX(price) per category.)_
- What if you needed the top product per category sorted by rating as a tiebreaker? _(Tests compound ORDER BY inside the window function.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/top_shelf)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.