# Price Rank

> In every category, someone charges the most. Who's on top?

Canonical URL: <https://datadriven.io/problems/price_rank>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

The catalog team wants competitive price positioning inside each category. Within each category, rank products by price with the most expensive at position 1, and let ties share a position. Skip rows where price is NULL. Return the product_name, category, price, and its position.

## Worked solution and explanation

### Why this problem exists in real interviews

The core skill being tested is DENSE_RANK for gapless ranking, applied to the `products` table in a product analytics context. Getting columns like `price`, `rating`, and `category` right is where most candidates slip.

---

### Break down the requirements

#### Step 1: Filter out NULL prices

`WHERE price IS NOT NULL` ensures NULL ratings do not receive a rank.

#### Step 2: Rank within each category

`DENSE_RANK() OVER (PARTITION BY category ORDER BY price DESC)` assigns positions with ties sharing a rank and no gaps.

#### Step 3: Return the required columns

Select `product_name`, `category`, `price`, and the rank.

---

### The solution

**Dense-rank for price rank**

```sql
SELECT product_name, category, price, rnk AS position
FROM (
    SELECT
        product_name,
        category,
        price,
        DENSE_RANK() OVER (PARTITION BY category ORDER BY price DESC) AS rnk
    FROM products
    WHERE price IS NOT NULL
) ranked
```

> **Cost Analysis**
>
> With `products` (18,000 rows), this query scans a small dataset. No indexing is needed for this volume. At production scale, an index on the primary filter column would improve performance.

> **Interviewers Watch For**
>
> Interviewers evaluate whether you translate the English requirements into the correct SQL clauses on the first attempt. They watch for clean syntax, correct column references, and whether you verify edge cases before declaring the query complete.

> **Common Pitfall**
>
> The most common mistake is misreading the prompt's filtering or grouping requirements. Double-check which columns to group by, which to aggregate, and whether the output should be filtered with `WHERE` (before grouping) or `HAVING` (after grouping).

---

## Common follow-up questions

- The `rating` column in `products` has a 5% null rate. How does your query handle rows where `rating` is NULL, and could that silently change the result count? _(Tests whether the candidate accounts for NULLs in `products.rating` and understands how aggregates skip NULL values.)_
- The `category` column in `products` is heavily skewed toward a few popular values. How would data skew affect parallel execution of your query? _(Tests understanding of skew in `products.category` and its impact on distributed query performance.)_
- What is the default window frame for your window function, and would explicitly setting ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW change anything? _(Tests knowledge of implicit vs explicit window frame specifications.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/price_rank)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.