# Top 10 Rated Products

> The ten highest-rated items.

Canonical URL: <https://datadriven.io/problems/top_10_rated_products>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

Each product may have multiple reviews, but the rating column already reflects the product's average. Find the top 10 products by rating, from highest to lowest.

## Worked solution and explanation

### Why this problem exists in real interviews

Top-N leaderboards are the most common SQL ask in screening rounds because they bundle three skills into one short query: NULL handling, multi-key ORDER BY, and LIMIT. Interviewers also check whether you read the schema carefully. The `products.rating` column is already a per-product rating, so candidates who reach for AVG and GROUP BY signal they did not look at the table before writing.

---

### Break down the requirements

#### Step 1: Filter out unrated products

`products.rating` is DOUBLE with a 5% NULL fraction. 'Ignore products with no rating' means a `WHERE rating IS NOT NULL` filter, which both removes those rows from contention and makes sure they cannot land in the top 10.

#### Step 2: Sort with an explicit tie-breaker

Highest rating first means `ORDER BY rating DESC`. The alphabetical tie-break is `product_name ASC`. Listing both keys explicitly is what makes the output deterministic when many products share the same top rating value (only 50 distinct rating values across 15K products, so ties are guaranteed).

#### Step 3: Limit and project

`LIMIT 10` cuts the sorted result to the requested 10 rows. Project just `product_name, rating` (the two columns the prompt asks for).

---

### The solution

**Filter, sort with tie-break, limit**

```sql
SELECT product_name, rating FROM products WHERE rating IS NOT NULL ORDER BY rating DESC, product_name ASC LIMIT 10
```

> **Cost Analysis**
>
> Across 15,000 rows the planner does a single sequential scan, applies the NULL filter (drops about 750 rows), then a top-K sort that only has to keep the running 10 best in memory. An index on `(rating DESC, product_name ASC)` would let the planner walk the index and stop after 10 non-NULL rows, but at this row count the sequential scan is already sub-millisecond.

> **Interviewers Watch For**
>
> The make-or-break tells are: did you over-engineer with AVG/GROUP BY when `rating` is already an aggregate, did you add the alphabetical tie-break, and did you handle NULL with `IS NOT NULL` rather than `rating != NULL` (which always returns UNKNOWN and silently drops every row).

> **Common Pitfall**
>
> Skipping the tie-break. With only 50 distinct rating values across 15K products, the top rating likely has many ties. Without `product_name ASC` the database picks an arbitrary 10 from the tied bucket and the result fails the deterministic comparison the grader expects.

---

## Common follow-up questions

- How would you return the top 10 rated products within each `category` instead of overall? _(Tests whether the candidate reaches for `ROW_NUMBER() OVER (PARTITION BY category ORDER BY rating DESC, product_name)` with an outer filter `WHERE rn <= 10`.)_
- How would you change the query so only `in_stock = 1` products can appear in the top 10? _(Tests filter composition. Adding `AND in_stock = 1` to the WHERE clause before LIMIT keeps only stocked products in contention. The `in_stock` column is INTEGER 0/1 with single_hot skew.)_
- What would change if you wanted to discount products with very few reviews from the leaderboard? _(Tests Bayesian ranking intuition. With `rating` already aggregated, the schema does not expose review counts, so the candidate should call out that this requires either a separate reviews table or a stored review_count column to weight low-volume ratings down.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/top_10_rated_products)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.