# Price Check

> Priced to sell or priced to sit?

Canonical URL: <https://datadriven.io/problems/price_check>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

The finance team is reviewing pricing strategy across the product catalog. For each category, they need to see the average price and the number of products that actually have a price on file. Exclude any product whose price is missing, and list categories from most expensive on average to least.

## Worked solution and explanation

### Why this problem exists in real interviews

Built around the `products` table, this challenge probes your ability to apply grouped COUNT aggregation in a product analytics setting. Correctly referencing the `price` and `category` columns is essential to a working solution.

---

### Break down the requirements

#### Step 1: Filter out NULL prices

`WHERE price IS NOT NULL` ensures only products with a price are counted.

#### Step 2: Group by category

`GROUP BY category` aggregates per category.

#### Step 3: Compute average and count

`AVG(price)` and `COUNT(*)` produce the two requested metrics.

#### Step 4: Order by average price descending

`ORDER BY AVG(price) DESC` surfaces the most expensive categories first.

---

### The solution

**Filter out null prices to find price check**

```sql
SELECT
    category,
    AVG(price) AS avg_price,
    COUNT(*) AS product_count
FROM products
WHERE price IS NOT NULL
GROUP BY category
ORDER BY avg_price DESC
```

> **Cost Analysis**
>
> With `products` (6,000 rows), this query scans a small dataset. No indexing is needed for this volume. At production scale, an index on the primary filter column would improve performance.

> **Interviewers Watch For**
>
> Interviewers evaluate whether you translate the English requirements into the correct SQL clauses on the first attempt. They watch for clean syntax, correct column references, and whether you verify edge cases before declaring the query complete.

> **Common Pitfall**
>
> The most common mistake is misreading the prompt's filtering or grouping requirements. Double-check which columns to group by, which to aggregate, and whether the output should be filtered with `WHERE` (before grouping) or `HAVING` (after grouping).

---

## Common follow-up questions

- What result would you get if every value in `products.rating` were NULL? Would your query return an empty set or something unexpected? _(Tests extreme NULL scenarios and whether the candidate guards against edge cases in `rating`.)_
- The `in_stock` column in `products` is dominated by a single hot value. How would data skew affect parallel execution of your query? _(Tests understanding of skew in `products.in_stock` and its impact on distributed query performance.)_
- `products.in_stock` only has 2 distinct values. If a new category were added, would your query automatically include it? _(Tests whether the query hard-codes values or dynamically adapts to `in_stock` changes.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/price_check)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.