# Categories With Mixed Price Tiers

> Users who cross content types.

Canonical URL: <https://datadriven.io/problems/categories_with_mixed_price_tiers>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

A recommendation model needs product categories that span both high-value (over $500) and lower-value transactions, since mixing price tiers improves diversity. Which categories have seen both kinds of transactions? Return just the category name.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests conditional aggregation with HAVING to detect categories that contain rows satisfying two distinct conditions. Interviewers probe whether you can express set-based existence checks using aggregates rather than correlated subqueries.

> **Trick to Solving**
>
> The phrase "both high-value and lower-value" is the signal for a HAVING clause with two conditions. Whenever a prompt asks "which groups contain at least one X and at least one Y," think conditional counts.
> 
> 1. Join products to transactions for category and amount
> 2. Group by category
> 3. Use `HAVING SUM(CASE WHEN total_amount > 500 ...) > 0 AND SUM(CASE WHEN total_amount <= 500 ...) > 0`

---

### Break down the requirements

#### Step 1: Join tables for category context

Join `transactions` to `products` on `product_id` to get the `category` column alongside `total_amount`.

#### Step 2: Group by category

`GROUP BY p.category` collapses rows into one per category.

#### Step 3: Apply dual HAVING conditions

Use conditional aggregation to confirm each category has at least one transaction over $500 and at least one at or below $500.

---

### The solution

**Conditional aggregation with dual HAVING**

```sql
SELECT p.category
FROM transactions t
JOIN products p ON t.product_id = p.product_id
GROUP BY p.category
HAVING SUM(CASE WHEN t.total_amount > 500 THEN 1 ELSE 0 END) > 0
   AND SUM(CASE WHEN t.total_amount <= 500 THEN 1 ELSE 0 END) > 0
```

> **Cost Analysis**
>
> Joins 150M transactions to 40K products (hash join on `product_id`). The GROUP BY reduces to the number of distinct categories (typically under 100). The conditional sums add negligible overhead to the aggregation pass.

> **Interviewers Watch For**
>
> The clean approach uses conditional aggregation in HAVING. Candidates who write two separate EXISTS subqueries produce correct but less efficient SQL. Interviewers prefer the single-pass approach.

> **Common Pitfall**
>
> Using `total_amount >= 500` instead of `> 500` would include $500 transactions in the high-value bucket and exclude them from the low-value check, potentially missing edge-case categories.

---

## Common follow-up questions

- What if the threshold was dynamic, say the median transaction amount? _(Tests ability to compute a threshold in a CTE and reference it in HAVING.)_
- How would you also show the count of high and low transactions per category? _(Simply SELECT the conditional sums alongside the category.)_
- What if a category only had exactly $500 transactions? _(Tests boundary condition: such a category would NOT qualify since 500 is not greater than 500.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/categories_with_mixed_price_tiers)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.