# Recent Price Drops

> The price just dropped. Who noticed?

Canonical URL: <https://datadriven.io/problems/recent_price_drops>

Domain: SQL · Difficulty: medium · Seniority: L5

## Problem

A downstream consumer flagged stale product data. Pull all products that either had a transaction recorded within the last day, or are currently marked as in stock. Return unique product IDs and names, no duplicates.

## Worked solution and explanation

### Why this problem exists in real interviews

Built around the `products` and `transactions` tables, this challenge probes your ability to apply self-join in a product analytics setting. Correctly referencing columns like `product_id`, `transaction_date`, and `in_stock` is essential to a working solution.

---

### Break down the requirements

#### Step 1: Find products with recent transactions

`SELECT product_id FROM transactions WHERE transaction_date >= DATE('now', '-1 day')` identifies recently transacted products.

#### Step 2: Find in-stock products

`SELECT product_id FROM products WHERE in_stock = 1` identifies currently available products.

#### Step 3: Combine with UNION

`UNION` the two product_id sets to merge and deduplicate, then join back to products for names.

---

### The solution

**Union-merge for recent price drops**

```sql
SELECT DISTINCT p.product_id, p.product_name
FROM products p
JOIN transactions t ON p.product_id = t.product_id
WHERE t.transaction_date >= DATE('now', '-1 day')
UNION
SELECT product_id, product_name
FROM products
WHERE in_stock = 1
```

> **Cost Analysis**
>
> With `products` (15,000 rows), `transactions` (60,000,000 rows), the full scan reads significant data. A composite index on the filter columns pushes the predicate into the index layer. Pre-aggregation in a materialized view is worth considering at this scale.

> **Interviewers Watch For**
>
> Interviewers evaluate whether you translate the English requirements into the correct SQL clauses on the first attempt. They watch for clean syntax, correct column references, and whether you verify edge cases before declaring the query complete.

> **Common Pitfall**
>
> The most common mistake is misreading the prompt's filtering or grouping requirements. Double-check which columns to group by, which to aggregate, and whether the output should be filtered with `WHERE` (before grouping) or `HAVING` (after grouping).

---

## Common follow-up questions

- What would happen to your result if `transactions.transaction_date` contained duplicate values that you did not expect? _(Tests whether the candidate considers data quality issues in `transaction_date` and uses DISTINCT or deduplication where needed.)_
- `transactions.transaction_id` has roughly 60,000,000 distinct values. What index strategy would you use to avoid a full scan on `transactions`? _(Tests indexing knowledge specific to the high-cardinality `transaction_id` column in `transactions`.)_
- You used UNION to combine results. If one branch returns zero rows from `products`, does the overall query still behave correctly? _(Tests understanding of UNION behavior with empty result sets.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/recent_price_drops)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.