# Users Without Purchases

> How many registered users have never made a single purchase

Canonical URL: <https://datadriven.io/problems/users_without_purchases>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

The growth team is sizing the conversion gap: how many registered users have never completed a purchase?

## Worked solution and explanation

### Why this problem exists in real interviews

This is a classic anti-join problem. Interviewers check whether you can efficiently find records in one table with no match in another, and whether you choose `LEFT JOIN ... IS NULL`, `NOT EXISTS`, or `NOT IN`.

---

### Break down the requirements

#### Step 1: Start from users

The count should reflect registered users, so users is the base table.

#### Step 2: Anti-join to transactions

`LEFT JOIN transactions ON users.user_id = transactions.user_id WHERE transaction_id IS NULL` finds users with no purchase records.

#### Step 3: Count the result

`COUNT(*)` returns the total number of non-purchasing users.

---

### The solution

**Anti-join for conversion gap sizing**

```sql
SELECT COUNT(*) AS users_without_purchases
FROM users u
LEFT JOIN transactions t ON u.user_id = t.user_id
WHERE t.transaction_id IS NULL
```

> **Cost Analysis**
>
> The LEFT JOIN between 10M users and 80M transactions is the bottleneck. The optimizer can convert `LEFT JOIN ... IS NULL` into an anti-join, which avoids materializing all matched rows. An index on `transactions(user_id)` is essential.

> **Interviewers Watch For**
>
> Whether you choose `LEFT JOIN ... IS NULL`, `NOT EXISTS`, or `NOT IN`. All three are valid for non-nullable keys, but `NOT EXISTS` is often preferred for clarity and optimizer friendliness.

> **Common Pitfall**
>
> Using `NOT IN` when the subquery column could contain NULLs. In this case user_id is non-nullable, so it is safe, but `NOT EXISTS` is the safer default habit.

---

## Common follow-up questions

- How would you find users who signed up more than 30 days ago but never purchased? _(Add a signup_date filter to target long-dormant users.)_
- What is the difference between NOT EXISTS and LEFT JOIN ... IS NULL? _(Semantically equivalent for non-nullable keys; optimizer may handle them differently.)_
- How would you find the conversion rate (purchasers / total users)? _(Tests combining this count with the total user count in a single query.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/users_without_purchases)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.