# Subscribers Without Premium

> Subscribed. But never upgraded.

Canonical URL: <https://datadriven.io/problems/subscribers_without_premium>

Domain: SQL · Difficulty: medium · Seniority: L5

## Problem

Pull basic-plan subscribers who never upgraded to premium from the subscriptions data. The retention team wants to run a winback campaign targeting this group.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests whether a candidate can demonstrate deduplicating correctly before counting. This pattern appears frequently in mid-level SQL rounds where interviewers want to see structured thinking.

---

### Break down the requirements

#### Step 1: Filter to the target set

The `IN` list restricts the query to only the specified values, avoiding a full-table scan of irrelevant rows.

#### Step 2: Deduplicate the result with DISTINCT

`SELECT DISTINCT` removes duplicate rows from the output. This is necessary when joins or subqueries can produce repeated combinations.

---

### The solution

**Exclude users whose push_notifs ever reference a premium plan**

```sql
SELECT DISTINCT user_id
FROM push_notifs
WHERE platform = 'basic' AND user_id NOT IN (SELECT user_id FROM push_notifs WHERE platform = 'premium')
```

> **Cost Analysis**
>
> With ~100M rows, the query performs a single sequential scan. An index on the filter/join columns would reduce the scan to a seek.

> **Interviewers Watch For**
>
> Interviewers watch for whether the query returns exactly the columns and ordering the prompt specifies; how quickly you identify the core operation and write clean, minimal code.

> **Common Pitfall**
>
> Returning extra columns that the prompt did not ask for, or using the wrong column alias, causes a grading mismatch even when the logic is correct.

---

## Common follow-up questions

- If a user has both 'basic' and 'premium' entries in push_notifs but the premium row has status = 'cancelled', should they appear in the results? _(Tests whether the candidate reads the schema carefully; the presence of a premium row might need a status check.)_
- Would you use NOT EXISTS, LEFT JOIN with IS NULL, or NOT IN here, and why? _(Tests understanding of query plan differences and NULL-safety pitfalls of NOT IN.)_
- How does the query behave if a user_id in push_notifs has only NULL values in the plan column? _(Tests edge-case awareness around NULLs in exclusion logic.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/subscribers_without_premium)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.