# Users Who Churned in February

> Gone in February.

Canonical URL: <https://datadriven.io/problems/users_who_churned_in_february>

Domain: SQL · Difficulty: hard · Seniority: L4

## Problem

Find all users who had sessions in January 2026 but none in February 2026.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests set difference on time-windowed data. Interviewers check whether you can identify users present in one time period but absent in another, which is the core churn detection pattern.

> **Trick to Solving**
>
> Churn detection is a set difference: users in January minus users in February. The cleanest approach is EXCEPT or NOT EXISTS.
> 
> 1. Get distinct users with January sessions
> 2. Get distinct users with February sessions
> 3. Return January users NOT IN the February set

---

### Break down the requirements

#### Step 1: Identify January users

`SELECT DISTINCT user_id FROM user_sessions WHERE session_start >= '2026-01-01' AND session_start < '2026-02-01'` captures users active in January.

#### Step 2: Exclude February users

Remove any user who had at least one session in February using NOT EXISTS or EXCEPT.

---

### The solution

**Set difference for monthly churn detection**

```sql
SELECT DISTINCT user_id
FROM user_sessions
WHERE session_start >= '2026-01-01'
  AND session_start < '2026-02-01'
  AND user_id NOT IN (
      SELECT DISTINCT user_id
      FROM user_sessions
      WHERE session_start >= '2026-02-01'
        AND session_start < '2026-03-01'
  )
```

> **Cost Analysis**
>
> Two range scans on 60M rows, each filtering to ~5M rows per month. The NOT IN subquery builds a hash set of ~3M February user IDs. An index on `(session_start, user_id)` enables efficient range scans.

> **Interviewers Watch For**
>
> Half-open date intervals (`>=` and `<`) instead of `BETWEEN`. Also, whether the candidate considers that NOT IN is safe here because user_id has no NULLs (it is a non-nullable INTEGER).

> **Common Pitfall**
>
> Using `BETWEEN '2026-01-01' AND '2026-01-31'` which misses sessions on January 31 if timestamps include time components after midnight.

---

## Common follow-up questions

- How would you compute the churn rate (churned / January active)? _(Count the churned set and divide by the January active set.)_
- What if you needed monthly churn for every month, not just January to February? _(Generalize with a self-join or LAG on monthly active user counts.)_
- How would you use EXCEPT instead of NOT IN? _(EXCEPT deduplicates automatically and avoids the NULL pitfall of NOT IN.)_
- What if a user was active in January, absent in February, but returned in March? _(Tests the definition of churn: is it permanent or temporary? The query only checks two months.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/users_who_churned_in_february)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.