# Inactive Unverified Users

> Signed up. Never verified. Never came back.

Canonical URL: <https://datadriven.io/problems/inactive_unverified_users>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

We have a verification queue that's been growing. Pull all users still in 'pending_verification' status who had zero sessions during March 2026. Return user_id, username, email, signup_date, account_status, and age_bucket.

## Worked solution and explanation

### What this is really asking

The March date filter belongs in the LEFT JOIN ON clause, not WHERE. Moved to WHERE, it collapses the anti-join because unmatched rows have NULL session_start.

---

### Break down the requirements

#### Step 1: Anti-join on user_sessions

LEFT JOIN to user_sessions and keep rows where us.session_id IS NULL. That isolates accounts with zero matched March sessions.

#### Step 2: Scope the join to March

Push year and month into ON so an account active in February still qualifies if March was empty.

#### Step 3: Filter on status

WHERE account_status = 'pending_verification' is safe in WHERE; it touches only the left side.

---

### The solution

**PENDING USERS WITH NO MARCH SESSIONS**

```sql
SELECT u.user_id, u.username, u.email, u.signup_date,
       u.account_status, u.age_bucket
FROM users u
LEFT JOIN user_sessions us
  ON u.user_id = us.user_id
 AND strftime('%Y', us.session_start) = '2026'
 AND strftime('%m', us.session_start) = '03'
WHERE u.account_status = 'pending_verification'
  AND us.session_id IS NULL
```

> **Cost Analysis**
>
> 8M users LEFT JOIN 40M sessions. Partitioning by session_start prunes to one month of files. A composite index on (user_id, session_start) keeps lookups cheap; IS NULL costs nothing once joined.

> **Interviewers Watch For**
>
> Whether you can articulate why the date filter belongs in ON. Naming the anti-join pattern and noting NOT EXISTS as an equivalent, often-faster rewrite.

> **Common Pitfall**
>
> strftime returns text, so compare to '03' not 3. The integer comparison fails every row and silently flips the result to every pending account, regardless of activity.

> **The False Start**
>
> First instinct is to put strftime('%m', us.session_start) = '03' in WHERE. That turns the LEFT JOIN into an INNER JOIN: unmatched rows have NULL session_start and fail equality. Pivot to keeping month and year inside ON.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- Rewrite this with NOT EXISTS instead of LEFT JOIN. _(Tests the anti-join alternative and when each plan wins on 40M sessions.)_
- Extend this to flag accounts inactive for any calendar month, not just March. _(Forces a per-account MAX(session_start) or calendar-driven join, exposing parameterization.)_
- What if user_sessions is append-only and lags by an hour? _(Opens freshness: compute on a cutoff timestamp or accept the lag in the queue.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/inactive_unverified_users)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.