# Users With and Without Ad Clicks

> Clicked an ad versus never clicked. The split.

Canonical URL: <https://datadriven.io/problems/users_with_and_without_ad_clicks>

Domain: SQL · Difficulty: hard · Seniority: L4

## Problem

Count users who received at least one ad click versus those who did not. A user counts as clicked if they have any clicked ad impression. Show each group (has click or not) and its user count.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests conditional aggregation across a LEFT JOIN. Interviewers check whether you can partition a user base into two groups based on the existence (or absence) of a condition in a related table.

> **Trick to Solving**
>
> The key insight is to compute this as a single query with conditional counting rather than two separate queries.
> 
> 1. Left join users to ad impressions with a click filter
> 2. Use CASE to classify each user as 'has_click' or 'no_click'
> 3. Count users per group

---

### Break down the requirements

#### Step 1: Identify users with at least one click

A subquery finds distinct user_ids from `ad_impressions` where `clicked = 1`.

#### Step 2: Classify all users

LEFT JOIN users to the click set. Users with a match have clicks; users with NULL do not.

#### Step 3: Count per group

GROUP BY the classification (has_click vs. no_click) and COUNT each group.

---

### The solution

**Conditional grouping with left join classification**

```sql
SELECT
    CASE WHEN c.user_id IS NOT NULL THEN 'has_click' ELSE 'no_click' END AS click_group,
    COUNT(*) AS user_count
FROM users u
LEFT JOIN (
    SELECT DISTINCT user_id
    FROM ad_impressions
    WHERE clicked = 1
) c ON u.user_id = c.user_id
GROUP BY click_group
```

> **Cost Analysis**
>
> The subquery scans 250M impressions to find ~10M distinct clicking users. The LEFT JOIN matches 12M users against this 10M set. Both steps benefit from indexes on `user_id`.

> **Interviewers Watch For**
>
> Pre-deduplicating the impressions in the subquery. Without DISTINCT, the LEFT JOIN would fan out users with multiple clicks, inflating the 'has_click' count.

> **Common Pitfall**
>
> Joining users directly to the full `ad_impressions` table. A user with 100 clicked impressions would be counted 100 times, giving an incorrect user count.

---

## Common follow-up questions

- How would you break this down further by ad campaign? _(Add campaign to the subquery and classification for a per-campaign breakdown.)_
- What if you also needed users who received impressions but never clicked? _(Add a third group: users with impressions but no clicks, requiring a more complex classification.)_
- How would you compute the click-through rate? _(Divide 'has_click' count by total users count.)_
- What if the users table has 100 million rows? _(The LEFT JOIN scales linearly; pre-aggregating the click set keeps it manageable.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/users_with_and_without_ad_clicks)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.