# Push Opens by Platform and Campaign

> Opens by platform and campaign.

Canonical URL: <https://datadriven.io/problems/push_opens_by_platform_and_campaign>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

The engagement team suspects iOS users are more responsive to push notifications than Android users. Show how many unique users opened a notification on each platform, sorted from most opens to fewest.

## Worked solution and explanation

### Why this problem exists in real interviews

Built around the `push_notifs` table, this challenge probes your ability to apply grouped COUNT aggregation in a notification setting. Correctly referencing columns like `opened`, `platform`, and `user_id` is essential to a working solution.

---

### Break down the requirements

#### Step 1: Filter to opened notifications

`WHERE opened = 1` keeps only notifications that were opened.

#### Step 2: Group by platform

`GROUP BY platform` aggregates opens per platform.

#### Step 3: Count distinct users

`COUNT(DISTINCT user_id)` counts unique openers per platform.

#### Step 4: Order descending

`ORDER BY COUNT(DISTINCT user_id) DESC` surfaces the most responsive platform first.

---

### The solution

**Filter to opened notifications to find push opens by platform and**

```sql
SELECT platform, COUNT(DISTINCT user_id) AS unique_openers
FROM push_notifs
WHERE opened = 1
GROUP BY platform
ORDER BY unique_openers DESC
```

> **Cost Analysis**
>
> With `push_notifs` (80,000,000 rows), the full scan reads significant data. A composite index on the filter columns pushes the predicate into the index layer. Pre-aggregation in a materialized view is worth considering at this scale.

> **Interviewers Watch For**
>
> Interviewers evaluate whether you translate the English requirements into the correct SQL clauses on the first attempt. They watch for clean syntax, correct column references, and whether you verify edge cases before declaring the query complete.

> **Common Pitfall**
>
> The most common mistake is misreading the prompt's filtering or grouping requirements. Double-check which columns to group by, which to aggregate, and whether the output should be filtered with `WHERE` (before grouping) or `HAVING` (after grouping).

---

## Common follow-up questions

- What result would you get if every value in `push_notifs.campaign` were NULL? Would your query return an empty set or something unexpected? _(Tests extreme NULL scenarios and whether the candidate guards against edge cases in `campaign`.)_
- If `push_notifs` grew to contain billions of rows, which part of your query would become the bottleneck given the cardinality of `sent_at`? _(Tests ability to identify performance hotspots related to `push_notifs.sent_at` at scale.)_
- `push_notifs.opened` only has 2 distinct values. If a new category were added, would your query automatically include it? _(Tests whether the query hard-codes values or dynamically adapts to `opened` changes.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/push_opens_by_platform_and_campaign)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.