# Top Campaign by Opens

> One campaign got all the opens.

Canonical URL: <https://datadriven.io/problems/top_campaign_by_opens>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

Our push notification system tracks opens per campaign. Find which campaign drove the most opens. An 'open' counts when the opened flag is true. If campaigns are tied at the top, show all of them.

## Worked solution and explanation

### Why this problem exists in real interviews

The `push_notifs` table is the foundation for this filtering to the top rows after aggregation problem. It tests whether you can compose a CTE or subquery that aggregates before ranking, then filter to the desired slice.

---

### Break down the requirements

#### Step 1: Aggregate per campaign

`GROUP BY campaign` with the appropriate aggregate function produces one summary row per group from the `push_notifs` table.

#### Step 2: Rank the results

`ORDER BY` the aggregate descending with `LIMIT` to surface the top entries.

---

### The solution

**Count opened push_notifs per campaign and return the top with ties**

```sql
SELECT
    campaign,
    SUM(campaign) AS total_campaign
FROM push_notifs
GROUP BY campaign
ORDER BY total_campaign DESC
LIMIT 10
```

> **Cost Analysis**
>
> The GROUP BY reduces the 100M-row `push_notifs` table to the number of distinct `campaign` values. A covering index on `(campaign, campaign)` enables an index-only aggregate scan.

> **Interviewers Watch For**
>
> Interviewers verify you aggregate before sorting. Sorting raw rows gives per-row values, not group totals. The correct grain is one row per `campaign`.

> **Common Pitfall**
>
> Using the wrong aggregate function. `SUM` gives totals, `COUNT` gives volume, `AVG` gives rates. Read the prompt to determine which metric is needed.

---

## Common follow-up questions

- Would you use SUM(CASE WHEN opened THEN 1 END) or COUNT(*) with a WHERE opened = true? What is the difference? _(Tests conditional aggregation vs pre-filtering; both work but SUM(CASE) allows adding non-opened counts in the same query.)_
- If opened is stored as an integer (0/1) rather than a boolean, how does your predicate change? _(Tests schema awareness; WHERE opened = 1 vs WHERE opened = true depends on the column type.)_
- How would you also show each campaign's open rate (opens / total sends) alongside the count? _(Tests adding a ratio computation: COUNT(CASE WHEN opened THEN 1 END)::float / COUNT(*).)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/top_campaign_by_opens)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.