# Top 2 Ad Campaigns by Spend

> Two campaigns. Most of the budget.

Canonical URL: <https://datadriven.io/problems/top_2_ad_campaigns_by_spend>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

Show the top two ad campaigns by total spend, excluding any campaign whose name contains 'test'. If campaigns are tied at the cutoff, include all of them.

## Worked solution and explanation

### Why this problem exists in real interviews

This probes filtering to the top rows after aggregation against `ad_impressions`. The key signal is whether the candidate recognizes that the grain of the ranking must match the grain of the output.

---

### Break down the requirements

#### Step 1: Apply filters

Use a `WHERE` clause to narrow the data to the relevant subset before aggregation.

#### Step 2: Aggregate per ad_campaign

`GROUP BY ad_campaign` with the appropriate aggregate function produces one summary row per group from the `ad_impressions` table.

#### Step 3: Rank the results

`ORDER BY` the aggregate descending with `LIMIT` to surface the top entries.

---

### The solution

**Dense-rank campaigns by total spend after excluding test names**

```sql
SELECT
    ad_campaign,
    SUM(revenue) AS total_revenue
FROM ad_impressions
GROUP BY ad_campaign
ORDER BY total_revenue DESC
LIMIT 2
```

> **Cost Analysis**
>
> The GROUP BY reduces the 300M-row `ad_impressions` table to the number of distinct `ad_campaign` values. A covering index on `(ad_campaign, revenue)` enables an index-only aggregate scan.

> **Interviewers Watch For**
>
> Interviewers verify you aggregate before sorting. Sorting raw rows gives per-row values, not group totals. The correct grain is one row per `ad_campaign`.

> **Common Pitfall**
>
> Using the wrong aggregate function. `SUM` gives totals, `COUNT` gives volume, `AVG` gives rates. Read the prompt to determine which metric is needed.

---

## Common follow-up questions

- If you use RANK instead of DENSE_RANK and three campaigns tie for first, what rank does the next campaign receive? _(Tests understanding of rank gaps; RANK would assign 4, skipping 2 and 3.)_
- How do you handle the 'contains test' exclusion if campaign names use mixed case or abbreviations like 'TST'? _(Tests robustness of the LIKE pattern and whether LOWER normalization is applied.)_
- What does the revenue column represent here, and could it differ from 'spend'? How would you confirm the correct column? _(Tests schema reading discipline; revenue may be advertiser earnings, not spend.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/top_2_ad_campaigns_by_spend)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.