# Campaign Revenue by Click Channel

> Which ad format drives the most revenue?

Canonical URL: <https://datadriven.io/problems/campaign_revenue_by_click_channel>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

The ad team wants a revenue breakdown by campaign showing how much came from clicked impressions versus non-clicked ones. For each campaign, show the total revenue and the percentage split between clicked and non-clicked, listed alphabetically by campaign.

## Worked solution and explanation

### Why this problem exists in real interviews

This probes conditional aggregation, a core skill for building pivot-style reports in a single pass. Interviewers want to see that you can split a metric by a binary flag using `CASE WHEN` inside aggregate functions rather than writing multiple queries or subqueries.

---

### Break down the requirements

#### Step 1: Identify the relevant table

Only `ad_impressions` is needed. The `push_notifs` table is a distractor; the prompt asks for revenue by click channel within impressions, not cross-channel joins.

#### Step 2: Group by campaign

`GROUP BY ad_campaign` produces one row per campaign. `SUM(revenue)` gives total revenue.

#### Step 3: Compute clicked vs non-clicked splits

Use `SUM(CASE WHEN clicked = 1 THEN revenue ELSE 0 END)` for clicked revenue and the complement for non-clicked. Divide each by total revenue and multiply by 100 for percentages.

#### Step 4: Order alphabetically

`ORDER BY ad_campaign` satisfies the alphabetical sort requirement.

---

### The solution

**Conditional aggregation with percentage split**

```sql
SELECT
    ad_campaign,
    SUM(revenue) AS total_revenue,
    ROUND(100.0 * SUM(CASE WHEN clicked = 1 THEN revenue ELSE 0 END) / SUM(revenue), 2) AS clicked_pct,
    ROUND(100.0 * SUM(CASE WHEN clicked = 0 THEN revenue ELSE 0 END) / SUM(revenue), 2) AS non_clicked_pct
FROM ad_impressions
GROUP BY ad_campaign
ORDER BY ad_campaign
```

> **Cost Analysis**
>
> A single sequential scan of 250M rows with no joins. The `GROUP BY` reduces output to ~180 rows. This is as efficient as possible; the bottleneck is I/O on the full table scan. An index on `ad_campaign` would not help since we need every row.

> **Interviewers Watch For**
>
> Candidates who recognize the `push_notifs` table as a distractor show strong prompt-reading skills. The other key signal is using conditional aggregation in one pass rather than self-joining the table twice.

> **Common Pitfall**
>
> Dividing by `SUM(revenue)` without guarding against zero-revenue campaigns causes division-by-zero errors. Wrapping in `NULLIF(SUM(revenue), 0)` is the safe pattern.

---

## Common follow-up questions

- How would you handle campaigns with zero total revenue? _(Tests NULLIF or CASE guards around division to prevent runtime errors.)_
- What if the prompt also asked for the push notification open rate per campaign? _(Now the second table is relevant; tests JOIN strategy and grain alignment.)_
- Could you produce this as a pivoted result with separate rows for clicked and non-clicked? _(Tests UNION ALL vs conditional aggregation trade-offs.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/campaign_revenue_by_click_channel)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.