# Click vs Non-Click Rates

> Some searches lead to clicks. Most do not.

Canonical URL: <https://datadriven.io/problems/click_vs_non_click_rates>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

The search quality team wants to measure how often users click a result when the result set is small. In a single row, compute two percentages relative to all search records: the share of queries where a result was clicked and the results count was 3 or fewer, and the share where no result was clicked despite the results count being 3 or fewer.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests unconditional aggregation to produce a single-row summary with two derived percentages. Interviewers probe whether you can express compound conditions inside aggregate functions and compute percentages relative to a total.

---

### Break down the requirements

#### Step 1: Define the two metrics

Clicked with small result set: `clicked_result = 1 AND results_count <= 3`. Not clicked with small result set: `clicked_result = 0 AND results_count <= 3`. Both expressed as percentages of total rows.

#### Step 2: Compute in a single query

Use conditional aggregation: `100.0 * SUM(CASE WHEN ... THEN 1 ELSE 0 END) / COUNT(*)` for each metric.

---

### The solution

**Dual conditional percentage in one row**

```sql
SELECT
    ROUND(100.0 * SUM(CASE WHEN clicked_result = 1 AND results_count <= 3 THEN 1 ELSE 0 END) / COUNT(*), 2) AS clicked_small_pct,
    ROUND(100.0 * SUM(CASE WHEN clicked_result = 0 AND results_count <= 3 THEN 1 ELSE 0 END) / COUNT(*), 2) AS not_clicked_small_pct
FROM search_queries
```

> **Cost Analysis**
>
> Single scan of 50M rows with no GROUP BY (single-row output). Two conditional sums add negligible per-row overhead. This is as efficient as possible.

> **Interviewers Watch For**
>
> Whether the candidate produces a single-row result vs two rows. The prompt says "in a single row," so conditional aggregation is required, not UNION.

> **Common Pitfall**
>
> Using `results_count < 3` instead of `<= 3` misses exactly-3 results. Precision in inequality operators matters. Always re-read the prompt for exact boundary conditions.

---

## Common follow-up questions

- What if you also needed the percentage for large result sets? _(Add a third CASE condition for results_count > 3.)_
- How would you make this a time-series, computing percentages per day? _(Add GROUP BY DATE(query_time) to produce one row per day.)_
- What if clicked_result could be NULL? _(Tests awareness that CASE WHEN null = 0 evaluates to false, potentially miscounting.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/click_vs_non_click_rates)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.