# Least Viewed Content

> Nobody is watching. Should it still exist?

Canonical URL: <https://datadriven.io/problems/least_viewed_content>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

Which content has the fewest unique viewers? A user viewing the same page multiple times counts only once. If there's a tie, include all tied content.

## Worked solution and explanation

### Why this problem exists in real interviews

On a 500M row `page_views` table, the interviewer is checking two reflexes: filter aggregates with HAVING, not WHERE, and dedupe viewers with `COUNT(DISTINCT user_id)` instead of `COUNT(*)`. The tie-inclusive twist forces a scalar subquery returning the global MIN viewer count, not a `LIMIT 1`.

---

### Break down the requirements

#### Step 1: Group on the content key

`GROUP BY page_url`. Treat `page_url` as `content_id` in the SELECT list so downstream consumers get the field name the prompt asked for.

#### Step 2: Dedupe per viewer

Use `COUNT(DISTINCT user_id)`, not `COUNT(*)`. A user reloading the same page ten times must contribute one viewer.

#### Step 3: Find the global minimum

Inner scalar subquery: group by `page_url`, then `SELECT MIN(viewer_count)` over those groups. That single number is the threshold every tied page must match.

#### Step 4: Filter on the aggregate

`HAVING COUNT(DISTINCT user_id) = (subquery)`. WHERE cannot see the aggregate. ORDER BY at the end is cosmetic since all rows tie, but keep it for stable output.

---

### The solution

**TIE-INCLUSIVE MIN-VIEWER PAGES**

```sql
SELECT page_url AS content_id,
       COUNT(DISTINCT user_id) AS unique_viewers
FROM page_views
GROUP BY page_url
HAVING COUNT(DISTINCT user_id) = (
  SELECT MIN(viewer_count)
  FROM (
    SELECT COUNT(DISTINCT user_id) AS viewer_count
    FROM page_views
    GROUP BY page_url
  )
)
ORDER BY unique_viewers ASC
```

> **Cost Analysis**
>
> Two passes over 500M rows: one for the outer group, one for the inner. A real planner often shares the scan via CTE materialization, but assume two hash aggs on `page_url`. Distinct counting needs a per-group hash set on `user_id`. Approximate with `APPROX_COUNT_DISTINCT` in production if exactness is not required.

> **Interviewers Watch For**
>
> Ask before writing: are anonymous viewers in scope? `COUNT(DISTINCT user_id)` silently skips NULL `user_id`, which makes a page with only logged-out traffic look like zero viewers and win the minimum. If anonymous counts, switch to `COUNT(DISTINCT COALESCE(user_id, view_id))` or session id.

> **Common Pitfall**
>
> Reaching for `ORDER BY unique_viewers ASC LIMIT 1`. That returns one row and silently drops ties, which the prompt explicitly forbids. The scalar subquery against `MIN(viewer_count)` is the only way to keep every tied page.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- Restrict the answer to the last 30 days of views. _(Tests partition pruning on `viewed_at` via a WHERE clause inside both the outer query and the inner subquery.)_
- Rewrite using a window function instead of a scalar subquery. _(Probes `RANK() OVER (ORDER BY COUNT(DISTINCT user_id))` on a grouped CTE and filtering rank = 1.)_
- What if `page_url` has tracking parameters that vary per visit? _(Forces normalization with `SPLIT_PART` or regex before grouping, otherwise the same article fragments into many `content_id` values.)_
- Return the least viewed page per `device` instead of overall. _(Adds `device` to GROUP BY and switches the scalar subquery to a per-device window MIN.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/least_viewed_content)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.