# Most Popular Content Type

> The content type everyone prefers.

Canonical URL: <https://datadriven.io/problems/most_popular_content_type>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

Before allocating next quarter's production budget, the editorial team wants to know which content type already dominates the catalog by volume, along with its item count.

## Worked solution and explanation

### What this is really asking

4M content_items collapse to a short content_type list, then ORDER BY cnt DESC LIMIT 1 picks the leader. The trap is what LIMIT 1 hides on ties.

---

### Break down the requirements

#### Step 1: Aggregate by content_type

GROUP BY content_type with COUNT(*). content_id is the primary key so COUNT(content_id) matches, but COUNT(*) reads as the right intent.

#### Step 2: Rank and slice

ORDER BY cnt DESC LIMIT 1 returns the leader. Return both columns so editorial sees the gap.

---

### The solution

**TOP CONTENT TYPE BY VOLUME**

```sql
SELECT content_type, COUNT(*) AS cnt
FROM content_items
GROUP BY content_type
ORDER BY cnt DESC
LIMIT 1
```

> **Cost Analysis**
>
> A 4M row scan is unavoidable without a precomputed rollup. content_type is low cardinality, so the hash aggregate stays tiny. A covering index on content_type lets the planner skip the wider row.

> **Interviewers Watch For**
>
> Whether you mention ties unprompted. A 12 minute budget says they want the edge case (multiple types tied for first) and a plan to surface it, not a one-line query.

> **Common Pitfall**
>
> Selecting non-aggregated columns alongside content_type. Lax engines accept SELECT title, content_type, COUNT(*) and return an arbitrary title; strict mode rejects it. Stick to the grouped column and the aggregate.

> **The False Start**
>
> First instinct is ORDER BY cnt DESC LIMIT 1 and ship it. That silently picks one winner when two types tie. Pivot to RANK() OVER (ORDER BY COUNT(*) DESC) in a CTE and filter WHERE rnk = 1 so every tied leader surfaces.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- Return every content_type tied for the top spot, not just one. _(Forces the RANK or DENSE_RANK rewrite when LIMIT is unsafe.)_
- Restrict to items published in the last 12 months. _(Adds a publish_date filter and opens whether recent volume diverges from catalog volume.)_
- How would you serve this answer to a dashboard refreshing every 30 seconds? _(Pushes toward a materialized rollup or counter table; scanning 4M rows per load is wasteful.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/most_popular_content_type)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.