# Creator Favorite Content Type

> Every creator has a go-to format.

Canonical URL: <https://datadriven.io/problems/creator_favorite_content_type>

Domain: SQL · Difficulty: hard · Seniority: L4

## Problem

For each content creator, which content type do they publish most often? If a creator is tied across multiple types, include all of the tied types. Show each creator and their most-published content type.

## Worked solution and explanation

### Why this problem exists in real interviews

This is a mode-per-group problem (most frequent value per partition). The tie-inclusion requirement makes it harder: you must use RANK, not ROW_NUMBER, and return all tied types.

> **Trick to Solving**
>
> "If tied, include all" signals RANK or DENSE_RANK, not ROW_NUMBER. The pattern is: count per (creator, type), rank within creator by count, filter to rank 1.
> 
> 1. COUNT(*) per (creator_id, content_type)
> 2. RANK() OVER (PARTITION BY creator_id ORDER BY count DESC)
> 3. Filter WHERE rank = 1

---

### Break down the requirements

#### Step 1: Count items per creator-type pair

`GROUP BY creator_id, content_type` with `COUNT(*)`.

#### Step 2: Rank within each creator

`RANK() OVER (PARTITION BY creator_id ORDER BY cnt DESC)` assigns rank 1 to the most-published type(s).

#### Step 3: Filter to rank 1

`WHERE rnk = 1` keeps the top type(s) per creator, including ties.

---

### The solution

**Mode-per-group with tie inclusion**

```sql
WITH type_counts AS (
    SELECT
        creator_id,
        content_type,
        COUNT(*) AS cnt,
        RANK() OVER (PARTITION BY creator_id ORDER BY COUNT(*) DESC) AS rnk
    FROM content_items
    GROUP BY creator_id, content_type
)
SELECT creator_id, content_type
FROM type_counts
WHERE rnk = 1
```

> **Cost Analysis**
>
> Scan of 6M rows, aggregated to (creators x types). The RANK window function sorts per creator partition. With thousands of creators and a handful of types each, this is efficient.

> **Interviewers Watch For**
>
> Whether the candidate chooses RANK (correct for tie inclusion) vs ROW_NUMBER (incorrect, as it would arbitrarily pick one tied type). This is the core test of the question.

> **Common Pitfall**
>
> Using ROW_NUMBER instead of RANK silently drops tied content types. If a creator publishes 50 articles and 50 videos, ROW_NUMBER would pick one arbitrarily.

---

## Common follow-up questions

- What if you needed the top 2 content types per creator? _(Change the filter to rnk <= 2.)_
- How would you break ties by content_type alphabetically? _(Add content_type ASC to the ORDER BY in the window function.)_
- What is the difference between RANK and DENSE_RANK here? _(Both assign rank 1 to ties, but they differ at rank 2+. Since we only filter rank = 1, the choice does not matter.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/creator_favorite_content_type)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.