# Top Active Senders per Channel

> Top three messages per channel by replies.

Canonical URL: <https://datadriven.io/problems/top_active_senders_per_channel>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

Show the top 3 most active senders by message count in each chat channel. Ties share the same rank with no gaps, which may produce more than 3 rows per channel. Return channel, sender ID, and message count.

## Worked solution and explanation

### Why this problem exists in real interviews

This uses `chat_msgs` to probe per-group ranking via `ROW_NUMBER()` or `DENSE_RANK()` partitioned by a grouping key. It reveals whether a candidate understands the difference between `ROW_NUMBER`, `RANK`, and `DENSE_RANK` and picks the right one for the output contract.

---

### Break down the requirements

#### Step 1: Apply filters

Use a `WHERE` clause to narrow the data to the relevant subset before aggregation.

#### Step 2: Aggregate per channel

`GROUP BY channel` with the appropriate aggregate function produces one summary row per group from the `chat_msgs` table.

#### Step 3: Rank within each channel

Use `ROW_NUMBER() OVER (PARTITION BY channel ORDER BY aggregate DESC)` to rank entries within each partition.

#### Step 4: Filter to top entries

Wrap in a subquery and filter `WHERE rn <= N` to keep only the top entries per group.

---

### The solution

**Dense-rank senders by message count within each chat_msgs channel**

```sql
SELECT channel, channel, total_reply_to
FROM (
    SELECT
        channel,
        channel,
        SUM(reply_to) AS total_reply_to,
        ROW_NUMBER() OVER (
            PARTITION BY channel
            ORDER BY SUM(reply_to) DESC
        ) AS rn
    FROM chat_msgs
    GROUP BY channel, channel
) ranked
WHERE rn <= 10
ORDER BY channel, total_reply_to DESC
```

> **Cost Analysis**
>
> The GROUP BY reduces the 40M-row `chat_msgs` table to the number of distinct `channel` values. The window function sorts within each partition. A covering index on `(channel, reply_to)` enables an index-only aggregate scan.

> **Interviewers Watch For**
>
> Interviewers specifically test whether you use `PARTITION BY` in the window function. Omitting it gives a global ranking instead of per-group, which is at its core different.

> **Common Pitfall**
>
> Using `ORDER BY ... LIMIT` instead of a window function for per-group ranking. LIMIT gives N rows globally, not per group. Per-group top-N always requires a window function.

---

## Common follow-up questions

- If a channel has 5 senders all with the same message count, how many rows does your query return for that channel? _(Tests understanding that DENSE_RANK assigns rank 1 to all five, and all pass the <= 3 filter.)_
- Should you aggregate first in a CTE and then apply DENSE_RANK, or can you combine them in a single SELECT? _(Tests query layering; window functions operate after GROUP BY, so aggregation and ranking can coexist in one layer.)_
- How would you exclude bots or system senders if they have a known sender_id prefix like 'sys_'? _(Tests adding a WHERE filter on sender_id before aggregation.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/top_active_senders_per_channel)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.