# Top 2 Active Push Days

> Two days stood out from the rest. Which ones?

Canonical URL: <https://datadriven.io/problems/top_2_active_push_days>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

During a push notification window from August 1 to 7, which two days saw the most unique users receiving a notification? Show the day of week number (0=Sunday through 6=Saturday), date, and unique user count.

## Worked solution and explanation

### What this is really asking

`push_notifs` has 80M rows partitioned by `sent_at`, so wrapping `sent_at` in `DATE()` inside WHERE silently breaks partition pruning. Then unique `user_id` per day, top two.

---

### Break down the requirements

#### Step 1: Window the 7 days

Filter `DATE(sent_at) BETWEEN '...-08-01' AND '...-08-07'`. In production, prefer a half-open `sent_at` range for pruning.

#### Step 2: Group by day, count distinct users

`GROUP BY DATE(sent_at)` with `COUNT(DISTINCT user_id)`. Three Tuesday pushes to one user count once.

#### Step 3: Day-of-week and rank

`strftime('%w', sent_at)` returns '0' through '6'. `ORDER BY unique_users DESC LIMIT 2` picks the top pair.

---

### The solution

**TOP TWO DAYS BY UNIQUE RECIPIENTS**

```sql
SELECT TRIM(strftime('%w', sent_at)) AS day_name,
       DATE(sent_at)                  AS send_date,
       COUNT(DISTINCT user_id)        AS unique_users
FROM push_notifs
WHERE DATE(sent_at) BETWEEN '2026-08-01' AND '2026-08-07'
GROUP BY send_date
ORDER BY unique_users DESC
LIMIT 2
```

> **Cost Analysis**
>
> `COUNT(DISTINCT user_id)` per day is the heavy step on 80M rows; the engine holds a hash set per group. A half-open `sent_at` range beats `DATE()` for pruning.

> **Interviewers Watch For**
>
> `COUNT(user_id)` vs `COUNT(DISTINCT user_id)`. Without DISTINCT, retried sends inflate the count for the day retries fired.

> **Common Pitfall**
>
> Grouping by `strftime('%w', sent_at)` instead of by date. Two different Sundays collapse into one row and the output loses the date column.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- How would you handle ties at rank two? _(Swap `LIMIT 2` for `DENSE_RANK() OVER (ORDER BY unique_users DESC) <= 2` so tied days surface.)_
- What if the partition column is sent_at_hour, not date? _(Same shape; keep WHERE on the raw partition column so pruning still works.)_
- How would you also report open rate per day? _(Add `SUM(opened) * 1.0 / COUNT(*)` and state whether the denominator is rows or distinct users.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/top_2_active_push_days)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.