# Peak Hour Power Callers

> One hour. The phone lines exploded.

Canonical URL: <https://datadriven.io/problems/peak_hour_power_callers>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

How many unique users made 3 or more API calls during the afternoon rush (15:00 to 17:59 inclusive)? Return a single count.

## Worked solution and explanation

### Why this problem exists in real interviews

Extracting insights from api_calls.endpoint grouped by method via HAVING filter and grouping is the central task. It is used in mid-level screens to test whether you pick the right aggregation function and partition boundary on the first attempt.

---

### Break down the requirements

#### Step 1: Filter to afternoon rush hours

`WHERE CAST(strftime('%H', call_time) AS INTEGER) BETWEEN 15 AND 17` restricts to 15:00 through 17:59.

#### Step 2: Count calls per user in that window

`GROUP BY user_id HAVING COUNT(*) >= 3` identifies power callers.

#### Step 3: Count the qualifying users

Wrap in a subquery and `SELECT COUNT(*)` to get the final number.

---

### The solution

**Time filter, per-user HAVING, outer COUNT**

```sql
SELECT COUNT(*) AS power_caller_count
FROM (
    SELECT user_id
    FROM api_calls
    WHERE CAST(strftime('%H', call_time) AS INTEGER) BETWEEN 15 AND 17
    GROUP BY user_id
    HAVING COUNT(*) >= 3
) sub
```

> **Cost Analysis**
>
> At `api_calls` (200,000,000 rows), a full table scan is expensive. Partition pruning (if the table is partitioned on the filter column) is the first optimization. A covering index on the `GROUP BY` + filter columns eliminates random I/O. Consider a materialized view for repeated dashboard queries.

> **Interviewers Watch For**
>
> Interviewers evaluate whether you translate the English requirements into the correct SQL clauses on the first attempt. They watch for clean syntax, correct column references, and whether you verify edge cases before declaring the query complete.

> **Common Pitfall**
>
> The most common mistake is misreading the prompt's filtering or grouping requirements. Double-check which columns to group by, which to aggregate, and whether the output should be filtered with `WHERE` (before grouping) or `HAVING` (after grouping).

---

## Common follow-up questions

- What happens to your result if api_calls.err_msg contains NULLs for some rows? _(Tests whether the candidate accounts for NULL behavior in aggregates and comparisons on err_msg.)_
- What is the difference between filtering in WHERE versus HAVING for this query against api_calls? _(Tests whether the candidate understands pre-aggregation vs post-aggregation filtering.)_
- With millions of distinct values in api_calls.call_id, what index strategy would you use to keep this query performant? _(Tests indexing knowledge specific to high-cardinality columns like call_id.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/peak_hour_power_callers)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.