# After Hours API Calls

> The office is dark. The API is not.

Canonical URL: <https://datadriven.io/problems/after_hours_api_calls>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

The compliance team is auditing API activity outside business hours for December 2026. Business hours are Monday through Friday, 09:00 to 16:00. All weekend calls and calls on December 25th and 26th count as out-of-hours regardless of time. Return the total count of out-of-hours calls as a decimal value.

## Worked solution and explanation

### Why this problem exists in real interviews

Interviewers use the `api_calls` table here to probe grouped aggregation. The columns `endpoint`, `method`, `status` force candidates to reason about the correct grain before writing any aggregation.

---

### Break down the requirements

#### Step 1: Group by `STRFTIME('%Y'`

`GROUP BY` at the correct grain produces one row per group.

#### Step 2: Compute `SUM(status)`

The SUM function computes the sum per group.

#### Step 3: Order by the metric

Sort by `sum_status` desc for readability.

---

### The solution

**Group-aggregate for after hours api calls**

```sql
SELECT
    STRFTIME('%Y', call_time) AS year, STRFTIME('%Y', call_time), method,
    SUM(status) AS sum_status
FROM api_calls
GROUP BY STRFTIME('%Y', call_time), method
ORDER BY sum_status DESC
```

> **Cost Analysis**
>
> The main table has 300M rows (77 GB). Partitioned on `call_time`, so queries filtering on that column skip most partitions. The GROUP BY reduces the row count early, keeping downstream operations cheap.

> **Interviewers Watch For**
>
> Strong candidates state the correct `GROUP BY` grain before writing any SQL, showing they think about the output shape first.

> **Common Pitfall**
>
> Selecting a non-aggregated column without including it in `GROUP BY` is the most common error. Some engines reject it; others silently return arbitrary values.

---

## Common follow-up questions

- The `err_msg` column in `api_calls` has roughly 96% NULLs. How does your query handle those rows, and would the result change if NULLs were replaced with zeros? _(Tests whether the candidate understands how NULLs propagate through aggregation functions and whether their WHERE/JOIN conditions implicitly filter them out.)_
- Your GROUP BY aggregates `call_id` from `api_calls`. If two groups have the same aggregate value, how is the output ordered, and is that deterministic? _(Tests awareness that ORDER BY on a non-unique value produces non-deterministic row order without a tiebreaker.)_
- `call_id` in `api_calls` has ~300M distinct values. What index strategy keeps your query from doing a full table scan? _(Tests whether the candidate can design indexes for high-cardinality columns and understands selectivity.)_
- If the business definition of `method` changed mid-quarter (e.g., a status value was renamed), how would you handle historical consistency? _(Tests awareness of slowly changing dimensions and backward-compatible query design.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/after_hours_api_calls)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.