# Low Latency API Calls

> Fast endpoints. Confirmed fast.

Canonical URL: <https://datadriven.io/problems/low_latency_api_calls>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

The performance team is building a baseline dataset of fast API calls, defined as anything at or below 100 milliseconds. Pull all matching call records.

## Worked solution and explanation

### Why this problem exists in real interviews

Querying api_calls for endpoint data using query construction tests whether you can translate a business requirement into the right column references and filter sequence. It shows up as a fundamentals check to verify practical fluency.

---

### Break down the requirements

#### Step 1: Filter to the target rows

Apply the `WHERE` filter to restrict the working set before aggregation. Filtering early reduces the number of rows that downstream operations process.

#### Step 2: Order the final output

Apply `ORDER BY` as specified to produce the expected row sequence. When tied values exist, add a secondary sort column for determinism.

---

### The solution

**Threshold filter for baseline dataset**

```sql
SELECT call_id, endpoint, method, status, latency, user_id, call_time, err_msg
FROM api_calls
WHERE latency <= 100
ORDER BY latency ASC
```

> **Cost Analysis**
>
> The query scans 80M rows from `api_calls`.

> **Interviewers Watch For**
>
> Interviewers expect you to articulate why you chose a specific join type and what happens to unmatched rows.

> **Common Pitfall**
>
> Forgetting that a JOIN can multiply rows when the relationship is one-to-many. Always check whether the join key is unique on at least one side.

---

## Common follow-up questions

- What happens to your result if api_calls.err_msg contains NULLs for some rows? _(Tests whether the candidate accounts for NULL behavior in aggregates and comparisons on err_msg.)_
- How would you verify that your aggregation on api_calls.call_id is not double-counting due to duplicate rows? _(Tests data quality awareness and deduplication strategies.)_
- With millions of distinct values in api_calls.call_id, what index strategy would you use to keep this query performant? _(Tests indexing knowledge specific to high-cardinality columns like call_id.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/low_latency_api_calls)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.