# Rate Limit Rules Per Endpoint

> Threshold rules, endpoint by endpoint.

Canonical URL: <https://datadriven.io/problems/rate_limit_rules_per_endpoint>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

We need to document rate limit configurations across endpoints. For each endpoint, surface the minimum and maximum allowed values, plus a summary string in the format 'Allowed > X AND Allowed <= Y => Endpoint = /path'. List results by endpoint.

## Worked solution and explanation

### Why this problem exists in real interviews

This analytics problem uses the `rate_limits` table to evaluate string concatenation for formatted output. Watch how the `endpoint` and `allowed` columns interact in the grouping and filtering logic.

---

### Break down the requirements

#### Step 1: Group by endpoint

`GROUP BY endpoint` aggregates all rate limit records per endpoint.

#### Step 2: Compute min and max allowed

`MIN(allowed)` and `MAX(allowed)` give the range.

#### Step 3: Build the summary string

Concatenate the values into the specified format using `||` string concatenation.

#### Step 4: Order by endpoint

`ORDER BY endpoint` for alphabetical listing.

---

### The solution

**Group by endpoint to find rate limit rules per endpoint**

```sql
SELECT
    endpoint,
    MIN(allowed) AS min_allowed,
    MAX(allowed) AS max_allowed,
    'Allowed > ' || MIN(allowed) || ' AND Allowed <= ' || MAX(allowed) || ' => Endpoint = ' || endpoint AS summary
FROM rate_limits
GROUP BY endpoint
ORDER BY endpoint
```

> **Cost Analysis**
>
> The query scans `rate_limits` (3,000,000 rows). A covering index on the filter and group columns would reduce I/O. At this scale, the full scan is acceptable but becomes costly if the table grows 10x.

> **Interviewers Watch For**
>
> Interviewers evaluate whether you translate the English requirements into the correct SQL clauses on the first attempt. They watch for clean syntax, correct column references, and whether you verify edge cases before declaring the query complete.

> **Common Pitfall**
>
> The most common mistake is misreading the prompt's filtering or grouping requirements. Double-check which columns to group by, which to aggregate, and whether the output should be filtered with `WHERE` (before grouping) or `HAVING` (after grouping).

---

## Common follow-up questions

- What would happen to your result if `rate_limits.blocked` contained duplicate values that you did not expect? _(Tests whether the candidate considers data quality issues in `blocked` and uses DISTINCT or deduplication where needed.)_
- If `rate_limits` grew to contain billions of rows, which part of your query would become the bottleneck given the cardinality of `allowed`? _(Tests ability to identify performance hotspots related to `rate_limits.allowed` at scale.)_
- If this query ran as a scheduled job, how would you add monitoring to detect when the result set is suspiciously empty? _(Tests operational awareness around scheduled query jobs.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/rate_limit_rules_per_endpoint)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.