# The Throttle Ceiling

> Too many requests in too short a timeframe. Throttle it.

Canonical URL: <https://datadriven.io/problems/the_throttle_ceiling>

Domain: Python · Difficulty: medium · Seniority: L4

## Problem

Implement a RateLimiter(max_requests, window_seconds). Each allow(timestamp) call returns True when the number of requests already recorded with timestamps in (timestamp - window_seconds, timestamp] is strictly less than max_requests, then records this timestamp. Returns False otherwise. The test harness passes parallel lists of calls and timestamps and expects a parallel list of booleans.

## Worked solution and explanation

### Why this problem exists in real interviews

Public APIs need to cap traffic per client without rejecting legitimate burstiness or permanently blacklisting anyone who hits the limit once. Interviewers use this to test class design, the sliding-window vs fixed-window distinction, and one specific trap: do denied requests count toward the next window? They want you to ask, then implement the answer correctly.

---

### Break down the requirements

#### Step 1: Lock the API the harness expects

The constructor takes `max_requests` and `window_seconds`. The decision method is `is_allowed(timestamp)` (note: not `allow`). The harness wraps the class in `run_rate_limiter(calls, max_requests, window_seconds)` and expects a parallel list of booleans.

#### Step 2: Evict timestamps outside the rolling window

For each new timestamp T, drop every recorded timestamp t where `T - t >= window_seconds`. The window is half-open: `(T - window, T]`. Anything older than the window is gone, anything inside it counts.

#### Step 3: Allow only when count is below the cap, never record denials

If the remaining count is strictly less than `max_requests`, append T and return True. Otherwise return False without recording. Recording denied requests would let one bad burst lock the client out forever, since each denial keeps the window full.

---

### The solution

**Sliding window limiter with denial-aware semantics**

```python
class RateLimiter:
    def __init__(self, max_requests, window_seconds):
        self.max_requests = max_requests
        self.window = window_seconds
        self.timestamps = []

    def is_allowed(self, timestamp):
        self.timestamps = [t for t in self.timestamps if timestamp - t < self.window]
        if len(self.timestamps) < self.max_requests:
            self.timestamps.append(timestamp)
            return True
        return False

def run_rate_limiter(calls, max_requests, window_seconds):
    limiter = RateLimiter(max_requests, window_seconds)
    return [limiter.is_allowed(t) for t in calls]
```

> **Cost Analysis**
>
> Time per call is O(k) where k is the number of timestamps currently in the window, bounded by `max_requests`. The list-comprehension rebuild is the dominant cost. Switching to a `collections.deque` and popping the left side while the head is older than the cutoff drops this to amortized O(1) per call (each timestamp is appended once and popped once across its lifetime). Space is O(max_requests).

> **Interviewers Watch For**
>
> Whether you ask the disambiguating question (sliding vs fixed window, do denials count) before coding, whether you pick `<` over `<=` correctly for the cutoff, and whether you avoid recording denied calls. Strong candidates also volunteer the deque optimization without being asked.

> **Common Pitfall**
>
> Recording the timestamp before the count check, or recording it inside the False branch. Either one inflates the in-window count with denials and bricks the client. The visible test with calls `[0, 4, 5]`, max 1, window 5 only passes if t=4 is denied and not stored, so that t=5 (5 - 0 = 5, which is not strictly less than the window of 5) sees an empty window and is allowed.

---

## Common follow-up questions

- How does this change when one limiter has to serve millions of independent clients in the same process? _(Tests the per-client extension. A dict of `client_id` to deque is the standard answer. Memory grows with active client count, so a follow-up about TTL eviction or LRU pruning is common.)_
- Why would a team intentionally pick a fixed-window limiter over the sliding version you wrote? _(Tests understanding that fixed windows allow a 2x burst at the boundary (N at the end of one window, N at the start of the next), while sliding windows do not. Counter is simpler and cheaper.)_
- How would you make this distributed across N API servers without each server keeping its own count? _(Tests Redis primitives. ZADD with timestamp scores plus ZREMRANGEBYSCORE for eviction and ZCARD for the count, wrapped in a Lua script for atomicity. Token bucket via INCR plus EXPIRE is the simpler alternative.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_throttle_ceiling)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.