# The Throttle Wall

> Stop the abusers. Let the rest through.

Canonical URL: <https://datadriven.io/problems/the_throttle_wall>

Domain: Python · Difficulty: hard · Seniority: L4

## Problem

Implement RateLimiter(max_requests, window_ms). is_allowed(client_id, timestamp) returns True when the client has strictly fewer than max_requests recorded in the window (timestamp - window_ms, timestamp], then records the current timestamp. Test harness: ['init', max, window] then ['is_allowed', client, ts] calls; returns parallel results (None for init, True/False for each is_allowed).

## Worked solution and explanation

### Why this problem exists in real interviews

API gateways at fintech and ad-tech shops have to throttle thousands of independent clients in the same process, and the FAANG bar is to do it without locking out a noisy client forever. Interviewers use this to test per-client state isolation, the sliding-window pattern in milliseconds (not seconds), and the harness contract where one constructor call returns None and each subsequent decision returns a bool.

---

### Break down the requirements

#### Step 1: Match the harness contract exactly

The constructor takes `max_requests` and `window_ms` (milliseconds, integer). The decision method is `is_allowed(client_id, timestamp)`. The harness loop reads `['init', max, window]` first (yielding None), then a stream of `['is_allowed', client, ts]` ops (yielding True or False each).

#### Step 2: Isolate state per client

A `defaultdict(deque)` keyed by `client_id` gives each client an independent timestamp queue without an explicit existence check. Sharing a single deque across clients lets one chatty client throttle everyone else, which the visible test catches: clientB at t=300 is allowed because its own deque is empty even though clientA is currently throttled.

#### Step 3: Evict in milliseconds, deny without recording

Pop from the head while `timestamp_ms - q[0] >= window_ms`. The window is half-open: a request exactly `window_ms` away from a stored timestamp falls outside and frees a slot. Then if `len(q) < max_requests`, append and return True; otherwise return False without recording.

---

### The solution

**Per-client sliding window in milliseconds**

```python
from collections import defaultdict, deque

class RateLimiter:
    def __init__(self, max_requests, window_ms):
        self.max_requests = max_requests
        self.window_ms = window_ms
        self.clients = defaultdict(deque)

    def is_allowed(self, client_id, timestamp_ms):
        q = self.clients[client_id]
        while q and timestamp_ms - q[0] >= self.window_ms:
            q.popleft()
        if len(q) < self.max_requests:
            q.append(timestamp_ms)
            return True
        return False

def run_throttle_wall(operations):
    obj = None
    results = []
    for op in operations:
        method = op[0]
        args = op[1:]
        if method == "init":
            obj = RateLimiter(*args)
            results.append(None)
        else:
            result = getattr(obj, method)(*args)
            results.append(result)
    return results
```

> **Cost Analysis**
>
> Time per `is_allowed` call is amortized O(1). Each timestamp is enqueued once and dequeued once across its lifetime in the window. Space is O(C * max_requests) where C is the number of active clients. At 10K simultaneous clients with `max_requests = 60`, that is roughly 600K integer slots, comfortably in process memory. The defaultdict never shrinks on its own, so a long-lived process needs a periodic sweep that drops empty deques to bound memory by truly active clients.

> **Interviewers Watch For**
>
> Whether you isolate state per client (defaultdict of deque vs a single shared deque), whether you pick `>=` for the eviction comparison so a request at exactly `window_ms` past an old one frees the slot (the visible test sends t=1101 with window=1000 and expects True), and whether you skip recording denied requests so a denial burst does not extend the cooldown.

> **Common Pitfall**
>
> Confusing the boundary direction. The visible test with max=2, window=1000 sends clientA at 100, 200, 300, 1101. The first two are allowed, 300 is denied (count is 2, not strictly less than 2), and 1101 is allowed because 1101 - 100 = 1001 which is at least 1000 so the t=100 entry evicts. Using `>` instead of `>=` keeps t=100 in the deque and wrongly denies t=1101.

---

## Common follow-up questions

- Your defaultdict grows forever as new client_ids arrive. How do you garbage-collect clients that have gone quiet? _(Tests memory hygiene in long-lived processes. A periodic sweep that pops the front of each deque and drops the entry when empty bounds memory to the number of clients seen recently. Last-seen timestamps plus an LRU eviction is the production answer.)_
- How does this design change when N API servers behind a load balancer all need to share one limiter per client? _(Tests Redis primitives. ZADD with timestamp as score, ZREMRANGEBYSCORE to evict, ZCARD to count, all inside a Lua script so the check-and-record is atomic across replicas. The deque-per-client design ports almost directly to a sorted set per client.)_
- What breaks if the harness sends timestamps that are not strictly increasing per client? _(Tests robustness. With multiple servers, timestamps can arrive out of order. The deque assumes monotonic input, so an out-of-order request can land mid-deque and break the eviction loop. Server-side timestamping or a sorted insertion fixes it.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_throttle_wall)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.