# The Event Window

> A five-minute window is all that matters.

Canonical URL: <https://datadriven.io/problems/the_event_window>

Domain: Python · Difficulty: medium · Seniority: L4

## Problem

Implement HitCounter with hit(timestamp) (record an event) and get_hits(timestamp) (return number of hits in the last 300 seconds up to and including timestamp, i.e., hits h with timestamp - 300 < h <= timestamp). Test harness returns None for hit and the count for get_hits.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **sliding window data structure design** for time-series counting. It probes whether a candidate can maintain a compact structure that supports efficient insertion and range queries over a rolling time window.

---

### Break down the requirements

#### Step 1: Record timestamps on hit

Store each hit timestamp in an ordered structure.

#### Step 2: Query hits within the last 300 seconds

Count all stored timestamps that fall within `[timestamp - 299, timestamp]`.

#### Step 3: Optimize for repeated queries

Use a queue or circular buffer to discard expired timestamps.

---

### The solution

**Queue-based hit counter with expiration pruning**

```python
from collections import deque
class HitCounter:
    def __init__(self):
        self.hits = deque()
    def hit(self, timestamp: int):
        self.hits.append(timestamp)
    def get_hits(self, timestamp: int) -> int:
        cutoff = timestamp - 299
        while self.hits and self.hits[0] < cutoff:
            self.hits.popleft()
        count = len(self.hits)
        return count
```

> **Time and Space Complexity**
>
> **Time:** `hit` is O(1). `get_hits` is amortized O(1) since each timestamp is added and removed at most once.
> 
> **Space:** O(w) where w is the number of hits within the current 300-second window.

> **Interviewers Watch For**
>
> Whether you prune expired entries lazily (on query) vs eagerly (on insertion). Lazy pruning in `get_hits` is simpler and achieves the same amortized cost.

> **Common Pitfall**
>
> Using a list and searching with binary search on every query. This is O(log n) per query vs O(1) amortized with a deque.

---

## Common follow-up questions

- What if timestamps can arrive out of order? _(Tests maintaining sorted order or using a different data structure.)_
- How would you support configurable window sizes? _(Tests parameterizing the 300-second constant.)_
- What if you need the hit rate (hits per second)? _(Tests dividing count by window size.)_
- How would you handle millions of hits per second? _(Tests bucket-based counting (one counter per second) instead of storing individual timestamps.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_event_window)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.