# The Frequency Eviction

> When storage is tight, something has to go.

Canonical URL: <https://datadriven.io/problems/the_frequency_eviction>

Domain: Python · Difficulty: hard · Seniority: L5

## Problem

Implement LFUCache(capacity) with get(key) returning the value (or -1 if missing) and put(key, value). When capacity is exceeded, evict the least-frequently-used entry. Ties are broken by LRU. Test harness passes parallel ['method', args] sequences and expects the list of return values (None for put, value or -1 for get).

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **LFU cache design**, a harder variant of LRU that requires tracking access frequency and breaking ties by recency. It probes advanced data structure composition to achieve O(1) amortized operations.

> **Trick to Solving**
>
> Maintain a dict of frequency-to-OrderedDict mappings. Track each key's frequency and the global minimum frequency. On eviction, pop from the OrderedDict at `min_freq`. On access, move the key from its current frequency bucket to `freq + 1`.

---

### Break down the requirements

#### Step 1: Track key frequency and value

Each key needs a stored value and an access count.

#### Step 2: Maintain frequency buckets

Group keys by their access count using OrderedDicts so LRU ordering within each frequency is preserved.

#### Step 3: Evict the least frequent, least recent key

On capacity overflow, pop from the bucket at `min_freq`.

#### Step 4: Update min_freq on operations

On `put` for a new key, `min_freq` resets to 1. When evicting from `min_freq`, check if the bucket is empty.

---

### The solution

**Frequency-bucketed cache with OrderedDict tiers**

```python
from collections import OrderedDict, defaultdict
class LFUCache:
    def __init__(self, capacity: int):
        self.capacity = capacity
        self.key_val = {}
        self.key_freq = {}
        self.freq_keys = defaultdict(OrderedDict)
        self.min_freq = 0
    def get(self, key):
        if key not in self.key_val:
            return -1
        freq = self.key_freq[key]
        del self.freq_keys[freq][key]
        if not self.freq_keys[freq]:
            del self.freq_keys[freq]
            if self.min_freq == freq:
                self.min_freq += 1
        self.key_freq[key] = freq + 1
        self.freq_keys[freq + 1][key] = True
        return self.key_val[key]
    def put(self, key, value):
        if self.capacity <= 0:
            return
        if key in self.key_val:
            self.key_val[key] = value
            self.get(key)
            return
        if len(self.key_val) >= self.capacity:
            evict_key, _ = self.freq_keys[self.min_freq].popitem(last=False)
            del self.key_val[evict_key]
            del self.key_freq[evict_key]
        self.key_val[key] = value
        self.key_freq[key] = 1
        self.freq_keys[1][key] = True
        self.min_freq = 1
```

> **Time and Space Complexity**
>
> **Time:** O(1) amortized for both `get` and `put`. All dict and OrderedDict operations are O(1).
> 
> **Space:** O(capacity) for the stored keys across all frequency buckets.

> **Interviewers Watch For**
>
> Whether you handle `min_freq` correctly. When a key at `min_freq` is accessed and its bucket becomes empty, `min_freq` must increment. When a new key is inserted, `min_freq` resets to 1.

> **Common Pitfall**
>
> Calling `self.get(key)` inside `put` to update frequency works but returns -1 for new keys. Only call it for existing key updates.

---

## Common follow-up questions

- What is the tradeoff between LFU and LRU in practice? _(Tests that LFU resists scan pollution but is slower to adapt to workload shifts.)_
- How would you implement a windowed LFU that only counts recent accesses? _(Tests combining time-based decay with frequency counting.)_
- How do real caches like Redis implement eviction? _(Tests knowledge of approximate LFU using probabilistic counters.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_frequency_eviction)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.