# Top N Keys

> Most of them do not matter. The few that do stand out.

Canonical URL: <https://datadriven.io/problems/top_n_keys>

Domain: Python · Difficulty: medium · Seniority: L3

## Problem

Given a dict mapping keys to numeric values and an integer n, return the n keys with the largest values, sorted by value descending, tie-break alphabetically by key. If fewer than n keys exist, return all of them.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **partial sorting** for top-N selection. It probes whether a candidate sorts the entire dict or uses a more efficient approach when only the top N are needed.

---

### Break down the requirements

#### Step 1: Sort dict keys by their values in descending order

The keys with the highest counts come first.

#### Step 2: Take the first N keys

Slice the sorted result to get only the top N.

---

### The solution

**Full sort with top-N slice**

```python
def top_n_keys(data: dict, n: int) -> list:
    sorted_keys = sorted(data.keys(), key=lambda k: data[k], reverse=True)
    result = []
    for i in range(min(n, len(sorted_keys))):
        result.append(sorted_keys[i])
    return result
```

> **Time and Space Complexity**
>
> **Time:** O(m log m) where m is the number of keys. For large m with small n, `heapq.nlargest` is O(m log n).
> 
> **Space:** O(m) for the sorted key list.

> **Interviewers Watch For**
>
> Mentioning `heapq.nlargest(n, data, key=data.get)` as an optimization when n is much smaller than the dict size.

> **Common Pitfall**
>
> Using `reverse=False` or forgetting to sort descending. The top N keys are the ones with the highest values, not the lowest.

---

## Common follow-up questions

- What if there are ties at the Nth position? _(Tests whether to include all tied keys or strictly limit to N.)_
- When does heapq.nlargest outperform sorted? _(Tests understanding that it is better when n is much smaller than the total count.)_
- How would you do this in a streaming context? _(Tests maintaining a min-heap of size N as new entries arrive.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/top_n_keys)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.