# The Duplicate Spotter

> Some values appear more than once - report only those.

Canonical URL: <https://datadriven.io/problems/the_duplicate_spotter>

Domain: Python · Difficulty: easy · Seniority: L3

## Problem

Given a list of integers, return a list of the distinct values that appear more than once. Output order: ascending.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **set-based counting** to identify duplicates in a single pass. It probes whether a candidate can distinguish between "seen once" and "seen multiple times" using two sets.

---

### Break down the requirements

#### Step 1: Track seen elements

Use a set to record every value encountered so far.

#### Step 2: Track duplicates

When a value is already in the seen set, add it to the duplicates set.

#### Step 3: Return duplicates as a list

Convert the duplicates set to a list for the output.

---

### The solution

**Two-set duplicate detection**

```python
def find_duplicates(nums: list) -> list:
    seen = set()
    duplicates = set()
    for num in nums:
        if num in seen:
            duplicates.add(num)
        else:
            seen.add(num)
    result = list(duplicates)
    return result
```

> **Time and Space Complexity**
>
> **Time:** O(n) single pass with O(1) set lookups.
> 
> **Space:** O(n) for the two sets in the worst case.

> **Interviewers Watch For**
>
> Whether you use two sets instead of a frequency dict. Both work, but the two-set approach is more space-efficient when you only need the binary "duplicate or not" signal.

> **Common Pitfall**
>
> Adding every duplicate occurrence to a list instead of a set, producing duplicate entries in the result. Using a set for duplicates prevents this.

---

## Common follow-up questions

- What if order of the output matters? _(Tests using a list with an `in` check or maintaining insertion order.)_
- How would you find elements that appear exactly k times? _(Tests upgrading to a full frequency dict.)_
- What if the input is sorted? _(Tests that adjacent comparison makes duplicates trivial to detect in O(1) space.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_duplicate_spotter)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.