# The Overlap Finder

> Two guest lists - who made it onto both?

Canonical URL: <https://datadriven.io/problems/the_overlap_finder>

Domain: Python · Difficulty: easy · Seniority: L3

## Problem

Given two lists of integers, return a sorted list of the distinct values that appear in BOTH.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **set-based intersection and deduplication**, the same core pattern as data reconciliation. It verifies that candidates can find common values between two datasets efficiently.

---

### Break down the requirements

#### Step 1: Convert one list to a set

This enables O(1) lookups when checking elements from the other list.

#### Step 2: Find common values

Scan the second list and collect elements present in the set, tracking which have already been added to avoid duplicates.

#### Step 3: Sort and return

Return the common values in sorted order.

---

### The solution

**Set intersection with dedup and sort**

```python
def find_overlap(list1, list2):
    set1 = set(list1)
    found = set()
    result = []
    for val in list2:
        if val in set1 and val not in found:
            result.append(val)
            found.add(val)
    result.sort()
    return result
```

> **Time and Space Complexity**
>
> **Time:** O(n + m + k log k) where n and m are list sizes and k is the number of common values.
> 
> **Space:** O(n + k) for the set and result.

> **Interviewers Watch For**
>
> Do you mention that `set(list1) & set(list2)` is a one-liner alternative? Both approaches are valid, but knowing the concise version shows Python fluency.

> **Common Pitfall**
>
> Not deduplicating the result. If both lists contain `[3, 3]`, the output should have 3 only once.

---

## Common follow-up questions

- What if both lists are already sorted? _(Tests the two-pointer approach for O(n + m) without extra space.)_
- What if you needed values in list1 but not in list2? _(Tests set difference instead of intersection.)_
- How would this scale to billions of records? _(Tests distributed set operations or hash-based joins.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_overlap_finder)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.