# The Matching Manifest

> Two warehouses, one shipment - only load what's in both.

Canonical URL: <https://datadriven.io/problems/the_matching_manifest>

Domain: Python · Difficulty: easy · Seniority: L3

## Problem

Given two lists of integers, return a sorted list of the distinct values present in both.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **set intersection** and the ability to deduplicate and sort results. It is a fundamental operation in data reconciliation and reveals whether candidates use set operations for O(n) lookup instead of nested loops.

---

### Break down the requirements

#### Step 1: Convert one list to a set

This enables O(1) membership checking when scanning the other list.

#### Step 2: Find common elements

Iterate through the second list and collect elements that exist in the set. Use a second set to avoid adding duplicates.

#### Step 3: Sort and return

The output must be sorted, so sort the collected common values before returning.

---

### The solution

**Set intersection with sorted output**

```python
def find_common(list1, list2):
    set1 = set(list1)
    seen = set()
    common = []
    for val in list2:
        if val in set1 and val not in seen:
            common.append(val)
            seen.add(val)
    common.sort()
    return common
```

> **Time and Space Complexity**
>
> **Time:** O(n + m + k log k) where n and m are list lengths and k is the number of common values. Building the set is O(n), scanning is O(m), sorting is O(k log k).
> 
> **Space:** O(n + k) for the set and the result list.

> **Interviewers Watch For**
>
> Do you use `set(list1) & set(list2)`? While concise, interviewers at this level want to see the manual approach to verify you understand why sets make lookups fast.

> **Common Pitfall**
>
> Forgetting deduplication. If `list2` contains `[3, 3, 3]` and 3 is in `list1`, naively appending on every match produces `[3, 3, 3]` instead of `[3]`.

---

## Common follow-up questions

- What if both lists were already sorted? _(Tests the two-pointer approach which avoids building a set entirely.)_
- What if you needed the count of each common value? _(Tests switching from set intersection to counter-based minimum.)_
- How would this work with millions of records from two databases? _(Tests distributed join strategies and memory constraints.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_matching_manifest)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.