# The Consecutive Sequence Finder

> Numbers that flow without interruption.

Canonical URL: <https://datadriven.io/problems/the_consecutive_sequence_finder>

Domain: Python · Difficulty: medium · Seniority: L5

## Problem

Given a list of integers (possibly unordered with duplicates), return the length of the longest run of consecutive integers that can be formed (e.g., [100, 4, 200, 1, 3, 2] contains 1,2,3,4 -> 4).

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **set-based algorithmic thinking** to achieve O(n) time without sorting. It probes whether a candidate can identify sequence start points and expand from them, a pattern useful in gap detection and partition analysis.

> **Trick to Solving**
>
> Convert the list to a set. A number is the start of a sequence only if `num - 1` is not in the set. From each start, count consecutive elements forward. This avoids O(n log n) sorting while still examining each element a bounded number of times.

---

### Break down the requirements

#### Step 1: Convert to a set for O(1) lookups

Duplicates are irrelevant for consecutive counting, and set membership is constant time.

#### Step 2: Identify sequence starting points

A number starts a sequence if `num - 1` is not in the set. This ensures each sequence is counted exactly once.

#### Step 3: Expand each sequence and track the longest

From each start, increment and count while the next number exists in the set.

---

### The solution

**Set-based sequence expansion**

```python
def longest_consecutive(nums: list) -> int:
    num_set = set(nums)
    longest = 0
    for num in num_set:
        if num - 1 not in num_set:
            current = num
            streak = 1
            while current + 1 in num_set:
                current += 1
                streak += 1
            if streak > longest:
                longest = streak
    return longest
```

> **Time and Space Complexity**
>
> **Time:** O(n). Each element is visited at most twice: once in the outer loop and once during sequence expansion.
> 
> **Space:** O(n) for the set.

> **Interviewers Watch For**
>
> The `num - 1 not in num_set` guard is the key insight. Without it, you expand from every element and degrade to O(n^2).

> **Common Pitfall**
>
> Sorting the input first. The problem explicitly asks for a solution without sorting, and sorting adds O(n log n).

---

## Common follow-up questions

- What if the input has duplicates? _(Tests that the set naturally deduplicates, so no special handling is needed.)_
- How would you return the actual sequence, not just its length? _(Tests tracking the start and building `range(start, start + length)`.)_
- What if the input is a stream and you cannot store all elements? _(Tests approximate algorithms or external sort strategies.)_
- Can you solve this in O(n log n) with sorting? When is that preferable? _(Tests understanding that sorting uses O(1) extra space vs O(n) for the set.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_consecutive_sequence_finder)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.