# The Indivisibles

> Numbers that yield only to themselves.

Canonical URL: <https://datadriven.io/problems/the_indivisibles>

Domain: Python · Difficulty: easy · Seniority: L5

## Problem

Given a positive integer n, return all prime numbers strictly less than n in ascending order. Return an empty list when n <= 2.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests the **Sieve of Eratosthenes**, the most efficient algorithm for generating all primes below a threshold. It probes array manipulation, boolean masking, and understanding of the sieve's mathematical basis.

> **Trick to Solving**
>
> Initialize a boolean array of size n where all entries are `True`. Starting from 2, mark all multiples of each prime as `False`. Only start marking from `p*p` since smaller multiples were already marked by smaller primes.

---

### Break down the requirements

#### Step 1: Create a boolean sieve array

Index i represents whether i is prime. Initialize all to `True`, then set 0 and 1 to `False`.

#### Step 2: Mark multiples of each prime

For each number p from 2 to sqrt(n), if `sieve[p]` is `True`, mark `p*p, p*p+p, ...` as `False`.

#### Step 3: Collect remaining True indices

These are the primes below n.

---

### The solution

**Sieve of Eratosthenes**

```python
def primes_below(n: int) -> list:
    if n < 2:
        return []
    sieve = [True] * n
    sieve[0] = False
    sieve[1] = False
    p = 2
    while p * p < n:
        if sieve[p]:
            multiple = p * p
            while multiple < n:
                sieve[multiple] = False
                multiple += p
        p += 1
    result = []
    for i in range(n):
        if sieve[i]:
            result.append(i)
    return result
```

> **Time and Space Complexity**
>
> **Time:** O(n log log n), nearly linear. This is the theoretical complexity of the sieve.
> 
> **Space:** O(n) for the boolean array.

> **Interviewers Watch For**
>
> Whether you start marking from `p*p` instead of `2*p`. Starting from `p*p` is the key optimization that avoids redundant work.

> **Common Pitfall**
>
> Using trial division for each number (O(n * sqrt(n))). The sieve is dramatically faster for generating all primes below a limit.

---

## Common follow-up questions

- How would you check if a single number is prime? _(Tests trial division up to sqrt(n), which is simpler for one-off checks.)_
- What is the segmented sieve? _(Tests memory optimization for very large ranges by processing in blocks.)_
- How many primes are there below n? _(Tests the prime number theorem: approximately n / ln(n).)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_indivisibles)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.