# The Number Screen

> Some numbers make the cut. Most do not.

Canonical URL: <https://datadriven.io/problems/the_number_screen>

Domain: Python · Difficulty: easy · Seniority: L3

## Problem

Return all prime integers strictly less than n, in ascending order. For n <= 2, return an empty list. Do not use any math libraries; implement divisibility checks yourself.

## Worked solution and explanation

### Why this problem exists in real interviews

Prime sieving is the classic stand in for whether you remember the sqrt optimization, can implement the Sieve of Eratosthenes from scratch, and reason about the difference between strict and non strict upper bounds. Interviewers care less about the specific algorithm and more about how you justify the complexity.

---

### Break down the requirements

#### Step 1: Handle the small n base case

For `n <= 2` return `[]` immediately. The smallest prime is 2 and the spec says strictly less than n, so `n == 2` yields no primes. Stating this base case up front prevents off by one errors in the sieve loop.

#### Step 2: Build a boolean sieve of length n

Mark `is_prime = [True] * n`, set indices 0 and 1 to False, then for each i from 2 up to sqrt(n) cross off multiples starting at `i*i`. Starting at `i*i` not `2*i` skips work already done by smaller primes.

#### Step 3: Collect indices that survived

Return `[i for i in range(2, n) if is_prime[i]]`. The strict `range(2, n)` matches the spec's strict less than rule without an extra comparison.

---

### The solution

**Sieve of Eratosthenes with sqrt cutoff**

```python
def primes_less_than(n: int) -> list[int]:
    if n <= 2:
        return []
    is_prime = [True] * n
    is_prime[0] = is_prime[1] = False
    i = 2
    while i * i < n:
        if is_prime[i]:
            for multiple in range(i * i, n, i):
                is_prime[multiple] = False
        i += 1
    return [i for i in range(2, n) if is_prime[i]]
```

> **Cost Analysis**
>
> Time is O(n log log n) thanks to the sieve, far better than O(n sqrt(n)) trial division. Space is O(n) for the boolean array. The sqrt cutoff on the outer loop is the single optimization that turns a textbook answer into a fast one.

> **Interviewers Watch For**
>
> Whether you start the inner loop at `i*i`, whether you justify the sqrt outer bound, and whether you treat the strict less than rule correctly. Many candidates write `range(2, n + 1)` out of habit and ship a bug.

> **Common Pitfall**
>
> Implementing trial division per number with `for d in range(2, num)` is correct but O(n^2) and gets timed out. The sieve is the expected algorithm for any range request larger than a few thousand.

---

## Common follow-up questions

- How would you compute primes up to 1 billion? _(Switch to a segmented sieve so memory stays bounded by sqrt(n). Discuss why a single boolean array of 1 billion entries is infeasible.)_
- How would you return only the count of primes, not the list? _(Sum the boolean array directly with `sum(is_prime[2:])`. Saves the list materialization cost when the caller only needs cardinality.)_
- What changes if the bound is inclusive instead of strict? _(Use `range(2, n + 1)` and size the sieve to `n + 1`. Walk through the off by one carefully.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_number_screen)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.