# Tokenize

> Split it apart. Keep the pieces.

Canonical URL: <https://datadriven.io/problems/tokenize>

Domain: Python · Difficulty: easy · Seniority: L3

## Problem

Given a string, split on whitespace (any amount) and return the list of non-empty tokens.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **string splitting** with irregular whitespace handling. It probes whether a candidate knows that Python's `str.split()` without arguments handles multiple spaces, tabs, and leading/trailing whitespace automatically.

---

### Break down the requirements

#### Step 1: Split on whitespace

Use `.split()` (no arguments) to split on any whitespace sequence.

#### Step 2: Return the list of non-empty tokens

The no-argument split automatically filters out empty strings from multiple consecutive spaces.

---

### The solution

**Default split with automatic whitespace handling**

```python
def tokenize(s: str) -> list:
    result = s.split()
    return result
```

> **Time and Space Complexity**
>
> **Time:** O(n) where n is the length of the string.
> 
> **Space:** O(n) for the token list.

> **Interviewers Watch For**
>
> Knowing the difference between `.split()` and `.split(' ')`. The latter preserves empty strings between multiple spaces, while the former does not.

> **Common Pitfall**
>
> Using `.split(' ')` then filtering empty strings manually. This works but is unnecessarily verbose when `.split()` does it automatically.

---

## Common follow-up questions

- What if you needed to split on a custom delimiter like '|'? _(Tests `.split('|')` and handling empty tokens.)_
- What if the input contained tab and newline characters? _(Tests that `.split()` handles all whitespace types.)_
- How would you tokenize while preserving the delimiters? _(Tests `re.split` with a capturing group to keep the separators.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/tokenize)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.