# The Lazy Stream

> Yield values one at a time from a potentially infinite source.

Canonical URL: <https://datadriven.io/problems/the_lazy_stream>

Domain: Python · Difficulty: hard · Seniority: L5

## Problem

Given a list that may contain some inner lists (one level of nesting), return a flat list where inner lists are expanded element-by-element in place. Strings count as leaves even though iterable. Do not recurse into deeper levels.

## Worked solution and explanation

### Why this problem exists in real interviews

Lazy flattening tests whether you understand generators, the string-is-iterable trap, and the difference between one level of flattening and full recursion. Interviewers use it to probe whether you can reach for `yield from` and articulate why a generator is the right abstraction even when the caller materializes the result.

---

### Break down the requirements

#### Step 1: Flatten exactly one level

If an element is a list, yield each of its items individually. Do not recurse into the items themselves. This is the bright line that separates this problem from `the_deep_unpacker`.

#### Step 2: Treat strings as leaves

Even though `str` is iterable, the spec says strings stay intact. Use `isinstance(item, list)` (not a generic iterable check) so a string like `'abc'` is yielded whole rather than as `'a', 'b', 'c'`.

#### Step 3: Use a generator and materialize at the boundary

Define an inner generator that `yield from item` for lists and yields the item directly otherwise. Wrap with `list(...)` at the return so the public signature still gives back a list. This keeps memory low if the inner sequence is large and lazy.

---

### The solution

**Generator with yield from for one level**

```python
def lazy_flatten(data: list) -> list:
    def _gen(items):
        for item in items:
            if isinstance(item, list):
                yield from item
            else:
                yield item
    return list(_gen(data))
```

> **Cost Analysis**
>
> Time is O(n) where n is the total count of leaves after one level of flattening. Space is O(n) for the materialized list, but the generator itself uses O(1) auxiliary memory which matters if a downstream caller switches to streaming.

> **Interviewers Watch For**
>
> Whether you use `yield from` instead of a nested for loop, whether you guard strings explicitly, and whether you can explain when returning the generator (instead of a list) would be the better API.

> **Common Pitfall**
>
> Recursing into nested lists turns this into deep flatten and breaks inputs like `[1, [2, [3, 4]], 5]` whose expected output is `[1, 2, [3, 4], 5]`. The 'one level only' rule is the whole point of the problem.

---

## Common follow-up questions

- How would you expose the generator directly instead of materializing? _(Drop the `list(...)` wrapper and rename to `iter_lazy_flatten`. Discuss when callers benefit (large or infinite inputs) and when materialization is safer.)_
- How would you flatten arbitrary depth while still treating strings as leaves? _(Recurse on the predicate `isinstance(item, list)` and `yield from` the recursive call. This is exactly the deep flatten variant.)_
- How would you handle generic iterables like tuples and sets? _(Replace `isinstance(item, list)` with a check for `Iterable` from `collections.abc` while still excluding `str` and `bytes`. Mention why sets break the in place ordering guarantee.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_lazy_stream)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.