# The Page Turner

> Nobody loads everything at once.

Canonical URL: <https://datadriven.io/problems/the_page_turner>

Domain: Python · Difficulty: medium · Seniority: L4

## Problem

Implement a Paginator class that takes a list of pages (each page is itself a list of items) and iterates through all items across all pages in order. It must support the Python iterator protocol (__iter__ and __next__). (The harness calls list(Paginator(pages)) and expects the flattened item sequence.)

## Worked solution and explanation

### Why this problem exists in real interviews

Pagination shows up in every API and every analytics pipeline. Interviewers use this prompt to check whether you understand the Python iterator protocol (`__iter__` and `__next__`), and whether you can flatten a nested structure lazily without materializing the whole feed in memory.

---

### Break down the requirements

#### Step 1: Make the class its own iterator

`__iter__` returns `self` so `for item in Paginator(pages)` and `list(Paginator(pages))` both work. The harness calls `list(Paginator(pages))`, which calls `__iter__` then `__next__` until `StopIteration`.

#### Step 2: Track two indices, not one flat cursor

Keep `page_idx` for which page you are on and `item_idx` for the position inside that page. A single flat cursor would force you to either pre-flatten the input or recompute offsets on every call.

#### Step 3: Skip empty pages and stop cleanly

Use a `while` loop in `__next__` so that an empty page advances `page_idx` without returning anything. When `page_idx` runs past the end, raise `StopIteration` (do not return `None`).

---

### The solution

**Custom iterator with two-level index tracking**

```python
class Paginator:
    def __init__(self, pages: list[list]):
        self.pages = pages
        self.page_idx = 0
        self.item_idx = 0

    def __iter__(self):
        return self

    def __next__(self):
        while self.page_idx < len(self.pages):
            page = self.pages[self.page_idx]
            if self.item_idx < len(page):
                item = page[self.item_idx]
                self.item_idx += 1
                return item
            self.page_idx += 1
            self.item_idx = 0
        raise StopIteration
```

> **Cost Analysis**
>
> Time: O(1) amortized per `__next__` call across N total items, so O(N) to consume the whole iterator. Empty pages contribute O(P) extra advance steps where P is the number of pages. Space: O(1) auxiliary state. The pages list is referenced, never copied.

> **Interviewers Watch For**
>
> Whether `__iter__` returns `self` (so the object is reusable in a `for` loop), whether you raise `StopIteration` instead of returning a sentinel, and whether empty pages are handled without a special case. Strong candidates also mention that this is exactly how generators work under the hood.

> **Common Pitfall**
>
> Flattening the input with `[item for page in pages for item in page]` inside `__init__`. That defeats the point of writing an iterator (you allocate O(N) upfront) and it also breaks the moment one of the pages is itself a stream rather than a list.

---

## Common follow-up questions

- Rewrite `Paginator` as a generator function. Which version is shorter, and why might production code still prefer the class form? _(Tests whether the candidate knows generators implement the iterator protocol for free, and that classes win when you need to expose extra state (current page, items seen) for debugging or resumption.)_
- How would you support `peek()` to look at the next item without consuming it? _(Tests buffering. The clean answer is to store one lookahead item and have `__next__` return the buffer first. Watch whether they handle the case where the buffer is empty after exhaustion.)_
- What changes if `pages` is itself a generator of pages rather than a list of lists? _(Tests understanding of `len()` versus iteration. They cannot index `self.pages[self.page_idx]`; they must call `next()` on the outer generator and catch `StopIteration` to terminate.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_page_turner)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.