# The Zigzag Encoder

> The message snakes its way across the rails.

Canonical URL: <https://datadriven.io/problems/the_zigzag_encoder>

Domain: Python · Difficulty: medium · Seniority: L4

## Problem

Given a string s and an integer rows (>= 1), write s in a zigzag pattern across that number of rows, then read the characters row by row to produce the encoded output. If rows == 1, the output equals the input.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **index pattern recognition** and **string manipulation**. The zigzag encoding requires distributing characters across rows in a specific pattern, then reading row by row. It probes whether a candidate can simulate the zigzag traversal or find a direct formula.

---

### Break down the requirements

#### Step 1: Create row buckets

Allocate one list per row to collect characters as they are distributed.

#### Step 2: Distribute characters in zigzag order

Move down through rows 0 to numRows-1, then back up from numRows-2 to 1, repeating this cycle.

#### Step 3: Concatenate all rows

Join each row's characters, then join all rows to form the encoded string.

---

### The solution

**Row-bucket distribution with direction toggle**

```python
def zigzag_encode(s: str, num_rows: int) -> str:
    if num_rows <= 1 or num_rows >= len(s):
        return s
    rows = []
    for i in range(num_rows):
        rows.append([])
    current_row = 0
    going_down = True
    for ch in s:
        rows[current_row].append(ch)
        if current_row == 0:
            going_down = True
        elif current_row == num_rows - 1:
            going_down = False
        if going_down:
            current_row += 1
        else:
            current_row -= 1
    parts = []
    for row in rows:
        parts.append("".join(row))
    result = "".join(parts)
    return result
```

> **Time and Space Complexity**
>
> **Time:** O(n) where n is the length of the string. Each character is processed once.
> 
> **Space:** O(n) for the row buckets.

> **Interviewers Watch For**
>
> Handling the edge cases: `num_rows == 1` (no zigzag, return as-is) and `num_rows >= len(s)` (each character on its own row, also return as-is).

> **Common Pitfall**
>
> Off-by-one in the direction toggle. The bounce happens at row 0 and row `num_rows - 1`, not at row 1 and `num_rows - 2`. Getting this wrong shifts the entire pattern.

---

## Common follow-up questions

- How would you decode a zigzag-encoded string? _(Tests computing how many characters go in each row, then distributing them back.)_
- Can you compute the output without simulating the zigzag? _(Tests the mathematical formula for which output position each input character maps to.)_
- What if the number of rows was very large? _(Tests that the edge case check `num_rows >= len(s)` avoids creating many empty rows.)_
- Where does zigzag encoding appear in practice? _(Tests awareness of rail fence cipher in cryptography and certain image scan patterns.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_zigzag_encoder)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.