# The Column Zipper

> Headers on top, values below, dict in the middle.

Canonical URL: <https://datadriven.io/problems/the_column_zipper>

Domain: Python · Difficulty: easy · Seniority: L3

## Problem

Given headers (list of column names) and rows (list of row lists, each of the same length as headers), return a list of dicts where each dict maps header to its row's value.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests the ability to **zip parallel sequences into structured records**, a fundamental data transformation. It mirrors how CSV parsers combine header rows with data rows, and probes comfort with index alignment and dict construction.

---

### Break down the requirements

#### Step 1: Pair each row with the headers

For each row list, create a dict by zipping headers with row values.

#### Step 2: Handle alignment

Each value at index i maps to the header at index i. Assume headers and row lengths match.

#### Step 3: Return the list of dicts

Each dict represents one row of structured data.

---

### The solution

**Header-row zipping into dicts**

```python
def zip_columns(headers: list, rows: list) -> list:
    result = []
    for row in rows:
        record = {}
        for i in range(len(headers)):
            record[headers[i]] = row[i]
        result.append(record)
    return result
```

> **Time and Space Complexity**
>
> **Time:** O(n * c) where n is the number of rows and c is the number of columns.
> 
> **Space:** O(n * c) for the output list of dicts.

> **Interviewers Watch For**
>
> Whether you use `zip(headers, row)` or manual indexing. Both work, but understanding the manual approach proves you know what `zip` does under the hood.

> **Common Pitfall**
>
> Assuming all rows have the same length as headers. In production, mismatched lengths cause silent data corruption or `IndexError`.

---

## Common follow-up questions

- What if some rows have fewer values than headers? _(Tests defensive coding: pad with `None` or use `zip_longest`.)_
- How would you reverse this operation? _(Tests extracting headers from dict keys and values into parallel lists.)_
- What if the dataset has millions of rows? _(Tests generator-based approach for memory efficiency.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_column_zipper)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.