# The File Tree Builder

> Flat paths. Build the nested tree.

Canonical URL: <https://datadriven.io/problems/the_file_tree_builder>

Domain: Python · Difficulty: medium · Seniority: L5

## Problem

Given a list of file paths (slash-separated), build a nested dict representing the directory tree. Directories map to inner dicts; files map to None. Return the resulting dict.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **trie-like nested dict construction from flat paths**, a pattern used in file system indexing, config trees, and URL routing. It probes iterative dict traversal and dynamic nested structure creation.

---

### Break down the requirements

#### Step 1: Split each path into segments

A path like `src/main/app.py` becomes `['src', 'main', 'app.py']`.

#### Step 2: Navigate and create intermediate dicts

For each segment except the last, create a nested dict if it does not exist. Navigate into it.

#### Step 3: Mark leaf files as None

The last segment is the file. Set its value to `None`.

---

### The solution

**Iterative nested dict construction**

```python
def build_tree(paths: list) -> dict:
    tree = {}
    for path in paths:
        parts = path.split('/')
        current = tree
        for i in range(len(parts) - 1):
            segment = parts[i]
            if segment not in current:
                current[segment] = {}
            current = current[segment]
        leaf = parts[-1]
        current[leaf] = None
    return tree
```

> **Time and Space Complexity**
>
> **Time:** O(n * d) where n is the number of paths and d is the average depth per path.
> 
> **Space:** O(n * d) for the nested dict structure.

> **Interviewers Watch For**
>
> Whether you handle shared prefixes correctly. Paths `src/a.py` and `src/b.py` should share the `src` dict node.

> **Common Pitfall**
>
> Overwriting a directory dict with `None` when a later path treats the same name as a file. In real file systems, a name cannot be both a file and a directory.

---

## Common follow-up questions

- How would you flatten the tree back into paths? _(Tests recursive DFS with path accumulation.)_
- What if you need to count files per directory? _(Tests post-order traversal counting leaves.)_
- How would you handle paths with trailing slashes? _(Tests stripping empty segments from the split result.)_
- What if two paths conflict (one is a file, another treats it as a directory)? _(Tests validation and conflict resolution.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_file_tree_builder)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.