# The Address Surgeon

> One string hides a street, a city, a state, and a zip.

Canonical URL: <https://datadriven.io/problems/the_address_surgeon>

Domain: Python · Difficulty: easy · Seniority: L3

## Problem

Given a US address string in the format '<street>, <city>, <state> <zip>', return a dict with 'street', 'city', 'state', 'zip' fields. Assume format is consistent.

## Worked solution and explanation

### Why this problem exists in real interviews

Parsing a structured string into named fields tests **delimiter-based splitting** and whether you handle multi-level parsing (comma-separated, then space-separated) correctly.

---

### Break down the requirements

#### Step 1: Split by comma to separate major sections

The address has three comma-separated parts: street, city, and state+zip.

#### Step 2: Extract state and zip from the last section

Split the third part by space to separate the state abbreviation from the zip code.

#### Step 3: Strip whitespace from each field

Leading/trailing spaces after comma splits should be removed.

---

### The solution

**Multi-level delimiter parsing with strip**

```python
def parse_address(address):
    parts = address.split(',')
    street = parts[0].strip()
    city = parts[1].strip()
    state_zip = parts[2].strip().split(' ')
    state = state_zip[0]
    zip_code = state_zip[1]
    result = {
        'street': street,
        'city': city,
        'state': state,
        'zip': zip_code
    }
    return result
```

> **Time and Space Complexity**
>
> **Time:** O(n) where n is the address string length.
> 
> **Space:** O(n) for the split parts and result dict.

> **Interviewers Watch For**
>
> Calling `.strip()` on each part after the comma split. Without stripping, the city would have a leading space.

> **Common Pitfall**
>
> Not stripping whitespace after splitting by comma. `'Springfield, IL 62704'.split(',')` produces `['Springfield', ' IL 62704']` with a leading space.

---

## Common follow-up questions

- What if the address format varies (e.g., no zip code)? _(Tests defensive parsing with length checks on split results.)_
- How would you parse addresses from multiple countries? _(Tests using regex with named groups or a dedicated address parsing library.)_
- What if the street contains commas? _(Tests using `split(',', maxsplit=2)` to limit the number of splits.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_address_surgeon)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.