# The Short Address

> Turn a big number into a compact alphanumeric code.

Canonical URL: <https://datadriven.io/problems/the_short_address>

Domain: Python · Difficulty: medium · Seniority: L4

## Problem

Given non-negative integer n, encode it in base-62 using the alphabet '0'..'9' then 'a'..'z' then 'A'..'Z'. Return the encoded string with the most significant digit first. 0 encodes to '0'.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **base conversion**, a fundamental computer science concept that appears in URL shorteners, hash encoding, and ID generation systems. It probes whether a candidate can implement modular arithmetic with a custom character set.

> **Trick to Solving**
>
> Recognize this as repeated division by 62, collecting remainders. Each remainder maps to a character in the `0-9a-zA-Z` alphabet. Build the result string in reverse order since you extract least-significant digits first.

---

### Break down the requirements

#### Step 1: Define the base-62 character set

The alphabet is `0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ`, giving 62 characters.

#### Step 2: Handle the zero edge case

The integer 0 should encode to `'0'`, not an empty string.

#### Step 3: Extract digits via repeated modulo and division

Take `n % 62` to get the current digit, map it to a character, then set `n = n // 62`. Repeat until n is 0.

#### Step 4: Reverse the collected characters

Since digits are extracted least-significant first, reverse the result to get the correct order.

---

### The solution

**Repeated division with custom alphabet**

```python
def encode_base62(n: int) -> str:
    chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
    if n == 0:
        return "0"
    digits = []
    while n > 0:
        remainder = n % 62
        digits.append(chars[remainder])
        n = n // 62
    digits.reverse()
    result = "".join(digits)
    return result
```

> **Time and Space Complexity**
>
> **Time:** O(log n) where n is the input integer. Each iteration divides n by 62.
> 
> **Space:** O(log n) for the digits list.

> **Interviewers Watch For**
>
> Does the candidate handle 0 as a special case? Does the character set match the expected encoding (0-9 before a-z before A-Z)? Getting the alphabet order wrong breaks decode compatibility.

> **Common Pitfall**
>
> Building the string by prepending characters (`result = ch + result`) is O(n^2) due to string immutability. Appending to a list and reversing is O(n).

---

## Common follow-up questions

- How would you implement the decode function? _(Tests the reverse operation: multiply accumulated value by 62 and add each character's index.)_
- Why base-62 instead of base-64? _(Tests awareness that base-64 uses `+` and `/` which are not URL-safe without encoding.)_
- How would you guarantee short codes are globally unique in a distributed system? _(Tests knowledge of ID generation strategies like Snowflake IDs or database sequences.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_short_address)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.