# The Timezone Trap

> Trip data and timezones. They're not the same thing.

Canonical URL: <https://datadriven.io/problems/the_timezone_trap>

Domain: Python · Difficulty: medium · Seniority: L4

## Problem

Given a list of trip dicts (each with 'city', 'status', 'completed_utc' in ISO format with 'Z'), filter to city='San Francisco' AND status='completed'. Convert completed_utc to Pacific time (UTC-8 fixed, ignoring DST for simplicity). Per YYYY-MM (in Pacific time), count completions. Return a dict {year-month: count}.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **timezone-aware datetime handling**, a notoriously tricky area in data engineering. Converting UTC timestamps to local time before aggregating is a common ETL requirement that probes awareness of DST, UTC offsets, and correct grouping.

---

### Break down the requirements

#### Step 1: Parse UTC timestamps from trip records

Extract the timestamp field and parse it as a UTC datetime.

#### Step 2: Convert to Pacific time

Apply the US/Pacific timezone conversion, which automatically handles PST/PDT daylight saving transitions.

#### Step 3: Extract year-month and count per period

Group by the local-time year-month and count completed trips.

---

### The solution

**UTC to Pacific conversion with zoneinfo**

```python
from datetime import datetime, timezone
from zoneinfo import ZoneInfo
def count_trips_by_month(trips: list) -> dict:
    pacific = ZoneInfo("America/Los_Angeles")
    counts = {}
    for trip in trips:
        utc_str = trip["completed_at"]
        utc_dt = datetime.fromisoformat(utc_str).replace(tzinfo=timezone.utc)
        local_dt = utc_dt.astimezone(pacific)
        key = local_dt.strftime("%Y-%m")
        counts[key] = counts.get(key, 0) + 1
    return counts
```

> **Time and Space Complexity**
>
> **Time:** O(n) where n is the number of trip records. Each conversion and grouping step is O(1).
> 
> **Space:** O(m) where m is the number of distinct year-month keys.

> **Interviewers Watch For**
>
> Using `zoneinfo.ZoneInfo` (Python 3.9+) or `pytz` for timezone conversion instead of manually subtracting 8 hours. Manual offset ignores daylight saving time and produces wrong results for half the year.

> **Common Pitfall**
>
> Parsing timestamps without explicitly setting UTC. If the input lacks a timezone suffix, `fromisoformat` returns a naive datetime, and `astimezone` will assume the local system timezone instead of UTC.

---

## Common follow-up questions

- What if some trips span midnight in Pacific time? _(Tests which timestamp to use for bucketing: start time or completion time.)_
- How would you handle a timezone that has half-hour offsets like India? _(Tests that the approach generalizes since `zoneinfo` handles arbitrary offsets.)_
- What about historical timezone changes? _(Tests awareness that the IANA timezone database tracks rule changes over time.)_
- How would you do this in SQL instead? _(Tests `AT TIME ZONE` clause and extraction functions.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_timezone_trap)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.