# When the Hours Touch

> Two sessions that overlap are really one. Find the windows a driver was actually online.

Canonical URL: <https://datadriven.io/problems/when-the-hours-touch>

Domain: Python · Difficulty: medium · Seniority: mid

## Problem

A driver toggles availability throughout a shift, and our status export logs each online stretch as a `[start, end]` pair of timestamps in seconds (carrying sub-second fractions), unsorted and frequently overlapping because a flaky app reconnects mid-session. Collapse these into the distinct windows the driver was actually online, treating two stretches that overlap or meet exactly end-to-end as one continuous window. Return the windows sorted by start time.

## Worked solution and explanation

### Why this problem exists in real interviews

This is interval coalescing wearing a session-cleanup costume. The skill being probed is whether you know the merge only works AFTER you sort by start time. The data hands you sessions in arrival order, and two overlapping stretches are almost never next to each other, so the candidate who walks the raw list comparing each session to the previous one quietly reports overlapping windows as separate. And because the timestamps are continuous, you cannot dodge the merge by dumping every covered instant into a set and regrouping contiguous points; there is no finite grid to enumerate. Sort by start first and every mergeable pair becomes adjacent; that single invariant is the whole problem.

---

### Break down the requirements

#### Step 1: Sort by start time to make overlaps adjacent

Sort the sessions on their start value. Once sorted, any session that overlaps an earlier one must overlap the window you are currently building, because nothing with an earlier start is still ahead of you. This is the invariant the whole sweep leans on, and it is exactly what a single pass over the unsorted input fails to establish.

#### Step 2: Sweep, extending a running window

Seed the result with the first session. For each later session, if its start is less than or equal to the current window's end, it overlaps or touches, so extend the window's end to the max of the two ends. Otherwise it starts a fresh window. The 'less than or equal' is what makes back-to-back sessions like [0.0, 15.0] and [15.0, 20.0] collapse into [0.0, 20.0].

#### Step 3: Take the max of the ends, not the latest end

When a session is fully contained inside the current window, its end is smaller, so blindly assigning the new end would shrink the window and lose coverage. Always extend with max(current_end, new_end). This is the line that handles [1.0, 100.0] swallowing [20.0, 30.0].

---

### The solution

**Sort by start, sweep a running window**

```python
def merge_sessions(sessions):
    if not sessions:
        return []
    ordered = sorted(sessions, key=lambda s: s[0])
    merged = [list(ordered[0])]
    for start, end in ordered[1:]:
        last = merged[-1]
        if start <= last[1]:
            last[1] = max(last[1], end)
        else:
            merged.append([start, end])
    return merged
```

> **Complexity**
>
> Time is O(n log n), dominated by the sort; the sweep is a single O(n) pass. Space is O(n) for the output. A driver's shift produces at most a few dozen sessions, so this is trivial per driver, and across a fleet you apply it per driver_id rather than over the whole stream at once. Note the cost does NOT depend on how long each window spans, which is the whole reason a covered-instant set is the wrong tool on continuous time.

**Single pass, no sort**

Walk the list in arrival order and compare each session to the previous one. On [[10.0, 30.5], [4.5, 12.0], ...] the [4.5, 12.0] arrives after [10.0, 30.5], so a forward-only scan either emits it as a separate window or extends in the wrong direction and loses the 4.5 start. Wrong answer, and it looks right on already-sorted inputs, which is how it slips through.

**Sort then sweep**

Sorting guarantees that if a session overlaps anything earlier, it overlaps the window currently being built. One comparison against the last window suffices, and every overlap is caught.

> **Interviewers Watch For**
>
> Stating the sort-by-start invariant out loud before writing the loop, and deciding deliberately whether touching sessions (end == next start) merge. Strong candidates name the 'less than or equal' choice and the max-of-ends line as the two places the logic actually bites, and they reject the enumerate-every-instant idea the moment they notice time is continuous.

> **Common Pitfall**
>
> Writing last[1] = end instead of last[1] = max(last[1], end). It passes simple overlap cases but silently shrinks the window the moment a fully contained session shows up, so [[1.0, 100.0], [20.0, 30.0]] wrongly collapses to [1.0, 30.0] and drops 70 seconds of coverage.

---

## Common follow-up questions

- Now the export interleaves sessions for thousands of drivers in one list. How do you coalesce per driver? _(Tests grouping by driver_id first (defaultdict of lists) then coalescing each group, versus trying to sweep the mixed stream.)_
- Instead of the windows, return the total time in seconds each driver was online. _(Tests summing (end - start) over the coalesced windows and recognizing you must coalesce first to avoid double-counting overlap.)_
- Sessions arrive as an unbounded stream rather than a finite list. What changes? _(Tests reasoning about ordering guarantees, buffering late events, and emitting a window only once it can no longer be extended.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/when-the-hours-touch)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.