# Flat Memory

> Two sorted runs, one buffer, no scratch space. Fold them together where they already live.

Canonical URL: <https://datadriven.io/problems/merge-sorted-runs-in-buffer>

Domain: Python · Difficulty: medium · Seniority: mid

## Problem

An external sort spills sorted runs to disk, then folds them back a pair at a time to keep memory flat. Each `primary` buffer keeps its own `filled` values in order at the front and reserves exactly enough zeroed slots at the tail to absorb the `incoming` run, which is itself already sorted. Combine the two into one ascending sequence using only the room `primary` already holds, and return that buffer. An empty `incoming` leaves the buffer unchanged.

## Worked solution and explanation

### Why this problem exists in real interviews

This is the in-place merge step of an external merge sort wearing a buffer-reuse costume. The skill being probed: fold two sorted runs into the first one's reserved tail without ever allocating a scratch list. Almost everyone writes the forward merge they learned for merge sort, copying the smaller head each step, and clobbers a primary value they have not read yet. The whole trick is which direction you write. Get it wrong and you overwrite live data with the very value you were about to compare against.

> **Trick to Solving**
>
> Fill from the back, not the front. The largest of the two remaining values goes into the highest free slot. Because the reserved tail is exactly len(incoming) slots wide, the write cursor is always at or ahead of the primary read cursor, so you can never stomp a primary value you still need to read.

---

### Break down the requirements

#### Step 1: Point three cursors at the ends

i at the last real primary value (filled - 1), j at the last incoming value (len(incoming) - 1), and write at the last buffer slot (len(primary) - 1). You are going to walk all three leftward.

#### Step 2: Write the larger tail value each step

Compare primary[i] and incoming[j]; the bigger one lands at primary[write], then that source cursor and write both step back one. Drive the loop on j: once every incoming value is placed, any remaining primary values are already in their final positions at the front, so there is nothing left to do.

#### Step 3: Guard the primary cursor running out first

If i drops below zero while incoming still has values, those incoming values are the smallest and copy straight into the front slots. The condition 'i >= 0 and primary[i] > incoming[j]' handles this in one expression: when i is exhausted it falls through to copying incoming.

---

### The solution

**Back-to-front merge into the reserved tail**

```python
def merge_runs(primary, filled, incoming):
    i = filled - 1
    j = len(incoming) - 1
    write = len(primary) - 1
    while j >= 0:
        if i >= 0 and primary[i] > incoming[j]:
            primary[write] = primary[i]
            i -= 1
        else:
            primary[write] = incoming[j]
            j -= 1
        write -= 1
    return primary
```

> **Complexity**
>
> Time O(m + n): every step places exactly one value and the loop runs until incoming is drained. Space O(1) extra, which is the entire point: the merge happens inside the buffer the caller already owns, so an external sort merging thousands of runs never spikes memory per pass.

> **Interviewers Watch For**
>
> Whether you write from the back. A candidate who proposes the forward merge and then 'fixes' it by shifting elements is signaling they have not internalized why direction matters here. The strong answer states the invariant out loud: write >= i always holds, so the destination never collides with an unread source.

**Forward merge (broken)**

Start both cursors at index 0 and write into primary from the front. The first time incoming[0] is smallest you write it into primary[0], destroying the primary value sitting there before you have read it. Recovering means shifting the tail right, which turns the merge O(m*n).

**Backward merge (correct)**

Start at the ends and write into the highest free slot. The reserved tail guarantees the write cursor leads the read cursor, so each overwrite only ever lands on an already-consumed or placeholder slot. One pass, no shifting.

> **Common Pitfall**
>
> Looping on 'i >= 0 and j >= 0' and then forgetting the leftover incoming values. If primary empties first while incoming still has the smallest values, those never get copied and the front of the buffer keeps stale placeholders. Driving the loop on j alone, with i guarded inside, avoids the separate drain loop entirely.

---

## Common follow-up questions

- Now incoming can be longer than the reserved tail. How do you detect the overflow and what do you return? _(Tests whether the candidate validates len(primary) == filled + len(incoming) instead of trusting the contract.)_
- Extend this to a k-way merge of sorted runs while keeping memory bounded. _(Tests moving from two pointers to a heap of run heads, the real external-sort merge.)_
- If equal values must keep incoming ahead of primary for stability, what changes? _(Tests understanding that the > vs >= choice at ties decides merge stability.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/merge-sorted-runs-in-buffer)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.