# The Disputed Fare

> One charge, two legs. Finance needs to know which two.

Canonical URL: <https://datadriven.io/problems/the-disputed-fare>

Domain: Python · Difficulty: medium · Seniority: mid

## Problem

A rider disputed a single charge that our billing system actually assembled from two separate ride legs, and finance needs to point to exactly which legs. Given `fares`, the leg charges in the order they were billed, and a disputed `target` total, give back the positions of the two legs whose charges add up to it, with the earlier leg first. When more than one such pair exists, return the one that completes earliest as you read left to right, and return `None` when no two legs account for the total.

## Worked solution and explanation

### Why this problem exists in real interviews

This is Two Sum wearing a fare-reconciliation badge. The real question: can you find the pair that sums to `target` in a single pass instead of testing every leg against every other? Anyone can write the double loop; it just quietly becomes O(n^2) on a driver's multi-year billing history. The move that separates candidates is recording each fare's index in a dict as you scan, so the instant you meet a leg whose complement (`target - fare`) is already on file, you have both positions in O(1). Reach for sorting plus two pointers instead and you throw away the very indices finance asked for.

---

### Break down the requirements

#### Step 1: Remember each fare's first position

Walk `fares` once with enumerate and store fare value to its index in a dict. Keep the FIRST index you saw for a value: that is what makes the earlier leg come first and keeps duplicate charges from clobbering each other.

#### Step 2: Look for the complement BEFORE inserting

For each leg, compute complement = target - fare and check whether it is already in the dict. Checking before you insert the current fare is what stops a single leg from pairing with itself when target is exactly twice that fare.

#### Step 3: Return on the first hit, None at the end

The first j whose complement is on file is by construction the earliest-completing pair, so return [seen[complement], j] immediately. If the scan finishes with no hit, no two legs account for the total, so return None.

---

### The solution

**One pass, value-to-index dict**

```python
def find_fare_pair(fares, target):
    seen = {}
    for j, fare in enumerate(fares):
        complement = target - fare
        if complement in seen:
            return [seen[complement], j]
        if fare not in seen:
            seen[fare] = j
    return None
```

**Nested loop (rejected)**

Two loops over the legs, O(n^2). Correct on the toy input, but a driver with tens of thousands of billed legs turns this into a visible latency spike, and it still has to be careful to report the earlier index first.

**Hash lookup (shipped)**

Single pass, O(n). The dict answers 'have I already seen the complement, and where?' in O(1), and because you scan left to right the first match is automatically the earliest-completing pair.

> **Complexity**
>
> Time O(n): one pass, O(1) average dict membership and insert per leg. Space O(n) for the dict in the worst case (no pair until the end). At the scale this runs, a single rider's or driver's billing history, that linear pass is nowhere near the bottleneck.

> **Interviewers Watch For**
>
> Do you store the INDEX, not just the value (a bare set proves a pair exists but cannot name the legs)? Do you check the complement before inserting? Do you ask what to return when nothing matches instead of assuming a pair always exists? Naming those three decisions out loud reads as senior.

> **Common Pitfall**
>
> Inserting the current fare into the dict before checking for its complement. Then a lone leg equal to target/2 finds ITSELF and you return [i, i], two positions that are actually one leg. Checking first, inserting second, avoids it. The sibling mistake is overwriting seen[fare] on a duplicate, which quietly reports a later index than the earliest occurrence.

---

## Common follow-up questions

- Now return every distinct pair of legs that sums to the target, not just the first. _(Tests moving from an early return to accumulating results, and handling duplicate values without emitting the same pair twice.)_
- The legs arrive as a stream you can only read once and cannot hold entirely in memory. How does your approach change? _(Tests whether the candidate sees that the dict already supports a single-pass streaming model, and can reason about its memory growth.)_
- Suppose fares came in already sorted. Would you still use a dict? _(Tests knowledge of the two-pointer alternative on sorted input (O(1) extra space) and the trade-off that it needs the original indices mapped back.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the-disputed-fare)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.