# The Weight of the Book

> Every holding pulls its weight. Order the asset classes by what they carry, and let the ties settle by name.

Canonical URL: <https://datadriven.io/problems/the-weight-of-the-book>

Domain: Python · Difficulty: medium · Seniority: mid

## Problem

An asset-allocation report needs each asset class placed in order of how much money sits in it, where several `holdings` can carry the same `asset_class` and their market values add up. Assign each class a position starting at 1 for the class holding the most total value, and when two classes hold the same total, give the alphabetically earlier name the stronger position. An empty book has nothing to place.

## Worked solution and explanation

### Why this problem exists in real interviews

This is a group-then-order problem wearing a portfolio report's costume, and the whole thing balances on one line: the sort key. Anyone can sum market values into a dict keyed by asset class. What separates candidates is the tie-break, because the ordering runs in two directions at once: total value from highest to lowest, but equal totals from A to Z. Reach for a plain reverse=True on the totals and you flip the names too, so every tie comes out backwards. That is exactly the row the interviewer plants to see whether you actually thought about ties or just eyeballed the happy path.

> **Trick to solving**
>
> Encode both directions in a single compound key: sorted(totals, key=lambda cls: (-totals[cls], cls)). Negating the total sends the biggest book to the front while leaving the name to sort ascending. One sort, both rules, no reverse flag.

---

### Break down the requirements

#### Step 1: Fold holdings into per-class totals

Several holdings share an asset class, so you cannot order the raw records. Walk the list once and accumulate into a dict with totals.get(cls, 0) + value. This is the step a plain sort of the input can never do for you, and it is why the answer needs a dict at all.

#### Step 2: Order the classes with one compound key

Sort the class names by (-total, name). The negative total gives you largest-first without reverse=True, and the trailing name breaks ties alphabetically in ascending order. Doing this as two separate concerns is where people slip.

#### Step 3: Turn the order into a position map

enumerate(ordered, start=1) hands you position 1, 2, 3 in order. Return a dict of class to position rather than a list, so the grade checks the actual positions you assigned, not just which classes showed up.

---

### The solution

**Aggregate, order by compound key, map to positions**

```python
def rank_asset_classes(holdings):
    totals = {}
    for h in holdings:
        totals[h["asset_class"]] = totals.get(h["asset_class"], 0) + h["market_value"]
    ordered = sorted(totals, key=lambda cls: (-totals[cls], cls))
    return {cls: rank for rank, cls in enumerate(ordered, start=1)}
```

**reverse=True (breaks ties)**

sorted(totals, key=lambda c: totals[c], reverse=True) puts the biggest book first, but reverse flips the entire comparison. Two classes tied at 2000 come out in Z-to-A order, so cash lands behind commodities instead of ahead of it.

**compound key (correct)**

sorted(totals, key=lambda c: (-totals[c], c)) negates only the total. The name still sorts ascending, so tied classes settle cash before commodities. Both rules live in one key with no flag to fight.

> **Complexity**
>
> O(n) to fold the holdings into totals, then O(g log g) to sort where g is the number of distinct asset classes (never more than n, and usually a tiny handful). Space is O(g) for the totals dict and the result. A book with millions of positions but a dozen asset classes still sorts a dozen items.

> **Interviewers watch for**
>
> Whether you spot that ties run opposite to the primary order and reach for a compound key instead of reverse=True. Strong candidates also ask out loud what a tie should do before writing anything, and they return a position map so the ordering is actually testable rather than a bare sorted list.

> **Common pitfall**
>
> Sorting the raw holdings list instead of the per-class totals, which double-counts a class that appears in several rows. The other classic miss is negative market values from short positions: they must still fold into the total, so a class can end up with a smaller or negative total and drop to a lower position, which the reverse=True crowd rarely checks.

---

## Common follow-up questions

- What if two tied classes should share the same position, so the next class jumps from 1 to 3 rather than 2? _(Tests dense versus standard competition ranking and whether the candidate tracks the previous total to detect ties explicitly.)_
- The report now wants only the classes making up the top 90 percent of total value, largest first. How do you extend this? _(Tests a running cumulative sum over the ordered classes and an early cutoff.)_
- Holdings stream in continuously and the position map must stay current. What changes? _(Tests incremental aggregation and whether a full re-sort per update is acceptable at the given scale.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the-weight-of-the-book)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.