# The Host Ranker

> Some hosts have more to offer.

Canonical URL: <https://datadriven.io/problems/the_host_ranker>

Domain: Python · Difficulty: medium · Seniority: L3

## Problem

Given a list of listing dicts (each with 'host_id', 'host_name', 'beds'), sum beds per host. Rank hosts using dense ranking by total beds descending, tie-break by host_name ascending. Return a list of dicts {'host_id', 'host_name', 'total_beds', 'rank'} sorted by rank ascending.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **group-by aggregation combined with dense ranking**, a multi-step data transformation. It probes the ability to aggregate, sort with multiple criteria, and assign ranks correctly.

---

### Break down the requirements

#### Step 1: Aggregate total beds per host

Sum beds across all listings for each host_id, tracking host_name.

#### Step 2: Sort by total beds descending

Higher bed counts rank first.

#### Step 3: Assign dense ranks

Equal totals share the same rank. The next distinct total gets the next consecutive rank number.

---

### The solution

**Aggregate, sort, and dense rank**

```python
def rank_hosts(listings: list) -> list:
    host_data = {}
    for rec in listings:
        hid = rec['host_id']
        if hid not in host_data:
            host_data[hid] = {'host_name': rec['host_name'], 'total_beds': 0}
        host_data[hid]['total_beds'] += rec['beds']
    sorted_hosts = sorted(host_data.values(), key=lambda h: -h['total_beds'])
    result = []
    current_rank = 0
    prev_beds = None
    for host in sorted_hosts:
        if host['total_beds'] != prev_beds:
            current_rank += 1
            prev_beds = host['total_beds']
        result.append({
            'host_name': host['host_name'],
            'total_beds': host['total_beds'],
            'rank': current_rank
        })
    return result
```

> **Time and Space Complexity**
>
> **Time:** O(n + k log k) where n is the number of listings and k is the number of unique hosts.
> 
> **Space:** O(k) for the aggregated data and output.

> **Interviewers Watch For**
>
> Whether you implement dense ranking correctly. If two hosts tie at rank 1, the next host should be rank 2, not rank 3.

> **Common Pitfall**
>
> Confusing host_id with host_name. Multiple listings can share a host_id with the same host_name. Aggregate by host_id.

---

## Common follow-up questions

- How would you write this in SQL? _(Tests `SUM(beds) ... GROUP BY host_id` with `DENSE_RANK() OVER(ORDER BY SUM(beds) DESC)`.)_
- What if ties should be broken by host_name alphabetically? _(Tests adding a secondary sort key.)_
- What is the difference between DENSE_RANK, RANK, and ROW_NUMBER? _(Tests understanding of gap behavior in each ranking function.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_host_ranker)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.