# The Provinces

> The home office isn't the whole story. Find the zones carrying the rest.

Canonical URL: <https://datadriven.io/problems/the-provinces-regional-earnings>

Domain: SQL · Difficulty: easy · Seniority: mid

## Problem

We run a retail operation split into regional zones, with the US as our home base and main branch. Find the three highest earning zones outside the home region, biggest earners first.

## Worked solution and explanation

### What this problem is really testing

This is a top-three-after-exclusion dressed up as a regional earnings report. The skill being probed: do you drop the home region BEFORE you rank, or after? Anyone can sum profit per region. The tell is whether the filter that removes the main branch runs on the base rows, ahead of the ordering and the LIMIT. Rank first and slice off the top three and you either hand back the US home region you were told to exclude, or you silently bump a real zone off the bottom of the list.

---

### Break down the requirements

#### Step 1: Exclude the home region on the base rows

'US as our home base and main branch' is a filter on region, not a row you remove after aggregating. Put WHERE region <> 'US' before the grouping so US profit never enters any bucket.

#### Step 2: Roll up to the zone grain

GROUP BY region and SUM(profit). The question is about zones, so one row per region is the unit. Note profit can be negative on returns, so SUM (not COUNT) is the earning measure.

#### Step 3: Order high to low, then take three

ORDER BY total_profit DESC puts the biggest earners on top, and LIMIT 3 keeps the leaderboard short. The DESC has to sit on the summed total, not on raw profit.

---

### The solution

**TOP EARNING ZONES OUTSIDE HOME**

```sql
SELECT region,
       SUM(profit) AS total_profit
FROM orders
WHERE region <> 'US'
GROUP BY region
ORDER BY total_profit DESC
LIMIT 3
```

**Exclude after ranking (wrong)**

Sum every region including US, sort, take the top three, then try to drop US. If US is the biggest earner it was inside your top three, so you return two real zones plus a hole, or you keep US entirely.

**Exclude before ranking (right)**

WHERE region <> 'US' removes the home region from the base rows, so the aggregate, the sort, and the LIMIT all operate only on the zones you actually want. The top three are three genuine outside zones.

> **Common Pitfall**
>
> Filtering the main branch out in a HAVING or a wrapping query after LIMIT 3 has already run. LIMIT executes last, so anything you trim afterward shortens the answer instead of shifting it. The exclusion has to happen in WHERE, before the row count is capped.

> **Interviewers Watch For**
>
> Whether you ask what 'main branch' means and pin it to a concrete region, and whether you sum profit rather than count orders. Bonus points for naming that profit can be negative, so a zone with heavy returns can legitimately sink below a smaller but cleaner one.

> **Cost Analysis**
>
> On a 200M-row orders table partitioned by region, WHERE region <> 'US' is a cheap partition-elimination step: the planner never scans the home partition. The remaining work is a hash aggregate over a handful of region keys, then a top-three sort. No join, no window, no self-reference; this stays a single scan of the non-home partitions.

---

## Common follow-up questions

- The main branch isn't fixed; it's whichever region earns the most. Exclude that one dynamically and return the next three. _(Pushes toward a scalar subquery or a window that finds the max region total first, testing whether they can make the exclusion data-driven instead of hardcoded.)_
- Two zones tie on total profit at the third spot. Which one does LIMIT 3 keep, and how would you make it deterministic? _(Tests awareness that LIMIT without a full tiebreaker in ORDER BY is nondeterministic, and that a secondary sort key (region name) fixes it.)_
- Now scope it to this fiscal year only. Where does the date predicate go and why? _(Checks that they add the window to WHERE alongside the region filter so it prunes before aggregation, rather than filtering after the rollup.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the-provinces-regional-earnings)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.