# Mid-Range Team Spenders

> Above average but not extreme.

Canonical URL: <https://datadriven.io/problems/mid_range_team_spenders>

Domain: SQL · Difficulty: hard · Seniority: L5

## Problem

Find employees whose cloud cost allocation is above average for their team but who are not among that team's top 3 spenders. Show each qualifying employee's team name and amount.

## Worked solution and explanation

### Why this problem exists in real interviews

FinOps and platform teams hand over cost data and ask which allocations look 'oversized but not headline-grabbing'. The interviewer wants to see whether you can mix a per-group aggregate (team average) with a per-row window rank in the SAME query, and whether you remember that DENSE_RANK shares ranks with no gaps. Mixing these two patterns trips juniors because they reach for a self join.

---

### Break down the requirements

#### Step 1: Compute the per team average separately

AVG(amount) GROUP BY team_name lives in its own CTE because every allocation row needs the team mean attached. Trying to combine AVG with a window rank in one SELECT mixes aggregation grain and forces a subquery anyway.

#### Step 2: Rank allocations within each team

DENSE_RANK() OVER (PARTITION BY team_name ORDER BY amount DESC) gives the no gaps behavior the prompt asks for. RANK would skip numbers after ties and silently drop allocations that should qualify. ROW_NUMBER would split ties arbitrarily.

#### Step 3: Apply BOTH filters with AND

Strictly above the team average AND strictly greater than rank 3 means you keep rows with rnk = 4, 5, 6 and so on whose amount also beats the team mean. Use strict inequalities, not >=, because the prompt says 'strictly'.

---

### The solution

**Join team average against window rank**

```sql
WITH team_avg AS (
  SELECT team_name, AVG(amount) AS avg_amount FROM cost_allocs GROUP BY team_name
),
ranked AS (
  SELECT team_name, amount,
         DENSE_RANK() OVER (PARTITION BY team_name ORDER BY amount DESC) AS rnk
  FROM cost_allocs
)
SELECT r.team_name, r.amount
FROM ranked r
INNER JOIN team_avg ta ON r.team_name = ta.team_name
WHERE r.amount > ta.avg_amount AND r.rnk > 3
```

> **Cost Analysis**
>
> Both CTEs scan cost_allocs once. With 25M rows and 70 teams, the GROUP BY for team_avg collapses to 70 rows that broadcast cheaply during the join. The window rank is the dominant cost because it must sort each partition by amount; partitioning by team_name lets the planner sort within team rather than globally.

> **Interviewers Watch For**
>
> Whether you reach for DENSE_RANK (no gaps) instead of RANK (with gaps) or ROW_NUMBER (arbitrary ties), and whether you keep both filters in WHERE rather than splitting into HAVING. Strong candidates also ask whether 'team average' should exclude the row itself; the prompt says no, so all rows feed the average.

> **Common Pitfall**
>
> Writing rnk >= 3 instead of rnk > 3 includes the third highest allocation, which the prompt explicitly excludes. The other common miss is using RANK instead of DENSE_RANK: a tie at rank 3 then bumps the next rank to 5 or later, dropping legitimate allocations that should have been the fourth distinct amount.

---

## Common follow-up questions

- How would the answer change if 'team average' had to exclude the row being evaluated? _(Tests whether the candidate knows about leave one out averages, which require a window AVG with a frame that excludes the current row, or an algebraic trick like (sum total minus current) divided by (count minus one).)_
- Why DENSE_RANK rather than RANK for this prompt? _(Tests whether the candidate can articulate the gap behavior. Two allocations tied for first cause RANK to skip rank 2, so 'rnk > 3' would silently exclude rows the prompt expects.)_
- If the table grew to 1B rows, how would you avoid sorting it twice? _(Tests awareness that team_avg and ranked both scan cost_allocs. A single pass with conditional aggregation in a window or a partition pruned by team_name reduces wall time.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/mid_range_team_spenders)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.