# Teams Below Double Average Spend

> Teams spending under twice the average.

Canonical URL: <https://datadriven.io/problems/teams_below_double_average_spend>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

Our FinOps system tracks cost allocations by team. Find teams whose total spend is less than twice the overall average team spend. Show the team name, their total spend, and the average spend of their services.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests whether you can compare group-level aggregates against a global aggregate in a single query. Interviewers use it to probe your understanding of subqueries or window functions for computing cross-grain comparisons, and whether you know that `HAVING` filters groups while `WHERE` filters rows.

> **Trick to Solving**
>
> When a prompt says "less than twice the overall average," it signals a two-level aggregation: first aggregate per group, then compare each group's total to a scalar derived from the same data.
> 
> 1. Compute each team's `SUM(amount)` with `GROUP BY team_name`
> 2. Compute the overall average team spend as a scalar subquery or CTE
> 3. Filter with `HAVING` to keep teams below twice that average

---

### Break down the requirements

#### Step 1: Aggregate spend per team

`GROUP BY team_name` with `SUM(amount)` collapses the 15M `cost_allocs` rows into 50 team-level totals.

#### Step 2: Compute overall average team spend

A scalar subquery computes `AVG(team_total)` across the 50 teams. This must be a nested aggregation: first SUM per team, then AVG across teams.

#### Step 3: Compute per-team service average

For each team, also compute `AVG(amount)` grouped at the service level, or `SUM(amount) / COUNT(DISTINCT svc_name)` to get the average spend across that team's services.

#### Step 4: Filter and return

Use `HAVING` to keep only teams whose total spend is below twice the global average team spend. Return `team_name`, total spend, and average service spend.

---

### The solution

**Aggregate then compare against global average**

```sql
WITH team_totals AS (
    SELECT
        team_name,
        SUM(amount) AS total_spend,
        AVG(amount) AS avg_svc_spend
    FROM cost_allocs
    GROUP BY team_name
)
SELECT
    team_name,
    total_spend,
    avg_svc_spend
FROM team_totals
WHERE total_spend < 2 * (SELECT AVG(total_spend) FROM team_totals)
ORDER BY total_spend
```

> **Cost Analysis**
>
> The CTE scans 15M rows once and produces 50 rows. The scalar subquery against `team_totals` is trivially cheap. Total cost is dominated by the single full-table aggregation scan.

> **Interviewers Watch For**
>
> Strong candidates immediately recognize the nested aggregation requirement: you cannot compare `SUM(amount)` against `AVG(SUM(amount))` in a single `GROUP BY` without a subquery or CTE. Interviewers penalize attempts to do it in one pass.

> **Common Pitfall**
>
> A common mistake is computing `AVG(amount)` across all 15M rows instead of averaging the per-team totals. That gives the average row-level amount, not the average team spend, producing completely wrong results.

---

## Common follow-up questions

- How would you handle it if teams with zero spend should still appear? _(Tests understanding of LEFT JOINs and COALESCE for missing aggregation groups.)_
- What if the query needs to run incrementally as new cost records arrive? _(Probes materialized view or incremental aggregation strategies.)_
- Could you solve this with a window function instead of a CTE? _(Tests knowledge of `AVG() OVER ()` to compute the global average inline.)_
- What index would help if `cost_allocs` had 1B rows? _(Probes index design for aggregate queries: a covering index on `(team_name, amount)`.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/teams_below_double_average_spend)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.