# Top CPU Pods per Namespace

> The two most CPU-hungry pods in each namespace.

Canonical URL: <https://datadriven.io/problems/top_cpu_pods_per_namespace>

Domain: SQL · Difficulty: hard · Seniority: L5

## Problem

Pivot the two highest-CPU pods in each namespace so each row is a namespace with its top and second-place pod names as separate columns.

## Worked solution and explanation

### Why this problem exists in real interviews

Using `k8s_pods`, this tests per-group ranking via `ROW_NUMBER()` or `DENSE_RANK()` partitioned by a grouping key with proper grain management. Strong candidates immediately identify the grouping key and metric column before writing any window function.

---

### Break down the requirements

#### Step 1: Aggregate per pod_name

`GROUP BY pod_name` with the appropriate aggregate function produces one summary row per group from the `k8s_pods` table.

#### Step 2: Rank within each pod_id

Use `ROW_NUMBER() OVER (PARTITION BY pod_id ORDER BY aggregate DESC)` to rank entries within each partition.

#### Step 3: Filter to top entries

Wrap in a subquery and filter `WHERE rn <= N` to keep only the top entries per group.

---

### The solution

**Rank pods by cpu_used within nspace then pivot first and second into columns**

```sql
SELECT pod_id, pod_name, total_cpu_used
FROM (
    SELECT
        pod_id,
        pod_name,
        SUM(cpu_used) AS total_cpu_used,
        ROW_NUMBER() OVER (
            PARTITION BY pod_id
            ORDER BY SUM(cpu_used) DESC
        ) AS rn
    FROM k8s_pods
    GROUP BY pod_id, pod_name
) ranked
WHERE rn <= 10
ORDER BY pod_id, total_cpu_used DESC
```

> **Cost Analysis**
>
> The GROUP BY reduces the 3M-row `k8s_pods` table to the number of distinct `pod_name` values. The window function sorts within each partition. A covering index on `(pod_name, cpu_used)` enables an index-only aggregate scan.

> **Interviewers Watch For**
>
> Interviewers specifically test whether you use `PARTITION BY` in the window function. Omitting it gives a global ranking instead of per-group, which is at its core different.

> **Common Pitfall**
>
> Using `ORDER BY ... LIMIT` instead of a window function for per-group ranking. LIMIT gives N rows globally, not per group. Per-group top-N always requires a window function.

---

## Common follow-up questions

- If a namespace has only one pod, what appears in the second-place column? _(Tests NULL handling in the pivot; MAX(CASE WHEN rank = 2 ...) returns NULL when no row qualifies.)_
- If two pods in the same namespace have identical cpu_used, which one gets rank 1? _(Tests tiebreaker strategy; adding pod_name to the ORDER BY provides a deterministic pick.)_
- Could you achieve this pivot using ARRAY_AGG or STRING_AGG instead of CASE, and what are the trade-offs? _(Tests alternative pivot approaches; ARRAY_AGG preserves order but returns an array type, not separate columns.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/top_cpu_pods_per_namespace)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.