# Third Largest Batch Job

> Bronze medal in the batch job rankings.

Canonical URL: <https://datadriven.io/problems/third_largest_batch_job>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

During the weekly platform review, the data engineering lead asked for the third-largest batch job by total rows processed. If multiple jobs tie for that position, include all of them.

## Worked solution and explanation

### Why this problem exists in real interviews

This is a variant of the Nth-value problem applied to batch job sizes. It tests the same ranking fundamentals as other top-N problems.

---

### Break down the requirements

#### Step 1: Identify the ranking metric

Rank by the size or output metric descending.

#### Step 2: Apply DENSE_RANK

`DENSE_RANK() OVER (ORDER BY metric DESC)` assigns rank 1 to the largest.

#### Step 3: Filter to rank 3

`WHERE rnk = 3` extracts the third-largest job.

---

### The solution

**Nth-value extraction with DENSE_RANK**

```sql
SELECT job_name, output_bytes
FROM (
    SELECT
        job_name,
        output_bytes,
        DENSE_RANK() OVER (ORDER BY output_bytes DESC) AS rnk
    FROM batch_jobs
) ranked
WHERE rnk = 3
```

> **Cost Analysis**
>
> The window function sorts all rows by `output_bytes`. An index on `output_bytes DESC` avoids the sort. Since we only need rank 3, a top-N optimization can stop early.

> **Interviewers Watch For**
>
> Interviewers verify you know the difference between DENSE_RANK and LIMIT/OFFSET for Nth-value queries.

> **Common Pitfall**
>
> Using `LIMIT 1 OFFSET 2` does not handle ties. If two jobs share the largest size, offset 2 gives what is tied for first, not the third distinct value.

---

## Common follow-up questions

- What if you needed the third largest per job category? _(Tests PARTITION BY in the window function.)_
- How would you find the top 3 largest jobs instead of just the third? _(Tests changing the filter to rnk <= 3.)_
- What if output_bytes can be NULL? _(Tests NULL handling in ORDER BY.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/third_largest_batch_job)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.