# Fastest Completion Per Day

> Every day has a speed champion.

Canonical URL: <https://datadriven.io/problems/fastest_completion_per_day>

Domain: SQL · Difficulty: medium · Seniority: L5

## Problem

For each date, find the batch job that completed the most rows. Return the start time, job name, and rows completed, ordered by date.

## Worked solution and explanation

### What this is really asking

`DATE(started)` is the partition key, `rows_done DESC` is the rank order. Per-day top-1 on 800k rows, ties allowed because RANK() returns every row at position 1.

---

### Break down the requirements

#### Step 1: Bucket by date

Cast `started` to a date so 23:59 and 00:01 land in different buckets.

#### Step 2: Rank within each day

RANK() OVER (PARTITION BY DATE(started) ORDER BY rows_done DESC). Ties at the max all get rank 1.

#### Step 3: Keep rnk = 1

Window functions cannot live in WHERE, so filter rnk = 1 in the outer query and ORDER BY job_date.

---

### The solution

**FASTEST COMPLETION PER DAY**

```sql
WITH ranked AS (
  SELECT
    DATE(started) AS job_date,
    job_name,
    rows_done,
    RANK() OVER (
      PARTITION BY DATE(started)
      ORDER BY rows_done DESC
    ) AS rnk
  FROM batch_jobs
)
SELECT job_date, job_name, rows_done
FROM ranked
WHERE rnk = 1
ORDER BY job_date
```

> **Cost Analysis**
>
> One pass over 800k rows plus a partitioned sort. An index on (started, rows_done DESC) lets the planner stream partitions. ROW_NUMBER is cheaper but drops tied winners.

> **Interviewers Watch For**
>
> Choice of RANK vs ROW_NUMBER vs DENSE_RANK, casting `started` to a date, and knowing the CTE is required because window functions cannot appear in WHERE.

> **Common Pitfall**
>
> GROUP BY DATE(started) with MAX(rows_done) gives the right number but loses `job_name`. Joining back reintroduces the tie problem the window already solved.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- How would you break ties deterministically? _(Add a secondary key (job_id ASC) and switch to ROW_NUMBER so one row per day survives.)_
- What if in-flight jobs should be excluded? _(Filter `status = 'completed'` inside the CTE so unfinished rows do not win.)_
- Return slowest job per day too? _(Add a second RANK with ORDER BY rows_done ASC, filter on either rank outside.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/fastest_completion_per_day)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.