# Content Sorted by Duration

> The catalog, sorted by length.

Canonical URL: <https://datadriven.io/problems/content_sorted_by_duration>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

The content team wants to see how each item compares to the longest piece in the catalog. Return all content items with their full details, duration as a decimal, and the difference in seconds from the longest item, listed from longest to shortest.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests window functions for computing a difference from a global maximum. The requirement to show the gap from the longest item probes whether you know MAX() OVER () for a table-wide computation alongside row-level output.

---

### Break down the requirements

#### Step 1: Select all content fields

Return content details with `duration_seconds` cast as decimal if needed.

#### Step 2: Compute difference from maximum

`MAX(duration_seconds) OVER () - duration_seconds` gives the gap from the longest item.

#### Step 3: Sort by duration descending

`ORDER BY duration_seconds DESC` as specified.

---

### The solution

**Window function for global max comparison**

```sql
SELECT
    *,
    duration_seconds::DECIMAL AS duration_decimal,
    MAX(duration_seconds) OVER () - duration_seconds AS diff_from_longest
FROM content_items
ORDER BY duration_seconds DESC
```

> **Cost Analysis**
>
> Single scan of 3M rows. The MAX() OVER () window function computes the global maximum in one pass. No GROUP BY or subquery needed.

> **Interviewers Watch For**
>
> Whether the candidate uses `MAX() OVER ()` (windowless partition = entire table) vs a subquery `(SELECT MAX(duration_seconds) FROM content_items)`. Both work, but the window function approach is cleaner.

> **Common Pitfall**
>
> If duration_seconds is NULL for some rows, `MAX() OVER ()` still returns the correct max (ignoring NULLs), but the subtraction produces NULL. Use COALESCE if you want 0 instead.

---

## Common follow-up questions

- What is the difference between MAX() OVER () and a scalar subquery? _(Both compute the same value. The window function avoids a separate scan. Tests optimizer awareness.)_
- How would you also show the rank of each item by duration? _(Add RANK() OVER (ORDER BY duration_seconds DESC).)_
- What if you needed the difference from the average instead of the max? _(Replace MAX with AVG in the window function.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/content_sorted_by_duration)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.