# Runner-Up Cost Without ORDER BY

> The second highest. Without sorting.

Canonical URL: <https://datadriven.io/problems/runner_up_cost_without_order_by>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

The FinOps team already knows the peak cost outlier and wants to understand the runner-up. What is the second highest cloud cost amount on record?

## Worked solution and explanation

### Why this problem exists in real interviews

Interviewers use this cloud cost scenario to test filtering and projection against the `cloud_costs` table. The focus is on how you handle the `amount` column when building the result.

> **Trick to Solving**
>
> Finding the 'second highest' or 'runner-up' is a classic pattern. Avoid ORDER BY/LIMIT when precision matters.
> 
> 1. Use a scalar subquery to find the maximum
> 2. Filter the outer query to values strictly less than that maximum
> 3. Take the MAX of the filtered set

---

### Break down the requirements

#### Step 1: Apply the range filter

The WHERE clause restricts rows to the target range. Applying this filter early reduces the volume flowing into downstream operations.

#### Step 2: Use a subquery to find the reference value

The scalar subquery computes a single value (like the maximum) that the outer query filters against. This avoids a self-join.

---

### The solution

**Apply the range filter to find runner-up cost without order by**

```sql
SELECT MAX(amount) AS second_highest
FROM cloud_costs
WHERE amount < (SELECT MAX(amount) FROM cloud_costs)
```

> **Cost Analysis**
>
> With ~10M rows, the query performs a single sequential scan. An index on the filter/join columns would reduce the scan to a seek.

> **Interviewers Watch For**
>
> Interviewers watch for whether you use a subquery or self-join, and can explain the tradeoffs.

> **Common Pitfall**
>
> Returning extra columns that the prompt did not ask for, or using the wrong column alias, causes a grading mismatch even when the logic is correct.

---

## Common follow-up questions

- What would happen to your result if `cloud_costs.cost_id` contained duplicate values that you did not expect? _(Tests whether the candidate considers data quality issues in `cost_id` and uses DISTINCT or deduplication where needed.)_
- `cloud_costs.cost_id` has roughly 10,000,000 distinct values. What index strategy would you use to avoid a full scan on `cloud_costs`? _(Tests indexing knowledge specific to the high-cardinality `cost_id` column in `cloud_costs`.)_
- How would you modify this query if the business logic required grouping by both `cost_id` and `provider` instead of just one? _(Tests ability to adapt the query structure to changing requirements.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/runner_up_cost_without_order_by)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.