RANK vs DENSE_RANK vs ROW_NUMBER Practice

Q: When does it actually matter which ranking function I pick?

When ties exist on the ORDER BY column. ROW_NUMBER picks 1 winner. RANK ties them and leaves gaps. DENSE_RANK ties them and stays tight. The 3-cell decision tree above shows which to pick based on whether the prompt wants exactly 1 per group, top-N with ties, or the Nth distinct value.

Q: Why does ROW_NUMBER return different rows on different runs?

The ORDER BY isn't total. ROW_NUMBER OVER (PARTITION BY x ORDER BY y) is nondeterministic when 2 records share both x and y. Add a tiebreaker column (usually a primary key) at the end of ORDER BY: ROW_NUMBER OVER (PARTITION BY x ORDER BY y, id).

Q: What's the right function for 'find the Nth highest distinct salary'?

DENSE_RANK. The 3 candidate answers in the code block above show why ROW_NUMBER and RANK both fail under ties at the top. DENSE_RANK doesn't skip after ties, so DENSE_RANK = N always returns the Nth distinct value.

Q: Can I rank without PARTITION BY?

Yes. Omitting PARTITION BY ranks across the entire result set as 1 group. RANK() OVER (ORDER BY score DESC) returns global rank. PARTITION BY is only needed when you want separate rankings per group (per department, per category, per customer).

Q: What's PERCENT_RANK used for?

Percentile-style rank. PERCENT_RANK() returns (rank - 1) / (total_rows - 1), so the top row is 1.0 and the bottom is 0.0. Useful for percentile bucketing, performance ranges, and outlier detection. Rare in interviews but useful to recognize.

Q: What's the difference between RANK and DENSE_RANK with ties?

Both ties get the same rank. RANK then skips the next rank (1, 2, 2, 4); DENSE_RANK doesn't (1, 2, 2, 3). When the question is 'find the Nth row' RANK is correct; when the question is 'find the Nth distinct value' DENSE_RANK is correct.

3 SQL ranking functions look interchangeable until 2 rows tie. ROW_NUMBER picks a winner. RANK ties them and leaves a gap. DENSE_RANK ties them with no gap. Picking the wrong one is a silent bug: the query runs, returns rows, and the interviewer waits to see if you notice the difference. 8 graded problems below, organized around the 3-cell decision tree.

Open problem 1 Rank reference

Decision tree: 3 cells

Memorize this and you'll pick the right function on instinct, every time.

ROW_NUMBERExactly 1 per group

Use when the question is "exactly 1 row per partition": most recent order per customer, deduplication, pagination, single-row sample. ROW_NUMBER never ties.

Tells in the prompt

"Latest per X"
"Most recent per X"
"Pick one per X"
"Deduplicate by X"

RANKTies matter, gaps OK

Use when ties are real and a gap after them is the right semantics: leaderboards, competition rankings, "top N including ties". WHERE rnk <= 3 may return 4 rows if 2 tied at 2.

Tells in the prompt

"Top N including ties"
"Leaderboard"
"Competition ranking"
"Standings"

DENSE_RANKNth distinct value

Use when the question is about the Nth distinct value, not the Nth row: 2nd highest distinct salary, 3rd distinct price tier. DENSE_RANK doesn't skip after ties.

Tells in the prompt

"Nth highest distinct"
"Nth unique X"
"Nth tier"
"Top N distinct values"

How the 3 functions diverge under ties

Same input, same ORDER BY. The divergence on tied rows is the entire interview interest.

Same input, same ORDER BY score DESC, 3 functions

score

ROW_NUMBER

RANK

DENSE_RANK

3 rows tie at score 90. ROW_NUMBER picks an arbitrary order (bug if no tiebreaker). RANK skips ranks 3 and 4, jumps to 5. DENSE_RANK numbers stay tight: 1, 2, 3, 4.

'2nd highest distinct salary' has 4 plausible answers, 2 right ones

-- Prompt: "Find the 2nd highest distinct salary."
-- 3 plausible interpretations; the grader rejects the wrong one explicitly.

-- WRONG. Returns the 2nd row by salary descending, which equals the
-- highest salary when 2+ employees share it.
SELECT salary FROM employees
ORDER BY salary DESC LIMIT 1 OFFSET 1;

-- WRONG. RANK = 2 returns nothing if the top is tied (RANK skips to 3).
SELECT DISTINCT salary FROM (
  SELECT salary, RANK() OVER (ORDER BY salary DESC) AS r FROM employees
) t WHERE r = 2;

-- RIGHT. DENSE_RANK = 2 returns the 2nd distinct salary regardless of
-- ties at the top.
SELECT salary FROM (
  SELECT DISTINCT salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS dr
  FROM employees
) t WHERE dr = 2;

-- ALSO RIGHT, terser. SELECT DISTINCT inside the subquery isn't needed
-- because DENSE_RANK already collapses ties.
WITH ranked AS (
  SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS dr
  FROM employees
)
SELECT DISTINCT salary FROM ranked WHERE dr = 2;

2 wrong, 2 right. OFFSET 1 and RANK both fail under ties at the top. DENSE_RANK doesn't.

8 ranking practice problems

Each problem maps to a cell of the decision tree. The grader catches wrong-function selection by engineering ties into the seed data.

Latest order per customer (dedup)

ROW_NUMBER

Easy · ~8 min

orders(order_id, customer_id, ordered_at, total)

Return 1 row per customer: the most recent order.

Trap

Seed 4 has 2 orders per customer with identical ordered_at. Without an order_id tiebreaker, the returned row is nondeterministic.

2nd highest distinct salary

DENSE_RANK

Easy · ~10 min

employees(emp_id, salary)

Return the 2nd highest distinct salary value.

Trap

OFFSET 1 LIMIT 1 returns the top salary when it's tied. RANK skips past the tie. DENSE_RANK is the only correct choice.

Top 3 students per class, including ties

RANK

Easy-Medium · ~10 min

scores(student_id, class_id, score)

For each class, return all students whose score is in the top 3 including anyone tied at 3rd.

Trap

Seed 2 has 4 students tied at 3rd place in 1 class. ROW_NUMBER drops 3 of them; RANK with WHERE rnk <= 3 returns all 4 correctly.

Top 3 strictly, even when ties

ROW_NUMBER + tiebreaker

Medium · ~12 min

scores(student_id, class_id, score)

For each class, return exactly 3 students even when scores tie. Break ties on student_id ASC.

Trap

RANK or DENSE_RANK can return more than 3 rows under ties. Explicit tiebreaker policy is required, and the grader checks the alphabetical fallback.

Department with 3rd highest avg salary

DENSE_RANK after aggregation

Medium · ~12 min

employees(emp_id, department, salary)

Find the department with the 3rd highest average salary.

Trap

2 departments may tie at the top. RANK would skip to 3 and miss the actual 3rd distinct value; DENSE_RANK lands on it. Aggregate first, rank second.

Sales rep percentile rank

PERCENT_RANK

Medium-Hard · ~10 min

sales(rep_id, revenue)

For each rep, return their percent rank across all reps (0 to 1, where 1 is top performer).

Trap

PERCENT_RANK returns (rank - 1) / (n - 1). With 1 row, this is 0/0 = NULL; the grader's last seed is a 1-row partition that exposes this.

Composite multi-criteria ranking

ROW_NUMBER with composite ORDER BY

Hard · ~15 min

candidates(candidate_id, technical_score, behavioral_score, years_exp)

Rank candidates by technical DESC, then behavioral DESC, then years_exp DESC. One row per rank.

Trap

Seed 7 has 2 candidates identical on all 3 criteria. The grader requires a 4th-level tiebreaker on candidate_id; without it, the ranking is unstable.

Late-data-aware dedup

ROW_NUMBER with composite ordering

Hard · ~20 min

events(event_id, user_id, event_time, ingested_at)

Return the latest event per user. 'Latest' = event_time DESC, then ingested_at DESC (in case of duplicate emissions).

Trap

Seed 1 has events that arrived in the wrong order: ingested_at is after event_time by hours. Using ingested_at as the primary ORDER BY ranks them incorrectly.

How often ranking functions appear in interviews

Ranking function presence in interview SQL questions

ROW_NUMBER (incl. dedup pattern)

15%

PARTITION BY (across all 3 functions)

21%

RANK / DENSE_RANK

11%

PERCENT_RANK / CUME_DIST

Ranking function FAQ

When does it actually matter which ranking function I pick?+

When ties exist on the ORDER BY column. ROW_NUMBER picks 1 winner. RANK ties them and leaves gaps. DENSE_RANK ties them and stays tight. The 3-cell decision tree above shows which to pick based on whether the prompt wants exactly 1 per group, top-N with ties, or the Nth distinct value.

Why does ROW_NUMBER return different rows on different runs?+

The ORDER BY isn't total. ROW_NUMBER OVER (PARTITION BY x ORDER BY y) is nondeterministic when 2 records share both x and y. Add a tiebreaker column (usually a primary key) at the end of ORDER BY: ROW_NUMBER OVER (PARTITION BY x ORDER BY y, id).

What's the right function for 'find the Nth highest distinct salary'?+

DENSE_RANK. The 3 candidate answers in the code block above show why ROW_NUMBER and RANK both fail under ties at the top. DENSE_RANK doesn't skip after ties, so DENSE_RANK = N always returns the Nth distinct value.

Can I rank without PARTITION BY?+

Yes. Omitting PARTITION BY ranks across the entire result set as 1 group. RANK() OVER (ORDER BY score DESC) returns global rank. PARTITION BY is only needed when you want separate rankings per group (per department, per category, per customer).

What's PERCENT_RANK used for?+

Percentile-style rank. PERCENT_RANK() returns (rank - 1) / (total_rows - 1), so the top row is 1.0 and the bottom is 0.0. Useful for percentile bucketing, performance ranges, and outlier detection. Rare in interviews but useful to recognize.

What's the difference between RANK and DENSE_RANK with ties?+

Both ties get the same rank. RANK then skips the next rank (1, 2, 2, 4); DENSE_RANK doesn't (1, 2, 2, 3). When the question is 'find the Nth row' RANK is correct; when the question is 'find the Nth distinct value' DENSE_RANK is correct.

02 / Why practice

Open problem 1 (dedup pattern)

01
Active recall beats re-reading by 50%
Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom
02
76% of hiring managers reject on the coding task, not the resume
From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice
03
Five problem shapes cover 80% of data engineer loops
Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition

Open problem 1

Adjacent topics

Rank Reference→

Full syntax, semantics, PARTITION BY patterns, and edge cases.

Window Functions→

Broader family: LAG, LEAD, SUM OVER, frame clauses.

ORDER BY Practice→

Deterministic tiebreakers, NULL placement, stable pagination.