SQL Window Functions Cheat Sheet

Most candidates memorize ROW_NUMBER and call themselves comfortable with window functions. Interviewers do not buy it. Window functions appear in roughly 35% of senior-level DE interviews analyzed, and the ones that actually separate strong candidates from weak ones are never the simple ranking questions. They are the frame clause questions, the LAG with a default value, the LAST_VALUE trap where the default frame silently stops at the current row.

35%
Senior rounds use them
21%
Use PARTITION BY
15%
Use ROW_NUMBER
9%
Aggregate windows

Window Function Syntax

PARTITION BY divides rows into groups. The function resets for each group. Omitting it treats the entire result set as one partition.

ORDER BY defines the row order within each partition. Required for ranking and offset functions. Optional for aggregate functions (but changes their behavior when present).

Frame clause defines which rows within the partition are included in the calculation. Only applies to aggregate functions and FIRST_VALUE / LAST_VALUE.

Window Function Syntax

function_name(args) OVER (
  [PARTITION BY partition_columns]
  [ORDER BY sort_columns [ASC|DESC]]
  [frame_clause]
)

Ranking Functions

Do not default to ROW_NUMBER just because it is the one you remember. The interviewer asking for 'the top three by salary per department' is usually checking whether you reach for DENSE_RANK or RANK when ties matter, and whether you know the difference. ROW_NUMBER with a tied ORDER BY produces nondeterministic output. That is a bug, not a feature, and it is the first thing a sharp interviewer probes.

ROW_NUMBER()

-- Number each employee within their department by salary
SELECT emp_name, department, salary,
  ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
FROM employees
LIMIT 20;
-- rn: 1, 2, 3, 4 (no gaps, no ties)

Assigns a unique sequential integer to each row. Ties get different numbers (arbitrary tiebreak).

RANK()

-- Rank employees by salary (gaps after ties)
SELECT emp_name, department, salary,
  RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rnk
FROM employees
LIMIT 20;
-- rnk: 1, 2, 2, 4 (gap: 3 is skipped)

Like ROW_NUMBER but ties get the same rank. Leaves gaps after ties.

DENSE_RANK()

-- Dense rank: no gaps after ties
SELECT emp_name, department, salary,
  DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS drnk
FROM employees
LIMIT 20;
-- drnk: 1, 2, 2, 3 (no gap)

Like RANK but no gaps. Ties get the same rank. The next distinct value gets the next integer.

NTILE(n)

-- Divide employees into 4 salary quartiles
SELECT emp_name, salary,
  NTILE(4) OVER (ORDER BY salary) AS quartile
FROM employees
LIMIT 20;
-- quartile: 1, 1, 2, 2, 3, 3, 4, 4 (spread across rows)

Divides the partition into n approximately equal groups and assigns a group number (1 to n) to each row.

Aggregate Window Functions

Standard aggregate functions work as window functions when used with OVER(). Without ORDER BY, they compute over the entire partition. With ORDER BY, they compute a running aggregate (using the default frame).

SUM()

-- Running total of revenue, plus partition total and 7-row rolling sum
SELECT
  transaction_id,
  transaction_date,
  total_amount,
  SUM(total_amount) OVER (ORDER BY transaction_date ROWS UNBOUNDED PRECEDING) AS running_total,
  SUM(total_amount) OVER (PARTITION BY product_id) AS product_total,
  SUM(total_amount) OVER (ORDER BY transaction_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS rolling_7
FROM transactions
LIMIT 20;

AVG()

-- Running average, 5-row moving average, and department average
SELECT
  emp_name,
  department,
  hire_date,
  salary,
  AVG(salary) OVER (ORDER BY hire_date ROWS UNBOUNDED PRECEDING) AS running_avg,
  AVG(salary) OVER (ORDER BY hire_date ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) AS moving_avg_5,
  AVG(salary) OVER (PARTITION BY department) AS dept_avg
FROM employees
LIMIT 20;

COUNT()

-- Running count, partition row count, and non-null count in a window
SELECT
  emp_name,
  department,
  hire_date,
  COUNT(*) OVER (ORDER BY hire_date ROWS UNBOUNDED PRECEDING) AS running_count,
  COUNT(*) OVER (PARTITION BY department) AS dept_count,
  COUNT(manager_id) OVER (PARTITION BY department) AS dept_with_manager
FROM employees
LIMIT 20;

MIN() / MAX()

-- Running minimum, group maximum, and 3-row sliding window min
SELECT
  emp_name,
  department,
  hire_date,
  salary,
  MIN(salary) OVER (ORDER BY hire_date ROWS UNBOUNDED PRECEDING) AS running_min,
  MAX(salary) OVER (PARTITION BY department) AS dept_max,
  MIN(salary) OVER (ORDER BY hire_date ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sliding_min
FROM employees
LIMIT 20;

Offset Functions

Offset functions access a value from a different row relative to the current row. They require ORDER BY and do not use a frame clause (except FIRST_VALUE and LAST_VALUE).

LAG(column, offset, default)

-- Previous transaction's amount and the change from it
SELECT
  transaction_id,
  transaction_date,
  total_amount,
  LAG(total_amount, 1, 0) OVER (ORDER BY transaction_date) AS prev_amount,
  total_amount - LAG(total_amount, 1, 0) OVER (ORDER BY transaction_date) AS change_from_prev
FROM transactions
LIMIT 20;

Returns the value from a previous row. Offset defaults to 1. Default value used when there is no previous row.

LEAD(column, offset, default)

-- Next transaction's amount per user (session-style analysis)
SELECT
  user_id,
  transaction_date,
  total_amount,
  LEAD(total_amount, 1) OVER (PARTITION BY user_id ORDER BY transaction_date) AS next_amount
FROM transactions
LIMIT 20;

Returns the value from a subsequent row. Mirror of LAG.

FIRST_VALUE(column)

-- Earliest-hired employee name in each department
SELECT
  emp_name,
  department,
  hire_date,
  FIRST_VALUE(emp_name) OVER (
    PARTITION BY department ORDER BY hire_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) AS first_hired
FROM employees
LIMIT 20;

Returns the first value in the window frame. Respects the frame clause.

LAST_VALUE(column)

-- Latest-hired employee name per department (needs the full frame)
SELECT
  emp_name,
  department,
  hire_date,
  LAST_VALUE(emp_name) OVER (
    PARTITION BY department ORDER BY hire_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) AS last_hired
FROM employees
LIMIT 20;

Returns the last value in the window frame. Important: the default frame ends at CURRENT ROW, so you usually need to specify ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

NTH_VALUE(column, n)

-- Second highest salary per department
SELECT
  emp_name,
  department,
  salary,
  NTH_VALUE(salary, 2) OVER (
    PARTITION BY department ORDER BY salary DESC
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) AS second_highest
FROM employees
LIMIT 20;

Returns the nth value in the window frame. Useful for getting the second-highest, third-most-recent, etc.

Distribution Functions

PERCENT_RANK()

SELECT emp_name, salary,
  PERCENT_RANK() OVER (ORDER BY salary) AS pct_rank
FROM employees
LIMIT 20;
-- Salary at the 50th percentile gets ~0.5

Relative rank as a percentage: (rank - 1) / (total_rows - 1). Returns 0.0 for the first row, 1.0 for the last.

CUME_DIST()

SELECT emp_name, salary,
  CUME_DIST() OVER (ORDER BY salary) AS cume_dist
FROM employees
LIMIT 20;
-- 0.25 means 25% of rows have a salary <= this row's salary

Cumulative distribution: fraction of rows with values less than or equal to the current row. Always returns a value between 0 (exclusive) and 1 (inclusive).

Frame Clause Reference

The frame clause defines which rows are included in the window calculation. It only applies to aggregate functions and FIRST_VALUE / LAST_VALUE / NTH_VALUE.

Interview note: The most common interview mistake with window functions is forgetting the default frame. When ORDER BY is present, the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. This means LAST_VALUE returns the current row, not the actual last row. Always specify the frame explicitly when using FIRST_VALUE, LAST_VALUE, or NTH_VALUE.

Frame Clause Syntax

{ROWS | RANGE | GROUPS} BETWEEN
  {UNBOUNDED PRECEDING | n PRECEDING | CURRENT ROW}
AND
  {CURRENT ROW | n FOLLOWING | UNBOUNDED FOLLOWING}

Frame Clause Quick Reference

FrameMeaningUse Case
ROWS UNBOUNDED PRECEDINGFrom partition start to current rowRunning total, running count
ROWS BETWEEN 6 PRECEDING AND CURRENT ROWCurrent row + 6 rows before7-day rolling average
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWINGPrevious row, current, next rowSmoothing, 3-point average
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWINGEntire partitionPartition total on every row
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWINGCurrent row to partition endReverse running total

PARTITION BY + ORDER BY Combos

PatternOVER ClauseResult
Global totalOVER ()Same value on every row
Group totalOVER (PARTITION BY dept)Same value within each dept
Global running totalOVER (ORDER BY date)Cumulative across all rows
Group running totalOVER (PARTITION BY dept ORDER BY date)Cumulative within each dept
Group rankingOVER (PARTITION BY dept ORDER BY salary DESC)Rank resets per dept

Common Interview Patterns

Top-N Per Group

WITH ranked AS (
  SELECT *,
    ROW_NUMBER() OVER (
      PARTITION BY department ORDER BY salary DESC
    ) AS rn
  FROM employees
)
SELECT * FROM ranked WHERE rn <= 3;

Year-over-Year Comparison

WITH monthly_revenue AS (
  SELECT
    substr(transaction_date, 1, 7) AS month,
    SUM(total_amount) AS revenue
  FROM transactions
  GROUP BY substr(transaction_date, 1, 7)
)
SELECT
  month,
  revenue,
  LAG(revenue, 12) OVER (ORDER BY month) AS prev_year,
  revenue - LAG(revenue, 12) OVER (ORDER BY month) AS yoy_change
FROM monthly_revenue;

Percentage of Group Total

SELECT
  department,
  emp_name,
  salary,
  ROUND(
    100.0 * salary / SUM(salary) OVER (PARTITION BY department),
    1
  ) AS pct_of_dept
FROM employees;

Consecutive Streaks

-- Find consecutive days of positive revenue
WITH daily_metrics AS (
  SELECT
    substr(transaction_date, 1, 10) AS day,
    SUM(total_amount) AS revenue
  FROM transactions
  GROUP BY substr(transaction_date, 1, 10)
),
flagged AS (
  SELECT
    day,
    revenue,
    julianday(day) - ROW_NUMBER() OVER (ORDER BY day) AS grp
  FROM daily_metrics
  WHERE revenue > 0
)
SELECT MIN(day) AS streak_start, MAX(day) AS streak_end,
  COUNT(*) AS streak_length
FROM flagged
GROUP BY grp
ORDER BY streak_length DESC;

Window Functions FAQ

What is a window function in SQL?+
A window function computes a value for each row based on a set of rows related to that row (the 'window'). Unlike GROUP BY, which collapses rows into groups, window functions keep every row in the result and add the computed value as a new column. The window is defined by PARTITION BY (which rows to include) and ORDER BY (how to order them within the partition). Window functions include ranking functions (ROW_NUMBER, RANK, DENSE_RANK), aggregate functions (SUM, AVG, COUNT, MIN, MAX), offset functions (LAG, LEAD, FIRST_VALUE, LAST_VALUE), and distribution functions (NTILE, PERCENT_RANK, CUME_DIST).
What is the difference between ROWS and RANGE in a frame clause?+
ROWS counts individual rows. RANGE considers the value of the ORDER BY expression. With ROWS BETWEEN 2 PRECEDING AND CURRENT ROW, the frame always includes exactly 3 rows. With RANGE BETWEEN 2 PRECEDING AND CURRENT ROW, the frame includes all rows whose ORDER BY value is within 2 of the current row's value, which could be more or fewer than 3 rows depending on duplicates and gaps. For most use cases, ROWS is more predictable. RANGE is useful when you want value-based windows (like 'all rows within 7 days of the current row').
What is the default frame clause when ORDER BY is specified?+
When ORDER BY is present and no explicit frame clause is given, the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. This means the window extends from the first row in the partition up to and including all rows with the same ORDER BY value as the current row. For SUM, this produces a running total. Be aware that RANGE (not ROWS) is the default: if there are ties in the ORDER BY column, all tied rows get the same running total. To get a strict row-by-row cumulative sum, explicitly specify ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
02 / Why practice

The 35% of senior rounds you can't bluff through

  1. 01

    Active recall beats re-reading by 50%

    Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom

  2. 02

    76% of hiring managers reject on the coding task, not the resume

    From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice

  3. 03

    Five problem shapes cover 80% of data engineer loops

    Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition

Related Guides