Window Functions: Intermediate

Robinhood shows every user a 30-day moving average price line and a daily portfolio change percentage, and those rolling calculations are computed in SQL using LAG, LEAD, and ROWS BETWEEN frame specifications rather than in application code. By computing trailing averages and period-over-period deltas directly in the database, Robinhood's dashboards update in milliseconds instead of seconds, even as millions of users simultaneously check their portfolios during volatile trading sessions. The intermediate window function techniques in this lesson are the exact patterns powering that kind of real-time financial analytics.

LAG and LEAD

Daily Life
Interviews

Compare each row to the one before or after

LAG and LEAD let you look backward and forward across rows without leaving your query. Every dataset has a first row with nothing before it, and that boundary is where period-over-period comparisons can quietly break.

THE CASE OF THE DISAPPEARING GROWTH

Nadia Reyes came in first thing this morning. Her growth report covers six months of revenue data. January is blank where every other month has a number. The source data is clean, no errors, no missing records. Nobody can explain the gap.

Six months. Six revenue figures. No gaps. No NULLs. The source data is clean.

monthly_revenue
month_nummonthrevenue
1Jan45000
2Feb52000
3Mar48000
4Apr61000
5May58000
6Jun72000

LAG and LEAD are window functions that reach backward or forward to access values from other rows without a self-join. They are essential for comparing a row to the row before or after it.

Think of your result set as a spreadsheet. When you are on row 5, LAG lets you peek at row 4 (or row 3, or any row above), and LEAD lets you peek at row 6 (or row 7, or any row below). The database does not need to join the table to itself or use a correlated subquery. It simply remembers what it already processed and knows what comes next.

LAG and LEAD Fundamentals

Understanding the syntax and behavior of these functions is essential before applying them to real analytics scenarios.
//

The Syntax

1SELECT
2 sale_month,
3 revenue,
4 LAG(revenue, 1) OVER (
5 ORDER BY sale_month
6 ) AS prev_revenue
7FROM transactions

LAG accepts up to three arguments: the column to read, an optional offset (defaults to 1), and an optional default value for when there is no previous row. LEAD works identically but looks forward instead of backward.

//

Why Not a Self-Join?

Before window functions existed, comparing a row to its predecessor required joining a table to itself. You would match each row to the previous row by offsetting a date or sequence column. This works, but it has real costs.
Self-Join Approach
  • Scans the table twice
  • Silently drops rows with missing data
  • Complex join conditions to maintain
LAG / LEAD
  • Single ordered pass over data
  • Returns NULL or your default for gaps
  • Clean, readable syntax
//

Month-Over-Month Comparison

The most common use of LAG is period-over-period comparison. By subtracting the previous period value from the current one, you get the change between periods in a single pass.

1SELECT
2 sale_month,
3 revenue,
4 LAG(revenue, 1) OVER (
5 ORDER BY sale_month
6 ) AS prev_month,
7 revenue - LAG(revenue, 1) OVER (
8 ORDER BY sale_month
9 ) AS mom_change
10FROM transactions
Result
sale_monthrevenueprev_monthmom_change
2025-0142000NULLNULL
2025-0245000420003000
2025-033900045000-6000
2025-04510003900012000

The first row has NULL for prev_month because there is no row before January. The arithmetic with NULL also produces NULL for mom_change. You can handle this with COALESCE or by providing a default value in the LAG call itself.

Practical Applications

Beyond simple comparisons, LAG and LEAD unlock powerful patterns for forecasting, change detection, and anomaly identification.

//

LEAD for Change Detection

While LAG is used to look backward, LEAD is equally valuable for forward-looking analysis. A common use case is detecting upcoming changes. For example, if you have a table of subscription plan changes, LEAD can show what plan a customer will switch to next, letting you identify potential downgrades before they happen.

1SELECT
2 customer_id,
3 plan_name,
4 change_date,
5 LEAD(plan_name, 1) OVER (
6 PARTITION BY customer_id
7 ORDER BY change_date
8 ) AS next_plan,
9 LEAD(change_date, 1) OVER (
10 PARTITION BY customer_id
11 ORDER BY change_date
12 ) AS next_change_date
13FROM plan_changes
Result
customer_idplan_namechange_datenext_plannext_change_date
C1Premium2025-01-01Basic2025-04-01
C1Basic2025-04-01NULLNULL
C2Basic2025-02-15Premium2025-03-15
C2Premium2025-03-15NULLNULL

Customer C1 is about to downgrade from Premium to Basic. Customer C2 upgraded from Basic to Premium. The NULL values in next_plan indicate the most recent plan, the one they are still on. This kind of forward-looking analysis powers churn prediction models and retention campaigns.

//

Detecting Sequential Gaps

LAG is also powerful for detecting gaps or anomalies in sequences. If you expect events to occur daily, you can flag days where the gap between consecutive events is larger than expected. This is critical in monitoring pipelines where a missing day of data could indicate a broken ETL job or a sensor failure.

1SELECT
2 event_date,
3 sensor_id,
4 LAG(event_date, 1) OVER (
5 PARTITION BY sensor_id
6 ORDER BY event_date
7 ) AS prev_date,
8 DATE_DIFF(
9 'day',
10 LAG(event_date, 1) OVER (
11 PARTITION BY sensor_id
12 ORDER BY event_date
13 ),
14 event_date
15 ) AS days_since_last
16FROM sensor_readings
Result
event_datesensor_idprev_datedays_since_last
2025-03-01S1NULLNULL
2025-03-02S12025-03-011
2025-03-05S12025-03-023
2025-03-06S12025-03-051

The row with days_since_last = 3 reveals a gap: sensor S1 skipped March 3rd and 4th. You could wrap this in a WHERE clause (via a subquery) to filter only the rows where the gap exceeds your threshold. In a production pipeline, this pattern feeds automated alerting systems that notify engineers when data stops flowing.

TIP
LAG and LEAD always require ORDER BY inside the OVER clause. Without ordering, "previous" and "next" have no meaning. The database will reject the query or produce unpredictable results.
//

Larger Offsets

The offset parameter lets you look further back or ahead. LAG(revenue, 12) looks 12 rows back, useful for year-over-year comparison with monthly data:

1SELECT
2 sale_month,
3 revenue,
4 LAG(revenue, 12) OVER (
5 ORDER BY sale_month
6 ) AS same_month_last_year,
7 revenue - LAG(revenue, 12) OVER (
8 ORDER BY sale_month
9 ) AS yoy_change
10FROM transactions
Result
sale_monthrevenuesame_month_last_yearyoy_change
2024-0138000NULLNULL
2024-0241000NULLNULL
2025-0142000380004000
2025-0245000410004000

The first 12 rows will have NULL for same_month_last_year because there are no rows 12 positions back yet. Once you have a full year of data, every row gets a comparison. Year-over-year analysis is one of the most requested metrics in business intelligence dashboards, and LAG with offset 12 is by far the cleanest way to compute it.

> Complete this query to rank metrics by value, with ties sharing the same rank and no gaps in the numbering.

SELECT
  metric_name,
  metric_value,
  ___() OVER (
    ORDER BY metric_value ___
  ) AS ranking
FROM employee_metrics
DESC
ASC
DENSE_RANK
RANK

DENSE_RANK is the correct choice whenever you want to count how many distinct performance levels exist above a given row, because it assigns consecutive integers with no gaps even when multiple rows share the same value.

Sorting in descending order puts the highest scores at rank 1, which matches the intuitive meaning of "first place" and is the convention used in leaderboards, competitive rankings, and compensation benchmarking tools.

LAG and LEAD are complementary to ranking functions because once you have ranked rows you can use LAG to compare each row to the one immediately above or below it in the ranking, which enables tier-boundary analysis without additional joins.

FIRST, LAST, NTH VALUE

Daily Life
Interviews

Pull specific values from a window

FIRST_VALUE, LAST_VALUE, and NTH_VALUE let you pull specific positional values from a window. They answer questions like "what was the first order amount in each category?" or "what was the third highest salary in each department?"

These functions are different from LAG and LEAD in an important way. LAG and LEAD access rows relative to the current row (1 row back, 2 rows forward). Positional functions access rows relative to the window frame itself (the first row in the frame, the last row, or the Nth row). This distinction matters because the frame can be the entire partition or just a sliding subset of it.

Common Use Cases
  • FIRST_VALUE: First purchase date per customer (acquisition dates)
  • LAST_VALUE: Last login time per user (activity monitoring)
  • NTH_VALUE: Third highest sale per region (outlier detection)

Positional Function Basics

Each positional function accesses a specific row within the window frame. Understanding their syntax and default behaviors is essential for correct usage.
//

FIRST_VALUE

FIRST_VALUE returns the value from the first row in the window frame. This is commonly used to compare every row against the starting point. Unlike LAST_VALUE, FIRST_VALUE works correctly with the default frame because the first row of the frame is always included.

1SELECT
2 employee_name,
3 department,
4 salary,
5 FIRST_VALUE(salary) OVER (
6 PARTITION BY department
7 ORDER BY salary DESC
8 ) AS top_salary,
9 salary - FIRST_VALUE(salary) OVER (
10 PARTITION BY department
11 ORDER BY salary DESC
12 ) AS gap_from_top
13FROM employee_metrics
Result
employee_namedepartmentsalarytop_salarygap_from_top
AliceEngineering1500001500000
BobEngineering130000150000-20000
CarolEngineering120000150000-30000
DianaSales1100001100000
EveSales95000110000-15000
Every employee can see how far their salary is from the highest in their department. The person with the top salary has a gap of zero. This pattern is commonly used in compensation analysis to show each employee their relative position without revealing other individuals' exact salaries.
//

The LAST_VALUE Frame Trap

LAST_VALUE has a subtle but critical gotcha that catches nearly everyone the first time. The default window frame is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. This means the "last value" in the frame is always the current row itself.

To understand why this happens, picture the frame as a sliding window that starts at the first row of the partition and ends at the current row. When the database processes row 1, the frame contains only row 1, so the last value is row 1. When it processes row 2, the frame contains rows 1 and 2, so the last value is row 2. The frame always ends at "where you are now," so LAST_VALUE just echoes the current row.

employeesalarydefault_lasttrue_last
Alice150000150000120000
Bob130000130000120000
Carol120000120000120000

Notice how default_last just echoes each row's own salary? That is because the frame ends at CURRENT ROW, so the "last" value is always the current row. To get the actual last value in the entire partition, you must explicitly expand the frame:

1SELECT
2 employee_name,
3 salary,
4 LAST_VALUE(salary) OVER (
5 ORDER BY salary DESC
6 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
7 ) AS lowest_salary
8FROM employee_metrics
Result
employee_namesalarylowest_salary
Alice15000095000
Bob13000095000
Carol12000095000
Diana11000095000
Eve9500095000

By specifying ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING, the frame spans the entire partition. Now LAST_VALUE correctly returns the last value (the lowest salary when ordered descending).

Whenever you use LAST_VALUE or NTH_VALUE, always add ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING unless you specifically want the frame to end at the current row. This is one of the most frequently tested window function concepts in interviews.

//

NTH_VALUE

NTH_VALUE returns the value from the Nth row in the window frame. It is useful when you need a specific position, such as the second highest or third most recent.

1SELECT
2 department,
3 employee_name,
4 salary,
5 NTH_VALUE(salary, 2) OVER (
6 PARTITION BY department
7 ORDER BY salary DESC
8 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
9 ) AS second_highest
10FROM employee_metrics
Result
departmentemployee_namesalarysecond_highest
EngineeringAlice150000130000
EngineeringBob130000130000
EngineeringCarol120000130000
SalesDiana11000095000
SalesEve9500095000

Notice the full frame specification again. Without it, NTH_VALUE suffers from the same trap as LAST_VALUE: the first row in each partition would return NULL for second_highest because the frame at that point only contains one row.

> Complete this query to show each employee's salary alongside the highest salary in their department.

SELECT
  department,
  metric_name,
  metric_value,
  ___(___) OVER (
    PARTITION BY department
    ORDER BY metric_value DESC
  ) AS top_value
FROM employee_metrics
FIRST_VALUE
LAST_VALUE
metric_value
NTH_VALUE

FIRST_VALUE with ORDER BY DESC gives the maximum value, while LAST_VALUE needs an explicit frame clause to see the entire partition.

Always add ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING when using LAST_VALUE or NTH_VALUE to avoid unexpected NULLs.

NTH_VALUE is useful for specific positions like second highest or third most recent, but also requires an explicit full-partition frame.

Frame Specifications

Daily Life
Interviews

Define exactly which rows a window covers

Frame specifications define exactly which rows participate in a window calculation. Every window function operates on a frame, and understanding frames is the key to unlocking the full power of window functions.
Up to this point, you have been relying on the default frame, which extends from the beginning of the partition to the current row. That default works for running totals but fails for many other patterns. To build moving averages, trailing windows, and centered calculations, you need to take explicit control of the frame boundaries.

Frame Syntax and Types

Frame clauses follow a consistent syntax pattern that gives you fine-grained control over exactly which rows are included in each calculation.
//

ROWS BETWEEN Syntax

The frame clause follows ORDER BY inside the OVER clause. The syntax is ROWS BETWEEN followed by a start bound and an end bound:

UNBOUNDED PRECEDING
UNBOUNDED PRECEDING
Start from the very first row in the partition.
N PRECEDING
N PRECEDING
Start N rows before the current row.
CURRENT ROW
CURRENT ROW
The current row being processed.
N FOLLOWING
N FOLLOWING
End N rows after the current row.
UNBOUNDED FOLLOWING
UNBOUNDED FOLLOWING
Extend to the very last row in the partition.
//

ROWS vs RANGE

SQL offers two frame types: ROWS and RANGE. They behave differently when your ORDER BY column contains duplicate values, which is common in real data.

ROWS counts physical rows. "3 PRECEDING" means exactly 3 rows above the current one, regardless of the values in those rows. RANGE, on the other hand, groups by value. "3 PRECEDING" with RANGE means all rows whose ORDER BY value is within 3 units of the current row's value. If multiple rows share the same value, RANGE treats them as a single logical group.

1SELECT
2 sale_date,
3 revenue,
4 SUM(revenue) OVER (
5 ORDER BY sale_date
6 ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
7 ) AS rolling_sum
8FROM transactions
ROWSRANGE
ROWS
Row position
Fixed sliding windows
RANGE
Column value
Calendar-based windows

In practice, ROWS is used far more often than RANGE. ROWS is simpler to reason about, more predictable, and more widely supported across database engines. Use RANGE only when you specifically need value-based grouping, such as computing a sum of all transactions within a dollar amount range of the current row.

//

Common Frame Patterns

1SELECT
2 sale_month,
3 revenue,
4 SUM(revenue) OVER (
5 ORDER BY sale_month
6 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
7 ) AS cumulative
8FROM transactions

Real-World Frame Patterns

Moving averages and trailing calculations are ubiquitous in analytics. These patterns smooth out noise and reveal underlying trends in time-series data.
//

Sliding Window Average

Moving averages smooth out noise in time-series data. A 3-day moving average at each point takes the average of the previous day, current day, and next day. This is called a centered moving average because the current row is in the middle of the window.
1SELECT
2 sale_date,
3 revenue,
4 AVG(revenue) OVER (
5 ORDER BY sale_date
6 ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
7 ) AS moving_avg_3
8FROM daily_revenue
Result
sale_daterevenuemoving_avg_3
2025-03-0112001350
2025-03-0215001400
2025-03-0315001433
2025-03-0413001367
2025-03-0513001300

The first row only has 2 values in its frame (itself and the next row) because there is no preceding row. The last row similarly only has 2 values. The AVG function automatically handles partial frames by averaging whatever rows are present. This edge behavior is important to understand: moving averages at the boundaries of your data are based on fewer data points and may be less reliable.

//

7-Day Trailing Average

A trailing window only looks backward, which is useful when you cannot use future data (such as real-time dashboards where tomorrow has not happened yet). The 7-day trailing average is one of the most widely used patterns in business analytics.
Here is how the frame works for each row. On day 1, the frame contains only day 1 (no preceding rows exist). On day 2, it contains days 1 and 2. By day 7, the frame is full: it contains the current day plus the 6 preceding days. From day 7 onward, the frame always contains exactly 7 rows, sliding forward one day at a time.
1SELECT
2 sale_date,
3 revenue,
4 AVG(revenue) OVER (
5 ORDER BY sale_date
6 ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
7 ) AS trailing_7_day_avg,
8 SUM(revenue) OVER (
9 ORDER BY sale_date
10 ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
11 ) AS trailing_7_day_total
12FROM daily_revenue
Result
sale_daterevenuetrailing_7_day_avgtrailing_7_day_total
2025-03-01120012001200
2025-03-02150013502700
2025-03-03150014004200
2025-03-07180014578700
2025-03-08140014719300

The frame ROWS BETWEEN 6 PRECEDING AND CURRENT ROW creates a trailing 7-day window (6 previous rows plus the current one). Early rows where fewer than 7 days of data exist will average whatever is available. Note that the trailing average stabilizes after the first full week of data, making subsequent values more reliable for trend analysis.

//

Bounded vs Unbounded Frames

Bounded frames (like 6 PRECEDING AND CURRENT ROW) limit the number of rows the function examines at each step. Unbounded frames (like UNBOUNDED PRECEDING AND CURRENT ROW) grow as the partition progresses. This distinction has practical implications.

With a bounded frame, the database only needs to maintain a fixed-size window in memory. Adding one row to the front and dropping one from the back is efficient. With an unbounded frame, the running calculation must account for every row seen so far. For SUM and COUNT, the database can maintain a running total incrementally. But functions that require sorting or comparing all values in the frame cannot take this shortcut, making unbounded frames expensive on very large partitions.

TIP
Use bounded frames for fixed-size windows like moving averages and trailing totals. Use unbounded frames for cumulative calculations like running totals and percentage of total.
The concept of moving averages has a surprisingly long history outside of databases.

> Complete this query to compare each day's cost with the previous day.

SELECT
  bill_date,
  amount,
  ___(amount, ___) OVER (
    ORDER BY bill_date
  ) AS prev_amount
FROM cloud_costs
0
LAG
1
LEAD

LAG with an offset of 1 is the standard tool for day-over-day or period-over-period comparisons because it reads the value from the preceding row in a single pass without requiring the table to be joined to itself.

Providing a default value as the third argument to LAG replaces the NULL that appears on the first row of each partition, which prevents NULL propagation when you subsequently subtract the lagged value from the current value.

Frame specifications and LAG serve different purposes: frames control which rows an aggregate function sees at each step, while LAG always jumps to a fixed offset regardless of any frame clause you specify.

NTILE and Bucketing

Daily Life
Interviews

Split rows into equal-sized groups

NTILE divides the ordered rows in a partition into a specified number of roughly equal groups and assigns each row a bucket number from 1 to N. It is the go-to function for creating quartiles, deciles, and custom percentile buckets.

Percentile bucketing is fundamental to data engineering and analytics. Nearly every organization segments its users, customers, or products into tiers based on some metric. The top 25 percent of customers by spending might receive premium support. The bottom 10 percent of products by revenue might be candidates for discontinuation. NTILE automates this segmentation cleanly.

NTILE Fundamentals

The NTILE function follows a predictable pattern for dividing rows into equal groups, making it straightforward to apply once you understand its behavior.

//

Basic NTILE Usage

Consider a customer segmentation scenario. A marketing team wants to divide customers into four spending tiers to target campaigns differently. The top quartile gets exclusive offers, the bottom quartile gets re-engagement campaigns.
1SELECT
2 customer_name,
3 total_spent,
4 NTILE(4) OVER (
5 ORDER BY total_spent DESC
6 ) AS spending_quartile
7FROM customer_summary
Result
customer_nametotal_spentspending_quartile
Alice152001
Bob128001
Carol98002
Diana76002
Eve54003
Frank32003
Grace18004
Hank9004
With 8 rows divided into 4 groups, each group gets exactly 2 rows. Quartile 1 contains the highest spenders and quartile 4 the lowest. This is a clean division because 8 divides evenly by 4.
//

Handling Remainders

When the row count does not divide evenly, NTILE distributes the extra rows to the earlier buckets. With 10 rows and 4 buckets, the first two buckets get 3 rows each and the last two get 2 rows each.

NTILE Remainder Distribution
  • 10 rows into 4 buckets → 3, 3, 2, 2
  • 11 rows into 4 buckets → 3, 3, 3, 2
  • 12 rows into 4 buckets → 3, 3, 3, 3
  • 100 rows into 10 buckets → 10 each (even)
  • 103 rows into 10 buckets → 11, 11, 11, 10, 10, 10, 10, 10, 10, 10
The rule is simple: if there are R remainder rows, the first R buckets each get one extra row. This ensures the difference in group size is never more than one.
//

Deciles and Percentiles

Use NTILE(3) for tertiles (3 groups), NTILE(10) for deciles (10 groups), or NTILE(100) for percentiles. Deciles are popular for salary analysis and customer segmentation. Percentiles give the finest granularity.

1SELECT
2 employee_name,
3 salary,
4 NTILE(10) OVER (
5 ORDER BY salary
6 ) AS salary_decile
7FROM employee_metrics
Result
employee_namesalarysalary_decile
Eve650001
Frank720002
Bob850003
Carol950005
Alice1300009
Diana15000010
Employees in decile 10 are in the top 10 percent of salaries. Those in decile 1 are in the bottom 10 percent. This is far more informative than a simple rank because it immediately conveys relative position.

NTILE in Practice

Understanding when and how to use NTILE effectively requires comparing it to alternatives and knowing how to combine it with other window clauses.

//

NTILE vs CASE Bucketing

Before NTILE, analysts created buckets using manual CASE statements with hardcoded thresholds. This approach is fragile because the thresholds must be updated whenever the data distribution changes. NTILE adapts automatically.

Manual CASE
  • Hardcoded thresholds
  • Must update when data changes
  • Buckets may be very unequal
  • Verbose: one WHEN per bucket
NTILE
  • Adapts to data automatically
  • Always produces equal groups
  • One line of SQL
  • Works with any partition

Use manual CASE when your bucket boundaries have business meaning (such as "under 50K is junior, 50-100K is mid, over 100K is senior"). Use NTILE when you want equal-sized groups based on relative position, such as "top 25 percent of performers."

//

NTILE with PARTITION BY

Combine NTILE with PARTITION to create buckets within groups. This is essential when the same absolute value means different things in different contexts.

1SELECT
2 department,
3 employee_name,
4 salary,
5 NTILE(4) OVER (
6 PARTITION BY department
7 ORDER BY salary DESC
8 ) AS dept_quartile
9FROM employee_metrics
Result
departmentemployee_namesalarydept_quartile
EngineeringAlice1500001
EngineeringBob1300002
EngineeringCarol1200003
EngineeringDave950004
SalesDiana1100001
SalesEve950002
SalesFrank850003
SalesGrace720004
Now quartiles are relative to each department. Carol earns 120K and is in quartile 3 for Engineering, but Diana earns less (110K) and is in quartile 1 for Sales. Context matters.
How many buckets should you use for your analysis?
1SELECT
2 customer_name,
3 total_spent,
4 NTILE(4) OVER (
5 ORDER BY total_spent DESC
6 ) AS spending_quartile
7FROM customer_summary

Note that NTILE does not consider the actual values when creating buckets. It only considers row position. Two employees with identical salaries could end up in different buckets if they fall on a boundary. If you need value-aware bucketing, use PERCENT_RANK or CUME_DIST instead.

> Complete this query to divide customers into 4 spending tiers.

SELECT
  customer_name,
  total_spent,
  ___(___) OVER (
    ORDER BY total_spent DESC
  ) AS tier
FROM customers
NTILE
4
10
RANK

NTILE distributes rows as evenly as possible. If 10 rows are split into 4 buckets, two buckets get 3 rows and two get 2.

NTILE assigns buckets based on row position, not actual values. Two customers with identical spending could end up in different tiers at a boundary.

For value-aware bucketing where identical values should always share the same tier, use PERCENT_RANK or CUME_DIST instead.

Multiple Windows

Daily Life
Interviews

Run different calculations in one query

Real-world analysis often requires comparing a row from multiple perspectives in the same query. You might need a rank within a department, a running total across the company, and a comparison to the previous month all in one result set.
Rank within category
Rank within category
RANK with PARTITION BY category
Percentage of total revenue
Percentage of total revenue
SUM with UNBOUNDED frame
Month-over-month growth
Month-over-month growth
LAG for period comparison
Quartile placement
Quartile placement
NTILE for percentile bucketing

There is no named WINDOW clause. You must write the full OVER specification for each window function. This means you will see multiple OVER clauses in a single SELECT, each potentially with different partitioning, ordering, and framing.

Multiple Window Definitions

Each window function in a query can have its own partitioning, ordering, and framing. This flexibility lets you answer multiple questions in a single pass over the data.
//

Different Partitions

Here is a practical example: for each sale, show the rank within its category and the rank across all categories.
1SELECT
2 product_name,
3 category,
4 revenue,
5 RANK() OVER (
6 PARTITION BY category
7 ORDER BY revenue DESC
8 ) AS category_rank,
9 RANK() OVER (
10 ORDER BY revenue DESC
11 ) AS overall_rank
12FROM product_sales
Result
product_namecategoryrevenuecategory_rankoverall_rank
Laptop ProElectronics8900011
Smart WatchElectronics4500023
Winter JacketClothing5200012
Running ShoesClothing3100024

The first RANK partitions by category so each category has its own ranking. The second RANK has no partition, ranking all products together. Winter Jacket is ranked 1st in Clothing but 2nd overall.

//

Different Orderings

You can also use different orderings in the same query. This is useful for showing how the same data looks from different angles.
1SELECT
2 employee_name,
3 department,
4 salary,
5 hire_date,
6 ROW_NUMBER() OVER (
7 ORDER BY salary DESC
8 ) AS salary_rank,
9 ROW_NUMBER() OVER (
10 ORDER BY hire_date
11 ) AS seniority_rank
12FROM employee_metrics
Result
employee_namedepartmentsalaryhire_datesalary_rankseniority_rank
AliceEngineering1500002020-03-1513
BobEngineering1300002018-06-0121
CarolSales1200002019-01-1032
DianaSales1100002021-09-2244
Alice earns the most (salary rank 1) but is only the third most senior employee (seniority rank 3). Bob earns less but has been at the company the longest. Multiple window orderings reveal these relationships without needing separate queries.
//

Multiple Query Perspectives

The most powerful pattern combines aggregate windows, positional functions, and ranking functions with different partitions and frames in the same query. This is the bread and butter of analytics dashboards.
1SELECT
2 sale_month,
3 department,
4 revenue,
5 SUM(revenue) OVER (
6 PARTITION BY department
7 ORDER BY sale_month
8 ) AS running_total,
9 LAG(revenue, 1) OVER (
10 PARTITION BY department
11 ORDER BY sale_month
12 ) AS prev_revenue,
13 NTILE(3) OVER (
14 ORDER BY revenue DESC
15 ) AS rev_tier
16FROM monthly_dept_revenue
Result
sale_monthdepartmentrevenuerunning_totalprev_revenuerev_tier
2025-01Engineering5000050000NULL1
2025-02Engineering55000105000500001
2025-03Engineering48000153000550002
2025-01Sales3000030000NULL3
2025-02Sales3500065000300002

This single query produces three different analytical perspectives: a department-level running total, month-over-month comparison via LAG, and a company-wide revenue tier via NTILE. Each OVER clause is tailored to its specific purpose.

Performance Considerations

Understanding how the database processes multiple windows helps you write efficient queries without sacrificing analytical power.
//

Shared Window Performance

When multiple window functions share the same OVER clause (same PARTITION BY, same ORDER BY, same frame), the query optimizer can often compute them in a single pass over the data. The database sorts the data once and evaluates all matching window functions together.

When window functions have different OVER clauses, the database may need to sort the data multiple times, once for each distinct window specification. On large datasets, each additional sort is expensive. This is why grouping window functions by their OVER clause is good practice.

TIP
Do not sacrifice clarity for performance. If your analysis requires different partitions and orderings, write them out. Only optimize window specifications if profiling shows the sorts are a bottleneck.
AggregatePositional
Aggregate
Frame totals
SUM, AVG across the frame
Positional
Row lookups
LAG, LEAD, FIRST_VALUE
Now let's look at the key rules for writing window functions correctly and avoiding common pitfalls:
Do
  • Always include ORDER BY in LAG, LEAD, and positional functions
  • Use explicit frame specs with LAST_VALUE and NTH_VALUE
  • Provide default values in LAG/LEAD to avoid NULL arithmetic
  • Use NTILE for equal-group bucketing instead of manual CASE logic
  • Comment your intent when using multiple windows in one query
  • Use ROWS frames unless you specifically need value-based RANGE grouping
Don't
  • Do not use LAST_VALUE without expanding the frame to UNBOUNDED FOLLOWING
  • Do not assume NTILE buckets are exactly equal when rows do not divide evenly
  • Do not confuse ROWS and RANGE frame types when dealing with duplicate values
  • Do not use reserved words like row or rank as column aliases
  • Do not forget that LAG and LEAD ignore the frame clause entirely

> Fill in the missing parts to calculate how much cost changed from one billing period to the next.

SELECT
  bill_date,
  amount,
  ___(___, 1) OVER (
    ___ bill_date
  ) AS prev_amount,
  amount - ___(amount, 1) OVER (
    ORDER BY bill_date
  ) AS mom_change
FROM cloud_costs
PARTITION BY
ORDER BY
amount
LEAD
LAG
LAG
Offset and cumulative window techniques become intuitive with repetition. Put these patterns to the test with real-world challenges.

Combining LAG for period comparison with NTILE for segmentation in the same query is a common analytics pattern because it simultaneously shows how a row changed from the previous period and where it ranks relative to all other rows.

Each window function in a SELECT list is evaluated independently against its own OVER clause, which means the database may sort the data multiple times if the specifications differ, so aligning OVER clauses where possible improves query performance on large datasets.

PUTTING IT ALL TOGETHER

> You are a growth analyst at Spotify tracking month-over-month revenue changes per product line to surface trends for the quarterly business review, and your stakeholders want both period comparisons and bucket tiers in a single report.

LAG() pulls the prior month revenue for each product line in one pass, letting you subtract it from the current month to compute the change directly.
FIRST_VALUE() and LAST_VALUE() with explicit ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING frames anchor each row to the quarter start and end for boundary comparisons.
Frame specifications using ROWS BETWEEN N PRECEDING AND CURRENT ROW produce a rolling three-month average that smooths seasonal noise across product lines.
NTILE() divides product lines into performance quartiles so leadership can immediately see which lines rank in the top versus bottom tier each quarter.
KEY TAKEAWAYS
LAG/LEAD access previous/next rows in a single pass; use offset and default value params to handle edges
LAST_VALUE and NTH_VALUE need an explicit ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING frame to see the full partition
Frame specs control which rows participate: use ROWS for physical row counts (moving averages) and RANGE for value-based grouping
NTILE(N) divides rows into N equal buckets for quartile/percentile segmentation
Combine multiple window functions with different partitions and orderings in one query for multi-perspective analysis

Look back. Look forward. Frame it.

Category
SQL
Difficulty
intermediate
Duration
39 minutes
Challenges
5 hands-on challenges

Topics covered: LAG and LEAD, FIRST, LAST, NTH VALUE, Frame Specifications, NTILE and Bucketing, Multiple Windows

Lesson Sections

  1. LAG and LEAD (concepts: sqlLagLead)

    LAG and LEAD Fundamentals Understanding the syntax and behavior of these functions is essential before applying them to real analytics scenarios. The Syntax Why Not a Self-Join? Before window functions existed, comparing a row to its predecessor required joining a table to itself. You would match each row to the previous row by offsetting a date or sequence column. This works, but it has real costs. Month-Over-Month Comparison Practical Applications LEAD for Change Detection Detecting Sequential

  2. FIRST, LAST, NTH VALUE (concepts: sqlFirstLastValue)

    Positional Function Basics Each positional function accesses a specific row within the window frame. Understanding their syntax and default behaviors is essential for correct usage. FIRST_VALUE Every employee can see how far their salary is from the highest in their department. The person with the top salary has a gap of zero. This pattern is commonly used in compensation analysis to show each employee their relative position without revealing other individuals' exact salaries. The LAST_VALUE Fr

  3. Frame Specifications (concepts: sqlWindowFrame)

    Frame specifications define exactly which rows participate in a window calculation. Every window function operates on a frame, and understanding frames is the key to unlocking the full power of window functions. Up to this point, you have been relying on the default frame, which extends from the beginning of the partition to the current row. That default works for running totals but fails for many other patterns. To build moving averages, trailing windows, and centered calculations, you need to

  4. NTILE and Bucketing (concepts: sqlNtile)

    NTILE Fundamentals Basic NTILE Usage Consider a customer segmentation scenario. A marketing team wants to divide customers into four spending tiers to target campaigns differently. The top quartile gets exclusive offers, the bottom quartile gets re-engagement campaigns. With 8 rows divided into 4 groups, each group gets exactly 2 rows. Quartile 1 contains the highest spenders and quartile 4 the lowest. This is a clean division because 8 divides evenly by 4. Handling Remainders The rule is simple

  5. Multiple Windows (concepts: sqlMultipleWindows)

    Real-world analysis often requires comparing a row from multiple perspectives in the same query. You might need a rank within a department, a running total across the company, and a comparison to the previous month all in one result set. Multiple Window Definitions Each window function in a query can have its own partitioning, ordering, and framing. This flexibility lets you answer multiple questions in a single pass over the data. Different Partitions Here is a practical example: for each sale,