SQL Interview Deep Dive

SQL Window Functions Practice

Window functions are one of the highest-frequency SQL topics in data engineering interviews. PARTITION BY, ROW_NUMBER, and RANK/DENSE_RANK appear constantly. Master them with real execution.

Write the query, run it against a real database, and verify your output row by row.

6.2%

ROW_NUMBER

4.9%

RANK / DENSE_RANK

High

LAG / LEAD

High

SUM / AVG OVER

Low

NTILE

Med

FIRST_VALUE / LAST_VALUE

Window Functions You Need to Know

Every function below is tested in data engineering interviews. Sorted by interview frequency, based on DataDriven's analysis of verified interview data.

6.2%

Frequency

ROW_NUMBER()

Function

Ranking

Category

ROW_NUMBER() OVER (PARTITION BY col ORDER BY col)

Assigns a unique sequential number to each row within a partition. No ties, no gaps. Appears in 6.2% of all DE interviews, making it the most commonly tested window function.

Common use cases

  • Get the most recent order per customer
  • Deduplicate records (keep latest)
  • Pagination queries
  • First/last event per user session

Interview tip: Interviewers love asking 'get the Nth row per group' questions. ROW_NUMBER with a CTE wrapper is the standard pattern.

4.9%

Frequency

RANK() / DENSE_RANK()

Function

Ranking

Category

RANK() OVER (PARTITION BY col ORDER BY col DESC)

RANK assigns the same number to ties but leaves gaps (1, 2, 2, 4). DENSE_RANK has no gaps (1, 2, 2, 3). Combined, they appear in 4.9% of all DE interviews.

Common use cases

  • Top N products by revenue per category
  • Percentile calculations
  • Competition-style rankings with ties
  • Finding all rows tied for first place

Interview tip: The follow-up question is always 'what if there are ties?' Know the difference between RANK and DENSE_RANK cold.

High

Frequency

LAG() / LEAD()

Function

Offset

Category

LAG(col, offset, default) OVER (ORDER BY col)

Access a value from a previous row (LAG) or next row (LEAD) without a self-join. Essential for time-series comparisons.

Common use cases

  • Calculate day-over-day or month-over-month changes
  • Find gaps between consecutive events
  • Detect session boundaries
  • Compare current value to previous value

Interview tip: LAG/LEAD questions almost always involve calculating a delta or identifying a change. Think 'compare to previous row' whenever you see time-series data.

High

Frequency

SUM() / AVG() OVER

Function

Aggregate

Category

SUM(col) OVER (ORDER BY col ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)

Running totals, moving averages, and cumulative sums. The window frame clause (ROWS BETWEEN) is what makes these powerful.

Common use cases

  • 7-day rolling average of daily metrics
  • Cumulative revenue over time
  • Running balance calculations
  • Percentage of total within a group

Interview tip: Frame clauses trip up many candidates. Know the difference between ROWS and RANGE, and between UNBOUNDED PRECEDING and N PRECEDING.

Low

Frequency

NTILE()

Function

Distribution

Category

NTILE(n) OVER (ORDER BY col)

Divides rows into N roughly equal buckets. Useful for percentile analysis and bucketing.

Common use cases

  • Divide customers into quartiles by spend
  • Create equal-sized test groups
  • Salary band analysis
  • Bucketing continuous values

Interview tip: Less commonly tested than ROW_NUMBER or LAG, but it shows up in 'segment users into groups' questions. Quick to implement, impressive when used correctly.

Med

Frequency

FIRST_VALUE() / LAST_VALUE()

Function

Offset

Category

FIRST_VALUE(col) OVER (PARTITION BY col ORDER BY col)

Get the first or last value in a window. Watch out: LAST_VALUE needs a proper frame clause or it returns the current row.

Common use cases

  • Compare every row to the first event in a session
  • Find the earliest order date per customer inline
  • Calculate distance from first/last value

Interview tip: LAST_VALUE is a trap question. Without ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING, it defaults to the current row. Interviewers use this to test frame clause knowledge.

The 3 Patterns That Cover 80% of Interview Questions

Most window function interview questions are variations of these three patterns. Learn them cold.

Top N Per Group

Get the top N rows within each group. The most common window function interview pattern.

WITH ranked AS (
  SELECT *,
    ROW_NUMBER() OVER (
      PARTITION BY department
      ORDER BY salary DESC
    ) AS rn
  FROM employees
)
SELECT * FROM ranked WHERE rn <= 3;
Running Total

Cumulative sum that grows with each row. Critical for financial and time-series queries.

SELECT
  order_date,
  amount,
  SUM(amount) OVER (
    ORDER BY order_date
    ROWS UNBOUNDED PRECEDING
  ) AS running_total
FROM orders;
Month-over-Month Change

Compare each row to the previous period. The go-to pattern for growth analysis questions.

SELECT
  month,
  revenue,
  revenue - LAG(revenue) OVER (ORDER BY month) AS mom_change,
  ROUND(100.0 * (revenue - LAG(revenue) OVER (ORDER BY month))
    / LAG(revenue) OVER (ORDER BY month), 1) AS mom_pct
FROM monthly_revenue;

Window Functions FAQ

Why are SQL window functions so important for data engineering interviews?+
Window functions are among the most frequently tested SQL topics in DE interviews. PARTITION BY is one of the most common individual keywords interviewers look for. They test your ability to think beyond simple aggregations: ranking, time-series analysis, running calculations, and row-level comparisons within groups.
What is the difference between RANK and ROW_NUMBER?+
ROW_NUMBER always assigns unique sequential numbers (1, 2, 3, 4). RANK assigns the same number to ties and leaves gaps (1, 2, 2, 4). DENSE_RANK assigns the same number to ties with no gaps (1, 2, 2, 3). Use ROW_NUMBER when you need exactly one row per group. Use RANK or DENSE_RANK when ties matter.
How do I practice SQL window functions?+
Write queries, do not just read solutions. Start with ROW_NUMBER (6.2% of interviews), then LAG/LEAD, then aggregate windows (SUM/AVG OVER). Practice the common patterns: top N per group, running totals, and period-over-period comparisons. DataDriven has window function challenges with real SQL execution at interview difficulty.
What are window frame clauses and why do they matter?+
Frame clauses define which rows are included in the window calculation. ROWS BETWEEN 6 PRECEDING AND CURRENT ROW gives you a 7-day rolling window. ROWS UNBOUNDED PRECEDING gives you a running total. Getting the frame wrong is the most common window function bug in interviews. Practice until frames are second nature.

Practice Window Functions Now

Reading about window functions is not the same as writing them under pressure. Practice with real SQL execution and timed drills.