SQL Reference

Common Table Expression (CTE) in SQL

CTEs show up in 8% of verified DE interview SQL questions overall. The rate climbs to 14% once you isolate the senior (L5) and staff (L6) pools, which together account for 804 of the 1,042 rounds in our corpus. That's roughly 1 in 7 senior SQL questions expecting a WITH clause somewhere in the answer.

The percentage understates their actual importance. CTEs rarely get asked directly, but they're the scaffolding interviewers expect you to use when a problem has more than one logical step. Of the 632 senior rounds we analyzed, the candidates who reached for a CTE in a multi-step problem scored measurably higher on the structure dimension of the rubric than the ones who nested subqueries.

8%

All SQL rounds

14%

Senior SQL rounds

632

L5 rounds tracked

172

L6 rounds tracked

Source: DataDriven analysis of 1,042 verified data engineering interview rounds.

What Is a Common Table Expression?

A common table expression is a named, temporary result set defined using the WITH keyword. It lives only for the single SQL statement that follows it. Think of it as giving a name to a subquery, except you declare it before the main query and can reference it as many times as you need.

In the corpus, we see all three names used interchangeably: “CTE” in 47% of transcripts, “WITH clause” in 31%, and the full “common table expression” in the remaining 22%. Interviewers won't care which one you use. They'll care whether you can name, reuse, and chain them without losing your place.

-- Basic common table expression
WITH active_users AS (
  SELECT user_id, name, email
  FROM users
  WHERE status = 'active'
    AND last_login >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT
  au.name,
  COUNT(o.order_id) AS recent_orders
FROM active_users au
LEFT JOIN orders o ON au.user_id = o.user_id
GROUP BY au.name
ORDER BY recent_orders DESC;

Interview note: Interviewers expect you to use CTEs for any problem with more than one logical step. Starting your answer with WITH signals that you structure your SQL thinking clearly.

Common Table Expression Syntax

The WITH keyword introduces one or more CTEs. Each CTE has a name, an optional column list, and a SELECT statement wrapped in parentheses. Multiple CTEs are separated by commas after a single WITH keyword. The main query follows after the last CTE.

-- Syntax template
WITH cte_name [(column1, column2, ...)] AS (
  SELECT ...
)
SELECT ... FROM cte_name;

-- Multiple CTEs (comma-separated, one WITH)
WITH
  step1 AS (
    SELECT ...
  ),
  step2 AS (
    SELECT ... FROM step1
  ),
  step3 AS (
    SELECT ... FROM step2
  )
SELECT * FROM step3;

One WITH keyword: Even with ten CTEs, you write WITH only once. This is the most common syntax mistake. Do not write WITH before each CTE.

Column aliases: You can name columns after the CTE name: WITH totals (region, revenue) AS (...). If omitted, columns inherit names from the inner SELECT.

Forward reference only: Each CTE can reference CTEs defined before it in the list, but not after. The order of definition determines visibility.

Statement scope: A CTE is visible only to the immediately following statement. You cannot reference it in a subsequent query. It is not a view, not a temp table, and not persistent.

Why Use a Common Table Expression?

Readability

CTEs turn a nested query into a flat sequence of named steps. Each step has a descriptive name that documents what it does. Reviewing a query with three named CTEs is faster than parsing three levels of nested subqueries. In code reviews and interviews, this matters.

Reuse Within a Query

A CTE can be referenced multiple times in the main query. If you need the same derived table in both a JOIN and a WHERE EXISTS, define it once as a CTE. Without a CTE, you would either duplicate the subquery or use a temp table.

Recursion

Only CTEs support recursion in SQL. A recursive CTE references itself to solve hierarchical problems: org charts, category trees, graph traversals, and date series generation. Subqueries cannot do this.

Testability

During development, you can run each CTE independently by replacing the main query with SELECT * FROM cte_name. This lets you validate each step before combining them. With nested subqueries, you have to extract and run each level separately.

Common Table Expression Patterns

1. Filter, Then Aggregate

The first CTE filters to the relevant rows. The main query aggregates. This separates the “what data” question from the “what metric” question.

WITH recent_orders AS (
  SELECT user_id, amount
  FROM orders
  WHERE order_date >= CURRENT_DATE - INTERVAL '90 days'
)
SELECT
  user_id,
  COUNT(*) AS order_count,
  SUM(amount) AS total_spent
FROM recent_orders
GROUP BY user_id
ORDER BY total_spent DESC;

2. Aggregate, Then Rank

The first CTE computes per-group aggregates. The second CTE (or the main query) ranks groups. This is the standard pattern for top-N per group problems.

WITH dept_salaries AS (
  SELECT
    department,
    name,
    salary
  FROM employees
),
ranked AS (
  SELECT *,
    ROW_NUMBER() OVER (
      PARTITION BY department ORDER BY salary DESC
    ) AS rn
  FROM dept_salaries
)
SELECT department, name, salary
FROM ranked
WHERE rn <= 3;

3. Self-Join via CTE

Define the CTE once and reference it twice in the main query. This avoids duplicating a complex subquery for a self-join.

WITH monthly_revenue AS (
  SELECT
    DATE_TRUNC('month', order_date) AS month,
    SUM(amount) AS revenue
  FROM orders
  GROUP BY DATE_TRUNC('month', order_date)
)
SELECT
  curr.month,
  curr.revenue AS current_revenue,
  prev.revenue AS previous_revenue,
  curr.revenue - prev.revenue AS mom_change
FROM monthly_revenue curr
LEFT JOIN monthly_revenue prev
  ON curr.month = prev.month + INTERVAL '1 month'
ORDER BY curr.month;

4. Recursive Common Table Expression

A recursive CTE references itself. It has an anchor member (base case) and a recursive member, connected by UNION ALL. The engine iterates until the recursive member returns zero rows.

WITH RECURSIVE org_tree AS (
  -- Anchor: CEO (no manager)
  SELECT id, name, manager_id, 1 AS depth
  FROM employees
  WHERE manager_id IS NULL

  UNION ALL

  -- Recursive: find direct reports
  SELECT e.id, e.name, e.manager_id, ot.depth + 1
  FROM employees e
  INNER JOIN org_tree ot ON e.manager_id = ot.id
  WHERE ot.depth < 20
)
SELECT * FROM org_tree ORDER BY depth, name;

CTE vs. Subquery vs. Temp Table

FeatureCTESubqueryTemp Table
ScopeSingle statementSingle statementSession
Reusable in queryYes (multiple refs)No (must repeat)Yes
RecursionYesNoNo
IndexableNoNoYes
MaterializedEngine decidesEngine decidesAlways
Best forQuery readability, multi-step logicSimple one-off derivationsMulti-statement workflows

Interview note: Default to CTEs in interviews. They keep everything in one query, which is easier for the interviewer to follow. Mention temp tables only if the problem requires multiple statements or you need to index an intermediate result.

Performance Considerations

In PostgreSQL 12+, BigQuery, Snowflake, and Trino, the optimizer can inline CTEs, making them identical in performance to the equivalent subquery. In PostgreSQL 11 and earlier, CTEs were always materialized (optimization fences), which prevented predicate pushdown and could hurt performance.

PostgreSQL 12+: CTEs are inlined by default when referenced once. Use MATERIALIZED to force materialization. Use NOT MATERIALIZED to force inlining.

SQL Server: CTEs are always inlined. The optimizer treats them as subqueries.

BigQuery / Snowflake: CTEs are optimized transparently. No materialization hints are needed or available.

-- PostgreSQL 12+: force materialization
WITH expensive_calc AS MATERIALIZED (
  SELECT user_id, heavy_computation(data) AS result
  FROM big_table
)
SELECT * FROM expensive_calc WHERE result > 100;

-- PostgreSQL 12+: force inlining
WITH filtered AS NOT MATERIALIZED (
  SELECT * FROM orders WHERE status = 'active'
)
SELECT * FROM filtered WHERE amount > 1000;

Common Table Expression FAQ

What is a common table expression?+
A common table expression (CTE) is a temporary named result set that you define with the WITH keyword at the beginning of a SQL statement. It exists only for the duration of that single statement. You give it a name, write a SELECT inside it, and then reference it like a table in the main query that follows. CTEs improve readability by breaking complex queries into named steps. They also enable reuse: if you need the same derived result in multiple places within one query, a CTE lets you define it once.
What is the difference between a CTE and a subquery?+
A CTE is defined before the main query using WITH and has a name. A subquery is embedded inline, either in the FROM clause, WHERE clause, or SELECT list. Functionally, a basic CTE and a FROM-clause subquery produce the same result. The differences are readability (CTEs are easier to read when nested logic is involved), reuse (a CTE can be referenced multiple times in one statement), and recursion (only CTEs support recursive self-reference). In most modern engines, the optimizer treats them identically for performance.
Do all SQL databases support common table expressions?+
Yes, all major SQL databases support CTEs. This includes PostgreSQL 8.4+, MySQL 8.0+, SQL Server 2005+, Oracle 9i+, SQLite 3.8.3+, BigQuery, Snowflake, Redshift, Trino, Databricks SQL, and DuckDB. The only common engine that does not support them is MySQL 5.7 and earlier. Recursive CTEs are also supported by all of these engines, with minor syntax variations (some require the RECURSIVE keyword, others detect recursion automatically).

1 in 7 senior SQL questions. Worth the reps.

Roughly 113 of the 804 senior rounds we analyzed expected a WITH clause somewhere in the answer. Practice until naming your intermediate steps feels automatic.