Common Table Expression (CTE): SQL WITH Clause Explained

CTEs show up in 8% of verified DE interview SQL questions overall. The rate climbs to 14% once you isolate the senior (L5) and staff (L6) pools, which together account for 804 of the 1,042 rounds in our corpus. That's roughly 1 in 7 senior SQL questions expecting a WITH clause somewhere in the answer.

All SQL rounds

14%

Senior SQL rounds

632

L5 rounds tracked

172

L6 rounds tracked

What Is a Common Table Expression?

A common table expression is a named, temporary result set defined using the WITH keyword. It lives only for the single SQL statement that follows it. Think of it as giving a name to a subquery, except you declare it before the main query and can reference it as many times as you need.

In the corpus, we see all three names used interchangeably: "CTE" in 47% of transcripts, "WITH clause" in 31%, and the full "common table expression" in the remaining 22%. Interviewers won't care which one you use. They'll care whether you can name, reuse, and chain them without losing your place.

Interview note: Interviewers expect you to use CTEs for any problem with more than one logical step. Starting your answer with WITH signals that you structure your SQL thinking clearly.

Basic Common Table Expression

-- Basic common table expression
WITH active_users AS (
  SELECT user_id, username, email
  FROM users
  WHERE account_status = 'active'
)
SELECT
  au.username,
  COUNT(oi.order_id) AS recent_orders
FROM active_users au
LEFT JOIN order_items oi ON au.user_id = oi.user_id
GROUP BY au.username
ORDER BY recent_orders DESC;

Common Table Expression Syntax

The WITH keyword introduces one or more CTEs. Each CTE has a name, an optional column list, and a SELECT statement wrapped in parentheses. Multiple CTEs are separated by commas after a single WITH keyword. The main query follows after the last CTE.

Syntax Template

-- Syntax template
WITH cte_name [(column1, column2, ...)] AS (
  SELECT ...
)
SELECT ... FROM cte_name;

-- Multiple CTEs (comma-separated, one WITH)
WITH
  step1 AS (
    SELECT ...
  ),
  step2 AS (
    SELECT ... FROM step1
  ),
  step3 AS (
    SELECT ... FROM step2
  )
SELECT * FROM step3;

Syntax Rules

One WITH keyword

Even with ten CTEs, you write WITH only once. This is the most common syntax mistake. Do not write WITH before each CTE.

Column aliases

You can name columns after the CTE name: WITH totals (region, revenue) AS (...). If omitted, columns inherit names from the inner SELECT.

Forward reference only

Each CTE can reference CTEs defined before it in the list, but not after. The order of definition determines visibility.

Statement scope

A CTE is visible only to the immediately following statement. You cannot reference it in a subsequent query. It is not a view, not a temp table, and not persistent.

Why Use a Common Table Expression?

Readability

CTEs turn a nested query into a flat sequence of named steps. Each step has a descriptive name that documents what it does. Reviewing a query with three named CTEs is faster than parsing three levels of nested subqueries. In code reviews and interviews, this matters.

Reuse Within a Query

A CTE can be referenced multiple times in the main query. If you need the same derived table in both a JOIN and a WHERE EXISTS, define it once as a CTE. Without a CTE, you would either duplicate the subquery or use a temp table.

Recursion

Only CTEs support recursion in SQL. A recursive CTE references itself to solve hierarchical problems: org charts, category trees, graph traversals, and date series generation. Subqueries cannot do this.

Testability

During development, you can run each CTE independently by replacing the main query with SELECT * FROM cte_name. This lets you validate each step before combining them. With nested subqueries, you have to extract and run each level separately.

Common Table Expression Patterns

1. Filter, Then Aggregate

WITH recent_orders AS (
  SELECT user_id, total_amount
  FROM transactions
  WHERE transaction_date >= '2024-01-01'
)
SELECT
  user_id,
  COUNT(*) AS order_count,
  SUM(total_amount) AS total_spent
FROM recent_orders
GROUP BY user_id
ORDER BY total_spent DESC;

The first CTE filters to the relevant rows. The main query aggregates. This separates the "what data" question from the "what metric" question.

2. Aggregate, Then Rank

WITH dept_salaries AS (
  SELECT
    department,
    emp_name,
    salary
  FROM employees
),
ranked AS (
  SELECT *,
    ROW_NUMBER() OVER (
      PARTITION BY department ORDER BY salary DESC
    ) AS rn
  FROM dept_salaries
)
SELECT department, emp_name, salary
FROM ranked
WHERE rn <= 3;

The first CTE computes per-group aggregates. The second CTE (or the main query) ranks groups. This is the standard pattern for top-N per group problems.

3. Self-Join via CTE

WITH monthly_revenue AS (
  SELECT
    substr(transaction_date, 1, 7) AS month,
    SUM(total_amount) AS revenue
  FROM transactions
  GROUP BY substr(transaction_date, 1, 7)
)
SELECT
  curr.month,
  curr.revenue AS current_revenue,
  prev.revenue AS previous_revenue,
  curr.revenue - prev.revenue AS mom_change
FROM monthly_revenue curr
LEFT JOIN monthly_revenue prev
  ON prev.month = strftime('%Y-%m', date(curr.month || '-01', '-1 month'))
ORDER BY curr.month;

Define the CTE once and reference it twice in the main query. This avoids duplicating a complex subquery for a self-join.

4. Recursive Common Table Expression

WITH RECURSIVE org_tree AS (
  -- Anchor: CEO (no manager)
  SELECT employee_id, emp_name, manager_id, 1 AS depth
  FROM employees
  WHERE manager_id IS NULL

  UNION ALL

  -- Recursive: find direct reports
  SELECT e.employee_id, e.emp_name, e.manager_id, ot.depth + 1
  FROM employees e
  INNER JOIN org_tree ot ON e.manager_id = ot.employee_id
  WHERE ot.depth < 20
)
SELECT * FROM org_tree ORDER BY depth, emp_name;

A recursive CTE references itself. It has an anchor member (base case) and a recursive member, connected by UNION ALL. The engine iterates until the recursive member returns zero rows.

CTE vs. Subquery vs. Temp Table

Feature	CTE	Subquery	Temp Table
Scope	Single statement	Single statement	Session
Reusable in query	Yes (multiple refs)	No (must repeat)	Yes
Recursion	Yes	No	No
Indexable	No	No	Yes
Materialized	Engine decides	Engine decides	Always
Best for	Query readability, multi-step logic	Simple one-off derivations	Multi-statement workflows

Performance Considerations

In PostgreSQL 12+, BigQuery, Snowflake, and Trino, the optimizer can inline CTEs, making them identical in performance to the equivalent subquery. In PostgreSQL 11 and earlier, CTEs were always materialized (optimization fences), which prevented predicate pushdown and could hurt performance.

PostgreSQL 12+: CTEs are inlined by default when referenced once. Use MATERIALIZED to force materialization. Use NOT MATERIALIZED to force inlining.

SQL Server: CTEs are always inlined. The optimizer treats them as subqueries.

BigQuery / Snowflake: CTEs are optimized transparently. No materialization hints are needed or available.

PostgreSQL Materialization Hints

-- PostgreSQL 12+: force materialization
WITH expensive_calc AS MATERIALIZED (
  SELECT user_id, heavy_computation(data) AS result
  FROM big_table
)
SELECT * FROM expensive_calc WHERE result > 100;

-- PostgreSQL 12+: force inlining
WITH filtered AS NOT MATERIALIZED (
  SELECT * FROM orders WHERE status = 'active'
)
SELECT * FROM filtered WHERE amount > 1000;

Common Table Expression FAQ

What is a common table expression?+

A common table expression (CTE) is a temporary named result set that you define with the WITH keyword at the beginning of a SQL statement. It exists only for the duration of that single statement. You give it a name, write a SELECT inside it, and then reference it like a table in the main query that follows. CTEs improve readability by breaking complex queries into named steps. They also enable reuse: if you need the same derived result in multiple places within one query, a CTE lets you define it once.

What is the difference between a CTE and a subquery?+

A CTE is defined before the main query using WITH and has a name. A subquery is embedded inline, either in the FROM clause, WHERE clause, or SELECT list. Functionally, a basic CTE and a FROM-clause subquery produce the same result. The differences are readability (CTEs are easier to read when nested logic is involved), reuse (a CTE can be referenced multiple times in one statement), and recursion (only CTEs support recursive self-reference). In most modern engines, the optimizer treats them identically for performance.

Do all SQL databases support common table expressions?+

Yes, all major SQL databases support CTEs. This includes PostgreSQL 8.4+, MySQL 8.0+, SQL Server 2005+, Oracle 9i+, SQLite 3.8.3+, BigQuery, Snowflake, Redshift, Trino, Databricks SQL, and DuckDB. The only common engine that does not support them is MySQL 5.7 and earlier. Recursive CTEs are also supported by all of these engines, with minor syntax variations (some require the RECURSIVE keyword, others detect recursion automatically).

02 / Why practice

1 in 7 senior SQL questions. Worth the reps.

01
Active recall beats re-reading by 50%
Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom
02
76% of hiring managers reject on the coding task, not the resume
From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice
03
Five problem shapes cover 80% of data engineer loops
Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition

Start Practicing

Related Guides

SQL CTE Guide→

Complete CTE reference with chaining, recursion, CTE vs subquery, CTE vs temp table

SQL Subqueries→

Scalar, table, and correlated subqueries with comparison to CTEs and interview patterns

SQL Interview Questions→

Every SQL topic tested in data engineering interviews with approaches and patterns