SQL CTE: Common Table Expressions with Examples
CTEs didn't exist in standard SQL until SQL:1999. Before that, every complex query was either a nested subquery mess or a temporary table plus three extra statements. The 1999 spec added the WITH clause and, almost as an afterthought, recursive CTEs. That afterthought quietly replaced the ugly self-join loops engineers had been writing since the 1980s.
What Is a CTE?
A CTE is a named temporary result set declared with the WITH keyword, scoped to the single statement that follows. The original 1999 spec called it a "query expression body with a with clause," which is exactly as readable as you'd expect from a committee document. The idea came from research on recursive query languages in the 1980s, and it shipped in DB2 first.
CTEs solve two problems. First, readability: instead of nesting subqueries three levels deep, you write each step as a named CTE and the final SELECT reads like a story. Second, reuse: if you need the same derived table in multiple places (e.g., in both a JOIN and a WHERE EXISTS), a CTE lets you define it once.
Every major SQL engine supports CTEs: PostgreSQL, MySQL 8.0+, SQL Server, Oracle, SQLite, BigQuery, Snowflake, Redshift, Trino, and Databricks. If you are writing SQL in 2024 or later, CTEs are always available.
Interview note: CTEs are the default tool for structuring interview answers. When an interviewer gives you a multi-step problem, start by sketching CTE names on the whiteboard before writing any SQL. This signals structured thinking.
Know CTE the way the interviewer who asks it knows it.
CTE Syntax
The WITH keyword introduces one or more CTEs. Each CTE has a name, an optional column list, and a SELECT statement. The main query follows after the last CTE.
Single CTE
-- Single CTE
WITH active_users AS (
SELECT user_id, name, email
FROM users
WHERE status = 'active'
AND last_login >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT
au.name,
au.email,
COUNT(o.order_id) AS recent_orders
FROM active_users au
LEFT JOIN orders o ON au.user_id = o.user_id
AND o.order_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY au.name, au.email
ORDER BY recent_orders DESC;5 CTE Patterns for Interviews
These five patterns cover the range of CTE usage you will encounter in production code and interviews. Each one solves a different category of problem.
Pulled from debriefs where SQL was the gate.
Key Syntax Rules
One WITH keyword
Even with multiple CTEs, you write WITH only once. Separate CTEs with commas.
Column aliases (optional)
You can specify column names after the CTE name: WITH totals (region, revenue) AS (...). If omitted, column names come from the inner SELECT.
Scope
A CTE is visible only to the statement that follows it. You cannot reference it in a separate query. It is not a view or a temp table.
Multiple references
You can reference the same CTE multiple times in the main query (e.g., in a self-join). The engine decides whether to materialize it or inline it.
4. CTE vs Subquery
CTEs and subqueries often produce the same result. The choice is about readability, reuse, and (in some engines) performance.
Use a subquery when: the derived table is used once, is short (under 5 lines), and nesting does not hurt readability.
Use a CTE when: the derived table is referenced multiple times, the query has more than one level of nesting, or the logic has distinct named steps.
CTE with Explicit Column Aliases
-- CTE with explicit column aliases
WITH monthly_revenue (month, region, revenue) AS (
SELECT
DATE_TRUNC('month', order_date),
region,
SUM(amount)
FROM orders
GROUP BY 1, 2
)
SELECT month, region, revenue
FROM monthly_revenue
WHERE revenue > 50000
ORDER BY month, revenue DESC;Recursive CTEs: Deep Dive
Recursive CTEs are the only way to walk hierarchical data in pure SQL without knowing the depth in advance. They replace what would otherwise require a loop in application code or a series of self-joins (one per level). The engine handles the iteration.
Iteration 0 (Anchor): The anchor SELECT runs once. Its output becomes the initial working table.
Iteration 1+: The recursive SELECT runs against the working table from the previous iteration. New rows produced become the next working table. This repeats until the recursive SELECT produces zero rows.
Final result: UNION ALL combines all rows from all iterations into the final output.
1. Basic CTE (Named Subquery)
-- Find departments where average salary
-- exceeds the company average
WITH company_avg AS (
SELECT AVG(salary) AS avg_salary
FROM employees
)
SELECT
department,
ROUND(AVG(salary), 0) AS dept_avg,
(SELECT avg_salary FROM company_avg) AS company_avg
FROM employees
GROUP BY department
HAVING AVG(salary) > (SELECT avg_salary FROM company_avg);Interview note: The company_avg CTE is referenced twice: once in SELECT and once in HAVING. Without a CTE, you would write the same subquery in both places. CTEs eliminate that duplication.
2. Recursive CTE
-- Walk an employee hierarchy from the CEO down
WITH RECURSIVE org_tree AS (
-- Anchor: start at the CEO (no manager)
SELECT id, name, manager_id, 1 AS depth
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive: find direct reports of current level
SELECT e.id, e.name, e.manager_id, ot.depth + 1
FROM employees e
INNER JOIN org_tree ot ON e.manager_id = ot.id
)
SELECT id, name, depth
FROM org_tree
ORDER BY depth, name;Interview note: Always include a depth or iteration counter and mention the safety limit. PostgreSQL defaults to no limit. SQL Server defaults to 100 (OPTION (MAXRECURSION N) to change it). In interviews, say: I would add a WHERE depth < 20 in the recursive member to prevent infinite loops from bad data.
3. Chained CTEs (Multi-Step Pipeline)
WITH monthly_orders AS (
-- Step 1: aggregate orders by user and month
SELECT
user_id,
DATE_TRUNC('month', order_date) AS month,
COUNT(*) AS order_count,
SUM(amount) AS total_spent
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY user_id, DATE_TRUNC('month', order_date)
),
ranked AS (
-- Step 2: rank users within each month
SELECT *,
RANK() OVER (
PARTITION BY month ORDER BY total_spent DESC
) AS spend_rank
FROM monthly_orders
),
top_spenders AS (
-- Step 3: keep top 10 per month
SELECT * FROM ranked WHERE spend_rank <= 10
)
-- Step 4: join back to users for names
SELECT
u.name,
ts.month,
ts.order_count,
ts.total_spent,
ts.spend_rank
FROM top_spenders ts
INNER JOIN users u ON ts.user_id = u.id
ORDER BY ts.month, ts.spend_rank;Interview note: Name your CTEs after what they compute, not after where they sit in the pipeline. Names like step1 and temp make the interviewer do extra work to understand your query. Names like monthly_orders and top_spenders are self-documenting.
Subquery vs CTE Comparison
-- Subquery version
SELECT department, avg_sal
FROM (
SELECT department, AVG(salary) AS avg_sal
FROM employees
GROUP BY department
) dept_avgs
WHERE avg_sal > (
SELECT AVG(salary) FROM employees
);
-- CTE version
WITH dept_avgs AS (
SELECT department, AVG(salary) AS avg_sal
FROM employees
GROUP BY department
),
company_avg AS (
SELECT AVG(salary) AS avg_sal
FROM employees
)
SELECT d.department, d.avg_sal
FROM dept_avgs d
WHERE d.avg_sal > (SELECT avg_sal FROM company_avg);5. CTE vs Temp Table
CTEs exist only for one statement. Temp tables persist for the session (or transaction). They solve different problems.
| Feature | CTE | Temp Table |
|---|---|---|
| Scope | Single statement | Session or transaction |
| Indexable | No | Yes |
| Materialized | Engine decides (usually inlined) | Always materialized to disk/memory |
| Multi-statement use | No | Yes |
| Best for | Query readability, single-query pipelines | Multi-query workflows, large intermediate results |
Generating a Date Series
-- Generate every date in January 2024
WITH RECURSIVE date_series AS (
SELECT DATE '2024-01-01' AS dt
UNION ALL
SELECT dt + INTERVAL '1 day'
FROM date_series
WHERE dt < DATE '2024-01-31'
)
SELECT dt FROM date_series;Path Accumulation
WITH RECURSIVE category_path AS (
-- Anchor: top-level categories (no parent)
SELECT
id,
name,
name AS full_path,
1 AS depth
FROM categories
WHERE parent_id IS NULL
UNION ALL
-- Recursive: append child name to parent path
SELECT
c.id,
c.name,
cp.full_path || ' > ' || c.name,
cp.depth + 1
FROM categories c
INNER JOIN category_path cp ON c.parent_id = cp.id
)
SELECT full_path, depth
FROM category_path
ORDER BY full_path;Preventing Infinite Loops
-- Guard against cycles with a depth limit
WITH RECURSIVE org_tree AS (
SELECT id, name, manager_id, 1 AS depth
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.id, e.name, e.manager_id, ot.depth + 1
FROM employees e
INNER JOIN org_tree ot ON e.manager_id = ot.id
WHERE ot.depth < 50 -- safety limit
)
SELECT * FROM org_tree;
-- SQL Server: set explicit recursion limit
-- SELECT * FROM org_tree OPTION (MAXRECURSION 200);Interview note: Mentioning cycle detection unprompted shows production experience. Say: In production I would add a depth limit and optionally track visited IDs in an array to detect cycles explicitly. PostgreSQL supports array accumulation for this: WHERE NOT e.id = ANY(visited_ids).
6 CTE Interview Questions
CTEs show up in every stage of the interview process. Phone screens test basic WITH syntax. Onsites test recursive traversal and multi-step pipelines.
Q1: Rewrite this nested subquery using a CTE.
What they test: Basic CTE syntax and refactoring skills. The interviewer hands you a two-level nested query and watches you extract it into named steps. They care about your naming choices and whether the result is actually more readable. Approach: Start from the innermost subquery. Extract it as the first CTE. Name it after what it computes (e.g., dept_averages, not temp1). Extract the next level. The final SELECT should read like a sentence.
Q2: Find the top 3 products by revenue in each category.
What they test: CTE chaining with window functions. The classic top-N-per-group pattern requires aggregation, ranking, and filtering. A chained CTE handles each step. Approach: First CTE: aggregate revenue by product and category. Second CTE: add ROW_NUMBER() OVER (PARTITION BY category ORDER BY revenue DESC). Final SELECT: WHERE rn <= 3. Discuss RANK vs ROW_NUMBER if the interviewer asks about ties.
Q3: Write a recursive CTE to find all employees under a given manager, at any depth.
What they test: Recursive CTE syntax and hierarchy traversal. This is the most common recursive CTE question. They want anchor + recursive member + UNION ALL, and they want you to include a depth counter. Approach: Anchor: SELECT the target manager. Recursive: JOIN employees on manager_id = previous level's id. Track depth. Mention the cycle-prevention guard (depth < N). State the termination condition: recursion stops when no new rows are produced.
Q4: What is the difference between a CTE and a temp table? When would you use each?
What they test: Conceptual clarity about scope, materialization, and use cases. This is a knowledge question, not a coding question. Approach: CTEs: single-statement scope, not indexable, optimizer may inline. Temp tables: session scope, indexable, always materialized. Use CTEs for readability in one query. Use temp tables when you need the result across multiple statements, when you need to index it, or when forcing materialization improves a specific query plan.
Q5: Generate a running total of daily revenue using a CTE.
What they test: CTE combined with window functions. The CTE aggregates to the day level; the window function computes the cumulative sum. Approach: CTE: GROUP BY date with SUM(amount) AS daily_revenue. Main SELECT: SUM(daily_revenue) OVER (ORDER BY date) AS running_total. Mention that the window function's default frame (RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) gives the correct cumulative sum.
Q6: A CTE in PostgreSQL 11 is slower than the equivalent subquery. Why?
What they test: Engine-specific CTE behavior. Before PostgreSQL 12, CTEs were optimization fences: the engine always materialized them, preventing predicate pushdown and other optimizations. This is a trick question that separates candidates who have dealt with real performance issues from those who have only read syntax docs. Approach: Explain that PostgreSQL 11 and earlier materialized all CTEs. Filters in the main query could not push into the CTE. In PG 12+, the optimizer can inline CTEs (controlled by the MATERIALIZED / NOT MATERIALIZED hints). If stuck on PG 11, rewrite the CTE as a subquery to let the optimizer push predicates down.
3 CTE Mistakes That Fail Interviews
These show up in real interview debriefs. Each one is avoidable with a clear mental model of how CTEs work.
Writing WITH Before Every CTE
A common syntax error: writing WITH for each CTE instead of separating them with commas. Only the first CTE gets WITH. Every subsequent CTE is separated by a comma. This error is immediate feedback that you have not written many CTEs.
Missing UNION ALL in Recursive CTEs
The anchor and recursive members must be connected by UNION ALL. Using UNION (without ALL) works in some engines but deduplicates rows between iterations, which breaks hierarchy traversals where the same value appears at different depths. Worse, some engines reject UNION entirely in recursive CTEs. Always use UNION ALL in recursive CTEs. If you need deduplication, apply DISTINCT in the final SELECT, not between iterations.
Over-Using CTEs Where a Subquery Is Cleaner
Not every query benefits from a CTE. A single scalar comparison (WHERE amount > (SELECT AVG(amount) FROM orders)) is clearer as an inline subquery. Wrapping it in a CTE adds syntax overhead with no readability gain. Interviewers notice when you add structure without purpose. Use CTEs when they add clarity or enable reuse.
SQL CTE FAQ
What is a CTE in SQL?+
Are CTEs faster than subqueries?+
Can you use multiple CTEs in one query?+
What is a recursive CTE?+
Do CTEs work in SQL Server, MySQL, and PostgreSQL?+
The WITH Clause Is 26 Years Old. Use It Like You Mean It.
- 01
Active recall beats re-reading by 50%
Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom
- 02
76% of hiring managers reject on the coding task, not the resume
From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice
- 03
Five problem shapes cover 80% of data engineer loops
Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition