SQL Practice

CTE Practice Problems for Data Engineers

Subqueries and CTEs appear in 4.9% of SQL interview questions, but they disproportionately show up in onsite rounds where difficulty is higher. Five problems from basic WITH clauses to recursive traversals.

5 Progressive CTE Problems

Ordered from warm-up to advanced. Each problem builds on a different CTE skill that interviewers test.

Problem 1Warm-up

Filter Before Joining

Write a CTE that pre-filters an orders table to the last 90 days, then join it to a customers table. This tests whether you understand that CTEs let you break a query into readable steps. The naive approach puts the date filter in the WHERE clause after the join. The CTE approach filters first, making the join smaller and the logic clearer.

Skills tested: Basic CTE syntax, query readability, filter pushdown

Problem 2Intermediate

Multi-Step Aggregation

Chain two CTEs: the first aggregates daily revenue per product, the second ranks products by total revenue. The final SELECT pulls the top 5. This is the bread-and-butter CTE pattern. Each CTE does one thing. The result reads top-to-bottom like a recipe.

Skills tested: Chained CTEs, aggregation, ranking

Problem 3Intermediate

CTE for Deduplication

Given a table with duplicate user events (same user_id, same event_type, same timestamp), use a CTE with ROW_NUMBER to keep only the first occurrence per user per event type. Then aggregate the deduplicated result. This pattern appears constantly in real pipelines where source data has duplicates.

Skills tested: ROW_NUMBER inside CTE, deduplication, combining CTEs with window functions

Problem 4Advanced

Recursive Org Chart Traversal

Given an employees table with manager_id, write a recursive CTE that builds the full reporting chain from CEO to every employee. Include a depth column. This is the classic recursive CTE problem. The anchor member selects the root (WHERE manager_id IS NULL). The recursive member joins back to the CTE itself.

Skills tested: Recursive CTE syntax, anchor/recursive members, tree traversal

Problem 5Advanced

Recursive Date Spine

Generate a continuous date series from 2024-01-01 to 2024-12-31 using a recursive CTE, then LEFT JOIN daily revenue data onto it. Days with no revenue should show 0. This solves the 'missing dates' problem that trips up many candidates. Without a date spine, gaps in your data become invisible.

Skills tested: Recursive date generation, LEFT JOIN to fill gaps, COALESCE for defaults

When CTEs Show Up in Interviews

CTEs are not a standalone topic. They are a tool interviewers expect you to reach for when a query gets complex. While CTEs and subqueries account for 4.9% of SQL questions overall, they concentrate in later rounds where questions are harder and the bar is higher.

Refactor a nested subquery into CTEs

Interviewers hand you a 40-line query with 3 levels of nested subqueries and ask you to rewrite it. The test is not whether you get the same result. It is whether you can decompose logic into named, readable steps.

CTE + Window Function combo

First CTE aggregates raw data. Second CTE applies a window function (usually ROW_NUMBER or LAG). Final SELECT filters. This three-layer pattern handles 70% of analytical interview questions.

Recursive hierarchy traversal

Org charts, category trees, bill-of-materials. The interviewer wants to see that you know the anchor/recursive member structure and can add a depth or path column.

CTE for testability

Some interviewers ask you to write a query, then change the requirements. CTEs make modifications easy because each step is isolated. Changing one CTE does not break the others.

Common CTE Mistakes

These errors come up in interviews more often than you would think.

Referencing a CTE before it is defined

CTEs are evaluated in order. If cte_b references cte_a, then cte_a must come first in the WITH clause. This seems obvious, but under interview pressure people get the order wrong.

Forgetting the comma between CTEs

The WITH keyword appears once. Each CTE after the first is separated by a comma. A common syntax error: writing WITH again for the second CTE.

Using a CTE when a subquery is simpler

If you only reference the result once and it is a single aggregation, a subquery in the FROM clause is fine. CTEs add value when you reference the same intermediate result multiple times or need readability across many steps.

Infinite recursion in recursive CTEs

Every recursive CTE needs a termination condition. If your recursive member does not converge (e.g., no WHERE clause limiting depth or no rows being eliminated), the query runs forever. Most engines have a default recursion limit, but relying on it is not a plan.

Worked Example: Recursive Org Chart Traversal

Given an employees table with a self-referencing manager_id, build the full reporting chain from CEO to every employee with their depth in the hierarchy.

WITH RECURSIVE org_tree AS (
  -- Anchor: start from the CEO (no manager)
  SELECT id, name, manager_id, 0 AS depth
  FROM employees
  WHERE manager_id IS NULL

  UNION ALL

  -- Recursive: join each employee to their manager in the tree
  SELECT e.id, e.name, e.manager_id, t.depth + 1
  FROM employees e
  JOIN org_tree t ON e.manager_id = t.id
)
SELECT depth, name
FROM org_tree
ORDER BY depth, name;

The anchor member seeds the recursion with root nodes (employees whose manager_id IS NULL). Each recursive step joins one level deeper. The depth column increments by 1 at each level, so you can see exactly where each person sits in the hierarchy.

Expected output

 depth | name
-------+--------------
     0 | Sarah (CEO)
     1 | James (VP Eng)
     1 | Maria (VP Ops)
     2 | Anil
     2 | Priya

CTE Practice FAQ

What is a CTE in SQL?+
A CTE (Common Table Expression) is a named temporary result set defined with the WITH keyword. It exists only for the duration of the query. Think of it as giving a name to a subquery so you can reference it later. CTEs make complex queries readable by breaking them into labeled steps.
When should I use a CTE instead of a subquery?+
Use a CTE when you reference the same intermediate result more than once, when you need more than two levels of nesting, or when readability matters (it always does in interviews). Use a subquery for simple one-off filters or scalar values. In interviews, CTEs almost always win because the interviewer is reading your query in real time.
Are recursive CTEs commonly asked in interviews?+
Less often than non-recursive CTEs, but they still appear. The most common scenarios are org chart traversal, category hierarchies, and date spine generation. Recursive CTEs tend to show up in onsite rounds rather than phone screens (only 11.7% of interviews include onsite SQL at all, so the questions that do make it there tend to be harder). If you can write a basic recursive CTE from memory, you are ahead of most candidates.
Do CTEs affect query performance?+
It depends on the database engine. PostgreSQL materializes CTEs by default (prior to v12), which can hurt performance. Most modern engines (Trino, BigQuery, Snowflake) inline CTEs and optimize them like subqueries. In interviews, readability matters more than micro-optimization. Write the CTE. Mention the performance nuance if asked.

Build CTE Fluency with Real SQL

Reading CTE syntax is easy. Writing a three-layer CTE under time pressure is not. Practice with real execution and instant feedback.