SQL Subquery Practice with Examples

Subqueries and CTEs account for 4.9% of SQL interview questions. That number sounds small until you realize they cluster in the hardest rounds. Scalar, correlated, EXISTS, and IN subqueries with real execution, plus a clear breakdown of when to use subqueries vs CTEs vs JOINs.

What this guide covers

Subqueries and CTEs account for ~4.9% of SQL interview questions, but they cluster in the hardest rounds. Four subquery shapes cover the corpus: scalar, correlated, EXISTS, IN. Each solves a different kind of problem. Knowing which to reach for, and how it compares to a CTE or JOIN, is the difference between a passing onsite and a stuck candidate.

Four types of SQL subqueries

Each type solves a different kind of problem. Know when to reach for each.

Scalar subquery

Syntax: SELECT name, salary, (SELECT AVG(salary) FROM employees) AS avg_sal FROM employees. Returns a single value. Usable in SELECT, WHERE, or HAVING. The engine runs it once and plugs the result in. If it returns more than one row, the query fails. When to use: you need a single number, such as the average, the max, a count, or a specific lookup value.

Correlated subquery

Syntax: SELECT * FROM orders o WHERE amount > (SELECT AVG(amount) FROM orders WHERE customer_id = o.customer_id). References a column from the outer query. The database re-evaluates it for every outer row. Slower than uncorrelated subqueries on large tables, but solves problems they cannot. The tell: the inner query mentions an alias from the outer query. When to use: the subquery result depends on which row you're currently evaluating in the outer query.

EXISTS subquery

Syntax: SELECT * FROM customers c WHERE EXISTS (SELECT 1 FROM orders WHERE customer_id = c.id). Returns TRUE if the subquery produces at least one row. The database stops scanning as soon as it finds a match. Faster than IN for large datasets. The standard way to check presence (or absence with NOT EXISTS) of related rows. When to use: existence checks. 'Find customers who have at least one order.' 'Find products that have never been returned.'

IN subquery

Syntax: SELECT * FROM products WHERE category_id IN (SELECT id FROM categories WHERE active = true). Checks whether a value matches any row in the subquery result. Functionally similar to EXISTS for simple cases, but behaves differently with NULLs. If the subquery returns a NULL, NOT IN produces unexpected results because NULL comparisons are always unknown. Classic interview gotcha. When to use: simple membership tests where the subquery column is NOT NULL. Prefer EXISTS when NULLs are possible.

Five subquery practice problems

Each problem targets a different subquery type. The last one is deliberately painful to show why window functions were invented.

Warm-up · Scalar

1. Above-average salary

Find all employees whose salary is above the company average. Use a scalar subquery in the WHERE clause. The entry-level subquery question that establishes whether you know the basic syntax.

Intermediate · EXISTS/IN

2. Customers with recent orders

Find customers who placed at least one order in the last 30 days. Solve it twice: once with EXISTS, once with IN. Compare and explain when they differ. Hint: they differ when NULLs are involved.

Intermediate · Correlated

3. Per-group maximum

Find the highest-paid employee in each department. Correlated subquery: for each employee, check if their salary equals the maximum salary in their department. The canonical correlated subquery problem.

Intermediate · NOT EXISTS/NOT IN

4. Products never ordered

Find products that have never appeared in an order. Solve it three ways: LEFT JOIN with IS NULL, NOT EXISTS, NOT IN. All three return the same result if there are no NULLs. Add a NULL product_id to the orders table and see which approach breaks.

Advanced · Correlated

5. Running comparison without window functions

For each order, show the order amount and the cumulative average of all previous orders by the same customer. Use a correlated subquery with no window functions. Hard. The correlated subquery must filter by customer_id AND by order_date less than the current row. Demonstrates why window functions exist.

Subquery vs CTE vs JOIN: when to use what

Interviewers sometimes ask you to rewrite a subquery as a JOIN, or explain why you chose one over the other. Knowing which tool to reach for is part of the answer.

ApproachStrengthsWeaknessesBest for
SubquerySimple, self-contained, works everywhere. Scalar subqueries in SELECT are easy to read. EXISTS is efficient for existence checks.Correlated subqueries re-execute per row. Nested subqueries beyond two levels become unreadable. NOT IN breaks with NULLs.Simple lookups, existence checks, one-off comparisons to aggregates.
CTENamed, readable, top-to-bottom flow. Each step isolated and testable. Can reference the same CTE multiple times.Some engines materialize CTEs. Cannot be correlated to an outer query.Multi-step transformations; queries that reference intermediate results more than once.
JOINOptimizers understand JOINs best. Set-based, efficient for large-scale matching. Clear cardinality semantics.Fan-out risk with one-to-many joins. Hard to read with 4+ tables. Requires understanding NULL behavior in outer joins.Combining tables; matching rows; any problem where you need columns from multiple sources.

Worked example: employees who earn above their department average

SELECT
  e.emp_name,
  e.department,
  e.salary,
  dept_avg.avg_salary
FROM employees e
JOIN (
  SELECT department, ROUND(AVG(salary), 2) AS avg_salary
  FROM employees
  GROUP BY department
) dept_avg ON dept_avg.department = e.department
WHERE e.salary > dept_avg.avg_salary
ORDER BY e.salary - dept_avg.avg_salary DESC;

The subquery in FROM computes each department's average once, then the outer query joins it back to filter individual employees. Avoids re-computing the average per row (which a correlated subquery in WHERE would do). Ordering by the gap surfaces the biggest outliers first.

Subquery FAQ

What is a correlated subquery in SQL?+
A correlated subquery references a column from the outer query. The database evaluates it once for every outer row. Example: finding employees who earn more than their department average. The subquery filters by the current row's department_id, so it produces a different result per row.
When should I use EXISTS instead of IN?+
EXISTS when checking for the presence of related rows, especially on large tables, since EXISTS stops scanning as soon as it finds one match. Also EXISTS when the subquery column might contain NULLs. NOT IN returns no rows if any NULL is in the subquery result. NOT EXISTS handles NULLs correctly.
Are subqueries slower than JOINs?+
Not necessarily. Modern optimizers often rewrite subqueries as joins internally. Scalar and EXISTS subqueries are well-optimized. Correlated subqueries can be slow on large tables because they re-execute per row. Check the execution plan rather than assuming one form is always faster.
Can I use a subquery in the SELECT clause?+
Yes. A scalar subquery in SELECT returns one value per row. Example: SELECT name, (SELECT COUNT(*) FROM orders WHERE orders.user_id = users.id) AS order_count FROM users. That is a correlated scalar subquery, which works but can be slow on large tables. A LEFT JOIN with GROUP BY often performs better.
02 / Why practice

Write subqueries that actually run

  1. 01

    Active recall beats re-reading by 50%

    Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom

  2. 02

    76% of hiring managers reject on the coding task, not the resume

    From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice

  3. 03

    Five problem shapes cover 80% of data engineer loops

    Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition

Related guides