SQL Practice

SQL LEFT OUTER JOIN Explained

Most candidates think LEFT JOIN "keeps all the data." The interviewer is checking whether you know LEFT JOIN drops rows the moment you put the right-table filter in the WHERE clause instead of the ON clause. That one mistake is the single most common LEFT JOIN bug in phone screens.

LEFT JOIN shows up in 17% of verified DE SQL rounds and FULL OUTER in 6%. The frequency is lower than INNER JOIN's 29%, but the bug surface is much larger. This page covers the syntax, the anti-join, and the three failure modes that silently corrupt outputs without throwing errors.

17%

Questions using LEFT JOIN

6%

FULL OUTER JOIN rate

1

Misplaced WHERE to break it

3

Silent failure modes

Source: DataDriven analysis of 1,042 verified data engineering interview rounds.

LEFT JOIN Syntax

LEFT JOIN and LEFT OUTER JOIN are byte-for-byte identical to the parser. The OUTER keyword is cosmetic. Most teams drop it because five characters aren't worth the debate in code review. The real action is in the ON clause, which is where the interviewer is going to set a trap.

-- These two are identical
SELECT * FROM a LEFT JOIN b ON a.id = b.a_id;
SELECT * FROM a LEFT OUTER JOIN b ON a.id = b.a_id;

-- Full syntax
SELECT
  left_table.column1,
  right_table.column2
FROM left_table
LEFT JOIN right_table
  ON left_table.key = right_table.key
  [AND additional_conditions]
WHERE [post-join_filters];

Interview note: The table listed first (after FROM) is the left table. The table after LEFT JOIN is the right table. Table order matters for LEFT JOIN but not for INNER JOIN. Swapping the tables in a LEFT JOIN changes the semantics entirely.

LEFT JOIN vs INNER JOIN

The difference only matters when unmatched rows exist. If every left-table row has at least one match, both join types return the same result. The divergence appears when some left rows have no corresponding right rows.

BehaviorLEFT JOININNER JOIN
Unmatched left rowsKept, with NULLs for right columnsDropped entirely
Unmatched right rowsDropped (use FULL OUTER JOIN to keep both)Dropped
Result row countAt least as many rows as the left table (more if right has duplicates)Only matching rows
NULL generationYes, for right-side columns where no match existsNo NULLs generated by the join itself
Use casePreserve all entities even without related dataOnly entities with matching data
-- customers: Alice (id=1), Bob (id=2), Carol (id=3)
-- orders: order for Alice (customer_id=1), order for Bob (customer_id=2)

-- LEFT JOIN: 3 rows (Carol has NULLs for order columns)
SELECT c.name, o.order_id
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;
-- Alice  | 101
-- Bob    | 102
-- Carol  | NULL    <-- preserved by LEFT JOIN

-- INNER JOIN: 2 rows (Carol dropped)
SELECT c.name, o.order_id
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;
-- Alice  | 101
-- Bob    | 102

5 LEFT JOIN Patterns

From basic entity preservation to the anti-join and the critical ON vs WHERE distinction. These patterns cover the full range of LEFT JOIN usage in interviews and production.

Basic LEFT JOIN

LEFT JOIN returns all rows from the left table. For rows that match the ON condition, columns from the right table are populated. For rows that do not match, right-side columns are filled with NULL. The keyword OUTER is optional: LEFT JOIN and LEFT OUTER JOIN are identical.

-- All customers, with their orders if any
SELECT
  c.customer_id,
  c.name,
  o.order_id,
  o.amount
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;

-- Customers with no orders will have:
-- order_id = NULL, amount = NULL
-- They are NOT dropped from the result

Anti-Join (LEFT JOIN + WHERE IS NULL)

The anti-join finds rows in the left table that have no match in the right table. LEFT JOIN produces NULLs for unmatched rows. Adding WHERE right_table.key IS NULL filters to only those unmatched rows. This is one of the most powerful and commonly tested patterns in SQL interviews. It answers questions like 'which customers have never ordered' and 'which products have no reviews.'

-- Customers who have never placed an order
SELECT c.customer_id, c.name
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.customer_id IS NULL;

-- Alternative: NOT EXISTS (often same performance)
SELECT c.customer_id, c.name
FROM customers c
WHERE NOT EXISTS (
  SELECT 1 FROM orders o
  WHERE o.customer_id = c.customer_id
);

-- Alternative: NOT IN (careful with NULLs)
SELECT customer_id, name
FROM customers
WHERE customer_id NOT IN (
  SELECT customer_id FROM orders WHERE customer_id IS NOT NULL
);

LEFT JOIN with Aggregation

Combining LEFT JOIN with GROUP BY and aggregate functions is a common reporting pattern. The LEFT JOIN preserves all entities from the left table. The aggregation summarizes the related data. COALESCE handles the NULLs generated by unmatched rows, converting them to zeros or default values for clean reporting.

-- All departments with their employee count and total salary
SELECT
  d.department_name,
  COUNT(e.employee_id) AS headcount,
  COALESCE(SUM(e.salary), 0) AS total_payroll
FROM departments d
LEFT JOIN employees e ON d.department_id = e.department_id
GROUP BY d.department_name;

-- Departments with no employees show headcount = 0, total_payroll = 0
-- COUNT(e.employee_id) correctly returns 0 for NULLs
-- SUM(e.salary) returns NULL for empty groups, so COALESCE is needed

Multiple LEFT JOINs

You can chain multiple LEFT JOINs to bring in data from several related tables while preserving all rows from the leftmost table. Each LEFT JOIN independently adds columns. If any join has no match, those columns are NULL. This pattern builds wide denormalized views for analytics and reporting.

-- Customer profile with latest order and support ticket count
SELECT
  c.customer_id,
  c.name,
  c.email,
  lo.latest_order_date,
  lo.latest_amount,
  COALESCE(t.ticket_count, 0) AS tickets
FROM customers c
LEFT JOIN (
  SELECT DISTINCT ON (customer_id)
    customer_id,
    order_date AS latest_order_date,
    amount AS latest_amount
  FROM orders
  ORDER BY customer_id, order_date DESC
) lo ON c.customer_id = lo.customer_id
LEFT JOIN (
  SELECT customer_id, COUNT(*) AS ticket_count
  FROM support_tickets
  GROUP BY customer_id
) t ON c.customer_id = t.customer_id;

LEFT JOIN with WHERE vs ON Conditions

This is a critical distinction that interviewers test frequently. Conditions in the ON clause filter the right table before the join. Conditions in the WHERE clause filter the joined result after the join. For LEFT JOINs, moving a right-table filter from ON to WHERE can eliminate rows that should have been preserved, effectively turning the LEFT JOIN into an INNER JOIN.

-- CORRECT: filter in ON clause (preserves unmatched customers)
SELECT c.customer_id, c.name, o.order_id
FROM customers c
LEFT JOIN orders o
  ON c.customer_id = o.customer_id
  AND o.status = 'completed';
-- Customers with no completed orders: order_id = NULL (preserved)

-- WRONG: filter in WHERE clause (drops unmatched customers)
SELECT c.customer_id, c.name, o.order_id
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.status = 'completed';
-- Customers with no completed orders are DROPPED
-- because o.status is NULL for them, and NULL != 'completed'

Common LEFT JOIN Mistakes

These three mistakes produce queries that run successfully but return incorrect results. Each one has cost real candidates real offers.

WHERE clause on right table nullifies the LEFT JOIN

The most common LEFT JOIN bug. Adding WHERE right_table.column = 'value' drops all rows where the right table has no match, because the column is NULL for those rows, and NULL does not equal anything. This silently converts the LEFT JOIN into an INNER JOIN. The query runs without errors. The results look plausible. The bug goes unnoticed until someone asks why certain rows are missing.

Fix: Move right-table filter conditions to the ON clause: LEFT JOIN ... ON key = key AND right_column = 'value'. This preserves unmatched left rows while filtering the right table before joining.

Using COUNT(*) instead of COUNT(column) after LEFT JOIN

COUNT(*) counts all rows, including those with NULLs from unmatched joins. If you LEFT JOIN customers to orders and GROUP BY customer, COUNT(*) returns 1 for customers with no orders (because the LEFT JOIN produces one row with NULLs). COUNT(order_id) returns 0, which is the correct count.

Fix: Use COUNT(right_table.column) when counting related entities after a LEFT JOIN. COUNT(o.order_id) correctly returns 0 for customers with no orders.

Not handling NULLs in calculations

LEFT JOINs generate NULLs. Any arithmetic with NULL produces NULL: salary + NULL = NULL. Any comparison with NULL returns UNKNOWN: NULL > 0 is not TRUE. Downstream calculations, CASE expressions, and WHERE conditions all need to account for these generated NULLs. Forgetting this produces incorrect totals, missing rows in filters, and broken reports.

Fix: Wrap right-side columns in COALESCE: COALESCE(o.amount, 0) for arithmetic. Use IS NULL / IS NOT NULL for comparisons. Test your query with entities that have no right-table matches to verify NULL handling.

4 LEFT JOIN Interview Questions

These questions test join mechanics, anti-join patterns, fan-out understanding, and the ON vs WHERE distinction.

Q1: What is the difference between LEFT JOIN and INNER JOIN? Give an example where they return different results.

What they test:

Fundamental join understanding. INNER JOIN drops unmatched rows from both sides. LEFT JOIN keeps all left-table rows, filling NULLs for unmatched right-side columns. The interviewer wants a concrete example, not just the definition. A table of customers with 100 rows and an orders table missing some customers: INNER JOIN returns fewer rows than LEFT JOIN because customers without orders are dropped.

Approach:

Give the example: 100 customers, 80 have orders. INNER JOIN returns rows for the 80 customers who have orders. LEFT JOIN returns rows for all 100 customers; the 20 without orders have NULL in the order columns. Explain when each is appropriate: INNER JOIN when you only need entities with related data, LEFT JOIN when you need all entities regardless.

Q2: Write a query to find all products that have never been ordered.

What they test:

The anti-join pattern. This is one of the most frequently tested join patterns across all SQL interview levels. The interviewer checks whether you use LEFT JOIN + WHERE IS NULL, NOT EXISTS, or NOT IN. All three approaches work, but LEFT JOIN + IS NULL is the most visual and commonly expected in DE interviews.

Approach:

SELECT p.product_id, p.name FROM products p LEFT JOIN order_items oi ON p.product_id = oi.product_id WHERE oi.product_id IS NULL. Explain why the IS NULL check must be on the join key from the right table. Mention NOT EXISTS as an alternative. Warn about NOT IN with NULLs: if the subquery returns any NULL, NOT IN returns empty results.

Q3: A LEFT JOIN query unexpectedly returns more rows than the left table. What happened?

What they test:

Understanding of join fan-out. LEFT JOIN guarantees at least one row per left-table row. But if the right table has multiple matching rows (the join key is not unique on the right side), the left row is duplicated for each match. This is a fan-out. 100 customers joining to an orders table where some customers have 5 orders produces up to 500 rows.

Approach:

Identify the cause: the join key is not unique in the right table. Example: customer_id appears multiple times in orders. Each order creates a separate joined row. Solutions: pre-aggregate the right table before joining (SELECT customer_id, COUNT(*) ... GROUP BY), or use a subquery with ROW_NUMBER to pick one row per key, or use DISTINCT ON.

Q4: Explain why moving a WHERE condition on the right table to the ON clause changes the result of a LEFT JOIN.

What they test:

ON vs WHERE semantics for outer joins. The ON clause filters before the join: rows that do not meet the condition are excluded from the right table, so the left row appears with NULLs. The WHERE clause filters after the join: rows where the right-side column is NULL (no match) are filtered out entirely. This effectively converts the LEFT JOIN to an INNER JOIN for that condition.

Approach:

Walk through an example. LEFT JOIN orders ON customers.id = orders.customer_id AND orders.year = 2024. Customers without 2024 orders still appear (with NULL). Move the year filter to WHERE: WHERE orders.year = 2024. Now customers without 2024 orders are dropped because orders.year is NULL, which fails the WHERE check. Conclusion: right-table filters belong in ON for LEFT JOINs.

LEFT JOIN FAQ

What is a LEFT OUTER JOIN in SQL?+
A LEFT OUTER JOIN (or simply LEFT JOIN) returns all rows from the left table and the matching rows from the right table. For left-table rows that have no match in the right table, the right-side columns are filled with NULL. This preserves every row from the left table regardless of whether a match exists. LEFT JOIN and LEFT OUTER JOIN are identical; the OUTER keyword is optional.
What is the difference between LEFT JOIN and INNER JOIN?+
INNER JOIN returns only rows where both tables have a match on the join condition. LEFT JOIN returns all rows from the left table, even if there is no match in the right table (filling NULLs for unmatched right columns). If every left-table row has at least one match, LEFT JOIN and INNER JOIN produce the same result. The difference only shows when some left-table rows have no corresponding right-table rows.
What is an anti-join in SQL?+
An anti-join finds rows in one table that have no matching rows in another table. The standard SQL pattern is LEFT JOIN combined with WHERE right_key IS NULL. For example: SELECT a.* FROM table_a a LEFT JOIN table_b b ON a.id = b.id WHERE b.id IS NULL returns all rows from table_a that do not exist in table_b. NOT EXISTS and NOT IN are alternative syntax for the same operation.
Why does my LEFT JOIN return more rows than the left table?+
The right table has duplicate values for the join key. LEFT JOIN produces one output row for each matching combination. If a customer has 5 orders, the LEFT JOIN creates 5 rows for that customer. This is called a fan-out. To fix it, aggregate or deduplicate the right table before joining: join to a subquery that has one row per key.

The ON vs WHERE Trap Is Still Free Real Estate for Interviewers

Senior engineers still fail LEFT JOIN questions because they trust the word "outer" and don't check their WHERE clause. Practice the anti-join until the filter placement is muscle memory, not guesswork.