Joins: Intermediate

Twitter's recommendation engine surfaces thousands of new accounts to every user each week, and generating those suggestions requires showing results even when follow data is incomplete. When a user has zero mutual followers with a suggested account, Twitter still needs to include that account with a zero-connection count rather than hiding it entirely. That behavior is powered by outer joins, which preserve rows from one table even when the other table has nothing to match. The join types you are about to learn are the engine behind every "who to follow" feature on social platforms.

LEFT JOIN

Daily Life
Interviews

Keep all rows even without a match

LEFT JOIN guarantees every row from your primary table appears in the result. What SQL puts in the other table's columns when there is no match is where this join type gets interesting.

THE GHOST ROSTER

Lisa Park flagged a pipeline report. Six customers entered the system. Six rows came out. But three rows are half-empty: names and signup dates intact, order columns blank. No WHERE clause. No filter. Nothing was deleted. How do six rows survive a query with half their data missing?

Two tables. Six customers. Three orders.

customers
idnamesignup
1WuNov 15
2PatelDec 01
3OseiJan 10
4RuizJan 18
5KimOct 05
6TanakaJan 22
orders
order_idcust_idamount
ORD-4411$2,400
ORD-4452$1,800
ORD-4475$3,100
Consider a common scenario: You want to analyze customer activity, including customers who haven't ordered yet. Here's the problem:
customers
cust_idname
C1Alice
C2Bob
C3Carol
C4Dave
orders
order_idcust_idtotal
O1C1$50
O2C1$30
O3C3$75

With INNER JOIN, Bob (C2) and Dave (C4) disappear from results because they have no orders. If you're analyzing "all customers," this silently loses data.

1SELECT
2 c.name,
3 o.order_id
4FROM customers AS c
5INNER JOIN orders AS o
6 ON c.cust_id = o.cust_id
Result
nameorder_id
AliceO1
AliceO2
CarolO3

LEFT JOIN preserves all rows from your primary table, filling in NULLs where no match exists in the secondary table.

LEFT JOIN (or LEFT OUTER JOIN) keeps ALL rows from the left table. If a row has no match in the right table, the right table's columns are filled with NULL.

//

Syntax

1SELECT
2 a.first_name,
3 b.status
4FROM customers AS a
5LEFT JOIN orders AS b
6 ON a.id = b.id
Result
col1col2
A1B1
A2NULL
A3B3

Watch how every row from the left table appears in the result. Rows without a matching right-side record receive NULL for those columns.

1SELECT
2 c.name,
3 o.order_id,
4 o.total
5FROM customers AS c
6LEFT JOIN orders AS o
7 ON c.cust_id = o.cust_id
Join Animation[object Object]
customers
cust_idname
C1Alice
C2Bob
C3Carol
C4Dave
orders
order_idcust_id
O1C1
O2C1
O3C3
result
col_0col_1col_2
C1AliceO1C1
Row 1/4

Notice: Bob and Dave appear in the results even though they have no orders. Their order columns show NULL.

1SELECT
2 c.name,
3 o.order_id,
4 o.total
5FROM customers AS c
6LEFT JOIN orders AS o
7 ON c.cust_id = o.cust_id
Result
nameorder_idtotal
AliceO1$50
AliceO2$30
BobNULLNULL
CarolO3$75
DaveNULLNULL

Practical Applications

LEFT JOIN shines when you need complete visibility into one table while optionally pulling in related data from another.

//

Common Use Cases

CUSTOMERSPRODUCTSEMPLOYEESDATES
CUSTOMERS
Missing orders
Find who never bought
PRODUCTS
No sales yet
Spot unsold inventory
EMPLOYEES
Not assigned
Staff without projects
DATES
No activity
Gaps in time coverage

> Find customers who have never placed an order.

SELECT
  c.first_name
FROM customers AS c
___ transactions AS t
  ON c.customer_id = t.user_id
WHERE t.transaction_id ___
IS NOT NULL
IS NULL
LEFT JOIN

LEFT JOIN is the single most commonly used outer join in practice because the need to preserve a primary entity while optionally pulling in related data appears in nearly every reporting scenario.

The NULL values that LEFT JOIN produces for unmatched rows are not a defect but a deliberate signal that tells you exactly which rows had no corresponding record in the joined table.

When you see unexpected NULLs in a LEFT JOIN result, the first thing to check is whether your join condition is correctly matching the foreign key from one table to the primary key of the other.

Finding unmatched rows

Daily Life
Interviews

Detect missing or orphaned records

When a LEFT JOIN produces NULL values in right-table columns, those NULLs tell you something important: no match was found. This becomes a powerful filter technique.

A powerful pattern: Use LEFT JOIN with WHERE to find rows that have NO match:

1SELECT
2 c.name
3FROM customers AS c
4LEFT JOIN orders AS o
5 ON c.cust_id = o.cust_id
6WHERE o.order_id IS NULL
Result
name
Bob
Dave

This finds customers with zero orders. The WHERE clause filters to only rows where the right table had no match (NULL).

Testing Your Understanding

Practice writing LEFT JOINs and using the NULL pattern to find unmatched rows.

> Complete this query to show all customers alongside their orders, even if a customer has no orders yet.

SELECT
  c.name,
  o.total
FROM customers AS c
___ orders AS o
  ___ c.id = o.customer_id
INNER JOIN
ON
LEFT JOIN

The pattern of LEFT JOIN combined with WHERE on the right-table key being IS NULL is one of the most useful data quality patterns in SQL because it exposes orphaned or incomplete records in a single query.

This technique is particularly valuable during data migrations, integration testing, and routine audits where you need to confirm that every record in one table has a corresponding record in another.

Once you recognize that NULLs from a LEFT JOIN represent the absence of a match rather than missing data, you can use them as a precise filter to surface exactly the gaps you need to investigate.

WHERE vs ON clause filtering

Daily Life
Interviews

Control when filters apply in joins

Where you place filter conditions in a LEFT JOIN query dramatically affects which rows survive. Understanding execution order explains why.

One of the most common mistakes with LEFT JOIN is accidentally converting it into an INNER JOIN by filtering on the right table in a WHERE clause. Understanding why this happens requires knowing the order of operations.

//

Query Evaluation

The database processes your query in a specific order. For a LEFT JOIN query, the evaluation happens in these phases:

01
1. FROM
Identify the left table
02
2. JOIN+ON
Pair rows, apply condition
03
3. LEFT fill
Add NULL for unmatched
04
4. WHERE
Filter combined results
05
5. SELECT
Return requested columns

The critical insight: WHERE is evaluated AFTER the join is complete. This means WHERE filters the already-joined result set, including the NULL-filled rows from unmatched left table entries.

//

WHERE on Right Columns

Consider this query that tries to find customers with high-value orders:
1SELECT
2 c.name,
3 o.total
4FROM customers AS c
5LEFT JOIN orders AS o
6 ON c.cust_id = o.cust_id
7WHERE o.total > 50
Result
nametotal
Alice75
TIP
This WHERE clause on the right table column removes NULL rows, effectively converting the LEFT JOIN to an INNER JOIN!
Let's trace through what happens step by step:
STEP-BY-STEP EVALUATION
  • Start with customers (Alice, Bob, Carol)
  • LEFT JOIN with orders on cust_id
  • Bob has no orders, so his row gets NULL for o.total
  • WHERE o.total > 50 filters: Alice (75) kept, Bob (NULL) removed, Carol (30) removed

This behavior stems from how SQL handles NULL in comparisons.

Filter Placement Effects

Moving filters between ON and WHERE clauses changes when conditions are evaluated and which rows survive the query.

//

Filter in ON Clause

Moving the filter condition into the ON clause changes when the filter is applied:

1SELECT
2 c.name,
3 o.total
4FROM customers AS c
5LEFT JOIN orders AS o
6 ON c.cust_id = o.cust_id
7 AND o.total > 50
Result
nametotal
Alice75
BobNULL
CarolNULL
Now the evaluation changes dramatically:
Match with conditions
Match with conditions
Each customer tries to match orders where cust_id matches AND total > 50
Evaluate per customer
Evaluate per customer
Alice matches (total=75). Bob has no orders. Carol's order (total=30) fails the condition.
Fill unmatched with NULL
Fill unmatched with NULL
Bob and Carol get NULL for order columns. No WHERE clause removes them.
All 3 customers appear
All 3 customers appear
The LEFT JOIN preserves every left row regardless of the ON filter outcome.
//

Comparing the Results

With WHERE o.total > 50: Only Alice appears (1 row). Bob and Carol are filtered out because their rows have NULL or values that don't match.

With the filter in ON: All 3 customers appear. Alice shows her total (75), while Bob and Carol show NULL because they don't have matching high-value orders.

Choosing Right Placement

Both ON and WHERE placements have valid use cases. The choice depends on whether you want to preserve unmatched rows.

//

When to Use WHERE vs ON

Both placements have valid use cases - it depends on your intent:
Filter in ON Clause
  • Applied during the join process
  • Unmatched left rows still appear (with NULLs)
  • Use when: "Show all X, with Y data if it meets criteria"
Filter in WHERE
  • Applied after the join completes
  • Removes rows that don't satisfy condition
  • Use when: "Only show X that have Y meeting criteria"
//

Real-World Example

Show all customers with their recent orders (last 30 days), including customers who haven't ordered recently:
1SELECT
2 c.name,
3 c.email,
4 o.order_date,
5 o.total
6FROM customers AS c
7LEFT JOIN orders AS o
8 ON c.cust_id = o.cust_id
9 AND o.order_date >= CURRENT_DATE - INTERVAL '30' DAY
10ORDER BY c.name, o.order_date DESC
Result
nameemailorder_datetotal
Alicealice@email.com2024-01-15150.00
Bobbob@email.comNULLNULL
Carolcarol@email.com2024-01-1085.00

By putting the date filter in the ON clause, customers without recent orders appear with NULL order data instead of being excluded entirely.

TIP
Put right-table filters in ON when you want to preserve all left rows for "optional matching." If you filter on right-table columns in WHERE instead, rows without a match get removed entirely. Remember that NULL comparisons return NULL, not FALSE.

> Show all customers with their high-value orders (over $100), keeping customers without such orders.

SELECT
  c.name,
  o.total
FROM customers AS c
LEFT JOIN orders AS o
  ___ c.id = o.customer_id
  ___ o.total > 100
WHERE
ON
AND

The WHERE vs ON distinction for LEFT JOIN filters is one of the most subtle and consequential decisions in SQL, because the results can look superficially similar while being logically very different.

When a colleague asks why their LEFT JOIN is behaving like an INNER JOIN, the WHERE clause filter on a right-table column is almost always the cause, making this one of the most important debugging patterns to recognize.

A reliable mental model is that the ON clause defines which rows can pair together, while the WHERE clause decides which pairs to keep after all rows have been preserved or filled with NULL.

RIGHT JOIN & FULL OUTER

Daily Life
Interviews

Preserve rows from either or both tables

RIGHT JOIN preserves all rows from the right table instead of the left. While rarely used in practice, understanding it completes the join picture.

RIGHT JOIN is the mirror of LEFT JOIN. It keeps ALL rows from the right table, filling NULL for unmatched left table columns.

//

Syntax

1SELECT
2 a.col1,
3 b.col2
4FROM left_table AS a
5RIGHT JOIN right_table AS b
6 ON a.id = b.id
Result
col1col2
A1B1
NULLB2
A3B3

Notice how every right-side row appears regardless of whether a matching left row exists. Absent left-side values become NULL.

1SELECT
2 a.col1,
3 b.col2
4FROM left_table AS a
5RIGHT JOIN right_table AS b
6 ON a.id = b.id
Join Animation[object Object]
customers
cust_idname
C1Alice
C3Carol
orders
order_idcust_idtotal
O1C1$50
O2C3$30
O3C5$99
result
col_0col_1col_2
C1AliceO1C1$50
Row 1/2

Notice that the result preserves every order row. When a customer has no matching record (C5), the name column fills with NULL rather than dropping the row entirely.

//

Invalid References

Consider a scenario where some orders reference customers that don't exist (orphaned data):
1SELECT
2 c.name,
3 o.order_id,
4 o.total
5FROM customers AS c
6RIGHT JOIN orders AS o
7 ON c.cust_id = o.cust_id
Result
nameorder_idtotal
AliceO1$50
CarolO2$30
NULLO3$99

Order O3 appears with NULL name because it references a customer that doesn't exist. RIGHT JOIN ensures all orders are shown.

TIP
RIGHT JOIN is rarely used in practice. Most queries can be rewritten using LEFT JOIN by swapping the table order. Many SQL style guides recommend using LEFT JOIN exclusively for consistency.
//

RIGHT JOIN vs LEFT JOIN

These two queries produce identical results:
1SELECT
2 c.name,
3 o.order_id
4FROM customers AS c
5RIGHT JOIN orders AS o
6 ON c.cust_id = o.cust_id

FULL OUTER JOIN Essentials

FULL OUTER JOIN preserves all rows from both tables, making it ideal for data reconciliation and comparison tasks.

//

FULL OUTER JOIN

FULL OUTER JOIN combines LEFT and RIGHT JOIN. It keeps ALL rows from BOTH tables, filling NULL where there's no match on either side.

//

Syntax

1SELECT
2 a.col1,
3 b.col2
4FROM table1 AS a
5FULL OUTER JOIN table2 AS b
6 ON a.id = b.id
Result
col1col2
A1B1
A2NULL
NULLB3
A4B4

Observe how neither table loses a row. Both sides are fully preserved, with NULL standing in wherever a match is absent.

1SELECT
2 a.col1,
3 b.col2
4FROM table1 AS a
5FULL OUTER JOIN table2 AS b
6 ON a.id = b.id
Join Animation[object Object]
users
user_idname
U1Alice
U2Bob
U3Carol
user_sessions
user_idemail
U1alice@co
U3carol@co
U4dan@co
result
col_0col_1col_2col_3
U1AliceU1alice@co
Row 1/3

Both tables are fully represented in the output. Rows unique to users get NULL in the right columns, rows unique to user_sessions get NULL in the left columns, and matched rows appear complete.

//

Comparing Two Lists

FULL OUTER JOIN is perfect for reconciliation - comparing two lists to find discrepancies:

users
user_idname
U1Alice
U2Bob
U3Carol
user_sessions
user_idemail
U1alice@co.com
U3carol@co.com
U4dan@co.com
1SELECT
2 a.user_id AS a_id,
3 a.name,
4 b.user_id AS b_id,
5 b.email
6FROM users AS a
7FULL OUTER JOIN user_sessions AS b
8 ON a.user_id = b.user_id
Result
a_idnameb_idemail
U1AliceU1alice@co.com
U2BobNULLNULL
U3CarolU3carol@co.com
NULLNULLU4dan@co.com
The result shows: Bob exists only in users. Dan exists only in user_sessions. Alice and Carol exist in both. This is data reconciliation in action.

> Compare user presence between transactions and push notifications to find all discrepancies.

SELECT
  t.user_id,
  p.user_id
FROM transactions AS t
___ push_notifs AS p
  ___ t.user_id = p.user_id
FULL OUTER JOIN
ON
LEFT JOIN

FULL OUTER JOIN is the most comprehensive join type because it guarantees that no row from either table will be dropped, making it ideal for any scenario where you need a complete picture of both datasets.

Data reconciliation between two systems is the most common real-world use case for FULL OUTER JOIN because it lets you identify records that exist in one system but not the other in a single pass.

RIGHT JOIN can almost always be rewritten as a LEFT JOIN by swapping the table order, which is why most teams adopt a convention of using only LEFT JOIN to keep their SQL consistent and easier to read.

Multiple JOINs

Daily Life
Interviews

Chain three or more tables together

Each join type has specific data preservation characteristics. Understanding these helps you select the right tool for your analysis.
//

Choosing the Right Join

Each join type preserves different rows. Choose based on which data you can't afford to lose:
Join TypePreservesFills NULL
INNEROnly matchingNever
LEFTAll left rowsRight columns
RIGHTAll right rowsLeft columns
FULL OUTERAll rows bothBoth sides
//

Relational Context

For an in-depth exploration of table relationships and cardinality notation, see the Relationships lesson in Data Modeling.

The type of relationship between tables often guides your join choice:
customersPKcust_idnameordersorder_idFKcust_idtotal

In a 1:N relationship from customers to orders, LEFT JOIN from customers shows all customers. Some may have zero orders (NULL in order columns), some may have multiple.

Real-world databases normalize data across multiple tables. To reassemble complete information, you often need to chain multiple JOINs together, connecting tables through their foreign key relationships.

//

Why Multiple Joins

Consider an e-commerce database with three tables: customers, orders, and products. The orders table acts as a "bridge" - it contains customer IDs linking to customers, and product IDs linking to products. To get a complete picture of "who ordered what," you need to join all three tables.
Follow how the orders table bridges customers and products, with each join linking rows through their shared keys.
1SELECT
2 c.name AS customer,
3 o.order_id,
4 p.name AS product
5FROM customers AS c
6INNER JOIN orders AS o
7 ON c.cust_id = o.cust_id
8INNER JOIN products AS p
9 ON o.prod_id = p.prod_id
Join Animation[object Object], [object Object]
customers
cust_idname
C1Alice
C2Bob
orders
order_idcust_idprod_id
O1C1P1
O2C1P2
O3C2P1
products
prod_idname
P1Widget
P2Gadget
P3Gizmo
result
col_0col_1col_2
C1AliceO1C1P1
Row 1/2
Each join produces an intermediate result that feeds into the next. The orders table acts as the bridge, connecting customer names on the left to product names on the right through shared keys.
//

Understanding the Query

1SELECT
2 c.name AS customer,
3 o.order_id,
4 p.name AS product
5FROM customers AS c
6INNER JOIN orders AS o
7 ON c.cust_id = o.cust_id
8INNER JOIN products AS p
9 ON o.prod_id = p.prod_id
Result
customerorder_idproduct
AliceO1Widget
AliceO2Gadget
BobO3Widget
//

Database Evaluation

When the database processes multiple JOINs, it evaluates them sequentially from left to right:

STEP-BY-STEP EXECUTION
  • Start with the FROM table (customers)
  • Execute first JOIN to pair customers with orders on cust_id
  • This creates an intermediate result set
  • Execute second JOIN to pair intermediate with products on prod_id
  • Apply SELECT to extract the requested columns

Conceptually, each JOIN produces a new virtual table that becomes the input for the next JOIN. The first JOIN combines customers and orders, then that combined result joins with products.

Performance Considerations

Multiple joins can significantly impact query performance. Understanding the computational cost helps you write efficient queries.
//

Computational Cost

Multiple joins multiply the work the database must do. Each additional join can significantly increase query time:
Fortunately, databases use sophisticated optimizations:
Index lookups
Index lookups
Using indexes on join keys for O(log n) lookups
Hash joins
Hash joins
Building hash tables for O(1) key matching
Join reordering
Join reordering
Starting with smaller tables to reduce intermediate results
Predicate pushdown
Predicate pushdown
Applying WHERE filters early to reduce data volume
//

Join Order

While the optimizer rearranges INNER JOINs for performance, your written order matters for readability. Follow this pattern for clear, maintainable queries:

Do
  • Start from the "main" entity you're reporting on
  • Add related tables one at a time
  • Use short, consistent aliases (c, o, p)
  • Format each JOIN on its own line
Don't
  • Chain more than 4-5 joins without testing intermediate results
  • Mix alias styles within a single query

Advanced Join Patterns

Complex queries often require mixing different join types and managing table aliases carefully to get the exact results you need.
//

Mixing Join Types

You can combine different join types. Example: Show all orders with customer names, including orders without valid customers:
1SELECT
2 c.name,
3 o.order_id,
4 p.name AS product
5FROM orders AS o
6LEFT JOIN customers AS c
7 ON o.cust_id = c.cust_id
8INNER JOIN products AS p
9 ON o.prod_id = p.prod_id
Result
nameorder_idproduct
AliceORD001Laptop
NULLORD002Monitor
BobORD003Keyboard

This keeps all orders (LEFT JOIN to customers), but only orders with valid products (INNER JOIN to products).

> Show all orders with customer names (even orphaned orders), but only for valid products.

SELECT
  c.name,
  o.order_id,
  p.name
FROM orders AS o
___ customers AS c
  ___ o.cust_id = c.id
___ products AS p
  ___ o.prod_id = p.id
ON
LEFT JOIN
ON
INNER JOIN
RIGHT JOIN
//

Alias Pitfalls

With multiple joins, column name collisions become common. Always use aliases:
Good Alias Habits
  • Short, meaningful aliases (c, o, p)
  • Every column qualified with alias
  • Consistent style across the query
Common Mistakes
  • Mixing aliased and non-aliased refs
  • Reusing the same alias letter
  • Leaving columns ambiguous
Outer joins and multi-table queries are essential for working with real-world datasets where not every record has a match. Put these patterns to the test with hands-on challenges.
PUTTING IT ALL TOGETHER

> You are a data analyst at DoorDash reconciling restaurants, menu items, and order data to produce a completeness report for the operations team. Relationships between the three tables are often incomplete, and missing data must be surfaced rather than silently dropped.

LEFT JOIN on restaurants to orders preserves every restaurant row, filling order columns with NULL when no orders exist yet.
WHERE o.order_id IS NULL after the LEFT JOIN identifies restaurants that have never received a single order.
Placing the date filter in the ON clause instead of WHERE preserves the outer join, keeping all restaurants in the result.
Multiple JOINs chain restaurants to orders then orders to menu_items, with each join result feeding the next stage.
KEY TAKEAWAYS
LEFT JOIN keeps all left table rows; unmatched rows get NULL for right columns
RIGHT JOIN is the mirror of LEFT JOIN; most prefer LEFT JOIN for consistency
FULL OUTER JOIN keeps all rows from both tables; perfect for data reconciliation
WHERE on right-table columns filters out NULLs, converting LEFT to INNER JOIN
Put right-table filters in ON clause to preserve outer join behavior
Multiple joins chain sequentially: first join result feeds into the second
Mix join types carefully; the order and type affect which rows survive
Use LEFT JOIN with a WHERE ... IS NULL check on the right key to find rows with no match

Keeping rows even without a match

Category
SQL
Difficulty
intermediate
Duration
33 minutes
Challenges
0 hands-on challenges

Topics covered: LEFT JOIN, Finding unmatched rows, WHERE vs ON clause filtering, RIGHT JOIN & FULL OUTER, Multiple JOINs

Lesson Sections

  1. LEFT JOIN (concepts: sqlLeftJoin)

    Consider a common scenario: You want to analyze customer activity, including customers who haven't ordered yet. Here's the problem: Syntax Practical Applications Common Use Cases

  2. Finding unmatched rows

    Testing Your Understanding This technique is particularly valuable during data migrations, integration testing, and routine audits where you need to confirm that every record in one table has a corresponding record in another.

  3. WHERE vs ON clause filtering

    Query Evaluation WHERE on Right Columns Consider this query that tries to find customers with high-value orders: Let's trace through what happens step by step: Filter Placement Effects Filter in ON Clause Now the evaluation changes dramatically: Comparing the Results Choosing Right Placement When to Use WHERE vs ON Both placements have valid use cases - it depends on your intent: Real-World Example Show all customers with their recent orders (last 30 days), including customers who haven't ordere

  4. RIGHT JOIN & FULL OUTER (concepts: sqlFullOuterJoin)

    Syntax Invalid References Consider a scenario where some orders reference customers that don't exist (orphaned data): RIGHT JOIN vs LEFT JOIN These two queries produce identical results: FULL OUTER JOIN Essentials FULL OUTER JOIN Syntax Comparing Two Lists The result shows: Bob exists only in users. Dan exists only in user_sessions. Alice and Carol exist in both. This is data reconciliation in action.

  5. Multiple JOINs (concepts: sqlMultipleJoins)

    Each join type has specific data preservation characteristics. Understanding these helps you select the right tool for your analysis. Choosing the Right Join Each join type preserves different rows. Choose based on which data you can't afford to lose: Relational Context For an in-depth exploration of table relationships and cardinality notation, see the Relationships lesson in Data Modeling. The type of relationship between tables often guides your join choice: Why Multiple Joins Consider an e-c