SQL Practice

SQL JOIN Practice for Data Engineers

JOINs are the second most-tested SQL category after aggregation. INNER JOIN is the most common join type by far, with LEFT JOIN close behind. Practice every join type against a real database with instant output comparison.

JOIN Types and Practice Problems

Sample problems from each join type, ordered by interview frequency. INNER and LEFT joins cover most real questions. Start there, then move to self-joins and inequality joins when those feel automatic.

INNER JOIN

Returns only rows that match in both tables. The default join type and the one most candidates reach for first. Interviewers test whether you understand that unmatched rows are silently dropped.

Practice problems

  1. 1Join orders to customers and calculate total spend per customer
  2. 2Find products that have been ordered at least once (filter out never-ordered products)
  3. 3Match employees to their department names and filter by hire date
  4. 4Join three tables: users, subscriptions, and payments to find active paying users
  5. 5Find all pairs of students enrolled in the same course

LEFT JOIN

Returns all rows from the left table, with NULLs where the right table has no match. The most interview-relevant join type. Interviewers use it to test your understanding of NULLs, missing data, and the difference between filtering in ON vs WHERE.

Practice problems

  1. 1Find customers who have never placed an order (LEFT JOIN + WHERE IS NULL)
  2. 2List all products with their most recent review, including products with zero reviews
  3. 3Show every employee with their manager name, including the CEO who has no manager
  4. 4Calculate revenue per category, including categories with $0 revenue
  5. 5Find users who signed up but never completed onboarding

Self-Join

A table joined to itself. Used for hierarchical data, finding pairs, and comparing rows within the same table. Self-joins feel unnatural at first, but they solve a specific class of problems that no other join can.

Practice problems

  1. 1Find each employee and their direct manager from a single employees table
  2. 2Identify customers who placed orders on consecutive days
  3. 3Find all pairs of products in the same category with a price difference under $10
  4. 4Match flights where the arrival city equals the departure city of another flight (connections)
  5. 5Find employees who earn more than their manager

Inequality Join

A join where the condition uses <, >, <=, >=, or BETWEEN instead of =. Less common but shows up in date range problems, event attribution, and interval matching. Many candidates have never written one.

Practice problems

  1. 1Match each sale to the price tier that was active on that date (join on date ranges)
  2. 2Find all events that occurred within 30 minutes of a user signup
  3. 3Attribute conversions to the most recent marketing touch before the conversion timestamp
  4. 4Join salary history to find employees who had overlapping salary records
  5. 5Match each order to the exchange rate that was valid on the order date

CROSS JOIN

Produces the Cartesian product of two tables. Every row in table A paired with every row in table B. Useful for generating combinations, date spines, and filling gaps. Small tables only, or your query will run forever.

Practice problems

  1. 1Generate all possible product-store combinations to find which stores are missing which products
  2. 2Cross join a date spine with a list of regions to create a scaffold for reporting
  3. 3Create all possible team matchups for a round-robin tournament
  4. 4Pair every user with every survey question to identify unanswered questions
  5. 5Generate a multiplication table from two sequences of numbers

JOINs That Trip People Up in Interviews

These are not obscure edge cases. They come up regularly, and getting them wrong usually costs the round.

!

LEFT JOIN with a WHERE filter on the right table

Adding WHERE right_table.column = &apos;value&apos; after a LEFT JOIN converts it into an INNER JOIN. The NULLs from unmatched rows fail the WHERE check and get dropped. Move the filter into the ON clause to keep it a true LEFT JOIN. This is one of the most common interview traps.

!

JOIN that silently duplicates rows

If your join key is not unique on one side, each row on the unique side gets duplicated for every match. Joining orders to order_items gives you one row per item, not one row per order. Interviewers ask you to count orders after a join to see if you catch the fan-out.

!

Forgetting that NULL != NULL in joins

Joining on a nullable column drops rows where either side is NULL, because NULL = NULL evaluates to NULL (falsy). If you need to match NULLs, use COALESCE or IS NOT DISTINCT FROM.

!

Three-way joins with ambiguous column names

When you join three or more tables, always alias your tables and qualify every column. Interviewers watch for ambiguous references. If two tables both have an &apos;id&apos; column and you write SELECT id, the query fails.

Worked Example: Total Spend Per Customer With Order Count

Given an orders table and a customers table, find each customer's name, total number of orders, and total spend. Include customers who have never ordered.

SELECT
  c.name,
  COUNT(o.order_id)              AS order_count,
  COALESCE(SUM(o.amount), 0)     AS total_spend
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.customer_id
GROUP BY c.customer_id, c.name
ORDER BY total_spend DESC;

LEFT JOIN keeps every customer even if they have zero orders. Without COALESCE, customers with no orders would show NULL for total_spend instead of 0. Grouping by customer_id (not just name) avoids merging two customers who happen to share a name.

Expected output

 name        | order_count | total_spend
-------------+-------------+------------
 Alice Chen   |           5 |     1240.00
 Bob Park     |           3 |      870.50
 Carol Jones  |           0 |        0.00

SQL JOINs FAQ

How many types of SQL JOINs are there?+
There are five standard join types: INNER JOIN, LEFT JOIN (LEFT OUTER JOIN), RIGHT JOIN (RIGHT OUTER JOIN), FULL OUTER JOIN, and CROSS JOIN. In practice, INNER and LEFT cover 90% of real work. Self-joins and inequality joins are techniques that use these same join types with different conditions.
What is the difference between LEFT JOIN and INNER JOIN?+
INNER JOIN returns only rows that have a match in both tables. LEFT JOIN returns all rows from the left table, filling in NULLs where the right table has no match. If you need to find 'things that do NOT have a match,' use LEFT JOIN with WHERE right_table.key IS NULL.
Why do SQL JOINs appear in every data engineering interview?+
Because real data lives in normalized tables. Every analytical query requires combining data from multiple sources. Based on DataDriven's analysis of verified interview data, JOINs are the second most-tested SQL category after aggregation. Interviewers test JOINs to verify you can reason about cardinality, handle NULLs from outer joins, and avoid row duplication.
How do I avoid row duplication when joining tables?+
Understand the cardinality of your join keys. If the key is one-to-many, each row on the 'one' side gets duplicated for every match on the 'many' side. Solutions: aggregate before joining, use DISTINCT, or join on a unique key. Always check your row count after a join to catch unexpected fan-out.

Write Multi-Table Joins Under Pressure

Knowing the syntax is step one. Writing a four-table join under interview pressure is something else entirely. Multiple challenge formats, all targeting the join patterns that appear most in real interviews.