SQL Reference

SQL INNER JOIN

INNER JOIN appears in 29% of verified DE SQL interview questions. That's 125 of 429 rounds in our corpus, the most frequent join type by a wide margin. LEFT JOIN clocks in at 17%, and multi-table joins at 5%. If you're calibrating prep time by frequency, INNER JOIN lives in the top three SQL constructs you'll face.

The rule is two words: matched rows. Everything else, cardinality, fan-out, silent row loss, comes from that rule. We'll walk the syntax, the join algorithms, and the 4 failure modes that show up in phone screens.

29%

Questions using INNER JOIN

125

Of 429 SQL rounds

17%

LEFT JOIN comparison

5%

Multi-table join questions

Source: DataDriven analysis of 1,042 verified data engineering interview rounds.

INNER JOIN Syntax

Three moving parts: left table, right table, ON clause. The ON clause resolves roughly 95% of the time to a foreign-key to primary-key comparison, but the language accepts any boolean expression. The part interviewers watch is whether you know which side you put in the driving position.

-- Basic INNER JOIN
SELECT
  o.order_id,
  o.order_date,
  c.name AS customer_name,
  c.email
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id;

The INNER keyword is optional. Writing JOIN alone defaults to INNER JOIN. But in queries that mix join types, writing it explicitly prevents confusion.

How the Engine Processes It

Step 1: The engine picks a driving table (usually the smaller one, chosen by the query planner).

Step 2: For each row in the driving table, it probes the other table for matching rows using the ON condition.

Step 3: Matched pairs are combined into output rows. Unmatched rows from either side are discarded.

-- Join with additional filter
SELECT
  o.order_id,
  o.total,
  p.name AS product_name,
  p.category
FROM orders o
INNER JOIN order_items oi ON o.order_id = oi.order_id
INNER JOIN products p ON oi.product_id = p.id
WHERE o.order_date >= '2024-01-01'
  AND p.category = 'electronics';

Interview note: Interviewers watch where you put filters. Conditions that belong in WHERE (like date ranges or category filters) should not go in the ON clause. ON defines the relationship between tables. WHERE filters the result. Mixing them up changes results with LEFT JOIN and signals confusion about join semantics.

INNER JOIN vs Other Joins

Each join type handles unmatched rows differently. This table shows exactly what happens to rows that exist in one table but not the other.

Join TypeLeft-only rowsRight-only rowsMatched rows
INNER JOINDroppedDroppedIncluded
LEFT JOINIncluded (NULLs for right cols)DroppedIncluded
RIGHT JOINDroppedIncluded (NULLs for left cols)Included
FULL OUTER JOINIncluded (NULLs)Included (NULLs)Included
CROSS JOINAll combinedAll combinedCartesian product

The critical distinction: INNER JOIN is lossy. It drops rows that do not match. If your orders table has 1,000 rows but only 950 have valid customer_ids, INNER JOIN with customers returns 950 rows. Those 50 orphaned orders vanish from the result. LEFT JOIN would keep them with NULL customer columns.

-- INNER JOIN: only orders with valid customers
SELECT o.order_id, c.name
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id;
-- Returns 950 rows (50 orphaned orders dropped)

-- LEFT JOIN: all orders, NULLs for missing customers
SELECT o.order_id, c.name
FROM orders o
LEFT JOIN customers c ON o.customer_id = c.id;
-- Returns 1000 rows (50 rows have NULL name)

Interview note: A common follow-up: “Your query returns fewer rows than expected. What happened?” The answer is almost always an INNER JOIN dropping unmatched rows. State that you would check for NULLs in the join column or switch to LEFT JOIN to debug.

Multi-Table INNER JOIN

Production queries rarely join just two tables. A typical analytics query might join orders to customers, then to products through an order_items bridge table, then to regions. Each additional JOIN adds another ON clause and another opportunity for row multiplication or row loss.

-- 4-table join: orders -> order_items -> products -> categories
SELECT
  c.name AS customer,
  o.order_id,
  o.order_date,
  p.name AS product,
  cat.name AS category,
  oi.quantity,
  oi.unit_price,
  oi.quantity * oi.unit_price AS line_total
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id
INNER JOIN order_items oi ON o.order_id = oi.order_id
INNER JOIN products p ON oi.product_id = p.id
INNER JOIN categories cat ON p.category_id = cat.id
WHERE o.order_date >= '2024-01-01'
ORDER BY o.order_date DESC, line_total DESC;

Row Count Behavior

When you join through a bridge table (like order_items), the output has one row per line item, not one row per order. An order with 3 items produces 3 output rows. An order with 10 items produces 10. This is correct behavior for a line-item report, but it breaks aggregate calculations if you are not careful.

-- BUG: order total is counted once per line item
SELECT
  c.name,
  SUM(o.total) AS revenue  -- WRONG: inflated by item count
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id
INNER JOIN order_items oi ON o.order_id = oi.order_id
GROUP BY c.name;

-- FIX: aggregate line items, not order totals
SELECT
  c.name,
  SUM(oi.quantity * oi.unit_price) AS revenue  -- CORRECT
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id
INNER JOIN order_items oi ON o.order_id = oi.order_id
GROUP BY c.name;

Interview note: The “inflated SUM after JOIN” bug is a top-5 interview trap. When you join to a one-to-many table, any aggregate on the “one” side gets counted multiple times. The fix is to either aggregate at the correct grain or use a subquery/CTE to pre-aggregate before joining.

INNER JOIN Performance

Join performance depends on three things: the join algorithm the engine picks, whether the join columns are indexed, and the cardinality of the tables. Here is what matters in practice.

Join Algorithms

Most SQL engines use one of three algorithms. The optimizer picks based on table sizes and available indexes.

Nested Loop: For each row in the smaller table, scan the larger table for matches. Fast when the inner table has an index on the join column. Common for small-to-large joins.

Hash Join: Build a hash table from the smaller table, then probe it with each row from the larger table. Best for large unsorted tables without useful indexes. O(n + m) time.

Merge Join: Sort both tables on the join column, then walk through them in parallel. Efficient when both inputs are already sorted (e.g., from an index scan or a preceding ORDER BY).

Indexing Join Columns

The single biggest performance factor for INNER JOIN is whether the join column on the probed table is indexed. Without an index, the engine does a full table scan for every probe row.

-- Ensure the FK column is indexed
CREATE INDEX idx_orders_customer_id
ON orders (customer_id);

-- Now this join uses an index lookup instead of a full scan
SELECT o.order_id, c.name
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id;

Join Order

Modern query optimizers reorder joins automatically. Writing tables in a specific FROM clause order does not force the engine to join them in that sequence. PostgreSQL, for instance, tests all possible join orders for up to 8 tables (controlled by join_collapse_limit). For queries with many joins, the optimizer may not explore every permutation, so breaking a complex query into CTEs can sometimes help by giving the planner smaller subproblems.

5 INNER JOIN Interview Questions

These questions appear in phone screens and onsites at companies from startups to FAANG. The first two are warmups. The rest test deeper understanding.

Q1: Write a query to get all orders with their customer names. Only include orders that have a valid customer.

What they test:

Basic INNER JOIN syntax. Can you join two tables on a foreign key and understand that INNER JOIN excludes orders with no matching customer?

Approach:

SELECT o.order_id, c.name FROM orders o INNER JOIN customers c ON o.customer_id = c.id. State that INNER JOIN naturally excludes orphaned orders because it requires a match in both tables.

Q2: Your revenue report shows higher numbers than expected after adding a product details join. What went wrong?

What they test:

Row multiplication awareness. When you join through a one-to-many relationship, the “one” side gets duplicated. Summing a column from the “one” side inflates the total.

Approach:

Explain that joining orders to order_items creates one row per item. If you SUM(o.total), the order total is counted once per line item. Fix by either pre-aggregating in a CTE or summing at the line-item grain (SUM(oi.quantity * oi.unit_price)).

Q3: When would you choose LEFT JOIN over INNER JOIN?

What they test:

Judgment about data completeness. This is a discussion question, not a coding question. They want to hear you reason about when dropping unmatched rows is acceptable and when it is not.

Approach:

Use INNER JOIN when both sides must exist (e.g., order with its customer for a billing report). Use LEFT JOIN when the left side is the authoritative set and missing matches are meaningful (e.g., all users including those with zero orders, or all products including unsold ones).

Q4: This query joining 5 tables is slow. How do you debug it?

What they test:

Performance debugging methodology. They do not want you to guess. They want a systematic approach: examine the query plan, check indexes, verify cardinalities.

Approach:

Run EXPLAIN ANALYZE. Check for sequential scans on large tables (missing indexes on join columns). Check for unexpected row counts at each join step (a sign of a many-to-many join producing a cartesian explosion). Add indexes on foreign key columns. Consider breaking into CTEs to give the optimizer smaller subproblems.

Q5: Write a self-join to find employees who share the same manager.

What they test:

Self-join mechanics. Joining a table to itself requires aliasing and a clear join condition. The trick is avoiding pairing an employee with themselves.

Approach:

SELECT e1.name, e2.name, e1.manager_id FROM employees e1 INNER JOIN employees e2 ON e1.manager_id = e2.manager_id WHERE e1.id < e2.id. The e1.id < e2.id condition prevents duplicate pairs and self-pairs.

3 INNER JOIN Mistakes That Fail Interviews

These come up repeatedly in interview feedback. Each one is easy to avoid once you know the pattern.

Silent Data Loss from INNER JOIN

Using INNER JOIN when the question requires all rows from one side. If a customer has no orders, INNER JOIN drops that customer. If the question says “list all customers and their order count,” customers with zero orders must appear with a count of 0. INNER JOIN gives you no way to do that.

-- WRONG: drops customers with zero orders
SELECT c.name, COUNT(o.order_id) AS order_count
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
GROUP BY c.name;

-- CORRECT: keeps all customers
SELECT c.name, COUNT(o.order_id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.name;

Read the question carefully. “All customers” means LEFT JOIN. “Customers who have ordered” means INNER JOIN. The phrasing tells you which join to use.

Accidental Cartesian Product

Forgetting the ON clause or joining on a non-unique column creates a cartesian product. If table A has 1,000 rows and table B has 1,000 rows, a cartesian product returns 1,000,000 rows. This crashes dashboards and burns through compute credits in cloud warehouses.

-- BUG: joining on date creates many-to-many
SELECT a.*, b.*
FROM daily_sales a
INNER JOIN daily_expenses b ON a.date = b.date;
-- If 50 sales rows and 30 expense rows share the same date,
-- that date produces 50 * 30 = 1500 rows

-- FIX: aggregate before joining
WITH sales AS (
  SELECT date, SUM(amount) AS total_sales
  FROM daily_sales GROUP BY date
),
expenses AS (
  SELECT date, SUM(amount) AS total_expenses
  FROM daily_expenses GROUP BY date
)
SELECT s.date, s.total_sales, e.total_expenses
FROM sales s
INNER JOIN expenses e ON s.date = e.date;

Always check the cardinality of your join columns. If neither side is unique on the join key, you get a many-to-many join. The output row count is the product of matching rows, not the sum.

Putting Filters in ON Instead of WHERE

For INNER JOIN, moving a filter from WHERE to ON produces the same result. But this creates a bad habit. With LEFT JOIN, putting a filter on the right table in ON vs WHERE gives different results. In ON, unmatched left rows still appear with NULLs. In WHERE, they are filtered out, effectively converting the LEFT JOIN to an INNER JOIN.

-- INNER JOIN: both produce the same result
-- But the second is clearer about intent
SELECT o.order_id, c.name
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id
  AND c.country = 'US';  -- filter in ON

SELECT o.order_id, c.name
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id
WHERE c.country = 'US';  -- filter in WHERE (preferred)

Keep ON for relationship conditions. Keep WHERE for filter conditions. This habit protects you when you refactor INNER JOIN to LEFT JOIN later.

SQL INNER JOIN FAQ

What does INNER JOIN do in SQL?+
INNER JOIN combines rows from two tables where the join condition is true. If a row in the left table has no matching row in the right table, it is excluded from the result. If a row in the right table has no match in the left table, it is also excluded. Only rows with matches in both tables appear in the output. INNER JOIN is the default join type: writing JOIN without a qualifier is the same as writing INNER JOIN.
What is the difference between INNER JOIN and LEFT JOIN?+
INNER JOIN returns only rows with matches in both tables. LEFT JOIN returns all rows from the left table, filling in NULLs for columns from the right table when there is no match. If every left row has a match, the results are identical. The difference only appears when some left rows have no corresponding right row. In interviews, start with INNER JOIN unless the question specifically requires preserving unmatched rows.
Is INNER JOIN the same as JOIN?+
Yes. In SQL, JOIN and INNER JOIN are identical. The INNER keyword is optional. Most style guides recommend writing INNER JOIN explicitly because it makes the intent clear when the query also uses LEFT JOIN or FULL JOIN. In interviews, writing INNER JOIN shows precision.
Can INNER JOIN produce duplicate rows?+
Yes. If the join condition matches one row in the left table to multiple rows in the right table (a one-to-many relationship), the left row appears once for each match. For example, joining a customers table to an orders table on customer_id produces one output row per order, duplicating customer columns for customers with multiple orders. This is expected behavior, not a bug. If you want one row per customer, aggregate or use DISTINCT.

29%. Train Where the Volume Is.

125 verified INNER JOIN questions in our corpus. Grab one, run it, read your plan output. Repeat until cardinality intuition is reflex.