SQL Self Join: Syntax, Use Cases, and Interview Questions

Self joins show up in 6% of verified DE interview SQL rounds, and the shape is predictable. Nine times out of ten the table is called employees, the two aliases are e and m, and the question is walking an org chart from IC to reporting manager. It's the archetypal adjacency-list-on-a-single-table problem, and it belongs in the same architectural bucket as recursive CTEs: tools for relating rows within the same dataset.

Self Join Syntax

A self join uses the same table twice with different aliases. The aliases are required because without them the database cannot tell which 'copy' of the table you mean in the ON clause and SELECT list.

6%
Rounds use self joins
9/10
Employee/manager shape
1
Level per join
2+
Use CTE instead

In this query, the employees table appears twice: once as e (the employee) and once as m (the manager). The ON clause says: find the row in m whose employee_id matches the current employee's manager_id.

Architectural note: INNER JOIN drops the root node of the hierarchy, which is almost never what a reporting query wants. LEFT JOIN keeps every employee and projects NULL for the top of the chain, which matches the shape a downstream dashboard or a dimension table expects. Pick the join type based on whether the root is data or noise.

Prepare for the interview
01 / Open invite
02min.

Know Joins the way the interviewer who asks it knows it.

a Joins query, the same shape a screen would give you.
The diff against expected. Where ties broke. What you missed.
sandbox
1SELECT user_id,
2 COUNT(*) AS sessions
3FROM events
4WHERE ts >= NOW() - INTERVAL '7 day'
5
Execute your solution0.4s avg.
Capital OneInterview question
Solve a Joins problem

Basic Self Join

-- Basic self join: employees and their managers
SELECT
  e.employee_id,
  e.name AS employee_name,
  e.manager_id,
  m.name AS manager_name
FROM employees e
INNER JOIN employees m ON e.manager_id = m.employee_id;

Use Case 1: Hierarchies

The classic self join use case. An employee table has an employee_id and a manager_id. The manager_id points to another employee_id in the same table. To get an employee alongside their manager's details, you self join.

Self Join with LEFT JOIN

-- Keep all employees, show NULL for top-level managers
SELECT
  e.employee_id,
  e.name AS employee_name,
  m.name AS manager_name  -- NULL for CEO / top-level
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id;

Interview note: Always clarify whether the question requires all employees or only those with managers. This determines whether you use INNER JOIN or LEFT JOIN. Asking this question shows the interviewer you think about data completeness.

Each additional level adds another self join. For deep hierarchies (5+ levels), this becomes impractical. At that point, a recursive CTE is the better approach.

Prepare for the interview
03 / From the bank03 of many
03hand-picked.

Top 2 Ad Campaigns by Spend

Medium18 min

Two campaigns. Most of the budget.

Multi-Level Hierarchy

-- Multi-level hierarchy: employee -> manager -> director
SELECT
  e.name AS employee,
  m.name AS manager,
  d.name AS director
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id
LEFT JOIN employees d ON m.manager_id = d.employee_id;

Use Case 2: Consecutive Event Comparison

Self joins can compare each row to the next or previous row. This is useful for detecting changes, calculating differences, and finding gaps in sequences. Window functions (LAG/LEAD) are often cleaner for this, but self joins work in databases that do not support window functions and are still tested in interviews.

Recursive CTE for Arbitrary Depth

-- Recursive CTE for arbitrary-depth hierarchy
WITH RECURSIVE org_chart AS (
  -- Base case: top-level (no manager)
  SELECT employee_id, name, manager_id, 0 AS depth
  FROM employees
  WHERE manager_id IS NULL

  UNION ALL

  -- Recursive case: join children to parents
  SELECT e.employee_id, e.name, e.manager_id, oc.depth + 1
  FROM employees e
  INNER JOIN org_chart oc ON e.manager_id = oc.employee_id
)
SELECT * FROM org_chart ORDER BY depth, name;

Interview tip: Start with a self join for 1-2 levels. If the interviewer asks 'what if the hierarchy is N levels deep,' switch to a recursive CTE. Showing both approaches demonstrates range.

Use Case 3: Finding Pairs

Self joins find rows that share a common attribute. This pattern answers questions like 'which employees share the same manager' or 'which products are in the same category.'

Consecutive Row Comparison

-- Compare each day's revenue to the previous day
SELECT
  curr.date,
  curr.revenue AS today_revenue,
  prev.revenue AS yesterday_revenue,
  curr.revenue - prev.revenue AS daily_change,
  CASE
    WHEN prev.revenue = 0 THEN NULL
    ELSE ROUND(100.0 * (curr.revenue - prev.revenue) / prev.revenue, 1)
  END AS pct_change
FROM daily_revenue curr
INNER JOIN daily_revenue prev
  ON curr.date = prev.date + INTERVAL '1 day'
ORDER BY curr.date;

-- Find consecutive logins (user logged in two days in a row)
SELECT DISTINCT a.user_id, a.login_date
FROM user_logins a
INNER JOIN user_logins b
  ON a.user_id = b.user_id
  AND a.login_date = b.login_date + INTERVAL '1 day';

Self Join vs LAG/LEAD

-- Window function approach (cleaner for ordered comparisons)
SELECT
  date,
  revenue AS today_revenue,
  LAG(revenue) OVER (ORDER BY date) AS yesterday_revenue,
  revenue - LAG(revenue) OVER (ORDER BY date) AS daily_change
FROM daily_revenue
ORDER BY date;

The window function is shorter and avoids the join entirely. But the self join is more flexible: it can compare non-adjacent rows, handle irregular intervals, and join on multiple conditions that do not fit the strict row-ordering that LAG/LEAD require.

Pair Finding

-- Employees who share the same manager
SELECT
  e1.name AS employee_1,
  e2.name AS employee_2,
  e1.manager_id
FROM employees e1
INNER JOIN employees e2
  ON e1.manager_id = e2.manager_id
  AND e1.employee_id < e2.employee_id  -- unique pairs only
ORDER BY e1.manager_id, e1.name;

-- Products in the same category with similar prices (within 10%)
SELECT
  p1.name AS product_1,
  p2.name AS product_2,
  p1.category,
  p1.price AS price_1,
  p2.price AS price_2
FROM products p1
INNER JOIN products p2
  ON p1.category = p2.category
  AND p1.product_id < p2.product_id
  AND ABS(p1.price - p2.price) <= p1.price * 0.10
ORDER BY p1.category, p1.name;

-- Customers who purchased the same product
SELECT
  o1.customer_id AS customer_1,
  o2.customer_id AS customer_2,
  o1.product_id,
  p.name AS product_name
FROM orders o1
INNER JOIN orders o2
  ON o1.product_id = o2.product_id
  AND o1.customer_id < o2.customer_id
INNER JOIN products p ON o1.product_id = p.product_id
ORDER BY o1.product_id;

3 Self Join Interview Questions

Self join questions are popular in on-site interviews because they test whether you can model relationships within a single table.

Q1

Given an employees table with (id, name, manager_id), write a query to show each employee's name alongside their manager's name.

LEFT JOIN employees e to employees m on e.manager_id = m.id. Select e.name as employee, m.name as manager. LEFT JOIN keeps employees with NULL manager_id (the CEO).

Q2

Find all pairs of users who logged in on the same day. Each pair should appear once.

Self join user_logins a to user_logins b on a.login_date = b.login_date AND a.user_id < b.user_id. This gives every unique pair of users who logged in on the same day, without duplicates or self-pairs.

Q3

Given a table of daily stock prices (date, ticker, price), find days where the price increased compared to the previous day for each ticker.

Self join: curr.ticker = prev.ticker AND curr.date = prev.date + INTERVAL '1 day'. Filter WHERE curr.price > prev.price. Mention that LAG(price) OVER (PARTITION BY ticker ORDER BY date) is an alternative approach.

Common Self Join Mistakes

These errors come up repeatedly in interviews and code reviews.

Common Mistake

Forgetting to Exclude Self-Pairs

Without a condition to prevent it, a self join pairs every row with itself. An employee becomes their own colleague. Add WHERE e1.employee_id != e2.employee_id at minimum, or WHERE e1.employee_id < e2.employee_id to get unique pairs.

Subtle Bug

Using != Instead of < for Unique Pairs

Using != eliminates self-pairs but keeps both orderings: Alice-Bob and Bob-Alice. Using < keeps only one ordering: only (Alice, Bob) where Alice.id < Bob.id. This halves the result set to the correct unique pairs.

Data Loss

Missing the LEFT JOIN for Root Nodes

In hierarchy queries, using INNER JOIN drops root-level rows (CEO, top-level categories) because they have no parent to match. Always consider whether you need LEFT JOIN to preserve these rows.

SQL Self Join FAQ

What is a self join in SQL?+
A self join is when you join a table to itself. The table appears twice in the FROM clause with different aliases, and the ON clause defines how rows in one copy relate to rows in the other copy. Self joins are used when the relationship you need exists within a single table, such as an employee table where each row has a manager_id that references another employee_id in the same table.
When would you use a self join instead of a window function?+
Self joins and window functions often solve the same problems, but each has strengths. Use a self join when you need to compare rows that are not necessarily adjacent (like finding all pairs that share an attribute) or when you need columns from two different rows in the same output row. Use a window function when you need to compare each row to its immediate predecessor or successor (LAG/LEAD) or compute a running aggregate. Window functions are usually more readable for ordered comparisons, while self joins are better for unordered pair-finding.
How do you avoid duplicate pairs in a self join?+
Use an inequality condition like WHERE a.id < b.id. If you use a.id != b.id, each pair appears twice (A-B and B-A). If you use a.id < b.id, each pair appears once (only the version where the first ID is smaller). This is the standard technique for generating unique combinations from a self join.
02 / Why practice

The adjacency primitive, wired into your query plans

  1. 01

    Active recall beats re-reading by 50%

    Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom

  2. 02

    76% of hiring managers reject on the coding task, not the resume

    From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice

  3. 03

    Five problem shapes cover 80% of data engineer loops

    Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition

Related Guides