SQL Self Join: Syntax, Use Cases, and Interview Questions
Self joins show up in 6% of verified DE interview SQL rounds, and the shape is predictable. Nine times out of ten the table is called employees, the two aliases are e and m, and the question is walking an org chart from IC to reporting manager. It's the archetypal adjacency-list-on-a-single-table problem, and it belongs in the same architectural bucket as recursive CTEs: tools for relating rows within the same dataset.
Self Join Syntax
A self join uses the same table twice with different aliases. The aliases are required because without them the database cannot tell which 'copy' of the table you mean in the ON clause and SELECT list.
In this query, the employees table appears twice: once as e (the employee) and once as m (the manager). The ON clause says: find the row in m whose employee_id matches the current employee's manager_id.
Architectural note: INNER JOIN drops the root node of the hierarchy, which is almost never what a reporting query wants. LEFT JOIN keeps every employee and projects NULL for the top of the chain, which matches the shape a downstream dashboard or a dimension table expects. Pick the join type based on whether the root is data or noise.
Know Joins the way the interviewer who asks it knows it.
Basic Self Join
-- Basic self join: employees and their managers
SELECT
e.employee_id,
e.name AS employee_name,
e.manager_id,
m.name AS manager_name
FROM employees e
INNER JOIN employees m ON e.manager_id = m.employee_id;Use Case 1: Hierarchies
The classic self join use case. An employee table has an employee_id and a manager_id. The manager_id points to another employee_id in the same table. To get an employee alongside their manager's details, you self join.
Self Join with LEFT JOIN
-- Keep all employees, show NULL for top-level managers
SELECT
e.employee_id,
e.name AS employee_name,
m.name AS manager_name -- NULL for CEO / top-level
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id;Interview note: Always clarify whether the question requires all employees or only those with managers. This determines whether you use INNER JOIN or LEFT JOIN. Asking this question shows the interviewer you think about data completeness.
Each additional level adds another self join. For deep hierarchies (5+ levels), this becomes impractical. At that point, a recursive CTE is the better approach.
Top 2 Ad Campaigns by Spend
Two campaigns. Most of the budget.
Pulled from debriefs where SQL was the gate.
Multi-Level Hierarchy
-- Multi-level hierarchy: employee -> manager -> director
SELECT
e.name AS employee,
m.name AS manager,
d.name AS director
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id
LEFT JOIN employees d ON m.manager_id = d.employee_id;Use Case 2: Consecutive Event Comparison
Self joins can compare each row to the next or previous row. This is useful for detecting changes, calculating differences, and finding gaps in sequences. Window functions (LAG/LEAD) are often cleaner for this, but self joins work in databases that do not support window functions and are still tested in interviews.
Recursive CTE for Arbitrary Depth
-- Recursive CTE for arbitrary-depth hierarchy
WITH RECURSIVE org_chart AS (
-- Base case: top-level (no manager)
SELECT employee_id, name, manager_id, 0 AS depth
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: join children to parents
SELECT e.employee_id, e.name, e.manager_id, oc.depth + 1
FROM employees e
INNER JOIN org_chart oc ON e.manager_id = oc.employee_id
)
SELECT * FROM org_chart ORDER BY depth, name;Interview tip: Start with a self join for 1-2 levels. If the interviewer asks 'what if the hierarchy is N levels deep,' switch to a recursive CTE. Showing both approaches demonstrates range.
Use Case 3: Finding Pairs
Self joins find rows that share a common attribute. This pattern answers questions like 'which employees share the same manager' or 'which products are in the same category.'
Consecutive Row Comparison
-- Compare each day's revenue to the previous day
SELECT
curr.date,
curr.revenue AS today_revenue,
prev.revenue AS yesterday_revenue,
curr.revenue - prev.revenue AS daily_change,
CASE
WHEN prev.revenue = 0 THEN NULL
ELSE ROUND(100.0 * (curr.revenue - prev.revenue) / prev.revenue, 1)
END AS pct_change
FROM daily_revenue curr
INNER JOIN daily_revenue prev
ON curr.date = prev.date + INTERVAL '1 day'
ORDER BY curr.date;
-- Find consecutive logins (user logged in two days in a row)
SELECT DISTINCT a.user_id, a.login_date
FROM user_logins a
INNER JOIN user_logins b
ON a.user_id = b.user_id
AND a.login_date = b.login_date + INTERVAL '1 day';Self Join vs LAG/LEAD
-- Window function approach (cleaner for ordered comparisons)
SELECT
date,
revenue AS today_revenue,
LAG(revenue) OVER (ORDER BY date) AS yesterday_revenue,
revenue - LAG(revenue) OVER (ORDER BY date) AS daily_change
FROM daily_revenue
ORDER BY date;The window function is shorter and avoids the join entirely. But the self join is more flexible: it can compare non-adjacent rows, handle irregular intervals, and join on multiple conditions that do not fit the strict row-ordering that LAG/LEAD require.
Pair Finding
-- Employees who share the same manager
SELECT
e1.name AS employee_1,
e2.name AS employee_2,
e1.manager_id
FROM employees e1
INNER JOIN employees e2
ON e1.manager_id = e2.manager_id
AND e1.employee_id < e2.employee_id -- unique pairs only
ORDER BY e1.manager_id, e1.name;
-- Products in the same category with similar prices (within 10%)
SELECT
p1.name AS product_1,
p2.name AS product_2,
p1.category,
p1.price AS price_1,
p2.price AS price_2
FROM products p1
INNER JOIN products p2
ON p1.category = p2.category
AND p1.product_id < p2.product_id
AND ABS(p1.price - p2.price) <= p1.price * 0.10
ORDER BY p1.category, p1.name;
-- Customers who purchased the same product
SELECT
o1.customer_id AS customer_1,
o2.customer_id AS customer_2,
o1.product_id,
p.name AS product_name
FROM orders o1
INNER JOIN orders o2
ON o1.product_id = o2.product_id
AND o1.customer_id < o2.customer_id
INNER JOIN products p ON o1.product_id = p.product_id
ORDER BY o1.product_id;3 Self Join Interview Questions
Self join questions are popular in on-site interviews because they test whether you can model relationships within a single table.
Given an employees table with (id, name, manager_id), write a query to show each employee's name alongside their manager's name.
LEFT JOIN employees e to employees m on e.manager_id = m.id. Select e.name as employee, m.name as manager. LEFT JOIN keeps employees with NULL manager_id (the CEO).
Find all pairs of users who logged in on the same day. Each pair should appear once.
Self join user_logins a to user_logins b on a.login_date = b.login_date AND a.user_id < b.user_id. This gives every unique pair of users who logged in on the same day, without duplicates or self-pairs.
Given a table of daily stock prices (date, ticker, price), find days where the price increased compared to the previous day for each ticker.
Self join: curr.ticker = prev.ticker AND curr.date = prev.date + INTERVAL '1 day'. Filter WHERE curr.price > prev.price. Mention that LAG(price) OVER (PARTITION BY ticker ORDER BY date) is an alternative approach.
Common Self Join Mistakes
These errors come up repeatedly in interviews and code reviews.
Forgetting to Exclude Self-Pairs
Without a condition to prevent it, a self join pairs every row with itself. An employee becomes their own colleague. Add WHERE e1.employee_id != e2.employee_id at minimum, or WHERE e1.employee_id < e2.employee_id to get unique pairs.
Using != Instead of < for Unique Pairs
Using != eliminates self-pairs but keeps both orderings: Alice-Bob and Bob-Alice. Using < keeps only one ordering: only (Alice, Bob) where Alice.id < Bob.id. This halves the result set to the correct unique pairs.
Missing the LEFT JOIN for Root Nodes
In hierarchy queries, using INNER JOIN drops root-level rows (CEO, top-level categories) because they have no parent to match. Always consider whether you need LEFT JOIN to preserve these rows.
SQL Self Join FAQ
What is a self join in SQL?+
When would you use a self join instead of a window function?+
How do you avoid duplicate pairs in a self join?+
The adjacency primitive, wired into your query plans
- 01
Active recall beats re-reading by 50%
Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom
- 02
76% of hiring managers reject on the coding task, not the resume
From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice
- 03
Five problem shapes cover 80% of data engineer loops
Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition
Related Guides
Hierarchy traversals, pair-finding, and consecutive-row comparisons with real SQL execution
INNER, LEFT, CROSS, and self-join problems at interview difficulty
Complete guide to every SQL topic tested in data engineering interviews