SQL Interview Deep Dive

SQL Self-Join Practice

Self-joins are a subset of the JOIN category (19.6% of SQL questions). They appear most often in problems involving hierarchical data, consecutive-event detection, and row-pair comparison.

Practice with real SQL execution. See exactly how self-joins work by running them against actual data.

Self-Join Patterns

Four patterns that cover the majority of self-join interview questions. Each pattern has a distinct shape and use case.

Hierarchy Traversal

SELECT e.name AS employee, m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id

The classic self-join. A table references itself through a foreign key (like manager_id pointing to id in the same employees table). This pattern handles org charts, category trees, and any parent-child relationship stored in a single table.

Common use cases

  • Find each employee and their manager name
  • List all direct reports for a given manager
  • Build multi-level org charts (chain multiple self-joins)
  • Find root nodes (WHERE manager_id IS NULL)

Interview tip: Always use LEFT JOIN for hierarchies, not INNER JOIN. INNER JOIN drops the CEO (or any root node) because their manager_id is NULL and has no match.

Row Comparisons Within a Table

SELECT a.date, a.revenue, b.revenue AS prev_day_revenue,
       a.revenue - b.revenue AS daily_change
FROM daily_metrics a
JOIN daily_metrics b ON a.date = b.date + INTERVAL '1 day'

Compare each row to another row in the same table based on a relationship (like consecutive dates, or same user different time periods). This was the standard approach before window functions existed.

Common use cases

  • Day-over-day or period-over-period changes
  • Compare each row to a baseline row
  • Find the next event after a specific event type
  • Calculate deltas between consecutive records

Interview tip: LAG/LEAD window functions now handle most of these cases more cleanly. But self-joins are still better when you need to compare non-adjacent rows or when the comparison logic is complex.

Finding Pairs

SELECT a.user_id AS user_a, b.user_id AS user_b,
       COUNT(*) AS shared_groups
FROM group_members a
JOIN group_members b ON a.group_id = b.group_id
  AND a.user_id < b.user_id
GROUP BY a.user_id, b.user_id

Find all pairs of rows that share a common attribute. The key trick: use a.id < b.id (not !=) to avoid both duplicate pairs and self-pairing. This produces each pair exactly once.

Common use cases

  • Users who belong to the same group
  • Products frequently purchased together
  • Students enrolled in the same courses
  • Find mutual connections in a social graph

Interview tip: Forgetting the < condition (using != instead) doubles your result set with reversed pairs. Forgetting it entirely includes self-pairs. Interviewers watch for this.

Gap Detection

SELECT a.login_date AS gap_start,
       MIN(b.login_date) AS gap_end,
       MIN(b.login_date) - a.login_date AS gap_days
FROM logins a
JOIN logins b ON a.user_id = b.user_id
  AND b.login_date > a.login_date
WHERE NOT EXISTS (
  SELECT 1 FROM logins c
  WHERE c.user_id = a.user_id
    AND c.login_date > a.login_date
    AND c.login_date < MIN(b.login_date)
)
GROUP BY a.user_id, a.login_date

Find gaps or missing entries in a sequence by joining a table to itself and looking for missing intermediate values. Common in time-series analysis and data quality checks.

Common use cases

  • Find gaps in daily login streaks
  • Identify missing sequence numbers
  • Detect periods of inactivity
  • Find holes in scheduling data

Interview tip: Gap detection with self-joins is verbose. LAG/LEAD is usually cleaner for consecutive gap finding. But self-joins handle the case where you need all pairs across the gap, not just adjacent rows.

When Self-Joins Beat Window Functions (and When They Don't)

Window functions replaced self-joins for many common patterns. LAG/LEAD handles consecutive row comparisons more cleanly. ROW_NUMBER handles deduplication without a self-join. Running totals with SUM() OVER are simpler than a correlated self-join.

But self-joins still win in specific situations. Finding all pairs of rows that share an attribute (users in the same group, products bought together) requires a self-join. There is no window function equivalent because the result set is fundamentally different: you need N-squared pairs, not N rows with an extra column.

Hierarchy traversal is another self-join stronghold. While recursive CTEs handle arbitrary depth, a simple two-level or three-level self-join is faster to write and easier to read when the depth is fixed. In an interview setting, chaining three LEFT JOINs for a three-level org chart is faster than setting up a recursive CTE.

The practical rule: if you are comparing each row to one specific other row (previous, next, parent), try a window function first. If you are comparing each row to many other rows or finding pairs, use a self-join. In interviews, knowing both approaches and choosing the right one shows strong SQL judgment.

Practice Problems

Five problems that test self-join patterns. Some are best solved with a self-join, some with window functions. Part of the exercise is choosing the right tool.

Problem 1Easy

Employees Earning More Than Their Manager

Given an employees table (id, name, salary, manager_id), find all employees who earn more than their direct manager. Return the employee name, employee salary, manager name, and manager salary.

Show hint

Self-join employees to itself: FROM employees e JOIN employees m ON e.manager_id = m.id WHERE e.salary > m.salary. This is one of the most commonly asked self-join questions.

Problem 2Hard

Consecutive Login Streaks

Given a logins table (user_id, login_date), find each user&apos;s longest streak of consecutive daily logins. Return user_id and max_streak_days.

Show hint

One approach: self-join logins to itself on user_id and login_date = login_date + 1 to find consecutive pairs, then use a gaps-and-islands technique. Alternatively, use ROW_NUMBER to assign groups and count within each group. The window function approach is cleaner here.

Problem 3Medium

Users in the Same Department Who Never Collaborated

Given employees (id, department_id) and collaborations (user_a, user_b, project_id), find all pairs of employees in the same department who have never appeared together in any collaboration. Return pairs only once (user_a_id < user_b_id).

Show hint

Self-join employees on department_id with a.id < b.id to get all same-department pairs. LEFT JOIN to collaborations and filter WHERE collaborations.project_id IS NULL.

Problem 4Medium

Price Changes: Before and After

Given a price_history table (product_id, effective_date, price), for each price change, show the product_id, old_price, new_price, change_date, and percentage change. A price change is when the same product has a new row with a different price.

Show hint

Self-join price_history on product_id where b.effective_date is the MAX date less than a.effective_date. Or use LAG(price) OVER (PARTITION BY product_id ORDER BY effective_date) for a cleaner solution. Compare both approaches.

Problem 5Hard

Find All Manager Chains (Multi-Level)

Given employees (id, name, manager_id), return every employee with their full management chain up to 4 levels: direct manager, skip-level manager, and skip-skip-level manager. NULLs where the chain ends.

Show hint

Chain four LEFT JOINs: e -> m1 ON e.manager_id = m1.id -> m2 ON m1.manager_id = m2.id -> m3 ON m2.manager_id = m3.id. Each join goes one level up. LEFT JOIN ensures you keep employees even when the chain is shorter than 4 levels.

Worked Example: Employees Earning More Than Their Manager

Given an employees table with id, name, salary, and manager_id, find every employee who earns more than their direct manager. Return both names and salaries.

SELECT
  e.name     AS employee_name,
  e.salary   AS employee_salary,
  m.name     AS manager_name,
  m.salary   AS manager_salary
FROM employees e
JOIN employees m ON e.manager_id = m.id
WHERE e.salary > m.salary
ORDER BY e.salary - m.salary DESC;

The self-join aliases the same table twice: e for the employee row and m for the manager row. The JOIN condition links each employee to their manager via manager_id. The WHERE clause then compares salaries across the two aliases. INNER JOIN is correct here because employees with no manager (manager_id IS NULL) should not appear in this result.

Expected output

 employee_name | employee_salary | manager_name | manager_salary
---------------+-----------------+--------------+---------------
 Anil          |       135000.00 | James        |      120000.00
 Mei           |       110000.00 | James        |      120000.00

Self-Join FAQ

When should I use a self-join instead of a window function?+
Use self-joins when you need to compare rows that are not adjacent (window functions like LAG/LEAD only access fixed offsets), when you need to find all pairs of rows sharing an attribute, or when you need to join on complex conditions beyond simple ordering. Use window functions when you are comparing consecutive rows or computing running aggregates. In interviews, if you can solve it with a window function, that is usually the cleaner answer.
How do I avoid duplicate pairs in a self-join?+
Use a.id < b.id in the JOIN condition instead of a.id != b.id. The < operator ensures each pair appears exactly once (A,B but not B,A) and excludes self-pairs (A,A). Using != gives you both (A,B) and (B,A), doubling your results. Using just a.id <> b.id has the same problem. This is a detail interviewers specifically watch for.
Are self-joins slow on large tables?+
Self-joins create a cross product filtered by the join condition, so they can be expensive. The key is having an index on the join column. A self-join on employees.manager_id with an index on id is fast. A self-join looking for all pairs in a million-row table without good filtering is a performance problem. In interviews, mention indexing and row count awareness. In production, test with EXPLAIN ANALYZE.
Can I self-join more than twice?+
Yes. Multi-level hierarchy queries commonly chain 3-4 self-joins (employee to manager to skip-level to VP). Each join adds one level. For arbitrary-depth hierarchies, use a recursive CTE instead. The rule of thumb: if the depth is fixed and small (under 5 levels), chained self-joins are fine. If the depth varies, use WITH RECURSIVE.

Build Self-Join Intuition

Self-joins click once you have written enough of them. Practice hierarchy queries, pair finding, and row comparisons with real SQL execution.