Self-Referential Tables

Concepts covered: dmSelfReferential

Modeling Hierarchies Within a Single Table Some relationships are between rows in the same table. An employee reports to a manager who is also an employee. A category has a parent category. A comment is a reply to another comment. These are self-referential relationships, and they model hierarchies. The most common approach is the adjacency list: a parent_id column that references the same table's primary key. It is simple to implement but hard to query beyond one level. Getting all descendants of a node requires a recursive query (WITH RECURSIVE in SQL). Querying one level deep is easy: WHERE manager_id = 2 gives Bob's direct reports. Querying all of Alice's reports at every level requires a recursive CTE. On deep or wide hierarchies, recursive queries can be slow. For most data engineeri

About This Interactive Section

This section is part of the Relationships lesson on DataDriven, a free data engineering interview prep platform. Each section includes explanations, worked examples, and hands-on code challenges that execute in real time. SQL queries run against a live PostgreSQL database. Python runs in a sandboxed Docker container. Data modeling problems validate against interactive schema canvases. All content is framed around what data engineering interviewers actually test at companies like Meta, Google, Amazon, Netflix, Stripe, and Databricks.

How DataDriven Lessons Work

DataDriven combines four interview rounds (SQL, Python, Data Modeling, Pipeline Architecture) with adaptive difficulty and spaced repetition. Easy problems get harder as you improve. Weak concepts resurface until you master them. Your readiness score tracks progress across every topic interviewers test. Every lesson section ends with problems you solve by writing and running real code, not by picking multiple-choice answers.