Loading section...

Pipeline DAG Design

Structuring Pipeline Dependencies A pipeline DAG (Directed Acyclic Graph) defines the execution order of your data transformations. Table A depends on Table B. Table B depends on Table C. The DAG ensures C runs before B, and B runs before A. Getting the DAG wrong means either: tables run before their dependencies are ready (wrong data), or the entire pipeline serializes unnecessarily (slow). The DAG should mirror the medallion layers. Bronze tasks (ingestion) have no dependencies on each other. Silver tasks depend on their bronze source. Gold tasks depend on their silver inputs. No gold task should depend directly on bronze. No task should run backward through the layers. DAG Design Principles Common DAG Anti-Patterns The practical test: draw your DAG on a whiteboard. If it looks like a ta