Loading lesson...
Orchestration and Dependencies: Advanced
Assets, backfills, SLAs, and concurrency are the levers a senior engineer pulls
Assets, backfills, SLAs, and concurrency are the levers a senior engineer pulls
- Category
- Pipeline Architecture
- Difficulty
- advanced
- Duration
- 40 minutes
- Challenges
- 0 hands-on challenges
Topics covered: Asset vs Task Orchestration, Backfills as First-Class Operations, SLAs at the Orchestrator Level, Concurrency, Pools, and Priority, Postmortem: Cadence Change Bug
Lesson Sections
- Asset vs Task Orchestration (concepts: paAssetBasedOrchestration, paTaskBasedOrchestration)
Two philosophies compete in modern orchestration. Task-based orchestration, exemplified by Airflow, treats the work as the primary object: define tasks, declare dependencies between them, and trust that the data will follow. Asset-based orchestration, exemplified by Dagster, treats the data as the primary object: declare the assets the pipeline produces, declare the dependencies between assets, and let the orchestrator infer the work. The two models produce equivalent pipelines on simple cases.
- Backfills as First-Class Operations (concepts: paBackfill, paIdempotentBackfillRequirement)
A backfill is the operation of running a pipeline over a historical date range that it did not run for at the time. Backfills happen for three reasons: a bug in the pipeline produced wrong data and the corrected pipeline must reprocess the affected dates; a new column was added and the historical data must be regenerated to fill it; a new pipeline is launched and needs initial history. In production, backfills are run constantly, and a system that does not treat them as a first-class operation p
- SLAs at the Orchestrator Level (concepts: paOrchestratorSla, paFreshnessPolicy)
An SLA, in orchestration terms, is a commitment that a DAG (or an asset) finishes by a stated time. The marketing dashboard SLA might be 'mart.daily_revenue is fresh for the previous day by 6am Pacific'. The SLA is not a hope. It is a configured guarantee that the orchestrator monitors and alerts on when missed. Declaring SLAs at the orchestrator level rather than in a separate runbook turns a soft expectation into an enforced contract, and it changes the on-call response from 'someone noticed l
- Concurrency, Pools, and Priority (concepts: paConcurrencyControl, paPoolsAndPriority)
Every shared system has finite capacity. A Snowflake warehouse has slot limits. A Spark cluster has executor limits. A Postgres replica has connection limits. An orchestrator that submits work without regard for those limits will, eventually, melt the system it depends on. Concurrency, pools, and priority are the three controls that let the orchestrator submit work in a way the downstream system can absorb. They are simple knobs that prevent the most expensive class of incident in mature deploym
- Postmortem: Cadence Change Bug (concepts: paOrchestrationPostmortem, paAssetTriggerSeam, paFreshnessSla)
A real production incident, redacted from a fintech postmortem. A daily DAG named `daily_finance_close` had been running cleanly for fourteen months. In month fifteen, it began missing its 6am Pacific SLA twice a week. Nothing in the DAG had changed. The investigation revealed that an upstream Stripe ingestion DAG had been quietly migrated from a 30-minute cadence to a 5-minute cadence three weeks earlier. The change was a clear improvement upstream. It broke the downstream because of every assu