Loading lesson...

Data Quality and Contracts: Advanced

Contracts make quality enforceable; observability makes it diagnosable; tuning makes it trusted

Contracts make quality enforceable; observability makes it diagnosable; tuning makes it trusted

Category
Pipeline Architecture
Difficulty
advanced
Duration
38 minutes
Challenges
0 hands-on challenges

Topics covered: Data Contracts and CI Enforcement, Five Pillars of Observability, Quality SLAs vs Ops SLAs, Tuning Thresholds vs History, Contracts on a Legacy Pipeline

Lesson Sections

  1. Data Contracts and CI Enforcement (concepts: paDataContracts, paContractCi)

    Lesson 1's advanced tier introduced the pipeline-as-product framing: a pipeline has a contract that names producer, consumer, schema, freshness SLA, quality SLA, backfill policy, and deprecation policy. This section turns that framing into a working mechanism. A data contract is the executable form of the commitment. The producer commits to a shape and a set of guarantees; the consumer relies on them; the contract is checked in CI on every change so violations cannot ship. Without enforcement, c

  2. Five Pillars of Observability (concepts: paFivePillars, paDataObservability, paLineage)

    Barr Moses and the Monte Carlo Data team named the five pillars of data observability: freshness, distribution, volume, schema, and lineage. The naming has caught on widely enough that conversations about quality use it as shorthand. The pillars are useful because they are not a checklist; they are a diagnostic framework. When something is wrong with the data, the pillar that detected the symptom narrows the search for the cause. When designing a quality program, the pillars name the gaps that h

  3. Quality SLAs vs Ops SLAs (concepts: paQualitySla, paOperationalSla)

    An SLA states a commitment. The pipeline-as-product framing from Lesson 1 introduced two SLAs as elements of the contract: freshness SLA and quality SLA. They are commonly conflated. They are different commitments to different things, with different consequences when they fail. A pipeline that meets its operational SLA can fail its quality SLA in green. A pipeline that meets its quality SLA can miss its operational SLA without affecting correctness. The producer who treats both as one number end

  4. Tuning Thresholds vs History (concepts: paAlertFatigue, paThresholdTuning)

    A quality system that fires too often gets ignored. The mechanism is simple. On-call engineers receive twenty pages a week. Three of them are real. The remaining seventeen train the engineer to acknowledge alerts without reading them carefully. The next real page lands in the same Slack channel as a false one and is missed. The pipeline that the team thought was protected is, in operational terms, unprotected, because the protection mechanism has been desensitized by its own noise. The fix is no

  5. Contracts on a Legacy Pipeline (concepts: paContractRollout, paLegacyMigration)

    Greenfield contracts are easy. Contracts on a legacy pipeline that has run for four years and has dozens of unknown consumers are hard. The mistake most teams make is treating the rollout as a one-shot migration: write the contract, declare the producer compliant, declare consumers responsible for catching up. The mistake produces breakage and erodes the credibility of the contract program. The disciplined rollout treats existing consumer behavior as the starting contract, evolves toward the des