Medallion Architecture

Concepts covered: dmMedallion

Bronze, Silver, Gold: Progressive Refinement Medallion architecture organizes a data lake into three layers, each with a clear purpose. Raw data enters at bronze. Cleaned and validated data lives in silver. Business-ready aggregates and curated datasets live in gold. The architectural principle is separation of concerns: raw ingestion should not depend on business logic, and business transformations should not depend on source system quirks. This is the most common architecture at companies using Databricks, Delta Lake, or any modern lakehouse platform. If you work with a data lake, you will encounter this pattern. Bronze is append-only and never cleaned. This is intentional. If your silver transformation has a bug, you can fix it and reprocess from bronze. If bronze was already cleaned, y

About This Interactive Section

This section is part of the Design Patterns lesson on DataDriven, a free data engineering interview prep platform. Each section includes explanations, worked examples, and hands-on code challenges that execute in real time. SQL queries run against a live PostgreSQL database. Python runs in a sandboxed Docker container. Data modeling problems validate against interactive schema canvases. All content is framed around what data engineering interviewers actually test at companies like Meta, Google, Amazon, Netflix, Stripe, and Databricks.

How DataDriven Lessons Work

DataDriven combines four interview rounds (SQL, Python, Data Modeling, Pipeline Architecture) with adaptive difficulty and spaced repetition. Easy problems get harder as you improve. Weak concepts resurface until you master them. Your readiness score tracks progress across every topic interviewers test. Every lesson section ends with problems you solve by writing and running real code, not by picking multiple-choice answers.