Data Gets Out of Sync

Why Normalization Exists Normalization exists to prevent a specific class of production failure: data that contradicts itself. When the same fact is stored in multiple rows or multiple tables, any update must touch every copy. Miss one, and your database has two different answers to the same question. This is not a theoretical concern. It is the number one source of 'the numbers do not match' bugs in real systems. Here is the canonical example: an employee table that looks reasonable until a department name changes. Alice and Bob say Engineering is on Floor 3. Carol says Floor 5. Which is correct? Nobody knows without checking a separate source. This happened because someone updated Carol's row but not Alice's and Bob's. The data contradicts itself. 3 The fix: store 'Engineering' and its l

About This Interactive Section

This section is part of the Normalization lesson on DataDriven, a free data engineering interview prep platform. Each section includes explanations, worked examples, and hands-on code challenges that execute in real time. SQL queries run against a live PostgreSQL database. Python runs in a sandboxed Docker container. Data modeling problems validate against interactive schema canvases. All content is framed around what data engineering interviewers actually test at companies like Meta, Google, Amazon, Netflix, Stripe, and Databricks.

How DataDriven Lessons Work

DataDriven combines four interview rounds (SQL, Python, Data Modeling, Pipeline Architecture) with adaptive difficulty and spaced repetition. Easy problems get harder as you improve. Weak concepts resurface until you master them. Your readiness score tracks progress across every topic interviewers test. Every lesson section ends with problems you solve by writing and running real code, not by picking multiple-choice answers.