Loading lesson...

Storage Layers and Table Formats: Advanced

Open table formats turn object storage into a transactional database without giving up the lake

Open table formats turn object storage into a transactional database without giving up the lake

Category
Pipeline Architecture
Difficulty
advanced
Duration
38 minutes
Challenges
0 hands-on challenges

Topics covered: The Lakehouse: ACID on Object, Snapshot Isolation and Time Travel, Schema Evolution Without Rewrites, The Small Files Problem, Choosing Storage Across Workloads

Lesson Sections

  1. The Lakehouse: ACID on Object (concepts: paLakehouse, paTableFormats, paIceberg, paDeltaLake)

    The lakehouse is a marketing term that names a real architectural shift. The shift is the addition of a metadata layer on top of files in object storage that provides the consistency guarantees a database has and a folder of files lacks. Iceberg, Delta Lake, and Apache Hudi are three implementations of the same idea. The data files are still Parquet (or ORC). The folders look mostly the same. The difference is a small set of metadata files that turn the directory of Parquet into a transactional

  2. Snapshot Isolation and Time Travel (concepts: paSnapshotIsolation, paTimeTravel, paOptimisticConcurrency)

    Snapshot isolation is the consistency guarantee that turns a table format into a real table. A reader sees a consistent point-in-time view of the table, even when writers are landing new data concurrently. Time travel is the operational use of the same machinery: read the table as it existed at a previous snapshot. The two features come from the same underlying mechanism, which is that the table's state is defined by an immutable chain of snapshots and the metadata pointer that names which snaps

  3. Schema Evolution Without Rewrites (concepts: paSchemaEvolution, paColumnMapping, paPartitionEvolution)

    A real production table changes shape over time. Producers add fields. Old fields get renamed. Columns become obsolete and need to be dropped. In a plain lake, every shape change requires rewriting partitions or splitting into a new table. In an open table format, additive and renaming changes happen at the metadata level and the data files stay where they are. The cost is bytes of metadata, not bytes of data. Operations and Their Costs Why Renaming Is Hard in Plain Parquet A plain Parquet file

  4. The Small Files Problem (concepts: paSmallFiles, paCompaction, paZOrdering)

    A 30-second streaming job writes a file every 30 seconds per partition. That is 2,880 files per partition per day. A daily ingestion that writes one large file per partition produces one. The two designs run the same SQL the same way, but the streaming version is dramatically slower because every read has to open thousands of tiny files instead of a few large ones. This is the small files problem, and it is the single most common operational headache in lake and lakehouse environments. The fix i

  5. Choosing Storage Across Workloads (concepts: paStorageSelection, paLakehouse, paMultiLayerStorage)

    A real production system rarely has one workload. The example here is a financial services platform with three concurrent demands on the same logical data: a regulatory archive that must retain seven years of transactions, a customer-facing app that needs single-row lookups under 50 milliseconds, and an analytical BI workload that runs daily aggregations across the entire history. No single storage layer is correct for all three. The right answer is a multi-layer architecture in which each workl