Loading section...
Compaction Strategies
Concepts: paSmallFiles
What They Want to Hear 'I run compaction as a scheduled maintenance job. For Delta Lake tables, OPTIMIZE rewrites small files into target-sized files without rewriting the entire table. For Iceberg, the rewrite_data_files action does the same. I schedule compaction after the pipeline writes and before downstream reads, so readers always see optimized files. I also set auto-compaction for streaming tables that produce many small files per micro-batch.' This is the answer that shows you treat compaction as infrastructure, not an afterthought.