Loading section...
Cost-Based Optimization
Concepts: paDistributedPrimitives
What They Want to Hear 'The cost-based optimizer uses table and column statistics to choose between join strategies, predicate ordering, and aggregation methods. When statistics are stale, the optimizer makes wrong decisions: it might sort-merge join a 50MB table instead of broadcasting it. I run ANALYZE TABLE periodically for critical tables, and I rely on AQE as a runtime fallback. For UDFs, the optimizer is blind: it cannot push predicates through a UDF or estimate its output cardinality. This is why built-in functions are 10-100x faster than Python UDFs.' This is the answer that shows you understand the optimizer's information needs.