Loading section...

Adaptive Query Execution

Concepts: paSparkExecutionModel

What They Want to Hear 'AQE re-optimizes the query plan at runtime using actual data statistics from completed stages. Three key optimizations: it coalesces small post-shuffle partitions into larger ones, it switches join strategies when runtime statistics show one side is smaller than the optimizer predicted, and it handles skew by splitting large partitions into sub-partitions. The traditional optimizer uses table-level statistics that can be stale; AQE uses partition-level statistics that are always current.' This is the answer that shows you understand that the Catalyst optimizer makes decisions with imperfect information, and AQE corrects those decisions at runtime.