# Let AQE Handle It

> Five tasks take 35 minutes. The other 195 take 30 seconds.

Canonical URL: <https://datadriven.io/problems/spark_aqe_skew_auto_optimization>

Domain: PySpark · Difficulty: medium · Seniority: L5

## Problem

A Spark 3.4 job joins a 400 GB search_logs table against a 60 GB ad_impressions table on query_id. Takes 90 minutes. Spark UI shows moderate skew: the top partition has 8x the median row count. A colleague suggests salting, but the codebase is complex and salting would require changes in three downstream jobs. Enable and configure Adaptive Query Execution to let Spark handle the skew at runtime, coalesce small partitions, and optimize the join strategy automatically.

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/spark_aqe_skew_auto_optimization)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.