Loading...
Let AQE Handle It
A medium spark interview practice problem on DataDriven. Write and execute real spark code with instant grading.
- Domain
- spark
- Difficulty
- medium
- Seniority
- senior
Problem
A Spark 3.4 job joins a 400 GB search_logs table against a 60 GB ad_impressions table on query_id. Takes 90 minutes. Spark UI shows moderate skew: the top partition has 8x the median row count. A colleague suggests salting, but the codebase is complex and salting would require changes in three downstream jobs. Enable and configure Adaptive Query Execution to let Spark handle the skew at runtime, coalesce small partitions, and optimize the join strategy automatically.
Practice This Problem
Solve this spark problem with real code execution. DataDriven runs your solution and grades it automatically.