Learn Practice Interview Discuss Daily Jobs

Let AQE Handle It

A medium Spark mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.

Domain: Spark
Difficulty: medium
Seniority: L5

Interview Prompt

A Spark 3.4 job joins a 400 GB search_logs table against a 60 GB ad_impressions table on query_id. Takes 90 minutes. Spark UI shows moderate skew: the top partition has 8x the median row count. A colleague suggests salting, but the codebase is complex and salting would require changes in three downstream jobs. Enable and configure Adaptive Query Execution to let Spark handle the skew at runtime, coalesce small partitions, and optimize the join strategy automatically.

Summary

Five tasks take 35 minutes. The other 195 take 30 seconds.

How This Interview Works

Read the vague prompt (just like a real interview)
Ask clarifying questions to the AI interviewer
Write your spark solution with real code execution
Get instant feedback and a hire/no-hire decision