DataDriven
LearnPracticeInterviewDiscussDaily

A medium Spark mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.

Domain
Spark
Difficulty
medium
Seniority
L5

Interview Prompt

A Spark 3.4 job joins a 400 GB search_logs table against a 60 GB ad_impressions table on query_id. Takes 90 minutes. Spark UI shows moderate skew: the top partition has 8x the median row count. A colleague suggests salting, but the codebase is complex and salting would require changes in three downstream jobs. Enable and configure Adaptive Query Execution to let Spark handle the skew at runtime, coalesce small partitions, and optimize the join strategy automatically.

Summary

Five tasks take 35 minutes. The other 195 take 30 seconds.

How This Interview Works

  1. Read the vague prompt (just like a real interview)
  2. Ask clarifying questions to the AI interviewer
  3. Write your spark solution with real code execution
  4. Get instant feedback and a hire/no-hire decision

Related

  • All Mock Interviews
  • Practice Mode (untimed)
  • Spark Interview Questions
  • Data Engineering Interview Prep Guide
  • Practice Problems
  • Daily Challenge