DataDriven
LearnPracticeInterviewDiscussDaily

A easy Spark mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.

Domain
Spark
Difficulty
easy
Seniority
L4

Interview Prompt

The order enrichment job joins a 500M-row orders table (80 GB) against a 5,000-row stores dimension (30 MB) on store_id. The join takes 12 minutes and shuffles 80 GB. The physical plan shows SortMergeJoin with Exchange (shuffle) on both sides. The stores table is 30 MB. Why did Spark choose SortMergeJoin, and how do you fix it?

Summary

30 MB table. 80 GB shuffle. Read the plan.

How This Interview Works

  1. Read the vague prompt (just like a real interview)
  2. Ask clarifying questions to the AI interviewer
  3. Write your spark solution with real code execution
  4. Get instant feedback and a hire/no-hire decision

Related

  • All Mock Interviews
  • Practice Mode (untimed)
  • Spark Interview Questions
  • Data Engineering Interview Prep Guide
  • Practice Problems
  • Daily Challenge