DataDriven
LearnPracticeInterviewDiscussDailyJobs

A daily Spark join runs for hours

A medium Pipeline Design mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.

Domain
Pipeline Design
Difficulty
medium

Interview Prompt

A daily Spark join runs for hours. The Spark UI shows one task running for 90% of the runtime while the rest finish fast: a single hot key is skewing the shuffle. Adding executors didn't help. Design the job so the skewed join completes on time.

How This Interview Works

  1. Read the vague prompt (just like a real interview)
  2. Ask clarifying questions to the AI interviewer
  3. Write your pipeline design solution with real code execution
  4. Get instant feedback and a hire/no-hire decision

Related

  • All Mock Interviews
  • Practice Mode (untimed)
  • System Design Interview Questions
  • Data Engineering Interview Prep Guide
  • Practice Problems
  • Daily Challenge