DataDriven
LearnPracticeInterviewDiscussDailyJobs

Size the Executors

A medium spark interview practice problem on DataDriven. Write and execute real spark code with instant grading.

Domain
spark
Difficulty
medium
Seniority
L5

Problem

A Spark job building daily product recommendation features keeps failing with different errors depending on the cluster config. With 2 large executors (64 GB each, 16 cores), the job dies from GC pauses. When a colleague tried 32 small executors (4 GB each, 1 core), broadcast joins fail because the 2 GB broadcast variable does not fit. Find a balanced executor configuration for a 50-node cluster with 128 GB RAM and 32 cores per node.

Summary

Too big: GC kills you. Too small: broadcast kills you.

Practice This Problem

Solve this spark problem with real code execution. DataDriven runs your solution and grades it automatically.

Related

  • All Practice Problems
  • Mock Interview Mode
  • Data Engineering Interview Prep Guide
  • Daily Challenge
  • Data Engineering Lessons