# Size the Executors

> Too big: GC kills you. Too small: broadcast kills you.

Canonical URL: <https://datadriven.io/problems/spark_executor_memory_config>

Domain: PySpark · Difficulty: medium · Seniority: L5

## Problem

A Spark job building daily product recommendation features keeps failing with different errors depending on the cluster config. With 2 large executors (64 GB each, 16 cores), the job dies from GC pauses. When a colleague tried 32 small executors (4 GB each, 1 core), broadcast joins fail because the 2 GB broadcast variable does not fit. Find a balanced executor configuration for a 50-node cluster with 128 GB RAM and 32 cores per node.

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/spark_executor_memory_config)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.