Cores and Slots

An executor is not a single worker. It has N cores, and each core can run one task at a time. So an executor with 5 cores is processing 5 partitions simultaneously. Think of a core as a slot: a place a task can be running right now. Your total parallelism is the sum of all slots across all executors. Slots and partitions decide your wall-clock time Put the last two sections together. You have a fixed number of partitions (the work) and a fixed number of slots (the workers). If you have 200 partitions and 50 slots, Spark runs them in 4 waves: 50, then 50, then 50, then 50. If one partition is much bigger than the rest, its task runs long and a slot is stuck on it while others sit idle. That is the seed of every skew problem you will ever debug, and it falls straight out of the slots-and-par

About This Interactive Section

This section is part of the How a Spark Job Runs lesson on DataDriven, a free data engineering interview prep platform. Each section includes explanations, worked examples, and hands-on code challenges that execute in real time. SQL queries run against a live PostgreSQL database. Python runs in a sandboxed Docker container. Data modeling problems validate against interactive schema canvases. All content is framed around what data engineering interviewers actually test at companies like Meta, Google, Amazon, Netflix, Stripe, and Databricks.

How DataDriven Lessons Work

DataDriven combines four interview rounds (SQL, Python, Data Modeling, Pipeline Architecture) with adaptive difficulty and spaced repetition. Easy problems get harder as you improve. Weak concepts resurface until you master them. Your readiness score tracks progress across every topic interviewers test. Every lesson section ends with problems you solve by writing and running real code, not by picking multiple-choice answers.