Loading interview...
Push It Down
A medium Spark mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.
- Domain
- Spark
- Difficulty
- medium
- Seniority
- senior
Interview Prompt
A daily analytics job reads a 3 TB user_events Parquet table partitioned by event_date, filters to yesterday (about 10 GB), and joins against user_profiles. The job takes 40 minutes but should take 5. A colleague wrote the pipeline using a subquery pattern that defeats partition pruning. The physical plan shows a full table scan of all 3 TB. Rewrite the query so Catalyst pushes the date filter down to the file scan.
How This Interview Works
- Read the vague prompt (just like a real interview)
- Ask clarifying questions to the AI interviewer
- Write your spark solution with real code execution
- Get instant feedback and a hire/no-hire decision