Loading interview...
Fix Skewed Viewing Events Pipeline
A hard Spark mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.
- Domain
- Spark
- Difficulty
- hard
- Seniority
- senior, staff
Interview Prompt
You are the on-call data engineer at a streaming company. The nightly `viewing_engagement` Spark job just paged you. It normally finishes in 45 minutes but has been running for over two hours and is still stuck. The job joins a large `event_data` table (800M rows/day of viewing, playback, and interaction events) against a small `users` dimension (2M subscribers) to produce daily engagement metrics by event type and account tier. Your SLA is 60 minutes. Diagnose the root cause using the Spark UI evidence and fix the job so it meets SLA.
How This Interview Works
- Read the vague prompt (just like a real interview)
- Ask clarifying questions to the AI interviewer
- Write your spark solution with real code execution
- Get instant feedback and a hire/no-hire decision