Loading...
Databricks Pipeline with Spark Performance Optimization
A medium Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.
- Domain
- Pipeline Design
- Difficulty
- medium
- Seniority
- senior
Problem
Our bank runs a Databricks platform for transaction analytics. The pipelines are functional but slow - a daily job that should finish in 45 minutes is taking 3.5 hours, and the team has been throwing more compute at it without understanding the root cause. Design the optimized pipeline architecture and the performance remediation plan that resolves the Spark bottlenecks.
Practice This Problem
Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it instantly.