Loading...
HIPAA-Compliant PHI De-identification Pipeline for Development
A hard Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.
- Domain
- Pipeline Design
- Difficulty
- hard
- Seniority
- staff
Problem
Our data engineering team works with protected health information in production and we need a way to provide engineers with realistic data in development and testing environments without exposing real patient records. Design a pipeline that ingests from production, de-identifies PHI, and delivers a statistically representative synthetic dataset to the dev environment.
Practice This Problem
Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it instantly.