Staff Data Engineer Top Tech Architecture
CloudflareThe Bucket Full of Resumes
Our HR platform receives thousands of resumes monthly as PDFs and scanned images. Right now they sit in an S3 bucket and searching them means opening files manually. We need a pipeline that extracts structured information from every document - candidate name, skills, work history, education - and makes it queryable. Design the end-to-end ingestion and extraction pipeline.
Ask the interviewer clarifying questions to understand the requirements and constraints before designing.
When you're ready, click Ready to Design to start building.
The Bucket Full of Resumes
A medium Pipeline Design mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.
- Domain
- Pipeline Design
- Difficulty
- medium
- Seniority
- staff
Interview Prompt
Our HR platform receives thousands of resumes monthly as PDFs and scanned images. Right now they sit in an S3 bucket and searching them means opening files manually. We need a pipeline that extracts structured information from every document - candidate name, skills, work history, education - and makes it queryable. Design the end-to-end ingestion and extraction pipeline.
How This Interview Works
- Read the vague prompt (just like a real interview)
- Ask clarifying questions to the AI interviewer
- Write your pipeline design solution with real code execution
- Get instant feedback and a hire/no-hire decision