Loading interview...

Document Ingestion and Text Extraction Pipeline

A medium Pipeline Design mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.

Domain
Pipeline Design
Difficulty
medium
Seniority
senior

Interview Prompt

Our legal team receives thousands of contract documents every month in PDF and scanned image format. They need to search across all of them and extract key terms like party names, dates, and obligations. Right now every document lives in a shared drive and search is impossible. Design a pipeline to ingest these documents and make the content queryable.

How This Interview Works

  1. Read the vague prompt (just like a real interview)
  2. Ask clarifying questions to the AI interviewer
  3. Write your pipeline design solution with real code execution
  4. Get instant feedback and a hire/no-hire decision