DataDriven
LearnPracticeInterviewDiscussDailyJobs

A clickstream pipeline matches the section's worked example: 18 months of mobile events stored as un

A medium Pipeline Design mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.

Domain
Pipeline Design
Difficulty
medium

Interview Prompt

A clickstream pipeline matches the section's worked example: 18 months of mobile events stored as unpartitioned GZIP CSV in S3 (10TB total). The DAU dashboard scans the full 10TB on every refresh because none of the four intermediate-tier levers are applied. Apply all four (columnar format, partitioning, splittable compression, pushdown engine) so the dashboard's same SQL drops from 10TB scanned to roughly 100GB.

How This Interview Works

  1. Read the vague prompt (just like a real interview)
  2. Ask clarifying questions to the AI interviewer
  3. Write your pipeline design solution with real code execution
  4. Get instant feedback and a hire/no-hire decision

Related

  • All Mock Interviews
  • Practice Mode (untimed)
  • System Design Interview Questions
  • Data Engineering Interview Prep Guide
  • Practice Problems
  • Daily Challenge