Loading interview...
Cost-Optimized Clickstream Data Lake
A hard Pipeline Design mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.
- Domain
- Pipeline Design
- Difficulty
- hard
- Seniority
- staff
Interview Prompt
Our product generates hundreds of millions of user interaction events every day. We stream them through Kafka but right now they just pile up and we have no good way to query them for analytics. Storage costs are already a concern and the data needs to be queryable for at least two years. Design an architecture to store and query this event data efficiently.
How This Interview Works
- Read the vague prompt (just like a real interview)
- Ask clarifying questions to the AI interviewer
- Write your pipeline design solution with real code execution
- Get instant feedback and a hire/no-hire decision