Log Parsing Pipeline Schema
A medium Data Modeling mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.
- Domain
- Data Modeling
- Difficulty
- medium
- Seniority
- L5
Interview Prompt
We ingest about 2TB of raw application logs daily. Right now they're just text files in S3. We need a structured schema so analysts can query error patterns, correlate events within sessions, and track error rates by service. Can you design the data model?
Summary
Raw text files, terabytes of them, full of buried signals and cryptic error codes.
How This Interview Works
- Read the vague prompt (just like a real interview)
- Ask clarifying questions to the AI interviewer
- Write your data modeling solution with real code execution
- Get instant feedback and a hire/no-hire decision