Log Parsing Pipeline Schema
A medium Data Modeling interview practice problem on DataDriven. Write and execute real data modeling code with instant grading.
- Domain
- Data Modeling
- Difficulty
- medium
- Seniority
- L5
Problem
We ingest about 2TB of raw application logs daily. Right now they're just text files in S3. We need a structured schema so analysts can query error patterns, correlate events within sessions, and track error rates by service. Can you design the data model?
Summary
Raw text files, terabytes of them, full of buried signals and cryptic error codes.
Practice This Problem
Solve this Data Modeling problem with real code execution. DataDriven runs your solution and grades it automatically.