Learn Practice Interview Discuss Daily Jobs

A 2014-era system on the canvas runs the canonical Lambda architecture this section just walked thro

A medium Pipeline Design mock interview question on DataDriven. Practice with AI-powered feedback, real code execution, and a hire/no-hire decision.

Domain: Pipeline Design
Difficulty: medium

Interview Prompt

A 2014-era system on the canvas runs the canonical Lambda architecture this section just walked through: an immutable Kafka event log, a Hadoop nightly batch layer producing the canonical view in Snowflake, and a Flink speed layer producing a real-time approximate delta in Redis. Both views exist, but the analytics dashboard reads them separately and is showing inconsistent numbers because no serving layer merges the two at query time. Apply the Lambda three-layer framing this section just taught and add the missing serving layer: a serving-layer storage node (HBase, Cassandra, or a cache layer like Redis or DynamoDB) that the dashboard reads, and which fronts both the batch view (for everything older than the last batch run) and the speed view (for everything since). The dashboard reads only from the serving layer; it does not query the batch view or the speed view directly. Do not change the existing batch or speed views; the only architectural delta is the serving layer that merges them.

How This Interview Works

Read the vague prompt (just like a real interview)
Ask clarifying questions to the AI interviewer
Write your pipeline design solution with real code execution
Get instant feedback and a hire/no-hire decision

Related

All Mock Interviews
Practice Mode (untimed)
System Design Interview Questions
Data Engineering Interview Prep Guide
Practice Problems
Daily Challenge