Learn Practice Interview Discuss Daily Jobs

A 2014-era system on the canvas runs the canonical Lambda architecture this section just walked thro

A medium Pipeline Design interview practice problem on DataDriven. Write and execute real pipeline design code with instant grading.

Domain: Pipeline Design
Difficulty: medium

Problem

A 2014-era system on the canvas runs the canonical Lambda architecture this section just walked through: an immutable Kafka event log, a Hadoop nightly batch layer producing the canonical view in Snowflake, and a Flink speed layer producing a real-time approximate delta in Redis. Both views exist, but the analytics dashboard reads them separately and is showing inconsistent numbers because no serving layer merges the two at query time. Apply the Lambda three-layer framing this section just taught and add the missing serving layer: a serving-layer storage node (HBase, Cassandra, or a cache layer like Redis or DynamoDB) that the dashboard reads, and which fronts both the batch view (for everything older than the last batch run) and the speed view (for everything since). The dashboard reads only from the serving layer; it does not query the batch view or the speed view directly. Do not change the existing batch or speed views; the only architectural delta is the serving layer that merges them.

Practice This Problem

Solve this Pipeline Design problem with real code execution. DataDriven runs your solution and grades it automatically.