Loading lesson...
The architecture decision that defines your pipeline
The architecture decision that defines your pipeline
Topics covered: When Would You Use Streaming?, Design for Both Batch and Stream, What About Late Data?, Micro-batch or True Streaming?, How Do You Handle Failures?
The Decision Framework The interviewer wants to hear a number, not a vibe. Your first sentence should be: "What's the latency SLA?" Not "do we want real-time" - that's a wish, not a requirement. If they say "daily is fine," you just eliminated streaming from the conversation. That's a senior signal - you saved six months of unnecessary infrastructure. There are exactly four factors that drive the batch-vs-stream decision. Your answer should walk through all four. Miss any one of them and the int
Lambda Architecture in Practice Here's the reality the interviewer is testing: production data platforms almost always need both batch and streaming. Batch gives you correctness. Streaming gives you speed. The question is how you merge them without creating a maintenance nightmare. Your answer should acknowledge both sides and then explain the merge strategy. The Lambda architecture formalized this pattern: a speed layer (streaming) serves approximate results immediately, while a batch layer rep
Watermarks, Windows, and Late Arrivals This is the #1 follow-up question after you propose a streaming architecture. Interviewers commonly ask some version of: "A user clicks at 11:59 PM but the event arrives at 12:03 AM. Which day does it belong to?" If you've already closed the daily window at midnight, you've got a problem. The interviewer wants to hear three words: watermark, allowed lateness, dead-letter. Start your answer here: a watermark is the system's estimate of "how far behind is the
Spark Structured Streaming vs Flink The interviewer is testing whether you pick a framework based on requirements or based on resume keywords. "Streaming" is a spectrum, not a binary. On one end, Spark Structured Streaming processes data in small batches (micro-batches) with a latency floor around 100 milliseconds. On the other end, Apache Flink processes each event individually. Your answer should match the framework to the SLA, not to your personal preference. The trigger interval in Spark SS
Exactly-Once, Offsets, and Checkpoints Every streaming system will fail. The interviewer knows this. The question isn't whether failures happen - it's whether your pipeline produces correct results when they do. Your answer framework: start with delivery guarantees, then explain offset management, then describe your idempotency strategy. Hit all three and you've covered the full rubric. The interviewer wants to hear that at-least-once is the production default. Not exactly-once. The trap is answ