DataDriven
LearnPracticeInterviewDiscussDaily
HelpContactPrivacyTermsSecurityiOS App

© 2026 DataDriven

Loading lesson...

  1. Home
  2. Learn
  3. Streaming Systems

Streaming Systems

Answer the Kafka and streaming questions with confidence

Answer the Kafka and streaming questions with confidence

Category
Pipeline Architecture
Difficulty
beginner
Duration
20 minutes
Challenges
0 hands-on challenges

Topics covered: Event Platforms, Event-Driven Architecture, Late-Arriving Data, Dead Letter Queues, Micro-Batch vs True Streaming

Lesson Sections

  1. Event Platforms (concepts: paEventPlatforms)

    What They Want to Hear 'Kafka is a distributed event streaming platform. Producers write events to topics. Each topic is split into partitions for parallel processing. Consumers read from partitions using consumer groups, where each partition is assigned to exactly one consumer in the group. The key difference from a traditional message queue: Kafka retains events after they are read, so multiple consumers can independently replay the same data.' That is the answer. Topics, partitions, consumer

  2. Event-Driven Architecture (concepts: paEventDriven)

    What They Want to Hear 'In event-driven architecture, services communicate by publishing events instead of calling each other directly. When an order is placed, the order service publishes an event. The inventory service, the notification service, and the analytics pipeline each consume that event independently. No service needs to know about the others. This decouples teams and systems.' That is the answer. Publish, not call. Independent consumers. Decoupled teams.

  3. Late-Arriving Data (concepts: paLateData)

    What They Want to Hear 'Late data arrives after the window it belongs to has already been processed. A click that happened at 11:58 PM might arrive at 12:03 AM, after the hourly window closed. I handle this with watermarks: a threshold that says how late I am willing to wait. If my watermark is 10 minutes, I keep the window open for 10 extra minutes to accept late events. Events that arrive after the watermark are either dropped or sent to a dead letter queue for reprocessing.' That is the answe

  4. Dead Letter Queues (concepts: paDeadLetterQueue)

    What They Want to Hear 'A dead letter queue (DLQ) is where events go when they cannot be processed. Instead of crashing the pipeline or blocking the stream, the bad event is moved to a separate topic for investigation. This keeps the main pipeline flowing. I monitor DLQ depth as a health metric: if it grows, something is systematically wrong. I reprocess DLQ events after fixing the root cause.' That is the answer. DLQ = safety valve. Monitor depth. Fix root cause, then replay.

  5. Micro-Batch vs True Streaming (concepts: paMicroBatchVsTrue)

    What They Want to Hear 'Micro-batch processes events in small time windows, typically every few seconds. Spark Structured Streaming uses this model. True streaming processes each event as it arrives with no batching delay. Flink uses this model. The practical difference is latency: micro-batch has a floor around 100 milliseconds. True streaming can process in single-digit milliseconds. For most use cases, micro-batch is good enough and simpler to operate.' That is the answer. Micro-batch = small

Related

  • All Lessons
  • Practice Problems
  • Mock Interview Practice
  • Daily Challenges