DataDriven
LearnPracticeInterviewDiscussDaily
HelpContactPrivacyTermsSecurityiOS App

© 2026 DataDriven

Loading lesson...

  1. Home
  2. Learn
  3. Merge Intervals: Intermediate

Merge Intervals: Intermediate

Sweep events, not intervals. One pass reveals maximum overlap, gaps, and schedules.

Sweep events, not intervals. One pass reveals maximum overlap, gaps, and schedules.

Category
Python
Difficulty
intermediate
Duration
30 minutes
Challenges
0 hands-on challenges

Topics covered: Meeting Rooms: How Many Rooms Do You Need?, Sweep Line: Events Drive Everything, Interval Trees and SortedList: When O(log n) Stabbing Queries Matter, Range Coverage: Minimum Intervals to Cover a Target, Merging Intervals with Metadata

Lesson Sections

  1. Meeting Rooms: How Many Rooms Do You Need? (concepts: pyMeetingRooms, pyMinHeapScheduling, pySweepTwoPointer)

    Meeting rooms II is the most common interval problem in on-site data engineering interviews at big tech. The question: given a list of meeting intervals, find the minimum number of meeting rooms required to hold all meetings simultaneously. The naive read is 'find the maximum number of overlapping intervals at any point in time.' There are two clean approaches. The heap approach is more intuitive; the event sweep approach is more generalizable. Know both. Approach 1: Min-Heap on End Times Sort m

  2. Sweep Line: Events Drive Everything (concepts: pySweepLine, pyCoverageGaps, pyEventSort)

    The sweep line algorithm is the generalization of the two-pointer meeting rooms approach. Convert every interval into two events: a +1 at the start and a -1 at the end. Sort all events by time (breaking ties: end events before start events at the same time — more on this shortly). Sweep through events, maintaining a running count. The running count at any moment is the number of active intervals. From this single sweep you can read off: maximum overlap, total covered time, gap locations, and cov

  3. Interval Trees and SortedList: When O(log n) Stabbing Queries Matter (concepts: pyIntervalTree, pySortedList, pyStabbingQuery)

    Sort-merge covers the batch case perfectly: you have all intervals upfront, sort once, scan once. But what happens when intervals arrive dynamically — one at a time — and you need to answer 'which intervals contain point P?' after each insertion? Sorting the entire list and scanning is O(n) per query. An interval tree does it in O(log n + k) where k is the number of matching intervals. This is the data structure that makes range queries on dynamic datasets practical at scale. What an Interval Tr

  4. Range Coverage: Minimum Intervals to Cover a Target (concepts: pyRangeCoverage, pyGreedyCoverage, pyJumpGame)

    The range coverage problem: given a target range [start, end] and a list of intervals, find the minimum number of intervals needed to fully cover the target. This appears in data engineering as: 'Given a set of pipeline jobs that each cover a time range, what is the minimum number of jobs needed to ensure no data gap in the target window?' The greedy strategy is: sort by start, greedily pick the interval that extends coverage furthest. Greedy Coverage Algorithm The greedy correctness argument: a

  5. Merging Intervals with Metadata (concepts: pyIntervalMetadata, pyAggregationSemantics, pyLogWindowMerge)

    In production, intervals are never just [start, end]. They carry data: error counts for a time window, user IDs in a session, labels from an anomaly detector. When you merge overlapping intervals, you must also merge their payloads. How you merge the payload depends on what it represents: sums for counts, unions for label sets, weighted averages for metrics. This is where the algorithm meets real data engineering — and it is what makes interval merging a non-trivial production problem. Merging L

Related

  • All Lessons
  • Practice Problems
  • Mock Interview Practice
  • Daily Challenges