Merge Intervals — Sort-by-End Breaks Greedy Invariant
Sort-by-end broke the greedy invariant: [1,10],[2,3],[4,5] merged to [2,3],[4,10] instead of [1,10].
- Sort by start time: O(n log n). This is mandatory — without it, you need O(n^2) comparisons.
- Overlap condition: currentStart <= lastEnd. Use strict < for non-overlap check.
- Merge: update lastEnd = max(lastEnd, currentEnd). Never assign directly.
- Time: O(n log n). Space: O(n) for output list.
- Calendar apps merge busy slots to show free time.
- Database query optimizers merge overlapping index range scans.
- Firewall engines merge CIDR blocks before evaluating packet rules.
- Forgetting Math.max() when extending the merged end. A contained interval like [2,5] inside [1,10] shrinks the end to 5 instead of keeping 10.
Imagine you're booking meeting rooms for a busy office. Several meetings overlap — one runs 9-11am and another 10am-12pm, so you can just block off 9am-12pm as one big chunk. The merge intervals problem is exactly this: given a list of time slots (or any numeric ranges), collapse all the overlapping ones into the smallest set of non-overlapping blocks. It's the algorithm behind your calendar app when it shows you a clean schedule instead of a jumbled mess.
Interval merging is a foundational pattern in calendar systems, database query optimizers, network firewalls, and resource schedulers. Every system that deals with overlapping time ranges or numeric spans uses some variant of this algorithm.
The problem: given an unsorted list of intervals, collapse all overlapping pairs into consolidated ranges. The naive approach compares every pair: O(n^2). The optimal approach sorts by start time first, then merges in a single linear pass: O(n log n). The sort is the entire secret — it transforms the problem from pairwise comparison to sequential merging.
The common misconception is that the merge logic itself is the hard part. It is not. The merge is a simple one-line check: if currentStart <= lastEnd, update lastEnd = max(lastEnd, currentEnd). The hard part is recognizing that sorting by start time makes this single-pass approach correct. Without the sort, a later interval could overlap an earlier one that was already processed.
Worked Example — Merging Overlapping Intervals
Intervals: [[1,3],[2,6],[8,10],[15,18]].
- Sort by start time: already sorted → [[1,3],[2,6],[8,10],[15,18]].
- merged = [[1,3]]. Compare [2,6]: start=2 <= last_end=3, so merge → update last to [1,6].
- Compare [8,10]: start=8 > last_end=6, no overlap → append [8,10]. merged=[[1,6],[8,10]].
- Compare [15,18]: start=15 > last_end=10, no overlap → append. merged=[[1,6],[8,10],[15,18]].
- Result: [[1,6],[8,10],[15,18]].
For insert-interval [[1,5],[6,9]] with new interval [2,7]: 1. Add all intervals ending before new.start=2: [1,5] ends at 5 which is >= 2, skip. 2. Merge all overlapping: [1,5] and [2,7] → [1,7]; [6,9] start=6 <= 7 → merge to [1,9]. 3. Add remaining: none. Result: [[1,9]].
- Sort by start time: O(n log n). This is the key that makes the greedy approach work.
- Merge: if currentStart <= lastEnd, update lastEnd = max(lastEnd, currentEnd).
- Non-overlap: currentStart > lastEnd (strict). Append new interval.
- The sort guarantees that if the current interval doesn't overlap the last merged one, it doesn't overlap any earlier one.
- Most interval problems are variants of this pattern: insert interval, meeting rooms, free slots.
How Merge Intervals Works — Plain English and Algorithm
Given a list of intervals, merge all overlapping ones into consolidated intervals. Two intervals [a,b] and [c,d] overlap if c <= b (the second starts before or when the first ends).
Algorithm: 1. Sort intervals by their start time: O(n log n). 2. Initialize result with the first interval. 3. For each subsequent interval [start, end]: a. If start <= result[-1][1]: the current interval overlaps the last merged interval. Update result[-1][1] = max(result[-1][1], end). b. Else: no overlap. Append [start, end] to result. 4. Return result.
Worked example — merge [[1,3],[2,6],[8,10],[15,18]]: Sort: already sorted. result = [[1,3]]. [2,6]: 2<=3, overlap. Update result[-1] end to max(3,6)=6. result=[[1,6]]. [8,10]: 8>6, no overlap. Append. result=[[1,6],[8,10]]. [15,18]: 15>10, no overlap. Append. result=[[1,6],[8,10],[15,18]]. Final: [[1,6],[8,10],[15,18]].
- a[0] - b[0]: overflows on extreme values. Wrong sort order.
- Integer.compare(a[0], b[0]): overflow-safe. Correct sort order.
- Comparator.comparingInt(a -> a[0]): overflow-safe, readable, recommended.
- In production: always use Integer.compare or Comparator.comparingInt.
- In interviews: mention the overflow risk to signal production awareness.
Why Sorting First Is the Entire Secret to This Problem
Here's the most have to compare every interval against every other interval — that's O(n²) comparisons. Sort first by start time and suddenly you only ever need to look at the PREVIOUS merged interval. Why? Because once intervals are sorted by start, if the current interval's start is beyond the previous interval's end, no future interval can possibly overlap the previous one either. You can commit to that merge and move on.
Think of it like sorting your mail by date before filing. Once you've sorted it, you can walk through the pile once, stacking letters that belong in the same date range together, and never look backwards.
The overlap condition itself is: two intervals [a, b] and [c, d] overlap if and only if c <= b. That is, the new interval starts BEFORE or exactly when the current one ends. When they overlap, the merged interval is [a, max(b, d)] — you keep the earlier start and push the end out as far as needed. The max() call is critical important insight in this whole article: without sorting, you'd because the new interval might be entirely contained inside the existing one.
- Sort by start: ensures intervals are processed in start-time order.
- Greedy invariant: if current.start > lastMerged.end, no earlier interval can overlap current.
- Proof: lastMerged.end >= all earlier intervals' ends (because merging extends the end).
- If current.start > lastMerged.end, then current.start > all earlier ends. No overlap possible.
- This is why sorting is mandatory — without it, the greedy invariant breaks.
Handling the Sneaky Edge Cases That Trip Up Most Candidates
The basic algorithm handles obvious overlaps, but interviews are won or lost on edge cases. There are three you need to have cold.
FIRST: Touching intervals. Do [1,4] and [4,7] overlap? That depends on the problem definition. The standard LeetCode version treats touching as overlapping, meaning [1,4] and [4,7] become [1,7]. Our condition lastEnd < currentStart handles this correctly — if they're equal (touching), the condition is false, so they merge. Make sure you're using strict less-than, not less-than-or-equal.
SECOND: A fully contained interval. If you have [1,10] followed by [2,5], the second is completely swallowed by the first. The Math.max() call saves you here — max(10,5) correctly keeps 10 as the end.
THIRD: A single interval input, or an empty array. Both are handled naturally by the loop — empty input returns empty output, single input goes straight into mergedList and comes back unchanged. Always test your solution with these before an interview.
FOURTH (the real trap): an already-sorted input with no overlaps at all. Every interval gets added independently. This is the worst case for interview candidates who hard-code assumptions about input state.
- mergedList.add(currentInterval): stores reference to original array. Caller mutation corrupts result.
- mergedList.add(new int[]{start, end}): stores defensive copy. Caller mutation does not affect result.
- Test: modify input after merge, check if result changes. If it does, defensive copy is missing.
- In production: always use defensive copies. In interviews: mention it to signal production awareness.
- This applies to any function that stores references to mutable input objects.
Real-World Patterns — Where Merge Intervals Actually Shows Up in Production
Understanding a problem in isolation is only half the job. Knowing when to reach for this pattern in the wild is what separates a strong engineer from someone who just passed LeetCode.
CALENDAR & SCHEDULING: Any system that shows you free/busy time blocks merges intervals under the hood. When you check someone's availability in Google Calendar, it collapses all their overlapping events before showing you the gaps.
DATABASE INDEX SCANS: Query optimizers merge overlapping range scans on indexed columns. If you ask for rows where age between 20 and 40, AND separately where age between 35 and 55, the engine merges these into one scan from 20 to 55.
NETWORK & SECURITY: Firewall rules often specify IP ranges. Before applying rules, systems merge overlapping CIDR blocks to reduce the number of rules evaluated per packet.
VIDEO/AUDIO EDITING: Timelines merge overlapping clips or subtitle ranges. Any NLE (non-linear editor) uses this when you apply a fade to overlapping audio regions.
The variant problems you'll encounter include: finding the total length covered by all intervals (sum of merged lengths), finding the gaps between merged intervals (invert the merged list), and the insert interval problem (insert a new interval and re-merge).
- Step 1: Merge each person's busy slots independently.
- Step 2: Find free slots for each person (gaps between merged busy slots).
- Step 3: Find intersection of the two free-slot lists. This is another interval problem.
- Alternative: merge both persons' busy slots together, then find gaps. Simpler but less flexible.
- The pattern composes: merge → invert → intersect. Each step is an interval operation.
Calendar Service Double-Booked Users: Sorted by End Time Instead of Start Time
- Sort by start time, never by end time. The greedy single-pass approach only works with start-time ordering.
- The merge invariant is: 'if current interval doesn't overlap the last merged interval, it doesn't overlap any earlier one.' This only holds with sort-by-start.
- Test with a long interval followed by short contained intervals: [[1,10],[2,3],[4,5]]. Expected: [[1,10]].
- Add post-merge validation: verify no output interval overlaps another. This catches sort-order bugs.
- Document the sort requirement in code comments. Future developers will not know why start-time sorting is mandatory.
Key takeaways
currentStart <= lastMergedEnd (strictMath.max() on the end — never a plain assignment — to handle fully contained intervals correctly.new int[]{start, end}) to prevent input mutation from silently corrupting your output.Interview Questions on This Topic
Frequently Asked Questions
That's Arrays & Strings. Mark it forged?
4 min read · try the examples if you haven't