Merge Intervals — Sort-by-End Breaks Greedy Invariant
Sort-by-end broke the greedy invariant: [1,10],[2,3],[4,5] merged to [2,3],[4,10] instead of [1,10].
20+ years shipping performance-critical code where algorithms decide the bill. Notes here come from systems that actually shipped.
- Sort by start time: O(n log n). This is mandatory — without it, you need O(n^2) comparisons.
- Overlap condition: currentStart <= lastEnd. Use strict < for non-overlap check.
- Merge: update lastEnd = max(lastEnd, currentEnd). Never assign directly.
- Time: O(n log n). Space: O(n) for output list.
- Calendar apps merge busy slots to show free time.
- Database query optimizers merge overlapping index range scans.
- Firewall engines merge CIDR blocks before evaluating packet rules.
- Forgetting Math.max() when extending the merged end. A contained interval like [2,5] inside [1,10] shrinks the end to 5 instead of keeping 10.
Imagine you're booking meeting rooms for a busy office. Several meetings overlap — one runs 9-11am and another 10am-12pm, so you can just block off 9am-12pm as one big chunk. The merge intervals problem is exactly this: given a list of time slots (or any numeric ranges), collapse all the overlapping ones into the smallest set of non-overlapping blocks. It's the algorithm behind your calendar app when it shows you a clean schedule instead of a jumbled mess.
Interval merging is a foundational pattern in calendar systems, database query optimizers, network firewalls, and resource schedulers. Every system that deals with overlapping time ranges or numeric spans uses some variant of this algorithm.
The problem: given an unsorted list of intervals, collapse all overlapping pairs into consolidated ranges. The naive approach compares every pair: O(n^2). The optimal approach sorts by start time first, then merges in a single linear pass: O(n log n). The sort is the entire secret — it transforms the problem from pairwise comparison to sequential merging.
The common misconception is that the merge logic itself is the hard part. It is not. The merge is a simple one-line check: if currentStart <= lastEnd, update lastEnd = max(lastEnd, currentEnd). The hard part is recognizing that sorting by start time makes this single-pass approach correct. Without the sort, a later interval could overlap an earlier one that was already processed.
Merge Intervals — The Sorting Trap That Breaks Greedy Invariants
The merge intervals problem asks: given a collection of intervals [start, end], merge all overlapping intervals into a single non-overlapping range. The core mechanic is simple — two intervals overlap if one's start ≤ the other's end — but the sorting order you choose determines whether the greedy algorithm holds. Sort by start time, and you can merge in a single pass: keep a current interval, extend its end if the next start ≤ current end, else push and reset. This works because sorting by start guarantees that any later interval cannot start before an earlier one, preserving the invariant that the current interval is always the earliest unmerged range.
In practice, the critical property is that sorting by end time breaks this invariant. If you sort by end, a later interval might start before an earlier one's start, causing the greedy pass to miss overlaps. For example, intervals [1,5], [6,10], [2,3] sorted by end become [2,3], [1,5], [6,10]; merging greedily would produce [2,5], [6,10] instead of the correct [1,10]. The algorithm runs in O(n log n) due to sorting, with O(n) for the merge pass and O(1) extra space if you reuse the input list.
Use merge intervals whenever you need to collapse overlapping time ranges, IP address blocks, or resource allocation windows. It's the foundation for calendar conflict detection, network route aggregation, and memory region coalescing. In real systems, the sorting choice isn't academic — a production bug in a scheduling service that sorted by end instead of start caused a 30-minute window of double-booked resources every day until the root cause was found.
Worked Example — Merging Overlapping Intervals
Intervals: [[1,3],[2,6],[8,10],[15,18]].
- Sort by start time: already sorted → [[1,3],[2,6],[8,10],[15,18]].
- merged = [[1,3]]. Compare [2,6]: start=2 <= last_end=3, so merge → update last to [1,6].
- Compare [8,10]: start=8 > last_end=6, no overlap → append [8,10]. merged=[[1,6],[8,10]].
- Compare [15,18]: start=15 > last_end=10, no overlap → append. merged=[[1,6],[8,10],[15,18]].
- Result: [[1,6],[8,10],[15,18]].
For insert-interval [[1,5],[6,9]] with new interval [2,7]: 1. Add all intervals ending before new.start=2: [1,5] ends at 5 which is >= 2, skip. 2. Merge all overlapping: [1,5] and [2,7] → [1,7]; [6,9] start=6 <= 7 → merge to [1,9]. 3. Add remaining: none. Result: [[1,9]].
- Sort by start time: O(n log n). This is the key that makes the greedy approach work.
- Merge: if currentStart <= lastEnd, update lastEnd = max(lastEnd, currentEnd).
- Non-overlap: currentStart > lastEnd (strict). Append new interval.
- The sort guarantees that if the current interval doesn't overlap the last merged one, it doesn't overlap any earlier one.
- Most interval problems are variants of this pattern: insert interval, meeting rooms, free slots.
How Merge Intervals Works — Plain English and Algorithm
Given a list of intervals, merge all overlapping ones into consolidated intervals. Two intervals [a,b] and [c,d] overlap if c <= b (the second starts before or when the first ends).
Algorithm: 1. Sort intervals by their start time: O(n log n). 2. Initialize result with the first interval. 3. For each subsequent interval [start, end]: a. If start <= result[-1][1]: the current interval overlaps the last merged interval. Update result[-1][1] = max(result[-1][1], end). b. Else: no overlap. Append [start, end] to result. 4. Return result.
Worked example — merge [[1,3],[2,6],[8,10],[15,18]]: Sort: already sorted. result = [[1,3]]. [2,6]: 2<=3, overlap. Update result[-1] end to max(3,6)=6. result=[[1,6]]. [8,10]: 8>6, no overlap. Append. result=[[1,6],[8,10]]. [15,18]: 15>10, no overlap. Append. result=[[1,6],[8,10],[15,18]]. Final: [[1,6],[8,10],[15,18]].
- a[0] - b[0]: overflows on extreme values. Wrong sort order.
- Integer.compare(a[0], b[0]): overflow-safe. Correct sort order.
- Comparator.comparingInt(a -> a[0]): overflow-safe, readable, recommended.
- In production: always use Integer.compare or Comparator.comparingInt.
- In interviews: mention the overflow risk to signal production awareness.
Why Sorting First Is the Entire Secret to This Problem
Here's the most have to compare every interval against every other interval — that's O(n²) comparisons. Sort first by start time and suddenly you only ever need to look at the PREVIOUS merged interval. Why? Because once intervals are sorted by start, if the current interval's start is beyond the previous interval's end, no future interval can possibly overlap the previous one either. You can commit to that merge and move on.
Think of it like sorting your mail by date before filing. Once you've sorted it, you can walk through the pile once, stacking letters that belong in the same date range together, and never look backwards.
The overlap condition itself is: two intervals [a, b] and [c, d] overlap if and only if c <= b. That is, the new interval starts BEFORE or exactly when the current one ends. When they overlap, the merged interval is [a, max(b, d)] — you keep the earlier start and push the end out as far as needed. The max() call is critical important insight in this whole article: without sorting, you'd because the new interval might be entirely contained inside the existing one.
- Sort by start: ensures intervals are processed in start-time order.
- Greedy invariant: if current.start > lastMerged.end, no earlier interval can overlap current.
- Proof: lastMerged.end >= all earlier intervals' ends (because merging extends the end).
- If current.start > lastMerged.end, then current.start > all earlier ends. No overlap possible.
- This is why sorting is mandatory — without it, the greedy invariant breaks.
Handling the Sneaky Edge Cases That Trip Up Most Candidates
The basic algorithm handles obvious overlaps, but interviews are won or lost on edge cases. There are three you need to have cold.
FIRST: Touching intervals. Do [1,4] and [4,7] overlap? That depends on the problem definition. The standard LeetCode version treats touching as overlapping, meaning [1,4] and [4,7] become [1,7]. Our condition lastEnd < currentStart handles this correctly — if they're equal (touching), the condition is false, so they merge. Make sure you're using strict less-than, not less-than-or-equal.
SECOND: A fully contained interval. If you have [1,10] followed by [2,5], the second is completely swallowed by the first. The Math.max() call saves you here — max(10,5) correctly keeps 10 as the end.
THIRD: A single interval input, or an empty array. Both are handled naturally by the loop — empty input returns empty output, single input goes straight into mergedList and comes back unchanged. Always test your solution with these before an interview.
FOURTH (the real trap): an already-sorted input with no overlaps at all. Every interval gets added independently. This is the worst case for interview candidates who hard-code assumptions about input state.
- mergedList.add(currentInterval): stores reference to original array. Caller mutation corrupts result.
- mergedList.add(new int[]{start, end}): stores defensive copy. Caller mutation does not affect result.
- Test: modify input after merge, check if result changes. If it does, defensive copy is missing.
- In production: always use defensive copies. In interviews: mention it to signal production awareness.
- This applies to any function that stores references to mutable input objects.
Real-World Patterns — Where Merge Intervals Actually Shows Up in Production
Understanding a problem in isolation is only half the job. Knowing when to reach for this pattern in the wild is what separates a strong engineer from someone who just passed LeetCode.
CALENDAR & SCHEDULING: Any system that shows you free/busy time blocks merges intervals under the hood. When you check someone's availability in Google Calendar, it collapses all their overlapping events before showing you the gaps.
DATABASE INDEX SCANS: Query optimizers merge overlapping range scans on indexed columns. If you ask for rows where age between 20 and 40, AND separately where age between 35 and 55, the engine merges these into one scan from 20 to 55.
NETWORK & SECURITY: Firewall rules often specify IP ranges. Before applying rules, systems merge overlapping CIDR blocks to reduce the number of rules evaluated per packet.
VIDEO/AUDIO EDITING: Timelines merge overlapping clips or subtitle ranges. Any NLE (non-linear editor) uses this when you apply a fade to overlapping audio regions.
The variant problems you'll encounter include: finding the total length covered by all intervals (sum of merged lengths), finding the gaps between merged intervals (invert the merged list), and the insert interval problem (insert a new interval and re-merge).
- Step 1: Merge each person's busy slots independently.
- Step 2: Find free slots for each person (gaps between merged busy slots).
- Step 3: Find intersection of the two free-slot lists. This is another interval problem.
- Alternative: merge both persons' busy slots together, then find gaps. Simpler but less flexible.
- The pattern composes: merge → invert → intersect. Each step is an interval operation.
The Off-by-One Nightmare — Closed vs. Half-Open Intervals
Here's where most merge interval code silently fails: interval boundary semantics.
Is [1,4] and [4,5] overlapping? Depends on your definition. If intervals are closed (both sides inclusive), they touch but don't overlap. Merge them and you get [1,5]. But some production systems treat intervals as half-open — [1,4) and [4,5) don't overlap because 4 belongs to the second range. Your merge code needs to know which space it lives in.
Calendar booking APIs use half-open intervals. Think 9:00-10:00 means you can book 10:00-11:00 right after. Video timeline segmenters use closed intervals — frame 100 and frame 100 overlap if one segment ends at 100 and another starts there. Mix them up and you'll either double-count edges or drop valid overlaps.
The fix is explicit comparators. Don't rely on "first endpoint < second startpoint" magic. Write a comparator enum: CLOSED vs HALF_OPEN. Then your merge logic becomes a single configurable condition. Your 4 AM page will thank you.
What No One Tells You: Your Mutable Intervals Are a Liability
I've seen production systems that merge millions of intervals per second — real-time ad slot allocation, network flow logs, financial trade windows. The rookie move? Mutating the original intervals in-place to save an allocation. Sounds fast, right? Wrong. You just created a debugging nightmare and a potential data race.
Every merge operation that modifies an existing interval's end value invalidates the original data for anyone holding a reference. In Java, that means downstream consumers read corrupted ranges. The fix: create new Interval objects for the merged result. Yes, it costs allocation. No, you don't care unless you're running on a toaster.
Use a List<Interval> output buffer. Append a new Interval(start, maxEnd) each time you merge. The original input stays pristine. Your future self — or the poor soul inheriting your code — will find the bug within minutes instead of hours. Defensive copying isn't a nice-to-have. It's a contractual reliability guarantee.
On a JVM, object allocation is cheap for single-threaded work. The real cost is cache misses from pointer chasing. Profile first, optimize never.
List.copyOf() instead of the ArrayList itself. Callers can't cast and mutate. If they need a mutable list, they copy it. That's their problem.Calendar Service Double-Booked Users: Sorted by End Time Instead of Start Time
- Sort by start time, never by end time. The greedy single-pass approach only works with start-time ordering.
- The merge invariant is: 'if current interval doesn't overlap the last merged interval, it doesn't overlap any earlier one.' This only holds with sort-by-start.
- Test with a long interval followed by short contained intervals: [[1,10],[2,3],[4,5]]. Expected: [[1,10]].
- Add post-merge validation: verify no output interval overlaps another. This catches sort-order bugs.
- Document the sort requirement in code comments. Future developers will not know why start-time sorting is mandatory.
Print sort order: System.out.println(Arrays.deepToString(intervals)) — verify intervals are sorted by index 0Test with unsorted input [[3,4],[1,2],[5,6]] — expected [[1,2],[3,4],[5,6]]Key takeaways
currentStart <= lastMergedEnd (strictMath.max() on the end — never a plain assignment — to handle fully contained intervals correctly.new int[]{start, end}) to prevent input mutation from silently corrupting your output.Practice These on LeetCode
Interview Questions on This Topic
Frequently Asked Questions
20+ years shipping performance-critical code where algorithms decide the bill. Notes here come from systems that actually shipped.
That's Arrays & Strings. Mark it forged?
7 min read · try the examples if you haven't