Skip to content
Home DSA Divide and Conquer — Integer Overflow at 2^30 Elements

Divide and Conquer — Integer Overflow at 2^30 Elements

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Recursion → Topic 5 of 9
mid = (low + high) / 2 overflows past 500M elements, causing negative indices and segfaults.
🧑‍💻 Beginner-friendly — no prior DSA experience needed
In this tutorial, you'll learn
mid = (low + high) / 2 overflows past 500M elements, causing negative indices and segfaults.
  • Three steps: Divide into equal subproblems, Conquer recursively (base case when trivially small), Combine — the combine step determines the overall complexity.
  • Master Theorem for T(n) = aT(n/b) + O(n^c): compare log_b(a) with c. Equal → O(n^c log n). Less → O(n^c). Greater → O(n^log_b(a)).
  • When D&C pays off: independent subproblems (no overlap), balanced splits (near-equal halves), efficient combine (O(n) or less). Imbalanced splits degrade to O(n^2).
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer
  • Divide and conquer solves problems by splitting into smaller identical subproblems, solving recursively, and combining.
  • Three steps: Divide (split), Conquer (solve recursively), Combine (merge).
  • Complexity is driven by the combine step — O(n) combine gives O(n log n), O(1) gives O(log n).
  • Master Theorem cases: compare log_b(a) to exponent c in the combine term.
  • Production trap: imbalanced splits (e.g., quick sort worst-case) degrade to O(n^2).
🚨 START HERE

Quick Debug Cheat Sheet for Divide and Conquer

When your D&C algorithm breaks in production, use these commands and checks to diagnose the issue fast.
🟡

Recursion never terminates

Immediate ActionAdd a depth limit counter and log when exceeded.
Commands
strace -p <pid> | grep -c 'SIGSTKFLT' (Linux) to check stack overflows
ulimit -a | grep stack to see stack limit
Fix NowIncrease stack size: `ulimit -s 65536` or set JVM `-Xss4m`.
🟡

Wrong result for large input

Immediate ActionReproduce with size 2 and size 3 first. Then try a size that is a power of 2.
Commands
python -c "print((2**31 - 1) + 1)" to check overflow
Add `print(mid)` in Python or `System.out.println(mid)` in Java before recursion.
Fix NowApply `mid = low + (high - low) // 2`.
🟠

CPU at 100% but no progress

Immediate ActionAttach strace to see system calls. Likely infinite loop in combine step.
Commands
strace -p <pid> -e trace=write,read 2>&1 | head -100
jstack <pid> (Java) to check threads stuck in recursion.
Fix NowForce kill process, fix base case condition, redeploy.
🔴

OutOfMemoryError in Java / MemoryError in Python

Immediate ActionCheck if algorithm creates new arrays on each recursive call. Use indices instead.
Commands
jmap -heap <pid> (Java) to check heap usage
sys.getsizeof() on each recursive call's copies in Python.
Fix NowRefactor to use indices and shared array.
Production Incident

Binary Search on Rotated Array — The Overflow Bug

A financial data pipeline used binary search on a rotated sorted array to find a timestamp pivot. In production, the search started returning wrong indexes under high load, corrupting time-series merges.
SymptomRandom segmentation faults and wrong pivot positions for arrays larger than 2^30 elements. The error was intermittent, only under heavy concurrent access.
AssumptionBinary search on rotated arrays is always O(log n) — the algorithm was proven correct in unit tests on small arrays.
Root causeThe divide step used mid = (low + high) // 2. When low + high exceeded INT_MAX (~2^31), integer overflow produced a negative mid, causing out-of-bounds access. The array size in production hit 500 million elements.
FixChanged to mid = low + (high - low) // 2 to avoid overflow. Also added a base case check for segment sizes < 2^20 to call a linear scan, preventing recursion deeper than log2(max array size).
Key Lesson
Integer overflow in mid calculation is a classic bug in divide and conquer — it manifests silently at scale.Always use low + (high - low) / 2 to compute mid for large arrays.Base-case termination thresholds must consider both recursion depth and input size to avoid stack overflow.
Production Debug Guide

Common symptoms and actions when D&C algorithms misbehave at scale

Stack overflow on large inputsCheck recursion depth. Set a base case threshold (e.g., n < 1000) and use iteration for small subproblems. Increase stack size via compiler flags or JVM -Xss.
Algorithm returns wrong results intermittentlyLog the divide indices and combine results. Most likely cause: integer overflow in mid calculation or incorrect base-case return.
Performance degraded to O(n^2) on real dataProfile the combine step. If it's O(n^2), the overall complexity becomes O(n^2 log n) or worse. Check if subproblems are balanced — imbalanced splits (e.g., quick sort on sorted data) cause quadratic behaviour.
Memory usage spikes during recursionCheck if each recursive call creates new data copies (e.g., array slicing in Python/Java). Use indices or views instead of copying entire arrays.

Divide and conquer is the algorithmic pattern behind merge sort, binary search, fast Fourier transform, Strassen's matrix multiplication, and closest pair of points. If you understand D&C deeply, you understand why all of these algorithms work and how to design new ones.

The pattern is deceptively simple: split the problem, solve each half, combine. The complexity depends almost entirely on the combine step — and the Master Theorem gives you an algebraic shortcut to avoid solving the recurrence from scratch each time. Learning the Master Theorem once saves you dozens of recurrence derivations throughout your career.

The Three-Step Pattern

  1. Divide: Split the problem into subproblems (usually of equal or near-equal size)
  2. Conquer: Solve each subproblem recursively (base case when small enough)
  3. Combine: Merge subproblem solutions into the overall solution

The efficiency depends critically on the combine step. Merge sort's O(n log n) comes from an O(n) combine. Binary search has O(1) combine (no merging needed).

That's it. If you can describe your algorithm as: split, solve halves, merge results — you've got a divide and conquer. The rest is just details about how fast that merge is.

dc_template.py · PYTHON
12345678910
def divide_and_conquer(problem):
    # Base case
    if is_small_enough(problem):
        return solve_directly(problem)
    # Divide
    subproblems = divide(problem)
    # Conquer
    solutions = [divide_and_conquer(sub) for sub in subproblems]
    # Combine
    return combine(solutions)
📊 Production Insight
The combine step is where production failures hide. A combine that is O(n) vs O(n^2) changes the algorithm from O(n log n) to O(n^2 log n).
Always measure combine complexity before ship.
🎯 Key Takeaway
Every D&C algorithm follows divide, conquer, combine.
The combine step determines the final time complexity.

The Master Theorem

For recurrences of the form T(n) = aT(n/b) + f(n) where a ≥ 1, b > 1:

  • Case 1: f(n) = O(n^(log_b(a) - ε)) → T(n) = Θ(n^log_b(a))
  • Case 2: f(n) = Θ(n^log_b(a)) → T(n) = Θ(n^log_b(a) · log n)
  • Case 3: f(n) = Ω(n^(log_b(a) + ε)) → T(n) = Θ(f(n))

Examples: - Merge sort: T(n) = 2T(n/2) + O(n) → Case 2 → O(n log n) - Binary search: T(n) = T(n/2) + O(1) → Case 2 → O(log n) - Strassen: T(n) = 7T(n/2) + O(n²) → Case 1 → O(n^2.807) - Simple recursion: T(n) = 2T(n/2) + O(1) → Case 1 → O(n)

🔥Quick Master Theorem Check
For T(n) = aT(n/b) + n^c: if log_b(a) < c → O(n^c). If log_b(a) = c → O(n^c log n). If log_b(a) > c → O(n^log_b(a)). The larger of the recursive work vs combine work dominates.
📊 Production Insight
The Master Theorem assumes equal-sized subproblems. Real-world data often breaks that — quick sort's worst-case T(n)=T(n-1)+T(0)+n falls outside Master Theorem scope.
When b varies dynamically, use Akra-Bazzi or trace the actual recursion tree.
🎯 Key Takeaway
For T(n) = aT(n/b) + n^c: compare log_b(a) with c.
Equal → O(n^c log n). Less → O(n^c). Greater → O(n^log_b(a)).
This covers 90% of recurrences you'll see.

Canonical Examples

These three algorithms are the classic divide and conquer triad — they show how the same pattern yields wildly different complexities depending on the combine step.

Binary Search: Divide the search space in half, conquer by checking the mid, combine is O(1) (just return mid or recurse). Result: O(log n).

Merge Sort: Divide at midpoint, conquer recursively, combine by merging two sorted halves in O(n) time. Result: O(n log n).

Fast Power: Divide exponent by 2, conquer by squaring the half-result, combine is O(1) multiplication. Result: O(log n) versus naive O(n).

The key insight: the combine step is the only thing that changes between these three. The divide and conquer skeleton stays the same.

dc_examples.py · PYTHON
1234567891011121314151617181920212223242526272829303132
# Binary Search — O(log n)
def binary_search(arr, target, lo, hi):
    if lo > hi: return -1
    mid = (lo + hi) // 2
    if arr[mid] == target: return mid
    if arr[mid] < target: return binary_search(arr, target, mid+1, hi)
    return binary_search(arr, target, lo, mid-1)

# Merge Sort — O(n log n)
def merge_sort(arr):
    if len(arr) <= 1: return arr
    mid = len(arr) // 2
    left  = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])
    return merge(left, right)

def merge(left, right):
    result, i, j = [], 0, 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]: result.append(left[i]); i += 1
        else:                   result.append(right[j]); j += 1
    return result + left[i:] + right[j:]

# Power function — O(log n)
def fast_power(base, exp):
    if exp == 0: return 1
    if exp % 2 == 0:
        half = fast_power(base, exp // 2)
        return half * half
    return base * fast_power(base, exp - 1)

print(fast_power(2, 10))  # 1024
▶ Output
1024
📊 Production Insight
Notice the binary search uses mid = (lo + hi) // 2. In production, for arrays larger than 2^30 elements, this overflows in Java and C++. Always use mid = lo + (hi - lo) // 2.
Also, the merge sort above creates new list slices — that's O(n) memory per recursion level. In production, use in-place indices.
🎯 Key Takeaway
Binary search: combine = O(1), total O(log n). Merge sort: combine = O(n), total O(n log n). Fast power: combine = O(1), total O(log n).
The combine step is the only differentiator.

When Divide and Conquer Pays Off

D&C pays off when: 1. Subproblems are independent (no shared state to communicate) 2. Subproblems are of equal or balanced size (imbalanced splits degrade to O(n²)) 3. The combine step is efficient (O(n) or less)

D&C does NOT pay off when
  • Subproblems overlap heavily (use dynamic programming instead)
  • The combine step is expensive (e.g., O(n²) combine → O(n² log n) total, worse than naive)
📊 Production Insight
Quick sort is a great example of D&C failing in production: when the pivot picks a minimum/maximum, subproblems are size 1 and n-1, recurrence becomes T(n)=T(n-1)+O(n) → O(n²).
Mitigate by using median-of-three pivot or random pivot selection.
Always test on sorted, reverse-sorted, and nearly-sorted data.
🎯 Key Takeaway
If subproblems overlap, use DP.
If combine is O(n²), total becomes O(n² log n) — worse than a naive O(n²).
If splits are unbalanced, performance collapses to quadratic.

Master Theorem Extended: When the Basic Cases Don't Apply

Not all recurrences fit the clean form T(n) = aT(n/b) + f(n). Real-world D&C algorithms often break the assumptions:

  1. Non-constant a or b: The number of subproblems or the split factor may vary per recursion level. Example: quick sort's pivot selection leads to varying b.
  2. Floating-point splits: Algorithms like Strassen's matrix multiplication split matrices into halves, but for odd dimensions you need padding — this breaks the clean n/b assumption.
  3. F(n) is non-polynomial: If combine is something like n log n (O(n log n)), Master Theorem's polynomial comparison fails. You need the extended case 2 with a log factor.

Akra-Bazzi Theorem handles variable subproblem sizes. For recurrences of the form T(n) = Σ a_i T(b_i n) + f(n), compute p such that Σ a_i (b_i)^p = 1. Then T(n) = Θ(n^p + n^p ∫ f(u)/u^(p+1) du).

This is rarely needed in interviews, but if you work on custom D&C algorithms (e.g., in numerical computing), it's essential.

Mental Model
The Big Picture: Master Theorem is a Shortcut, Not the Truth
The Master Theorem is a shortcut for the most common pattern — it's not the whole truth. Think of it as a precomputed lookup table for recurrences.
  • Works only when subproblems are equal size (b constant, a constant).
  • Requires f(n) to be a polynomial or polylog.
  • Case 2 has a more general form: if f(n) = Θ(n^c log^k n), then T(n) = Θ(n^c log^(k+1) n).
  • When assumptions break, you must trace the recursion tree or use Akra-Bazzi.
  • For interviews, 95% of questions use the basic three-case Master Theorem.
📊 Production Insight
Never assume a recurrence fits the Master Theorem without checking the base case size. Real datasets often produce odd-sized subproblems at deep recursion levels — you'll see runtime that doesn't match your theoretical bound.
Performance tests with powers-of-two input sizes hide these non-conforming cases.
Test with random sizes.
🎯 Key Takeaway
Master Theorem assumes equal splits. Real data doesn't.
Use Akra-Bazzi for variable split factors.
Always test runtime on non-power-of-two sizes.
🗂 Divide and Conquer vs Other Paradigms
When to choose which approach
ParadigmSplitSubproblem OverlapCombineExample
Divide and ConquerEqual/balancedIndependentMergeMerge Sort
Dynamic ProgrammingAnyOverlappingTabulation/StoreFibonacci (DP)
GreedySingle pathNot applicableLocal decisionHuffman Coding
BacktrackingExplore allNot applicablePrune/revertN-Queens

🎯 Key Takeaways

  • Three steps: Divide into equal subproblems, Conquer recursively (base case when trivially small), Combine — the combine step determines the overall complexity.
  • Master Theorem for T(n) = aT(n/b) + O(n^c): compare log_b(a) with c. Equal → O(n^c log n). Less → O(n^c). Greater → O(n^log_b(a)).
  • When D&C pays off: independent subproblems (no overlap), balanced splits (near-equal halves), efficient combine (O(n) or less). Imbalanced splits degrade to O(n^2).
  • When to use DP instead: when subproblems overlap (Fibonacci without memoisation is D&C with exponential redundancy — DP fixes this).
  • The fast_power function is the clearest minimal D&C example — O(log n) vs O(n) naive. Every cryptographic system relies on fast modular exponentiation built this way.
  • Integer overflow in mid calculation is a silent killer at scale. Use low + (high-low)//2.

⚠ Common Mistakes to Avoid

    Using D&C when subproblems have heavy overlap
    Symptom

    The algorithm runs in exponential time where a linear DP exists (e.g., naive Fibonacci).

    Fix

    Check if subproblems are independent. If the same subproblem is solved many times, switch to dynamic programming with memoisation or tabulation.

    Ignoring the combine step's complexity
    Symptom

    Total runtime is orders of magnitude worse than expected (e.g., O(n^2 log n) instead of O(n log n)).

    Fix

    Derive the recurrence as T(n) = aT(n/b) + f(n) and apply Master Theorem before coding. If f(n) is O(n^2) and a=b=2, total is O(n^2 log n) — worse than a naive O(n^2) algorithm.

    Using integer division for mid calculation without overflow protection
    Symptom

    For large arrays (size > 2^30), mid becomes negative, causing ArrayIndexOutOfBounds or segmentation fault.

    Fix

    Use mid = low + (high - low) // 2 instead of (low + high) // 2. In languages with arbitrary precision (Python), this is safe, but in Java, C++, C# it's critical.

    Not handling odd-sized subproblems correctly
    Symptom

    The algorithm fails or has off-by-one errors for inputs not a power of two.

    Fix

    Ensure base cases cover all sizes. Test with input sizes: 1, 2, 3, 4, 5, 7, 8, 9, 15, 16. The divide step must work for any n > 1.

    Creating new arrays at each recursion level
    Symptom

    Memory usage O(n log n) instead of O(n) — causes OutOfMemory in production for large inputs.

    Fix

    Use indices into a shared array (e.g., merge sort with auxiliary array). In Python, use list slicing which copies — switch to indices for large data.

Interview Questions on This Topic

  • QWhat are the three steps of divide and conquer?JuniorReveal
    Divide the problem into smaller independent subproblems (usually equal size), Conquer each subproblem recursively (base case: solve directly), Combine the subproblem solutions into the overall solution. The combine step's complexity determines the total runtime.
  • QApply the Master Theorem to T(n) = 3T(n/3) + O(n).Mid-levelReveal
    Here a=3, b=3, f(n)=n. Compute log_b(a) = log_3(3) = 1. Compare with exponent c in f(n)=n^c: c=1. Since log_b(a) = c (1=1), this is Case 2: T(n) = Θ(n^c log n) = Θ(n log n).
  • QWhat makes quick sort's worst case O(n²) while merge sort is always O(n log n)?Mid-levelReveal
    Quick sort's divide step depends on pivot selection. If the pivot is always the smallest or largest element (e.g., sorted input with first-element pivot), the split is 1 and n-1, leading to recurrence T(n)=T(n-1)+O(n), which solves to O(n²). Merge sort always divides exactly in half (equal split), so its recurrence is T(n)=2T(n/2)+O(n) → O(n log n). Merge sort's guarantee comes from the balanced divide step, not the combine.
  • QWhen should you use divide and conquer vs dynamic programming?SeniorReveal
    Use D&C when subproblems are independent (no overlap). Use DP when subproblems overlap (the same subproblem is solved multiple times). D&C solves each subproblem freshly; DP stores results to avoid recomputation. Example: Fibonacci via recursion (D&C) is exponential; Fibonacci with memoisation (DP) is linear. A good rule: if you can identify identical subproblems in your recursion tree, switch to DP.
  • QExplain how to apply the Akra-Bazzi theorem to a recurrence with variable split sizes.SeniorReveal
    Akra-Bazzi handles recurrences where subproblem sizes are fractions of n that may vary. For T(n) = Σ a_i T(b_i n) + f(n), you compute p such that Σ a_i (b_i)^p = 1. Then T(n) = Θ(n^p + n^p ∫_1^n f(u)/u^(p+1) du). This is advanced — typically used in numerical algorithm analysis. For standard interviews, Master Theorem suffices.

Frequently Asked Questions

What is the difference between divide and conquer and dynamic programming?

D&C solves independent subproblems — each subproblem is solved fresh. DP solves overlapping subproblems — stores results to avoid redundant computation. Fibonacci via recursion is D&C (exponential). Fibonacci via memoisation/tabulation is DP (linear).

Can the Master Theorem handle recurrences where f(n) is not a polynomial?

Yes, Case 2 extends to f(n) = Θ(n^c log^k n). Then T(n) = Θ(n^c log^(k+1) n). For example, T(n) = 2T(n/2) + n log n → log_b(a)=1, c=1, k=1 → Case 2 → T(n)=Θ(n log^2 n).

What happens if subproblems are not the same size?

The Master Theorem no longer applies directly. For example, quick sort with a fixed pivot can have sizes 1 and n-1. You must either use the recursion tree method or Akra-Bazzi. In practice, ensure your divide step produces balanced subproblems.

How do I debug a divide and conquer algorithm that works on small inputs but fails on large ones?

Check integer overflow in the mid calculation first. Then check whether base cases cover all sizes (test every size from 1 to 10). Finally, verify the combine step returns correct types — a common bug is returning a single element when the merge expects a list.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousDivide and Conquer TechniqueNext →Counting Inversions using Merge Sort
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged