Binary Search Overflow — The Bug Hidden in Java for 9 Years
Binary search breaks silently on large arrays when mid calculation overflows.
20+ years shipping performance-critical code where algorithms decide the bill. Everything here is grounded in real deployments.
- Core concept: halve the search space every step by comparing target with middle element
- Requires sorted data — unsorted input gives silently wrong answers, no exception, no warning
- Works in O(log n) time: for 1 million items, at most 20 comparisons; for 1 billion, at most 30
- Two implementations: iterative (while loop, O(1) space) and recursive (function call overhead, O(log n) stack space)
- Biggest pitfall: overflow in
(left + right) / 2— useleft + (right - left) / 2or(left + right) >>> 1 - Production insight: silently returns -1 for existing targets if array is unsorted — always sort first and validate
- Java built-ins:
Arrays.binarySearch()for primitives,Collections.binarySearch()for Lists — both require pre-sorted input
Imagine you're looking up the word 'Mango' in a physical dictionary. You don't start at page 1 and read every word — you crack the book open near the middle, see you're at 'J', and think 'M comes after J, so I'll look in the right half.' You keep halving the remaining pages until you land on 'Mango'. That's binary search — instead of checking every item one by one, you repeatedly cut the search space in half. What might take 1,000 steps the slow way takes just 10 steps with binary search.
Binary search halves your problem space on every check. It's the difference between guessing a number between 1 and 1,000 in ten tries versus a thousand. Get the logic wrong, and you'll ship a bug that silently corrupts data for years—exactly what happened in Java's own library for nearly a decade.
What Binary Search Actually Does — and the One-Off Bug That Hid for Nearly a Decade
Binary search finds an element in a sorted array by repeatedly halving the search interval. Given a sorted list of n elements, it compares the target against the middle element, then discards the half that cannot contain the target. This reduces the search space from n to n/2 to n/4 … in O(log n) comparisons — exponentially faster than linear scan’s O(n).
In practice, binary search works only on sorted data and random-access structures (arrays, array-backed lists). Each iteration computes mid = (low + high) / 2. The classic Java implementation used (low + high) / 2, which overflows when low + high exceeds Integer.MAX_VALUE — a bug that silently corrupts results for large arrays (size > 2^30). The fix: mid = low + (high - low) / 2.
Use binary search whenever you need fast lookups on static or rarely-changed sorted data — database index seeks, dictionary lookups, versioned data snapshots. It’s the foundation of every balanced tree and B-tree operation. Misunderstanding the overflow edge case has caused production outages in search engines, trading systems, and distributed caches.
Why Searching One-by-One Breaks Down (Linear Search vs Binary Search)
Before binary search makes sense, you need to feel the pain it solves. The naive way to search a list is linear search — start at item 1, check it, move to item 2, check it, and keep going until you find your target or run out of items. For a list of 10 things, that's fine. For a list of 1,000,000 things, you could be checking up to one million items. In the absolute worst case, your target is the very last item, or it isn't there at all.
This approach is called O(n) in Big O notation — as the list grows, your work grows at exactly the same rate. Double the data, double the time.
Binary search flips the script. Instead of O(n), it runs in O(log n) time. For a list of 1,000,000 items, binary search needs at most 20 comparisons — not one million. For 1 billion items, at most 30 comparisons. The catch? Your data must be sorted first. That single requirement is what makes the halving trick possible: if your list is sorted, you can always look at the middle and know with certainty which half your target lives in.
Here's the critical subtlety most tutorials skip: if you only search once, sorting first costs O(n log n) plus O(log n) for the search, totalling O(n log n) — worse than a single O(n) linear scan. Binary search pays off when you sort once and search many times. For k searches on a sorted array, total cost is O(n log n + k log n). Linear search for k queries costs O(kn). Binary search wins when k is large enough that k log n < kn, which is true for any k ≥ 1 after sorting.
for (int i=1; i<arr.length; i++) assert arr[i-1] <= arr[i];How Binary Search Actually Works — The Step-by-Step Logic
Here's the core idea, plain and simple. You have a sorted array. You want to find a target value. Binary search maintains two pointers — a left boundary and a right boundary — that represent the section of the array still worth searching. At the start, left is index 0 and right is the last index.
Each round, you calculate the middle index of the current window. You look at the value sitting at that middle index and ask one of three questions: 1. Is the middle value equal to my target? If yes — you're done, return the index. 2. Is the middle value less than my target? Then the target must be in the right half. Move left up to middle + 1. 3. Is the middle value greater than my target? Then the target must be in the left half. Move right down to middle - 1.
You repeat this until either you find the target or left exceeds right (meaning the target isn't in the array at all).
Each iteration eliminates half the remaining candidates. Starting with 1,000,000 items: after step 1 you have 500,000 candidates. After step 2, 250,000. After 20 steps, you're down to 1. That's the magic of logarithms — log base 2 of 1,000,000 is about 20.
Notice the boundary updates use mid + 1 and mid - 1, not just mid. This is critical. If you use left = mid when the target is greater than arr[mid], and the window has only 2 elements (left and left+1), then mid = left and you set left = mid = left — the window never shrinks, and you loop forever.
left <= right — using left < right skips the last element when the search window narrows to exactly one element.<= to ensure the single-element window is checked before the loop exits.Binary Search the Recursive Way — Same Logic, Different Shape
The iterative version uses a while loop. The recursive version calls itself with a smaller window each time. Both approaches are identical in logic and performance — it's a style choice, though iterative is usually preferred in production because it avoids stack overhead.
In the recursive version, you pass the current left and right boundaries as parameters. When you hit the base case (left > right), you return -1. When you find the target, you return its index. Otherwise you call yourself again with the updated half.
Seeing both versions is valuable because many interview questions ask you to implement one or the other, and understanding recursion here is a perfect stepping stone to harder recursive algorithms like merge sort and quicksort.
Notice how the recursive version reads almost like the English description of binary search — it's self-documenting. The tradeoff is that each recursive call adds a frame to the call stack. For an array of one million elements, that's at most 20 recursive calls deep, which is perfectly fine. For extremely large arrays (billions of elements), the stack depth reaches ~30 frames — still well within default stack limits. But it's good to be aware of the difference, especially in languages without tail-call optimisation (Java, Python).
Common Mistakes That Break Binary Search (And Exactly How to Fix Them)
Binary search looks simple, but it has a handful of classic traps that catch even experienced developers. The most infamous is the integer overflow bug — it lived undetected in Java's standard library for nearly a decade before being reported by Joshua Bloch in 2006. Understanding these mistakes doesn't just help you avoid bugs; it shows interviewers you think carefully about edge cases.
The second most common mistake is applying binary search to unsorted data. The algorithm will run without throwing any errors — it will just silently return the wrong answer. There's no warning, no exception. This is the sneakiest kind of bug because your tests on small sorted datasets pass, and the failure only manifests in production when unsorted data slips through.
Third: getting the boundary updates wrong. Using middleIndex instead of middleIndex + 1 when updating the left boundary causes an infinite loop when your window is two elements wide. The loop never terminates because the window never shrinks. Specifically: when left=2 and right=3, mid=2 (integer division), and setting left = mid gives left=2 again — no progress.
Fourth: using left < right instead of left <= right. This skips the case where the window narrows to exactly one element. If your target happens to be that single remaining element, the loop exits before checking it, and you return -1 incorrectly.
Let's look at each of these concretely in code so you can recognize them on sight.
Arrays.sort(yourArray) before searching is a one-liner insurance policy.Using Java's Built-in Binary Search: Arrays.binarySearch() and Collections.binarySearch()
Instead of implementing binary search from scratch every time, you can use the battle-tested implementations in Java's standard library. The method Arrays.binarySearch(int[] a, int key) performs binary search on a sorted array of primitives, returning the index of the key if found, or a negative value — -(insertion point) - 1 — if not found. This negative return value tells you where the key would be inserted to maintain sorted order. For example, if the key is greater than all elements, the insertion point is a.length, so the method returns -(a.length) - 1.
For List objects, use Collections.binarySearch(List<? extends Comparable<? super T>> list, T key). This works with ArrayList (random access) efficiently in O(log n), but degrades to O(n) per step with LinkedList because finding the midpoint requires traversal. The list must be sorted according to the natural ordering of its elements (or a provided Comparator).
These methods were the ones that carried the infamous integer overflow bug for 9 years in the Arrays.binarySearch primitive overloads — fixed in Java 6 update 10, but a reminder that even library code can have edge-case bugs. For most production code, prefer these built-in methods for correctness and readability, but always ensure the array or list is sorted beforehand. If you need the leftmost or rightmost occurrence in a dataset with duplicates, you'll need to write your own variant, as the built-ins return any matching index — not necessarily the first or last.
-(insertion point) - 1 means you can recover the insertion point with int ip = -result - 1;. This is useful for inserting into a sorted array or finding the nearest value — without a separate search step.Arrays.binarySearch over custom implementation for primitives — it's optimised, tested, and handles the overflow correctly since the fix. But always validate sorted input, especially after data mutations. A common mistake: calling Arrays.binarySearch on an array that was sorted by a Comparator but searched using natural ordering, or vice versa — this produces silently wrong results.Arrays.binarySearch and Collections.binarySearch for standard search needs.Finding First and Last Occurrence in Duplicate Arrays
Standard binary search returns an arbitrary occurrence when duplicates exist. For many interview problems — and real-world scenarios like finding the earliest log entry with a certain timestamp or the latest user action — you need the leftmost (first) or rightmost (last) occurrence. The trick is to modify the standard algorithm so that instead of returning immediately on a match, you record the match and continue searching in the left or right half.
Leftmost (first occurrence): When arr[mid] == target, record mid as a candidate result, then set right = mid - 1 and keep looping. This forces the search to continue leftward, finding any earlier occurrence. After the loop ends, result holds the first occurrence (or -1 if no match was found).
Rightmost (last occurrence): When arr[mid] == target, record mid as a candidate result, then set left = mid + 1 and keep looping. This forces the search to continue rightward, finding any later occurrence.
This pattern appears frequently in real-world searches across sorted arrays of objects where you want the earliest or latest match. It's also a common building block for count-of-occurrence (lastIndex - firstIndex + 1) and range queries. Both variants maintain O(log n) time complexity — the loop still halves the window each step, even when recording matches.
firstOccurrence and lastOccurrence and combine them. Total time complexity: O(log n) + O(log n) = O(log n). This is a common follow-up to the basic binary search question and appears in LeetCode 34.Binary Search in Python and C++ — Multi-Language Implementations
Binary search is language-agnostic. The same logic works across languages with only syntactic differences. Here are iterative implementations in Python and C++, side by side.
Python's bisect module provides built-in binary search via bisect_left (equivalent to first occurrence / lower bound) and bisect_right (equivalent to upper bound). These are implemented in C for speed and should be your default choice in production Python code. For learning purposes, writing it from scratch solidifies your understanding.
C++ offers std::binary_search (returns bool), std::lower_bound (iterator to first element ≥ target), and std::upper_bound (iterator to first element > target). These work on any sorted range and are the production standard in C++.
One language-specific nuance: Python handles arbitrarily large integers natively, so the overflow bug from the Java incident cannot occur in pure Python. However, if you're writing C extensions or using ctypes with fixed-width integers, the same overflow applies. In C++, the overflow is identical to Java for int types.
// is floor division and integers have arbitrary precision — no overflow possible. In C++, int division truncates toward zero, and overflow behaves identically to Java (wraps around for signed types). In C++ with very large arrays, use size_t or ptrdiff_t for indices to avoid overflow. Always be aware of how your language handles integer limits.bisect module is implemented in C under the hood and is highly optimised. Always prefer bisect.bisect_left and bisect.bisect_right over hand-written binary search in production Python. In C++, std::lower_bound and std::upper_bound give iterators for first/last occurrence equivalents and are the idiomatic production choice. std::binary_search only returns a boolean — it doesn't give you the index.Where Binary Search Shines (and Where It Doesn't)
Binary search isn't a universal hammer. Its power depends on sorted data and random-access memory. Here's where it wins and where you should use something else.
Wins: - Large static datasets searched repeatedly (e.g., a phone book, dictionary, sorted log entries). The O(n log n) sort cost is amortised over thousands of O(log n) searches. - Database indexing: B-trees and B+ trees are built on the same halving principle, enabling O(log n) lookups across billions of rows. - Finding roots of monotonic functions (binary search on answer) — used in numerical methods, machine learning hyperparameter tuning, and optimisation. - Sorted file processing: searching within a sorted flat file or memory-mapped region.
Losses: - Linked lists: random access is O(n) per step, so binary search degrades to O(n log n) overall — worse than a single O(n) linear scan. Use linear search or convert to array first. - Tiny datasets (less than ~64–128 items): linear search is often faster due to CPU cache locality and branch prediction. The overhead of computing midpoints and branching isn't worth it. - Data that changes frequently: re-sorting each time costs O(n log n), outweighing search benefits. Use a balanced BST (TreeMap, TreeSet) or skip list instead. - Unsorted data searched only once: sort + binary search = O(n log n) + O(log n). A single linear scan = O(n). Linear wins. - Hash-based exact lookups: if you only need exact matches (not ranges, not ordered queries), a HashMap gives O(1) average time.
Real-world analogy: Binary search is like using an index in a book. If you're looking up one word, the index saves time. If you're reading the whole book cover to cover, the index is useless. Match the tool to the access pattern.
- Sorted array = alphabetical list of keywords in the index
- Middle element = the page you open the book to
- Halving = skipping entire sections not containing your word
- Linear search = reading every page from start to finish
- The index fails if the book isn't alphabetical (unsorted data)
- Building the index (sorting) takes effort — worth it only if you look up many words
Applications of Binary Search: Monotonic Functions, Search Space Reduction, and Feasibility Problems
Binary search is not limited to searching for a value in an array. It is a general algorithmic technique for solving problems where the search space is monotonic — i.e., a property holds for one side and not the other. This pattern is often called 'binary search on the answer.'
1. Monotonic Functions: Given a function f(x) that is monotonic (non-decreasing or non-increasing) over some range, you can binary search for the x where f(x) equals a target or crosses a threshold. For example, finding the integer square root of a number: binary search on x from 0 to n, checking if x*x ≤ n. Similarly, finding the first bad version in a sequence of software builds (if version v is bad, all versions after v are also bad) is a classic binary search application.
2. Search Space Reduction: Problems like 'guess the number' where you have a known range and get feedback (higher/lower) are inherently binary search. This extends to optimisation problems where you search for the minimum or maximum feasible value. For example, find the minimum capacity of a ship to transport all packages within D days (LeetCode 1011).
3. Feasibility Problems: Many algorithmic problems ask: 'Is it possible to achieve a certain objective under given constraints?' If the answer is monotonic (e.g., if capacity C works, any larger capacity also works), you can binary search on the parameter. Common examples: allocating books to students to minimise the maximum pages per student, splitting an array into subarrays with a bounded maximum sum, or scheduling jobs with a minimum completion time.
Binary search on the answer requires you to write a helper function feasible(mid) that returns true/false for a given candidate answer. Then you binary search across the possible range of the answer, narrowing to the optimal value.
The key insight: any problem where you can frame the question as 'what is the minimum/maximum X such that condition(X) is true?' and condition(X) is monotonic — binary search applies.
Advantages and Disadvantages of Binary Search
The following table summarises the key strengths and weaknesses of binary search.
| Feature | Advantage | Disadvantage |
|---|---|---|
| Time Complexity | O(log n) — extremely fast even for billions of items (30 comparisons for 1 billion) | Only achievable after sorting; if data is unsorted, you pay O(n log n) to sort first |
| Space Complexity | O(1) for iterative version — no extra memory beyond a few variables | Recursive version uses O(log n) stack space; may be a concern in stack-constrained environments |
| Data Structure Requirement | Works with any random-access data structure (arrays, ArrayLists, memory-mapped files) | Inefficient on linked lists (O(n) access per midpoint, overall O(n log n) — worse than linear scan) |
| Implementation Complexity | Core logic is simple — three comparisons and two pointer updates per iteration | Subtle bugs are extremely common: integer overflow, off-by-one boundary errors, infinite loops, and silent wrong answers on unsorted data |
| Correctness | Deterministic and complete — will always find the target if it exists and data is sorted | Silent failure mode: unsorted input produces wrong answers with no error or exception |
| Use Cases | Large static or infrequently updated sorted datasets; database indexing; optimisation problems; monotonic function root-finding | Not suitable for tiny datasets (<64 items), frequently changing data, unsorted arrays, linked lists, or fuzzy/partial matches |
In practice, the advantages strongly outweigh the disadvantages for the right use cases. Always evaluate your data size, mutation frequency, and access pattern before choosing binary search over alternatives.
Arrays.binarySearch, bisect, std::lower_bound) to avoid implementation bugs. Only hand-roll binary search when you need custom variants like first/last occurrence, rotated array search, or search on answer. The built-in implementations have been battle-tested across millions of production systems and handle edge cases you might miss.Search in Rotated Sorted Array — The Interview Classic
A rotated sorted array is a sorted array that's been shifted at some pivot point. Example: [4,5,6,7,0,1,2] — this was originally [0,1,2,4,5,6,7], rotated at index 4. Standard binary search fails because the array isn't fully sorted. But you can still search in O(log n) by identifying which half is properly sorted and then deciding where the target lives.
Here's the logic: compute mid. Check if the left half is sorted (arr[left] <= arr[mid]). If it is, test if the target lies within that sorted range (arr[left] <= target && target < arr[mid]). If yes, narrow to the left half; otherwise go to the right half. If the left half isn't sorted, then the right half must be sorted — apply the same sorted-range check there.
Edge case: duplicates (LeetCode 81). When arr[left] == arr[mid] == arr[right], you can't determine which half is sorted. The safe fallback is to increment left and decrement right, degrading to O(n) in the worst case. This is unavoidable — with all-equal elements, there's no information to eliminate a half.
This pattern appears in production when dealing with log rotation (new logs appended, old logs archived or wrapped around), circular buffers (ring buffers for network packets, sensor readings), or data that was rotated during a failover or rebalancing operation.
arr[i] > arr[i+1]) on each lookup, then used the rotated search variant.Conditions for Binary Search — When It Works and When It Betrays You
Binary search isn't a magic wand. You can't throw it at any array and expect logarithmic speed. Two hard preconditions must hold, and violating either will give you wrong answers silently.
First, the data must be sorted. Not 'kinda sorted.' Not 'sorted in some places.' Fully sorted according to the same comparison operator you're using to search. A single unsorted element breaks the assumption that mid splits the search space correctly. Production story: a developer once sorted a list of user IDs by string comparison but searched with numeric operators. Binary search returned garbage for two years.
Second, random access must be O(1). Arrays and ArrayLists work. Linked lists? Forget it. Traversing a linked list to find mid kills the logarithmic advantage — you're back to O(n) per step, which defeats the entire purpose. Binary search on a linked list is a party trick, not a solution.
Always validate these conditions before you write a single line of search logic.
Complexity Analysis — Why O(log N) Isn't Just a Flex
You know binary search is O(log N). But what does that actually mean when your data goes from 1,000 to 1,000,000 entries?
Linear search on 1,000 items takes up to 1,000 comparisons. Binary search? Ten. On a million items, linear search hits a million comparisons worst-case. Binary search does twenty. That's the difference between a response in milliseconds vs. a timeout that wakes up an SRE at 3 AM.
Here's the trade-off: binary search's space complexity depends on implementation. The iterative version runs in O(1) extra space — just three integer variables: low, high, mid. The recursive version burns O(log N) stack space because each recursive call frames the next half. On a 32-bit system with a 1-million-element array, that's about 20 stack frames. Negligible on modern hardware, but in embedded systems or recursion-heavy code, it can blow your stack.
Always prefer the iterative version in production. It's harder to get wrong and doesn't make your ops team nervous.
Binary Search Pseudocode — The Skeleton That Kills Off-by-One Errors
You don't memorize binary search. You internalize its loop invariant. The pseudocode isn't academic fluff — it's the contract that prevents the decade-old bug that plagued Java's own library. Every real implementation is just this skeleton with language-specific syntax.
The core rule: maintain two pointers, low and high, such that the target must lie between them. Each iteration shrinks that range by half. The mid-point calculation mid = low + (high - low) / 2 isn't fancy — it's explicitly there to prevent integer overflow when low + high exceeds Integer.MAX_VALUE. That's not theory; that's a production vulnerability.
Termination happens when low > high. At that point, if the target wasn't found, you return a sentinel (typically -1). In a rotated sorted array or finding boundaries, you adjust the condition to low < high to lock onto the transition point. The pseudocode stays the same; only the comparison operator changes.
mid = (low + high) / 2. In languages with fixed integer sizes, low + high overflows when both are large (close to Integer.MAX_VALUE). Always prefer low + (high - low) / 2 — it's mathematically identical and overflow-proof.Related Links — The Battle-Tested Resources Every Senior Dev Actually Uses
You don't need a library of tutorials. You need the three places real engineers go when binary search breaks in production. First: the official Java Arrays.binarySearch() documentation. The contract for undefined behavior on unsorted arrays is non-negotiable — skip reading it once, and you'll debug a false negative for hours.
Second: the research paper "Binary Search in a Sorted Rotated Array" by Bentley and Sedgewick. It's the canonical reference for the interview classic and contains the exact proofs for why the algorithm terminates. Don't argue with your peer about edge cases — cite this.
Third: the java.util.Collections source code. Reading the actual implementation of binarySearch() for lists reveals how the JDK handles the RandomAccess interface check. When your list is a linked list, binary search degrades to linear search internally. The docs won't scream that at you; the source code will.
Skip generic blog posts that regurgitate textbook examples. These three resources — official docs, foundational papers, and source code — are what you reference in code reviews and incident postmortems.
Collections.binarySearch() and Arrays.binarySearch(). When a junior argues that binary search is always O(log n), show them line 245 where it checks instanceof RandomAccess. Lists without random access degrade to linear scan.Binary Search Visualization — See Why O(log n) Actually Matters
Binary search is notoriously hard to internalize from text alone. A good visualization turns abstract index arithmetic into a concrete, spatial operation. Start with a sorted array of 16 elements. Draw a horizontal line representing the array. Mark three pointers: low (leftmost), high (rightmost), and mid (computed as (low + high) / 2). Animate each step: highlight the current mid, compare it to the target, then collapse the search space by moving low to mid+1 or high to mid-1. The key insight is the geometric shrinkage — after one comparison, half the elements are discarded. After four comparisons, you eliminate 15 out of 16 items. This visual feedback cements why binary search works on arrays of any size, including edge cases like empty arrays or single elements. Avoid the bug where mid becomes a floating-point number — integer arithmetic ensures exact indices.
How to Get Started with DSA — Binary Search as Your First Algorithm
Start with binary search because it has a clear precondition (sorted data), a single failure mode (off-by-one), and immediate real-world payoff (database indexing, autocomplete, debuggers). Don't jump to complex data structures first. Write binary search from memory on pen and paper until you can do it without referencing any tutorial. Then add two constraints: find the first occurrence (use while low < high with mid = low + (high - low) / 2) and find the last occurrence (use while low < high with mid = low + (high - low + 1) / 2). After mastering these, move to rotated arrays and peak finding — these are just binary search with a twist. The goal isn't to memorize code but to internalize the invariant: before each iteration, the target must be in [low, high]. This invariant is reusable across thousands of DSA problems. Skip linear search entirely — it teaches bad intuition.
Integer Overflow in Midpoint Calculation Crashes Search in Production
(left + right) / 2 is mathematically safe because the sum of two positive integers divided by 2 should stay within range. The team assumed Java's int type would handle any valid array index pair.int overflow: when left and right are both > 1.07 billion, their sum exceeds Integer.MAX_VALUE (2,147,483,647). The overflow wraps to a negative number, dividing by 2 yields a negative index. For example, left=1,500,000,000 and right=1,600,000,000 gives sum=3,100,000,000, which overflows to -1,194,967,296, and dividing by 2 gives -597,483,648. This bug lived in Java's Arrays.binarySearch() implementation for nearly a decade — Joshua Bloch reported it in 2006, and it affected every Java program using the standard library's binary search on very large arrays.(left + right) / 2 with left + (right - left) / 2. This never overflows because right - left is always non-negative and always less than Integer.MAX_VALUE when both are valid array indices. Alternatively use (left + right) >>> 1 (unsigned right shift), which treats the bit pattern as unsigned and correctly halves even overflowed sums. Java's standard library was patched to use the unsigned shift approach.- Never assume addition of two positive integers stays positive in languages with fixed-width integers (Java, C, C++, C#).
- Use the safe midpoint formula as a reflex — it's mathematically identical and overflow-proof.
- Test binary search on edge cases: largest possible array sizes, first/last elements, single-element arrays, and missing values.
- Python developers are not immune — while Python handles arbitrary-precision integers natively, the same bug exists in C extensions and in any language with fixed-width integers.
- Code review should flag
(left + right) / 2as a known anti-pattern in any binary search implementation.
Arrays.sort(array) before searching. Verify with a linear scan on a small subset. Also check whether the sort order matches what the binary search expects — ascending vs descending sorts require different comparison logic.(left + right) / 2, replace with left + (right - left) / 2 immediately. This is integer overflow. Print left, right, and their sum to confirm the overflow is occurring.left = mid + 1 and right = mid - 1 (not left = mid or right = mid). With a 2-element window, mid equals left and if you set left = mid, the window never shrinks. Add a debug print of left, right, and mid each iteration to observe stagnation.right = mid - 1 and record result). For rightmost, continue right (left = mid + 1 and record result).System.out.println("left=" + left + " right=" + right + " mid=" + mid + " val=" + arr[mid]);for (int i = 1; i < arr.length; i++) { if (arr[i-1] > arr[i]) { System.out.println("UNSORTED at index " + i); break; } }Key takeaways
left + (right - left) / 2 to prevent integer overflow. The bug from (left + right) / 2 hid in Java's standard library for nearly a decade.left <= right with boundary updates left = mid + 1 and right = mid - 1; using left < right misses single-element windows, and using left = mid causes infinite loops.Common mistakes to avoid
5 patternsApplying binary search on an unsorted array
Arrays.sort(array) before searching, or assert that the array is sorted in debug mode: for (int i=1; i<arr.length; i++) assert arr[i-1] <= arr[i];. If data mutates between sort and search, re-sort or use a self-sorting data structure (TreeSet, TreeMap).Integer overflow in midpoint calculation using (left + right) / 2
(left + right) / 2 with left + (right - left) / 2 or use unsigned shift (left + right) >>> 1 in Java. Both produce the same result but never overflow. This applies to any language with fixed-width integer types (Java, C, C++, C#, Rust).Wrong boundary update causing infinite loop (left = mid instead of left = mid + 1)
left = mid + 1 when target > arr[mid], and right = mid - 1 when target < arr[mid]. Both boundaries must strictly exclude mid to guarantee the window shrinks every iteration.Using `left < right` instead of `left <= right` in the loop condition
while (left <= right) to include the single-element window check. The loop correctly terminates when left > right, meaning the entire search space has been exhausted.Not handling duplicates — expecting first or last occurrence from standard binary search
Practice These on LeetCode
Interview Questions on This Topic
Implement binary search iteratively and recursively. Explain the tradeoffs.
java
// Iterative — O(log n) time, O(1) space
public int binarySearch(int[] arr, int target) {
int left = 0, right = arr.length - 1;
while (left <= right) {
int mid = left + (right - left) / 2;
if (arr[mid] == target) return mid;
else if (arr[mid] < target) left = mid + 1;
else right = mid - 1;
}
return -1;
}
// Recursive — O(log n) time, O(log n) stack space
public int binarySearchRecursive(int[] arr, int target, int left, int right) {
if (left > right) return -1;
int mid = left + (right - left) / 2;
if (arr[mid] == target) return mid;
else if (arr[mid] < target)
return binarySearchRecursive(arr, target, mid + 1, right);
else
return binarySearchRecursive(arr, target, left, mid - 1);
}
`
Key details to mention: safe midpoint formula to avoid overflow, left <= right loop condition, and mid ± 1` boundary updates to prevent infinite loops.Frequently Asked Questions
20+ years shipping performance-critical code where algorithms decide the bill. Everything here is grounded in real deployments.
That's Searching. Mark it forged?
19 min read · try the examples if you haven't