Closest Pair of Points — Why Your Strip Takes O(n²)
An unsorted strip turned 10K points into 30-second queries.
20+ years shipping performance-critical code where algorithms decide the bill. Notes here come from systems that actually shipped.
- Divide and conquer splits points by median x, finds closest pair in each half recursively, then checks a narrow strip around the dividing line.
- The strip’s width is δ (minimum distance from left/right halves). Only points within δ of the midline are considered.
- Each point in the strip checks at most 7 neighbours, thanks to a geometric packing argument in a 2δ×δ rectangle.
- Performance insight: the combine step runs in O(n) when the strip is sorted by y, giving overall O(n log n).
- Production insight: forgetting to sort the strip by y can silently balloon the combine step to O(n²) on dense datasets.
- Biggest mistake: assuming the 7-neighbour bound holds without sorting the strip by y-coordinate.
Given n points on a plane, find the two closest ones. Naive: check all pairs — O(n²). D&C: split points by x-coordinate, find closest pair in each half recursively, then check if a closer pair crosses the dividing line. The key insight: you only need to check a narrow strip around the dividing line, and within that strip, each point has at most 7 neighbours to check.
The closest pair problem is a beautiful example of computational geometry enabling an otherwise-quadratic combine step to run in linear time. The naive O(n^2) solution checks every pair. The D&C solution achieves O(n log n) by proving a remarkable geometric fact: in the critical 'strip' around the dividing line, each point has at most 7 other points it needs to check.
This seven-point bound — derived from packing arguments in a 2δ×δ rectangle — is the kind of geometric reasoning that separates algorithm designers from algorithm users. It is also a template for many computational geometry optimisations: use the structure of space to bound the number of interactions.
Why the Closest Pair of Points Problem Is a Divide-and-Conquer Trap
The closest pair of points problem asks: given n points in a plane, find the pair with the smallest Euclidean distance. The naive O(n²) solution checks every pair — fine for 1,000 points, fatal for 100,000. The divide-and-conquer solution achieves O(n log n) by recursively splitting the set, solving each half, and then checking only a narrow strip of width 2δ around the midline. The critical insight: within that strip, sorting by y-coordinate and comparing each point to its next 7 neighbors guarantees correctness without blowing up to O(n²).
In practice, the strip is where most implementations silently regress to O(n²). The strip can contain O(n) points in the worst case (e.g., all points lie near the midline). Without the y-sorted neighbor limit, you'll compare every point in the strip against every other — exactly the naive algorithm again. The 7-point bound is tight: it relies on the fact that points within a δ×2δ rectangle cannot be too dense without violating the minimum distance from each half.
Use this algorithm when you need to compute proximity in spatial data — collision detection, GIS nearest-neighbor queries, or clustering initialization. It's the foundation for higher-dimensional variants (e.g., closest pair in 3D) and for approximate nearest-neighbor search. In real systems, the O(n log n) version is mandatory for n > 10,000; below that, the constant factors of recursion and sorting often make the naive O(n²) faster.
Algorithm Overview
- Sort points by x-coordinate: O(n log n)
- Recursively find closest pair in left half (δ_L) and right half (δ_R)
- δ = min(δ_L, δ_R)
- Find all points within δ of the dividing line (the strip)
- Check pairs in the strip sorted by y-coordinate
- The key: each point in the strip has at most 7 other points to check
The 7-Neighbour Proof
Why does each point in the strip have at most 7 other points to check? Consider a δ×2δ rectangle straddling the midline. In each δ×δ half, the closest pair distance is ≥ δ (we already found the best in each half). So in each δ×δ quarter, we can pack at most 4 points at distance ≥ δ from each other (a 2×2 grid). Total: 8 points maximum in the 2δ×2δ strip rectangle including the reference point itself — so at most 7 others.
This bounds the inner loop to constant work per point → O(n) combine step.
Optimisation — Pre-Sort by Y
Re-sorting the strip each time adds a log factor. The optimisation: maintain two sorted arrays throughout — one sorted by x (for dividing) and one sorted by y (for strip checking). This gives T(n) = 2T(n/2) + O(n) → O(n log n) with smaller constants.
Implementation detail: during recursion, merge the y-sorted lists of the left and right halves (like merge sort) to produce the parent's y-sorted list. The strip can then be filtered in O(|strip|) from the y-sorted list without a full sort.
Complexity Analysis and Recurrence
The recurrence for the naive version (no pre-sort): T(n) = 2T(n/2) + O(n log n). Solving via Master Theorem: a=2, b=2, f(n)=O(n log n) = O(n^(log₂2) log n) = O(n log n). This falls under Case 2 of the Master Theorem (actually a special case). The solution is O(n log² n). Because the combine step sorts the strip, it's O(n log n), leading to T(n) = 2T(n/2) + O(n log n) = O(n log² n). Wait — careful: if we sort the strip each time, the combine is O(k log k) where k ≤ n. That sums to O(n log² n) in the worst case. Pre-sorting brings combine to O(n) and recurrence to O(n log n).
- Without pre-sort: T(n) = 2T(n/2) + O(n log n) → by Master Theorem Case 2, T(n) = Θ(n log² n).
- With pre-sort: T(n) = 2T(n/2) + O(n) → by Master Theorem Case 2, T(n) = Θ(n log n).
So pre-sorting doesn't just reduce constants — it changes the asymptotic complexity.
Implementation Details and Edge Cases
Base Case: n ≤ 3. Brute force is efficient and avoids recursion overhead. Use combinations for clarity, but a triple-nested loop is fine for n=3.
Median Selection: Sort by x once, then the median is simply p[mid]. No need for linear-time selection — the sort dominates anyway.
Duplicate Points: If two points have identical coordinates, the distance is 0. The algorithm should handle this correctly because δ becomes 0 and the strip width becomes 0 — only points exactly on the midline are considered, but duplicates on the same side will be caught in the recursive step. However, if duplicates exist across halves, they may be missed because the strip is empty (δ=0). Fix: before divide, deduplicate or handle zero-distance case explicitly.
Vertical and Horizontal Lines: Sorting by x works fine. If many points share the same x, the median may be ambiguous. Use a stable sort or choose the middle index after sorting.
Floating-Point Precision: Distance calculations can suffer from floating-point errors. Use math.dist which is stable, but for comparison, prefer squared distances to avoid sqrt overhead.
Integer Coordinates: If all coordinates are integers, use squared distances to avoid sqrt entirely, only taking sqrt at the end.
Extensions and Higher Dimensions
The D&C approach generalises to k dimensions. The strip becomes a slab of width 2δ, and the neighbour bound becomes O(3^k), which grows exponentially with dimension. For 3D, each point checks at most 31 neighbours (3^3 - 1 = 26? Actually the bound is more complex). But the combinatorial explosion makes D&C impractical for dimensions > 3. For high-dimensional data, use space-filling curves or KD-trees.
Related Problems: The packing argument also applies to finding the minimal distance between two convex polygons, or computing the diameter of a point set using rotating calipers.
Practical Note: In production, for 2D and 3D, the D&C algorithm is the gold standard. For higher dimensions, approximate methods (like LSH) are used.
The Naive Approach: Why You Still Need to Know O(n²)
Before you dismiss the brute-force solution, understand this: it's the baseline every optimization must beat. It's also the only correct answer in interviews when the divide-and-conquer implementation has a bug you can't fix in 30 minutes.
The naive approach is trivial — compute Euclidean distance for every unordered pair, track the minimum. O(n²) time, O(1) space. For n=1000 that's ~500k distance calculations. For n=10⁵ it's catastrophic (~5 billion). But here's the kicker: it's the simplest correct implementation. When you're debugging the divide-and-conquer version and the strip merge logic is melting your brain, being able to fall back to the brute-force version to verify outputs isn't weakness — it's discipline.
I've seen production systems ship with brute-force because the constraints were small enough, and the team didn't trust the D&C implementation edge cases. Know your limits. Use brute-force as the reference implementation, not the final answer.
The Divide-and-Conquer Expected Approach: Where the Real Money Is
The O(n log n) solution is the one that makes this problem famous. Here's the skeleton: sort points by x, recursively find min distance in left and right halves, then handle the strip where points straddle the dividing line.
The critical insight that most tutorials gloss over: the strip check is not O(n²) because you can prove only 7 points can exist in any δ × 2δ rectangle. That's the 7-neighbour theorem. After sorting strip points by y, for each point you only check the next 7. This keeps the merge step O(n) and the recurrence T(n) = 2T(n/2) + O(n), giving O(n log n).
Where implementations almost always fail: floating-point comparison for equality in the strip boundary, and off-by-one errors in the recursive calls. Use a tolerance, not ==. And for god's sake, pre-sort the points by y once and pass around index arrays — sorting the strip fresh each recursion kills your complexity.
Getting Started With Python’s min() and max() Functions
Before diving into the closest pair of points algorithm, you need a firm grasp of Python's min() and max() functions. These built-in functions are deceptively simple yet powerful tools for data extraction and comparison. At their core, they return the smallest or largest item in an iterable or among multiple arguments. Understanding their behavior prevents subtle bugs when processing point coordinates or distances during divide-and-conquer steps. The functions work by comparing elements using the less-than operator by default, but this can be customized. Knowing how to call these functions correctly with a single iterable versus multiple arguments is essential for clean code. For example, when you have a list of points as tuples, min(points) compares the first element of each tuple unless you specify a key. This foundational skill will help you implement efficient distance comparisons in the closest pair algorithm without reinventing the wheel.
min() and max() on an empty iterable raise ValueError unless you provide a default. Always check for empty data when processing coordinate lists.min() and max() with a default parameter to avoid crashes on empty collections.Processing Dictionaries With min() and max()
When working with dictionaries in the closest pair problem, you often need to find the point with minimum or maximum distance or coordinate value. Python's min() and max() on a dictionary by default operate on keys, not values. This common pitfall leads to incorrect results when you intend to compare distances or y-coordinates stored as dictionary values. To correctly find the key associated with the minimum or maximum value, use the key parameter: min(dict, key=dict.get). This returns the key whose value is smallest. For example, if you map point IDs to their y-coordinates, min(point_map, key=point_map.get) gives you the ID of the point with the smallest y. In a divide-and-conquer context, this technique is invaluable for selecting candidate points from a strip sorted by y. Understanding this subtlety prevents off-by-one errors and ensures your algorithm compares the right attributes. Always verify whether you need keys or values before calling these functions.
min() directly on a dictionary returns the smallest key lexicographically, not the key with the smallest value — a silent logic error.Tweaking the Standard Behavior of min() and max() With key and default
In the closest pair of points algorithm, you'll perform many comparisons that depend on custom criteria — like distance between points rather than raw values. Python's min() and max() accept a key parameter that transforms each element before comparison, without modifying the original data. For instance, to find the point pair with smallest Euclidean distance, use key=lambda pair: dist(pair[0], pair[1]). This avoids writing explicit loops. Additionally, the default parameter prevents errors when the iterable is empty, returning a fallback value instead of raising ValueError. This is critical when processing an empty list of candidate pairs in the divide step. Combining key with default makes your code robust and concise. For example, min(pair_list, key=distance_func, default=None) safely returns None if no pairs exist. Mastering these tweaks transforms min() and max() from simple utilities into powerful algorithmic tools, saving development time and reducing bugs in production code.
Closest Pair Blowup in Geospatial Query
strip.sort(key=lambda p: p[1]) and enforced the 7-point check with break condition on y-difference. Profiling confirmed O(n log n) after the fix.- The geometric bound only applies when points are sorted by y in the strip.
- Always verify assumptions about pre-sorting when using D&C combine steps.
- Profile before assuming O(n) — a missing sort can silently degrade performance.
Key takeaways
Common mistakes to avoid
5 patternsForgetting to sort the strip by y-coordinate
strip.sort(key=lambda p: p[1]) before the inner loop. Alternatively, maintain a y-sorted list during recursion to avoid the sort entirely.Using '<=' instead of '<' for strip width
< delta (strict less than) when filtering points within the strip. The strip should include points strictly less than δ away from the midline.Not handling duplicate points
Base case too small (n ≤ 1) or too large (n ≤ 10)
Not updating delta after finding a closer pair in the strip
delta after each closer pair found: if d < delta: best = ...; delta = d. Then the inner loop's break condition uses the updated delta.Practice These on LeetCode
Interview Questions on This Topic
Why does each point in the strip need at most 7 comparisons?
Frequently Asked Questions
20+ years shipping performance-critical code where algorithms decide the bill. Notes here come from systems that actually shipped.
That's Recursion. Mark it forged?
9 min read · try the examples if you haven't