Float Collinearity Causes Self-Intersecting Polygons
Naive line intersection using raw float equality failed on nearly collinear vertices, causing a self-intersecting polygon destroying three prototypes..
20+ years shipping performance-critical code where algorithms decide the bill. Drawn from code that ran under real load.
- Orientation test: cross product of vectors (q-p) x (r-p). Positive = CCW, Negative = CW, Zero = collinear. No division — exact for integer coordinates.
- Segment intersection: segments AB and CD intersect iff {A,B} straddle CD AND {C,D} straddle AB. Collinear cases require an on-segment bounding-box check.
- Intersection point: solve parametric line equations. Denominator near zero means parallel — check with epsilon.
- Performance: O(1) per test. No allocations, no division in orientation. Batch tests dominate sweep line algorithms.
- Production insight: floating-point precision is the #1 source of geometry bugs. Use integer coordinates or epsilon comparisons — never raw float equality.
- Biggest mistake: forgetting collinear endpoint cases. Naive implementations return wrong results when an endpoint lies exactly on the other segment.
Two line segments on a plane: do they cross? This sounds simple but requires careful handling of collinear points, endpoints, and floating-point precision. The cross product orientation test answers this in O(1) — and it is the building block of convex hull, polygon clipping, and sweep line algorithms.
Line intersection is the foundational primitive of computational geometry. Collision detection, polygon clipping, sweep line algorithms, and convex hull construction all reduce to one question: do two segments intersect? The cross product orientation test answers this in O(1) with no division, making it numerically robust and fast.
Production geometry code fails not on the common case but on edge cases: collinear overlapping segments, endpoints lying exactly on segments, and floating-point precision at near-parallel intersections. A single missed collinear check can corrupt a convex hull, create self-intersecting polygons, or cause collision detection to miss contacts.
A common misconception is that line equations are equivalent to the cross product approach. They are not. Line equations require division, which introduces floating-point error. The cross product uses only multiplication and subtraction — exact for integer coordinates and significantly more robust for floats.
How Line Intersection Algorithms Actually Work
Line intersection algorithms determine whether two line segments cross in 2D space and compute the exact intersection point. The core mechanic is solving the parametric equations of each segment: P = A + t(B - A) and Q = C + u(D - C). The intersection exists when 0 ≤ t ≤ 1 and 0 ≤ u ≤ 1, meaning both points lie within their respective segments. This is fundamentally an orientation test using cross products, not slope-intercept math.
In practice, the algorithm relies on orientation predicates: given three points, do they form a clockwise, counterclockwise, or collinear turn? Two segments intersect only if their orientations are opposite (or one is collinear and the other straddles). The cross product sign is the key — but floating-point precision makes exact collinearity detection unreliable. A tolerance (epsilon) of 1e-9 or 1e-12 is standard, but too large an epsilon introduces false positives.
Use this algorithm whenever you need to detect polygon self-intersection, compute boolean operations on shapes, or validate geometry in GIS or CAD systems. It's O(1) per segment pair, but naive O(n²) checks on all pairs in a polygon with n vertices can be slow. Spatial indexing (e.g., sweep-line, R-tree) reduces this to O(n log n). The real-world cost of a missed or false intersection is a corrupted polygon that breaks downstream rendering or physics engines.
Orientation Test — The Core Primitive
The orientation of three points p, q, r is determined by the cross product of vectors (q-p) and (r-p): - Positive: counterclockwise (left turn) - Negative: clockwise (right turn) - Zero: collinear
This is the single operation that underlies all segment intersection tests. The cross product formula is:
val = (q.y - p.y) (r.x - q.x) - (q.x - p.x) (r.y - q.y)
No division is involved — only multiplication and subtraction. This makes the test exact for integer coordinates and significantly more robust than line-equation approaches that require dividing by a potentially-zero denominator.
- Cross product = 2 * signed area of triangle (p, q, r)
- Positive area = points are counterclockwise
- Zero area = points are collinear (triangle degenerates to a line)
- No division means no rounding error for integer inputs
Segment Intersection
Two segments AB and CD intersect if their orientations 'straddle' each other. Specifically, A and B must be on opposite sides of line CD, AND C and D must be on opposite sides of line AB. This is checked by verifying that orientation(A,B,C) != orientation(A,B,D) and orientation(C,D,A) != orientation(C,D,B).
Edge case: when one or more orientations are zero (collinear), the segments may still intersect if an endpoint lies on the other segment. The on_segment test checks this by verifying the collinear point falls within the bounding box of the segment.
- o1 == 0: p2 is collinear with p1q1 — check if p2 lies on segment p1q1
- o2 == 0: q2 is collinear with p1q1 — check if q2 lies on segment p1q1
- o3 == 0: p1 is collinear with p2q2 — check if p1 lies on segment p2q2
- o4 == 0: q1 is collinear with p2q2 — check if q1 lies on segment p2q2
Line Intersection Point
Given that two lines intersect, the intersection point is computed by solving the parametric equations of both lines simultaneously. For lines through p1-p2 and p3-p4, the denominator is the cross product of the direction vectors. If the denominator is zero, the lines are parallel. The parameter t gives the position along the first line, and the intersection point is p1 + t*(p2-p1).
This function operates on infinite lines, not segments. To find the intersection of segments, first verify they intersect using segments_intersect, then compute the point. For near-parallel lines, the denominator approaches zero and the intersection point becomes numerically unstable — handle this with an epsilon check.
- line_intersection: where do the infinite lines through these points cross?
- segments_intersect: do the finite segments between these points overlap?
- To get segment intersection point: call segments_intersect first, then line_intersection
- Calling line_intersection alone gives false positives for non-overlapping segments
The Naive Approach — Pairwise Hell
You could check every segment against every other segment. That's O(n²). For small datasets—say, under 50 segments—it works fine. For anything larger, it's a death sentence.
The logic itself is simple: for each pair, run the orientation test on both endpoints. If the orientations are opposite (or collinear with overlapping ranges), they intersect. That check is O(1). So the whole thing is O(n²) time, O(1) space. No mystery.
Here's the truth: the naive approach isn't stupid, it's honest. It tells you exactly when you're in over your head. If your geometry engine handles 100K segments and you use this, you'll be watching a monitor go orange at 2 AM. Sweep line exists for a reason.
Sweep Line — The Only Algorithm You Need
The sweep line algorithm changes the game from O(n²) to O(n log n). Here's how it works: imagine a vertical line sliding from left to right across your plane. As it moves, it maintains a list of 'active' segments—those currently crossing the sweep line, sorted by their Y-coordinate at the current X.
The key insight: intersections only happen between segments that are neighbors in this sorted order. So instead of checking every pair, you only check pairs that become adjacent as the sweep line passes. When a new segment enters, you test it against its immediate neighbors. When one leaves, you test the neighbors against each other.
You need a balanced BST or a similar structure for the active set. In Java, that's TreeMap with a custom comparator—but careful, floating point comparisons will bite you. Use integer arithmetic or epsilon-tolerant comparisons.
For line segments (not infinite lines), you also need to handle endpoint events: when the sweep line hits a left endpoint, insert the segment; when it hits a right endpoint, remove it. Insertion and neighbor queries are O(log n) each.
Orientation of Three Points — The 2D Gut Check
Orientation is the foundation of everything here. Given three points A, B, C, orientation tells you if C is to the left, right, or collinear with the directed line AB. It's a cross product sign test, no trigonometry needed.
The formula: (B.y - A.y) (C.x - B.x) - (B.x - A.x) (C.y - B.y). If positive -> counterclockwise (left turn). Negative -> clockwise (right turn). Zero -> collinear.
Why this works: the cross product of vectors AB and BC gives twice the signed area of triangle ABC. If the area is positive, C is left of AB. That's it. No floating point errors if you use integers.
Collinear cases are the ugly part. When orientation is zero, you need an extra containment check: does point C lie within the bounding box of AB? That's onSegment(). Without it, you'll miss intersections where segments share an endpoint or overlap partially.
Graham Scan — Convex Hull in O(n log n)
Why care? Line intersection algorithms often assume you already know which segments are relevant. The Graham Scan eliminates irrelevant points by computing the convex hull — the smallest convex polygon containing all points. This is foundational: any line that doesn't intersect the hull never intersects the points inside it. Graham Scan works by first finding the lowest-y point (pivot), then sorting all other points by polar angle relative to that pivot. It scans the sorted list, maintaining a stack. For each new point, it checks orientation of the last two stack points and the new point — if the turn is clockwise (right turn), the middle point is inside the hull and gets popped. Only counter-clockwise turns stay. The result is a tight boundary you can feed directly into sweep-line algorithms. The critical nuance: collinear points on hull edges require special handling — either keep them for precision or discard them depending on whether you need minimal hull. This single pass after sorting gives O(n log n) total, beating naive pairwise methods.
Chand–Kapur Algorithm — Avoiding Degenerate Sorting Perils
Why this matters? The Graham Scan's polar sort fails when points share the same angle (collinear with pivot) or when the convex hull has multiple collinear boundary points. Chand–Kapur (also called the Tentative Hull algorithm) fixes this by skipping the angle sort entirely. Instead, it builds the hull incrementally by traversing the sorted-by-x points and maintaining an upper and lower chain separately. For each chain, you push points and check orientation — if a right turn occurs, you pop the middle point. This avoids the collinear-angle ambiguity because you process points in monotonic x-order. The critical insight: process left-to-right for the upper hull, right-to-left for the lower hull. No angle computation, no floating-point. This makes Chand–Kapur more robust for integer coordinates and avoids edge-case crashes common in competitive programming. Complexity remains O(n log n) from sorting by x, but the constant is lower because you skip the angle comparison. Use this when your input contains many collinear points or when you need exact integer output without rounding errors.
Self-Intersecting Polygon in CAD Export Crashes CNC Machine
- Floating-point equality in geometry is never safe. Always use epsilon comparisons or exact arithmetic.
- Polygon simplification algorithms can produce self-intersecting output if the intersection predicate is imprecise.
- Post-condition validation (checking that output polygons are simple) catches bugs that the algorithm itself misses.
- Test with adversarial inputs: near-collinear vertices, near-parallel edges, degenerate triangles. Real CAD data is full of these.
Print orientation values for all four point triples to verify they are near-zeroCheck if on_segment bounding-box test uses raw float or epsilon comparisonKey takeaways
Practice These on LeetCode
Interview Questions on This Topic
Frequently Asked Questions
20+ years shipping performance-critical code where algorithms decide the bill. Drawn from code that ran under real load.
That's Geometry. Mark it forged?
7 min read · try the examples if you haven't