Prime Factorisation — 10^9 Calls Killed API
A batch of 10^9 trial divisions requires ~3×10^13 iterations — 3.5 days of CPU.
20+ years shipping performance-critical code where algorithms decide the bill. Written from production experience, not tutorials.
- Prime factorization breaks numbers into unique primes: 360 = 2³ × 3² × 5
- Tau(n) = ∏(exponent+1) gives divisor count; sigma(n) = ∏(p^(a+1)-1)/(p-1) gives sum
- Trial division O(√n) per number – fine for n ≤ 10¹², deadly above
- SPF sieve drops per-number cost to O(log n) after O(n log log n) preprocessing
- Pollard's rho factors composite numbers up to 10¹⁸ in seconds – but has failure modes
Every integer > 1 is either prime or can be broken down into a unique product of primes — like 360 = 2³ × 3² × 5. This is the Fundamental Theorem of Arithmetic. Factorizing a number unlocks many properties: number of divisors, sum of divisors, Euler's totient, and more. The speed of factorization depends heavily on the algorithm chosen.
RSA's security depends on one fact: factoring the product of two large primes is computationally hard. The best known classical algorithm (General Number Field Sieve) runs in sub-exponential but super-polynomial time. A 2048-bit RSA modulus would take longer than the age of the universe to factor on current hardware.
For small numbers — up to ~10^12 — trial division works fine. For competitive programming — up to ~10^6 per query — the SPF sieve turns factorization from O(√n) per query to O(log n). Knowing which regime you're in and choosing the right tool is the practical skill.
The real trap? Assuming one algorithm works for all sizes. Engineers waste hours because they don't detect when trial division becomes impractical or when Pollard's rho loops indefinitely on a prime input.
Why Trial Division Kills Your API at Scale
Prime factorization is the decomposition of a composite number into a product of smaller primes, each prime appearing exactly as many times as its multiplicity. The core mechanic is simple: divide the input by the smallest prime (2), then 3, then 5, etc., until the quotient becomes 1. This is the fundamental theorem of arithmetic — every integer greater than 1 has a unique prime factorization.
In practice, the naive trial division runs in O(√n) per number. For a single 10^12 integer, that's up to 10^6 divisions — trivial. But when your API receives 10^9 requests per day, each requiring factorization of numbers up to 10^12, the total operations hit 10^15. That's not just slow; it's a guaranteed timeout and resource exhaustion. Precomputation via a sieve up to √max(n) (e.g., 10^6) reduces each factorization to O(number of prime factors), typically under 10 divisions.
Use prime factorization when you need to compute GCD/LCM, reduce fractions, or solve modular arithmetic problems. In production, it's the foundation for cryptographic key generation (RSA) and hash table sizing. The difference between O(√n) and O(log n) per call determines whether your service stays up or burns CPU.
Trial Division
Check divisibility by all primes up to √n. O(√n) per number. This is the simplest and most intuitive algorithm. For a single n ≤ 10^12, trial division completes in microseconds. But scale it to 10^6 numbers and you're looking at days.
The key optimization: after checking 2, only check odd numbers. That halves the work. For even better performance, precompute a list of primes up to √max_n using a simple sieve, then only divide by those primes.
Number of Divisors and Sum of Divisors
- Number of divisors: τ(n) = (a1+1)(a2+1)...(ak+1)
- Sum of divisors: σ(n) = Π (p^(a+1)
- 1) / (p
- 1)
- Euler's totient: φ(n) = n × Π (1
- 1/p) for each prime p dividing n
These formulas are direct consequences of the multiplicative property. For example, τ(360) = τ(2³)τ(3²)τ(5¹) = (3+1)(2+1)(1+1) = 4×3×2 = 24 divisors. The sum σ(360) = (2^(3+1)-1)/(2-1) × (3^(2+1)-1)/(3-1) × (5^(1+1)-1)/(5-1) = (16-1)/1 × (27-1)/2 × (25-1)/4 = 15 × 13 × 6 = 1170.
Listing All Divisors
To list all divisors, iterate up to √n and collect pairs. Complexity O(√n) – but you can't beat that because output size is O(τ(n)) which is at most O(√n). For n=10^12, τ(n) is at most ~6720, so enumerating divisors is feasible.
The algorithm: for each divisor d ≤ √n, if n % d == 0, add d and n//d. Sort the result at the end.
SPF Sieve for Batch Factorization
When factorizing many numbers up to n, precompute the Smallest Prime Factor (SPF) sieve. Each factorization then takes O(log n) instead of O(√n).
Construction: Create an array spf[0..n] filled with 0. For i from 2 to n, if spf[i]==0 (i is prime), set spf[i]=i and for j in range(i*i, n+1, i), if spf[j]==0, set spf[j]=i. Then to factor a number x, repeatedly divide by spf[x] and record the count.
This is the standard for competitive programming and any scenario where you need to factorise a batch of numbers up to ~10^7.
Pollard's Rho Algorithm for Large Composite Numbers
When n exceeds ~10^12 and you need to factorise a single number, trial division becomes impractical. Pollard's rho is a probabilistic algorithm that finds a non-trivial factor in O(n^(1/4)) expected time.
It works by iterating a pseudo-random function f(x) = (x^2 + c) mod n and detecting cycles using Floyd's cycle detection. When a cycle is found, gcd(|x-y|, n) gives a factor.
It's not guaranteed – for prime n it will loop forever. Always pair it with a primality test (Miller-Rabin) before running rho.
- The function f(x) = x^2 + c mod n produces a sequence that must eventually repeat.
- Once it repeats, the difference between two pointers (tortoise and hare) often shares a factor with n.
- If the sequence repeats without hitting a factor, change c and restart.
- Expected number of steps before finding a factor is O(√p) where p is the smallest prime factor.
Factorization Using Trial Division – O(√n) Time and O(log n) Space
Most devs jump straight into trial division because it's the first thing that comes to mind: loop through numbers, check divisibility, collect factors. At scale, this pattern is a production incident waiting to happen. But for small n or one-off computations, you need to know the sharpest version of it. The variant that avoids redundant checks and unnecessary work.
The trick: strip out all factors of 2 first, then check only odd numbers up to √n. This cuts the loop iterations nearly in half. Every time you find a factor, divide the number down aggressively — this shrinks the search space on the fly. If n is still > 2 after the loop, it's a prime factor itself. That handles the edge case where n is prime. The space stays O(log n) because you only store the discovered factors, and the recursion depth is bounded by the number of distinct prime factors.
This isn't going to win any speed contests for batch processing or massive composites. But for a single query where n fits in 32-bit, this is your surgical strike. No sieves, no memory overhead, just clean brute force with a haircut.
i <= sqrt(n) and recompute sqrt inside every iteration, you'll tank performance. Compute the bound once before the loop. Also, never modify the loop's upper bound variable inside — n shrinks, but your bound stays static.Sieve of Eratosthenes – Smallest Prime Factor (SPF) for Batch Domination
When you need to factor a hundred thousand numbers in under a second, trial division is going to get you paged at 3 AM. This is where the Smallest Prime Factor sieve becomes your hammer. Precompute the smallest prime divisor for every number up to some limit N in O(N log log N), then each factorization degrades to O(log n) — just walk the chain of divisions.
The idea is dead simple: create an array spf where spf[i] holds the smallest prime that divides i. Initialize by marking spf[i]=i for all i. Then run the sieve: for each prime p, for each multiple m of p, if spf[m] is still m, set it to p. Once built, factoring n is a loop: grab spf[n], record it, divide n by spf[n], repeat until n=1. No division operations at all during factorization — just array lookups and integer division.
Memory is the trade-off. For N=10⁷, that's 40 MB just for the int array. Fine for most servers, but watch your heap. The payoff is batch throughput: you can factor millions of numbers in the time trial division takes for a hundred. This is what every competitive programmer and production batch processor reaches for when latency matters.
IntSupplier or static initializer to lazily build it. This turns every subsequent factorization into an O(log n) lookup chain.Pollard's Rho – When n Kicks Your Sieve Out of RAM
You've maxed out your SPF sieve at 10⁷ and somebody hands you a 64-bit composite. Trial division would take until retirement. Pollard's Rho is your escape hatch. It's a probabilistic algorithm that finds a non-trivial factor in O(√√n) expected time — that's O(n^¼) for the number crunchers. For a 10¹² number, you're looking at around 10⁶ iterations worst-case. Not free, but survivable.
The core idea is perversely simple: generate a pseudo-random sequence modulo n, look for collisions using Floyd's cycle detection (tortoise and hare). When you detect a cycle, compute gcd(a - b, n). If that gcd is > 1 and < n, you've found a factor. Recursively apply to both halves. Pure Miller-Rabin to test primality of the results. The whole pipeline — Miller-Rabin + Pollard's Rho — is what's behind every serious factorization library.
You don't write this from scratch for production unless you're a masochist. Java's BigInteger doesn't ship with Pollard's Rho. You'll need to roll it or pull in a lib. But understanding it is the difference between "works on my machine" and "works on the customer's 10¹⁸ invoice ID."
10^9 Calls to trial_division() Collapsed the API
- Always compute total work (calls × cost per call) before committing to an algorithm.
- Trial division is for single queries, not batch processing.
- When the input range is ≤ 10^6 per query, use SPF sieve. For larger single numbers (>10^12), use Pollard's rho.
is_prime = all(n % d != 0 for d in range(2, int(n**0.5)+1))Add Miller-Rabin primality test for n > 10^12Key takeaways
Common mistakes to avoid
3 patternsUsing trial division in a loop without early exit on prime
Forgetting to add the final prime factor after trial division loop
if n > 1: factors[n] = factors.get(n, 0) + 1 after the loop.Assuming Pollard's rho works on prime numbers
Practice These on LeetCode
Interview Questions on This Topic
How many divisors does 10^6 have? How do you compute this from its factorization?
Frequently Asked Questions
20+ years shipping performance-critical code where algorithms decide the bill. Written from production experience, not tutorials.
That's Number Theory. Mark it forged?
6 min read · try the examples if you haven't