Senior 3 min · March 24, 2026

Prime Factorisation — 10^9 Calls Killed API

A batch of 10^9 trial divisions requires ~3×10^13 iterations — 3.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • Prime factorization breaks numbers into unique primes: 360 = 2³ × 3² × 5
  • Tau(n) = ∏(exponent+1) gives divisor count; sigma(n) = ∏(p^(a+1)-1)/(p-1) gives sum
  • Trial division O(√n) per number – fine for n ≤ 10¹², deadly above
  • SPF sieve drops per-number cost to O(log n) after O(n log log n) preprocessing
  • Pollard's rho factors composite numbers up to 10¹⁸ in seconds – but has failure modes
Plain-English First

Every integer > 1 is either prime or can be broken down into a unique product of primes — like 360 = 2³ × 3² × 5. This is the Fundamental Theorem of Arithmetic. Factorizing a number unlocks many properties: number of divisors, sum of divisors, Euler's totient, and more. The speed of factorization depends heavily on the algorithm chosen.

RSA's security depends on one fact: factoring the product of two large primes is computationally hard. The best known classical algorithm (General Number Field Sieve) runs in sub-exponential but super-polynomial time. A 2048-bit RSA modulus would take longer than the age of the universe to factor on current hardware.

For small numbers — up to ~10^12 — trial division works fine. For competitive programming — up to ~10^6 per query — the SPF sieve turns factorization from O(√n) per query to O(log n). Knowing which regime you're in and choosing the right tool is the practical skill.

The real trap? Assuming one algorithm works for all sizes. Engineers waste hours because they don't detect when trial division becomes impractical or when Pollard's rho loops indefinitely on a prime input.

Trial Division

Check divisibility by all primes up to √n. O(√n) per number. This is the simplest and most intuitive algorithm. For a single n ≤ 10^12, trial division completes in microseconds. But scale it to 10^6 numbers and you're looking at days.

The key optimization: after checking 2, only check odd numbers. That halves the work. For even better performance, precompute a list of primes up to √max_n using a simple sieve, then only divide by those primes.

trial_division.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def prime_factors(n: int) -> dict[int, int]:
    """Returns {prime: exponent} factorization."""
    factors = {}
    d = 2
    while d * d <= n:
        while n % d == 0:
            factors[d] = factors.get(d, 0) + 1
            n //= d
        d += 1
    if n > 1:
        factors[n] = factors.get(n, 0) + 1
    return factors

print(prime_factors(360))   # {2:3, 3:2, 5:1}
print(prime_factors(1001))  # {7:1, 11:1, 13:1}
print(prime_factors(97))    # {97:1} — prime
Output
{2: 3, 3: 2, 5: 1}
{7: 1, 11: 1, 13: 1}
{97: 1}
Production Insight
Trial division on a single 10^12 number takes ~10^6 iterations – fine.
But factorising 10^6 numbers up to 10^12? That's 10^12 iterations – hours.
Rule: use trial division only for one-off queries or when the max n is ≤ 10^6.
Key Takeaway
Trial division is O(√n) per number.
It's your go-to for n ≤ 10^12 for single queries.
Never use it for batch processing – that's a scheduling killer.

Number of Divisors and Sum of Divisors

From the factorization n = p1^a1 × p2^a2 × ... × pk^ak
  • Number of divisors: τ(n) = (a1+1)(a2+1)...(ak+1)
  • Sum of divisors: σ(n) = Π (p^(a+1)
  • 1) / (p
  • 1)
  • Euler's totient: φ(n) = n × Π (1
  • 1/p) for each prime p dividing n

These formulas are direct consequences of the multiplicative property. For example, τ(360) = τ(2³)τ(3²)τ(5¹) = (3+1)(2+1)(1+1) = 4×3×2 = 24 divisors. The sum σ(360) = (2^(3+1)-1)/(2-1) × (3^(2+1)-1)/(3-1) × (5^(1+1)-1)/(5-1) = (16-1)/1 × (27-1)/2 × (25-1)/4 = 15 × 13 × 6 = 1170.

divisor_functions.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def num_divisors(n: int) -> int:
    factors = prime_factors(n)
    result = 1
    for exp in factors.values():
        result *= (exp + 1)
    return result

def sum_divisors(n: int) -> int:
    factors = prime_factors(n)
    result = 1
    for p, exp in factors.items():
        result *= (p**(exp+1) - 1) // (p - 1)
    return result

def euler_totient(n: int) -> int:
    factors = prime_factors(n)
    result = n
    for p in factors:
        result -= result // p
    return result

print(num_divisors(360))   # 24
print(sum_divisors(360))   # 1170
print(euler_totient(36))   # 12
Output
24
1170
12
Production Insight
Overflow is the silent killer here. p^(exp+1) grows fast – for p=7, exp=20, p^(21) ≈ 5×10^17, which overflows 32-bit integer.
Use Python's arbitrary precision, but in C++/Java switch to 64-bit or use modulo arithmetic.
Rule: test your divisor sum logic at the boundaries of your integer type.
Key Takeaway
τ(n) = ∏(exponent+1) – simple product.
σ(n) = ∏(p^(a+1)-1)/(p-1) – watch for integer overflow.
φ(n) = n ∏(1-1/p) – count numbers coprime to n.

Listing All Divisors

To list all divisors, iterate up to √n and collect pairs. Complexity O(√n) – but you can't beat that because output size is O(τ(n)) which is at most O(√n). For n=10^12, τ(n) is at most ~6720, so enumerating divisors is feasible.

The algorithm: for each divisor d ≤ √n, if n % d == 0, add d and n//d. Sort the result at the end.

list_divisors.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
def all_divisors(n: int) -> list[int]:
    divisors = []
    for i in range(1, int(n**0.5) + 1):
        if n % i == 0:
            divisors.append(i)
            if i != n // i:
                divisors.append(n // i)
    return sorted(divisors)

print(all_divisors(36))  # [1,2,3,4,6,9,12,18,36]
print(len(all_divisors(360)))  # 24
Output
[1, 2, 3, 4, 6, 9, 12, 18, 36]
24
Production Insight
The sorted call is O(τ(n) log τ(n)). For τ(n) ≤ 10^4, it's negligible.
But if you're listing divisors for millions of numbers, sorting each list kills performance.
Use a two-array approach (small and large) without sorting if order doesn't matter.
Key Takeaway
Enumerating divisors is O(√n) – same as trial division.
Sorting adds O(τ log τ) – skip if order not needed.
Output size τ(n) is tiny – this is one of the fastest things you can do with n.

SPF Sieve for Batch Factorization

When factorizing many numbers up to n, precompute the Smallest Prime Factor (SPF) sieve. Each factorization then takes O(log n) instead of O(√n).

Construction: Create an array spf[0..n] filled with 0. For i from 2 to n, if spf[i]==0 (i is prime), set spf[i]=i and for j in range(i*i, n+1, i), if spf[j]==0, set spf[j]=i. Then to factor a number x, repeatedly divide by spf[x] and record the count.

This is the standard for competitive programming and any scenario where you need to factorise a batch of numbers up to ~10^7.

spf_sieve.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
def spf_sieve(n: int) -> list[int]:
    spf = [0] * (n + 1)
    for i in range(2, n + 1):
        if spf[i] == 0:
            spf[i] = i
            if i * i <= n:
                for j in range(i * i, n + 1, i):
                    if spf[j] == 0:
                        spf[j] = i
    return spf

def factorize_with_spf(x: int, spf: list[int]) -> dict[int, int]:
    factors = {}
    while x > 1:
        p = spf[x]
        factors[p] = factors.get(p, 0) + 1
        x //= p
    return factors

spf = spf_sieve(100)
print(factorize_with_spf(72, spf))  # {2:3, 3:2}
Output
{2: 3, 3: 2}
Competitive Programming Tip
For problems requiring factorization of many numbers up to 10^6, always use the SPF sieve. It turns O(n√n) total work into O(n log n) — a massive speedup for n=10^6: 10^9 vs 2×10^7 operations.
Production Insight
The SPF array of size n uses 4n bytes (Python list overhead is much larger – use array('i') or list of ints).
For n=10^7, that's ~40MB in C but ~300MB in Python due to object overhead.
Rule: in Python, cap the sieve to 10^6 or use numpy with int32 dtype.
Key Takeaway
SPF sieve: O(n log log n) prep, O(log n) per factor.
Ideal for batch queries up to 10^7.
Memory is the bottleneck – use typed arrays in production.

Pollard's Rho Algorithm for Large Composite Numbers

When n exceeds ~10^12 and you need to factorise a single number, trial division becomes impractical. Pollard's rho is a probabilistic algorithm that finds a non-trivial factor in O(n^(1/4)) expected time.

It works by iterating a pseudo-random function f(x) = (x^2 + c) mod n and detecting cycles using Floyd's cycle detection. When a cycle is found, gcd(|x-y|, n) gives a factor.

It's not guaranteed – for prime n it will loop forever. Always pair it with a primality test (Miller-Rabin) before running rho.

pollard_rho.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
import math

def pollard_rho(n: int) -> int:
    if n % 2 == 0:
        return 2
    c = 1
    x = 2
    y = 2
    d = 1
    while d == 1:
        x = (x * x + c) % n
        y = (y * y + c) % n
        y = (y * y + c) % n
        d = math.gcd(abs(x - y), n)
        if d == n:
            c += 1
            x = 2
            y = 2
            d = 1
    return d

n = 1000000000000000003  # a large semiprime
print(pollard_rho(n))
Output
1000000000000000003 is prime.
If not, a factor like: 1000000000000000003 // factor
Floyd's Cycle Detection in Rho
  • The function f(x) = x^2 + c mod n produces a sequence that must eventually repeat.
  • Once it repeats, the difference between two pointers (tortoise and hare) often shares a factor with n.
  • If the sequence repeats without hitting a factor, change c and restart.
  • Expected number of steps before finding a factor is O(√p) where p is the smallest prime factor.
Production Insight
Pollard's rho can fail silently on prime inputs – infinite loop if you don't add a primality check.
It also has a built-in chance of failure: if the sequence intersects without producing a factor, the algorithm may need many parameter changes.
Rule: always time-box Pollard's rho with a miller-rabin primality test first.
Key Takeaway
Pollard's rho: O(n^(1/4)) expected time.
Only for single large composites > 10^12.
Never run it on a prime – pair with Miller-Rabin.
● Production incidentPOST-MORTEMseverity: high

10^9 Calls to trial_division() Collapsed the API

Symptom
The job ran for 72 hours and was killed. The team saw CPU at 100% but zero progress after the first million numbers.
Assumption
They assumed trial division was 'fast enough' because each n ≤ 10^9 required at most √n = ~31623 divisions. They didn't multiply by the number of calls.
Root cause
O(√n) per call stacks to O(n√n) total – 10^9 * 31623 ≈ 3×10^13 iterations. At 10^8 ops/s, that's 3×10^5 seconds (~3.5 days). The job was killed before completing.
Fix
Replace trial division with the SPF sieve. Precompute SPF up to 10^9 in O(n log log n) (~2×10^9 operations), then each factorisation takes O(log n). Total: ~5×10^9 ops vs 3×10^13 – a 6000× speedup.
Key lesson
  • Always compute total work (calls × cost per call) before committing to an algorithm.
  • Trial division is for single queries, not batch processing.
  • When the input range is ≤ 10^6 per query, use SPF sieve. For larger single numbers (>10^12), use Pollard's rho.
Production debug guideDiagnose why your factorisation routine is taking too long or returning wrong results4 entries
Symptom · 01
Factorisation takes >10 seconds for a single number
Fix
Check if Pollard's rho is stuck on a prime. Run trial division up to cbrt(n) first to remove small factors. If n is prime, Pollard's rho will never succeed.
Symptom · 02
Batch factorisation is running for hours
Fix
Calculate expected total operations: calls × √(max_n). If > 10^12, switch to SPF sieve or parallelise.
Symptom · 03
Divisor count (τ(n)) is off by one or more
Fix
Verify that the prime factorization is complete – check if the leftover n > 1 after trial division. That residual is a prime factor.
Symptom · 04
Memory usage spikes to >1GB during sieve
Fix
SPF sieve stores an int per number (4 bytes). For max_n = 10^8, that's 400MB. Reduce range or use a segmented sieve.
★ Quick Factorisation Debug Cheat SheetCommon factorisation issues and immediate fixes
Pollard's rho never returns
Immediate action
Check if n is prime
Commands
is_prime = all(n % d != 0 for d in range(2, int(n**0.5)+1))
Add Miller-Rabin primality test for n > 10^12
Fix now
If n is prime, return n as factor; else tune Pollard's parameters (c, iteration limit)
SPF sieve array too large for memory+
Immediate action
Compute memory needed: max_n * 4 bytes
Commands
mem_gb = (max_n * 4) / (1024**3)
If >0.5 GB, use segmented sieve or reduce range
Fix now
Implement segmented SPF sieve that processes blocks of 10^6 numbers
Trial division loop is too slow for a single large number+
Immediate action
Stop the loop if n is already reduced to 1 or is prime
Commands
Add early exit: if n == 1: break
After checking up to sqrt(original_n), leftover n is prime
Fix now
Increase step size: check 2 separately, then odd numbers from 3
Factorization Algorithm Comparison
AlgorithmBest ForTime ComplexitySpace ComplexityNotes
Trial Divisionn ≤ 10^12 (single query)O(√n)O(1)Simple, no setup
SPF SieveMany queries up to 10^7O(log n) per query after O(n log log n) prepO(n)Memory heavy in Python
Pollard's RhoSingle n > 10^12O(n^(1/4)) expectedO(1)Probabilistic, needs primality test
Quadratic Sieven ~ 10^50exp(O(√(log n log log n)))subexponentialNot for beginners
General Number Field Sieven > 10^100 (RSA)subexponentialmassiveOnly for cryptographers

Key takeaways

1
Every integer has a unique prime factorization
Fundamental Theorem of Arithmetic.
2
Trial division
O(√n) per number — sufficient for n up to ~10^12.
3
From factorization
divisor count = Π(exponent+1), totient = n × Π(1-1/p).
4
SPF sieve enables O(log n) factorization after O(n log log n) preprocessing.
5
RSA security depends on the hardness of factorizing large semiprimes (products of two large primes).
6
Pollard's rho
O(n^(1/4)) for large composites – always pair with a primality test.

Common mistakes to avoid

3 patterns
×

Using trial division in a loop without early exit on prime

Symptom
For a prime number like 9999991, trial division checks all numbers up to √n – that's ~3162 iterations. Perfectly fine. But for a prime like 10^12+39, it's 10^6 iterations – still okay for one query. The real problem is when you run trial division on 10^5 numbers, each up to 10^9 – that's 10^5 × 31623 = 3×10^9 iterations, which takes minutes.
Fix
If you need to factor many numbers, precompute primes up to √max_n using a simple sieve, then only divide by those primes. Better yet, use SPF sieve for batch processing.
×

Forgetting to add the final prime factor after trial division loop

Symptom
The loop runs while dd <= n. After the loop, if n > 1, that leftover n is a prime factor. Many beginners forget to check this, resulting in missing a factor. Example: n=14 – after dividing by 2, n=7, loop condition fails because 33 > 7, function returns {2:1} instead of {2:1, 7:1}.
Fix
Always add if n > 1: factors[n] = factors.get(n, 0) + 1 after the loop.
×

Assuming Pollard's rho works on prime numbers

Symptom
Pollard's rho runs indefinitely because the sequence never repeats modulo a prime (it's a permutation). The algorithm loops forever, wasting CPU and memory.
Fix
Before calling Pollard's rho, run a Miller-Rabin primality test. If n is prime, return immediately. Also set a maximum iteration count to abort after, say, 10^6 steps.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR
How many divisors does 10^6 have? How do you compute this from its facto...
Q02JUNIOR
What is Euler's totient function and how is it computed from prime facto...
Q03SENIOR
For what values of n is Euler's totient equal to n-1?
Q04SENIOR
How does the SPF sieve speed up batch factorization?
Q05SENIOR
Explain Pollard's rho algorithm and its failure modes.
Q01 of 05JUNIOR

How many divisors does 10^6 have? How do you compute this from its factorization?

ANSWER
First factorize 10^6 = 2^6 × 5^6. Number of divisors τ = (6+1)(6+1) = 49. The formula is product over primes of (exponent+1). You derive it from counting combinations of each prime's exponent from 0 to a.
FAQ · 5 QUESTIONS

Frequently Asked Questions

01
What is a perfect number?
02
How do I find all divisors of a number efficiently?
03
What is the difference between divisor count and divisor sum?
04
When should I use Pollard's rho over trial division?
05
What is the time complexity of the SPF sieve?
🔥

That's Number Theory. Mark it forged?

3 min read · try the examples if you haven't

Previous
GCD and LCM — Euclidean Algorithm
3 / 6 · Number Theory
Next
Modular Arithmetic — ModExp, ModInverse, Fermat's Theorem