Modular Inverse Bug — Fermat's Fails on Composite Modulus
- Modular Arithmetic Properties: The Complete Reference
- Visualizing Modular Arithmetic: The Clock Model
- Fast Modular Exponentiation: O(log exp) Power
- Modular arithmetic keeps numbers bounded (wraps at modulus) while preserving addition, subtraction, and multiplication.
- Fast exponentiation (repeated squaring) computes a^b mod m in O(log b) — essential for RSA and CP.
- Use modular inverse for division: multiply by b^(-1) instead of dividing. Inverse exists iff gcd(a,m)=1.
- Precompute factorials and inverse factorials for O(1) nCr queries; precompute all inverses 1..n in O(n) with recurrence.
- Matrix exponentiation solves linear recurrences (Fibonacci) in O(k^3 log n) — crucial for n = 10^18.
- Biggest mistake: using integer division (/) mod m instead of modular inverse — gives wrong answer silently.
Quick Debug Cheat Sheet: Modular Arithmetic
Wrong answer for division mod p
inv_b = pow(b, MOD-2, MOD) # if MOD primeinv_b = pow(b, -1, MOD) # Python 3.8+ (any MOD)TLE due to large exponentiation
result = pow(base, exp, mod) # O(log exp)Negative result from modular subtraction
safe_sub = (a - b) % MOD # Python workssafe_sub_cpp = ((a % MOD) - (b % MOD) + MOD) % MODnCr mod p: TLE for multiple queries
fact[0]=1; for i in 1..n: fact[i]=fact[i-1]*i%MODinv_fact[n]=pow(fact[n], MOD-2, MOD); for i in n-1..0: inv_fact[i]=inv_fact[i+1]*(i+1)%MODMatrix exponentiation returns wrong Fibonacci
M = [[1,1],[1,0]]F_n = mat_pow(M, n-1)[0][0]Production Incident
Production Debug GuideSymptom → Action guide for common production issues with modular operations
Every competitive programming problem involving large numbers ends with 'output the answer modulo 10^9+7'. Every RSA operation involves modular exponentiation — computing m^e mod n. Every Diffie-Hellman key exchange involves computing g^x mod p. Every elliptic curve operation reduces coordinates mod p. Understanding modular arithmetic is not optional for systems engineers or competitive programmers — it is the foundation layer that makes everything else possible.
The key insight that unlocks modular arithmetic: (a × b) mod m = ((a mod m) × (b mod m)) mod m. This distributivity means you can reduce intermediate values at every step, keeping numbers in range, without affecting the final result. Without this, even simple operations on 'big' numbers would overflow. With it, you can compute 2^(10^18) mod (10^9+7) with Python's built-in pow(2, 1018, 109+7) — under a microsecond.
I learned this the hard way during a coding competition. The problem asked for the 10^18-th Fibonacci number mod 10^9+7. I wrote a naive loop. It ran for 30 seconds before the judge killed it. My teammate wrote 4 lines using matrix exponentiation — O(log n) instead of O(n). His solution ran in 0.001 seconds. Same problem, same modulus, completely different algorithm. That was the day I understood that modular arithmetic isn't a topic — it's a toolkit, and the tools you choose determine whether your solution runs in microseconds or heat-death-of-the-universe.
This guide covers every modular arithmetic concept you need: from basic properties to matrix exponentiation, from Fermat's theorem to the Chinese Remainder Theorem, with working code in Python and Java, competitive programming patterns, and the cryptographic applications that make this math matter in production systems.
Modular Arithmetic Properties: The Complete Reference
Modular arithmetic preserves addition, subtraction, and multiplication within the wrapped range. Division is the exception — it requires the modular inverse.
Addition: (a + b) mod m = ((a mod m) + (b mod m)) mod m Subtraction: (a - b) mod m = ((a mod m) - (b mod m) + m) mod m — the '+m' prevents negative results Multiplication: (a × b) mod m = ((a mod m) × (b mod m)) mod m Exponentiation: (a^b) mod m — use fast modular exponentiation (next section) Division: (a / b) mod m = (a × b⁻¹) mod m — requires modular inverse
Critical gotcha: modular arithmetic does NOT distribute over division. (a/b) mod m ≠ ((a mod m) / (b mod m)) mod m. You must convert division to multiplication by the modular inverse.
Another gotcha: negative numbers. In Python, (-3) % 7 = 4 (always non-negative). In C/C++, (-3) % 7 = -3 (keeps the sign). This difference causes bugs when porting algorithms between languages. Always ensure your result is non-negative: ((a % m) + m) % m.
# io.thecodeforge: Modular Arithmetic Properties — Complete Reference # Every property verified with concrete examples. MOD = 10**9 + 7 # ───────────────────────────────────────────────────────────── # ADDITION: (a + b) mod m = ((a mod m) + (b mod m)) mod m # ───────────────────────────────────────────────────────────── a, b, m = 10**18, 10**18, 10**9 + 7 result_naive = (a + b) % m result_distributed = ((a % m) + (b % m)) % m print('=== Addition ===') print(f'({a} + {b}) mod {m} = {result_naive}') print(f'(({a} mod {m}) + ({b} mod {m})) mod {m} = {result_distributed}') print(f'Match: {result_naive == result_distributed}') print() # ───────────────────────────────────────────────────────────── # SUBTRACTION: (a - b) mod m = ((a mod m) - (b mod m) + m) mod m # The '+m' prevents negative results. # ───────────────────────────────────────────────────────────── a, b, m = 5, 10, 7 result = (a - b) % m result_safe = ((a % m) - (b % m) + m) % m print('=== Subtraction ===') print(f'({a} - {b}) mod {m} = {result} (Python handles negatives correctly)') print(f'(({a} mod {m}) - ({b} mod {m}) + {m}) mod {m} = {result_safe}') print(f'Match: {result == result_safe}') print() # ───────────────────────────────────────────────────────────── # MULTIPLICATION: (a × b) mod m = ((a mod m) × (b mod m)) mod m # This is the property that makes RSA possible. # ───────────────────────────────────────────────────────────── a, b, m = 10**18, 10**18, 10**9 + 7 result_naive = (a * b) % m result_distributed = ((a % m) * (b % m)) % m print('=== Multiplication ===') print(f'({a} × {b}) mod {m} = {result_naive}') print(f'(({a} mod {m}) × ({b} mod {m})) mod {m} = {result_distributed}') print(f'Match: {result_naive == result_distributed}') print() # ───────────────────────────────────────────────────────────── # DIVISION: (a / b) mod m = (a × b⁻¹) mod m # Division does NOT distribute over mod. Use modular inverse. # ───────────────────────────────────────────────────────────── a, b, m = 10, 3, 10**9 + 7 # Can't just do (a / b) mod m — division doesn't work in mod # Instead: find b⁻¹ such that b × b⁻¹ ≡ 1 (mod m) b_inv = pow(b, m - 2, m) # Fermat's little theorem (m must be prime) result = a * b_inv % m # Verify: result × b mod m should equal a verify = result * b % m print('=== Division ===') print(f'{a} / {b} mod {m} = {result}') print(f'Verify: {result} × {b} mod {m} = {verify} (should be {a})') print(f'Match: {verify == a}') print() # ───────────────────────────────────────────────────────────── # NEGATIVE MODULO: Python vs C++ behavior # This is a common interview trap. # ───────────────────────────────────────────────────────────── print('=== Negative Modulo ===') print(f'Python: (-3) % 7 = {(-3) % 7} (always non-negative)') print(f'Python: (-10) % 3 = {(-10) % 3}') print() print('C/C++: (-3) % 7 = -3 (keeps the sign)') print('C/C++: (-10) % 3 = -1') print() print('Fix for C/C++: ((a % m) + m) % m') print(f'Python equivalent: ((-3) % 7 + 7) % 7 = {((-3) % 7 + 7) % 7}') print() # ───────────────────────────────────────────────────────────── # PROPERTIES SUMMARY TABLE # ───────────────────────────────────────────────────────────── print('=== Properties Summary ===') properties = [ ('(a + b) mod m', 'Distributes', 'Yes'), ('(a - b) mod m', 'Distributes (add m to fix negatives)', 'Yes'), ('(a × b) mod m', 'Distributes', 'Yes'), ('(a^b) mod m', 'Use fast modexp', 'Yes'), ('(a / b) mod m', 'Does NOT distribute — use inverse', 'No'), ('a mod (-m)', 'Undefined convention — avoid negative moduli', 'N/A'), ] for expr, note, distributes in properties: print(f'{expr:20s} | Distributes: {distributes:3s} | {note}')
(1000000000000000000 + 1000000000000000000) mod 1000000007 = 999999993
((1000000000000000000 mod 1000000007) + (1000000000000000000 mod 1000000007)) mod 1000000007 = 999999993
Match: True
=== Subtraction ===
(5 - 10) mod 7 = 2 (Python handles negatives correctly)
((5 mod 7) - (10 mod 7) + 7) mod 7 = 2
Match: True
=== Multiplication ===
(1000000000000000000 × 1000000000000000000) mod 1000000007 = 490000037
((1000000000000000000 mod 1000000007) × (1000000000000000000 mod 1000000007)) mod 1000000007 = 490000037
Match: True
=== Division ===
10 / 3 mod 1000000007 = 333333336
Verify: 333333336 × 3 mod 1000000007 = 10 (should be 10)
Match: True
=== Negative Modulo ===
Python: (-3) % 7 = 4 (always non-negative)
Python: (-10) % 3 = 2
C/C++: (-3) % 7 = -3 (keeps the sign)
C/C++: (-10) % 3 = -1
Fix for C/C++: ((a % m) + m) % m
Python equivalent: ((-3) % 7 + 7) % 7 = 4
=== Properties Summary ===
(a + b) mod m | Distributes: Yes |
(a - b) mod m | Distributes: Yes | (add m to fix negatives)
(a × b) mod m | Distributes: Yes |
(a^b) mod m | Distributes: Yes | Use fast modexp
(a / b) mod m | Distributes: No | Does NOT distribute — use inverse
((a % m) + m) % m, you'll get negative intermediate results that break downstream calculations. This has caused bugs in competitive programming submissions and production systems alike. Always normalize: ((a % m) + m) % m.Visualizing Modular Arithmetic: The Clock Model
The easiest way to understand modular arithmetic is to think of a clock. A standard clock has 12 hours — it's modulo 12 arithmetic. If the current time is 10 o'clock and you add 3 hours, you get 1 o'clock (not 13). That's because 10 + 3 ≡ 1 (mod 12). The numbers wrap around after reaching the modulus.
This model extends directly to any modulus. The Quotient-Remainder Theorem formalizes the intuition: for any integer a and positive modulus m, there exist unique integers q (quotient) and r (remainder) such that a = q·m + r, where 0 ≤ r < m. The remainder r is exactly a mod m. This is the fundamental theorem underlying all modular arithmetic — it guarantees that every integer has a unique representation modulo m.
When you add two numbers modulo m, you add the remainders, then subtract m if the sum exceeds m-1. For example, on a 12-hour clock, 10 + 3 = 13, but 13 − 12 = 1. That's the same as (10+3) % 12 = 1. Multiplication works similarly but in multiple wraps.
The diagram below shows step-by-step the addition of 3 hours to 10 on a clock, illustrating the wrap-around at 12.
Fast Modular Exponentiation: O(log exp) Power
Computing a^b mod m naively (multiply a by itself b times, then mod m) is O(b) — impossibly slow when b is 10^18. Fast modular exponentiation uses repeated squaring to compute the result in O(log b) multiplications.
The algorithm: write b in binary. For each bit, square the current base. If the bit is 1, multiply the result by the current base. Reduce mod m at every step to keep numbers in range.
Example: 3^13 mod 100. Binary of 13 = 1101. - Start: result=1, base=3 - Bit 1 (rightmost): result=1×3=3, base=3²=9 - Bit 0: base=9²=81 - Bit 1: result=3×81=243%100=43, base=81²=6561%100=61 - Bit 1: result=43×61=2623%100=23 - Answer: 3^13 mod 100 = 23
Python's built-in pow(a, b, m) uses this exact algorithm — it's implemented in C and is the fastest option available. Use it. Don't reimplement unless you need to understand the internals.
# io.thecodeforge: Fast Modular Exponentiation # O(log exp) using repeated squaring. The most important algorithm in this guide. import time def mod_exp_manual(base: int
2^10 mod 1000 = 24 | Match: True
3^200 mod 13 = 9 | Match: True
7^10^6 mod 1000000007 = 927993657 | Match: True
2^10^18 mod 1000000007 = 246026182 | Match: True
=== Performance ===
Manual mod_exp: 3.42 µs per call
Built-in pow(): 0.03 µs per call
Speedup: 114.0x
Always use pow(base, exp, mod) in production and competitions.
=== Step-by-Step Trace: 3^13 mod 100 ===
Step 1: exp= 13 ( 0b1101) bit=1 | result= 1 base= 3 → result=3 (multiply)
Step 2: exp= 6 ( 0b110) bit=0 | result= 3 base= 9
Step 3: exp= 3 ( 0b11) bit=1 | result= 3 base= 81 → result=43 (multiply)
Step 4: exp= 1 ( 0b1) bit=1 | result= 43 base= 61 → result=23 (multiply)
Final answer: 23
C++ Modular Exponentiation: Performance and Pitfalls
In C++, there is no built-in three-argument pow for modular exponentiation in the standard library (std::pow works on floating-point, not modular). You must implement fast exponentiation yourself. However, C++ is often used in competitive programming and production systems where performance is critical, so a correct and fast implementation is essential.
The implementation follows the same repeated squaring algorithm. Key pitfalls: integer overflow in the multiplication (a * b % mod can overflow if a and b are near INT64_MAX). Use __int128 for multiplication when dealing with moduli up to 10^18, or use a technique like Russian peasant multiplication for arbitrary precision.
The code below uses unsigned long long and __int128 for safe multiplication. For moduli up to ~2e9 (most CP), a simple (long long) (a * b) % mod is safe after casting to 64-bit? Actually, if a,b < MOD and MOD < 2^31, product < 2^62, which fits in signed 64-bit (9.22e18). For MOD up to 10^9+7, product < 1e18, safe. For RSA moduli (3072-bit), arbitrary precision is needed.
Important: use long long instead of int to avoid overflow. Also, the algorithm is identical to Python's manual version but with explicit type handling.
// io.thecodeforge: Fast Modular Exponentiation in C++ // Uses unsigned long long and __int128 for safe multiplication #include <cstdint> #include <iostream> using namespace std; typedef unsigned long long ull; typedef __int128 int128; // Modular exponentiation: (base^exp) % mod ull mod_pow(ull base, ull exp, ull mod) {\n ull result = 1;\n base %= mod;\n while (exp > 0) {\n if (exp & 1) {\n result = (int128)result * base % mod;\n } base = (int128)base * base % mod; exp >>= 1; } return result; } // If __int128 is not available (e.g., MSVC), use more portable approach: // ull mul_mod(ull a, ull b, ull m) { // ull res = 0; // a %= m; // while (b) { // if (b & 1) res = (res + a) % m; // a = (a << 1) % m; // b >>= 1; // } // return res; // } int main() { // Test cases cout << "3^13 mod 100 = " << mod_pow(3, 13, 100) << endl; cout << "2^10 mod 1000 = " << mod_pow(2, 10, 1000) << endl; cout << "7^(10^6) mod (10^9+7) = " << mod_pow(7, 1000000, 1000000007ULL) << endl; cout << "3^1000000005 mod (10^9+7) = " << mod_pow(3, 1000000005ULL, 1000000007ULL) << endl; // Fermat inverse return 0; }
2^10 mod 1000 = 24
7^(10^6) mod (10^9+7) = 927993657
3^1000000005 mod (10^9+7) = 333333336
(a*b)%mod without ensuring product fits in the type's range.(a * b) % MOD and got silent wrap-around, causing hash collisions. Switching to __int128 multiplication fixed it.Modular Inverse: Three Methods Compared
The modular inverse of a mod m is x such that a×x ≡ 1 (mod m). It exists if and only if gcd(a, m) = 1. Three methods:
Method 1: Fermat's Little Theorem — When m is prime: a⁻¹ ≡ a^(m-2) mod m. O(log m) via modular exponentiation. Simple, fast, but only works when m is prime.
Method 2: Extended Euclidean Algorithm — Finds x, y such that a×x + m×y = gcd(a, m). If gcd(a, m) = 1, then a×x ≡ 1 (mod m), so x is the inverse. Works for any m (not just primes). O(log m).
Method 3: Euler's Theorem — When gcd(a, m) = 1: a^φ(m) ≡ 1 (mod m), so a⁻¹ ≡ a^(φ(m)-1) mod m. Generalises Fermat's theorem but requires computing φ(m), which requires factoring m.
For competitive programming (m = 10^9+7, prime): use Fermat's — one line. For cryptography (m = φ(n), not necessarily prime): use Extended GCD. For understanding: implement all three and verify they agree.
# io.thecodeforge: Modular Inverse — Three Methods Compared from math import gcd # ───────────────────────────────────────────────────────────── # METHOD 1: FERMAT'S LITTLE THEOREM # a^(-1) ≡ a^(p-2) mod p (only when p is prime) # O(log p) — one modular exponentiation # ───────────────────────────────────────────────────────────── def mod_inv_fermat(a: int
a= 3: Fermat= 34 | ExtGCD= 34 | Euler= 34 | Match: True
a= 7: Fermat= 29 | ExtGCD= 29 | Euler= 29 | Match: True
a= 42: Fermat= 65 | ExtGCD= 65 | Euler= 65 | Match: True
a= 99: Fermat= 50 | ExtGCD= 50 | Euler= 50 | Match: True
=== Extended GCD works for non-prime moduli ===
a= 3, m=100: inverse= 67 | verify: 3×67 mod 100 = 1
a= 7, m=100: inverse= 43 | verify: 7×43 mod 100 = 1
a= 11, m=100: inverse= 91 | verify: 11×91 mod 100 = 1
a= 99, m=100: inverse= 99 | verify: 99×99 mod 100 = 1
Note: a=2 has no inverse mod 100 because gcd(2, 100) = 2 ≠ 1
a=2, m=100: Modular inverse does not exist: gcd(2, 100) = 2
=== Python 3.8+ Built-in ===
pow(3, -1, 1000000007) = 333333336
pow(3, 1000000005, 1000000007) = 333333336
Match: True
Verify: 3 × 333333336 mod 1000000007 = 1
Fermat vs Extended GCD: Advantages and Disadvantages
Choosing between Fermat's Little Theorem and the Extended Euclidean Algorithm for computing modular inverses depends on the context. Here is a direct comparison:
| Criteria | Fermat's Little Theorem | Extended GCD |
|---|---|---|
| Works with | Only prime modulus p | Any modulus m (provided gcd(a,m)=1) |
| Complexity | O(log p) — one modular exponentiation | O(log m) — similar asymptotic cost |
| Code length | Very short: pow(a, p-2, p) | Slightly longer: 6-10 lines to implement ext_gcd |
| Edge cases | Does not check gcd; silently gives wrong answer if p is composite | Correctly reports failure if inverse does not exist (gcd != 1) |
| Multiplication overflow | Same risk as any modular exponentiation | No additional risk; uses additions and subtractions |
| Precomputation possible | No direct precomputation for all inverses | Yes, can be extended to compute inverses of many numbers (via recurrence) |
When to use Fermat: - Modulus is guaranteed prime (10^9+7, 998244353, etc.) - You need one-liner simplicity - Performance is critical and modulus is prime
When to use Extended GCD: - Modulus is composite or unknown primality - You need to detect non-existence of inverse (gcd(a,m) != 1) - You are writing library code that must handle any modulus - You need the inverse for multiple numbers (can derive recurrence from it)
In practice, for competitive programming: if the problem states modulus is prime, Fermat is fine. For production systems or library code, always prefer Extended GCD (or Python's pow(a,-1,m)) because it's safer and only marginally slower.
The production incident at the top of this article is a perfect example: using Fermat on an RSA modulus (composite) caused silent data corruption. Extended GCD would have saved hours of debugging.
# io.thecodeforge: Comparison of Fermat vs Extended GCD MOD_PRIME = 998244353 # common CP prime MOD_COMPOSITE = 1000000 # not prime def modinv_fermat(a, mod): return pow(a, mod - 2, mod) def modinv_extgcd(a, mod): # Extended GCD: solves a*x + mod*y = 1 def egcd(a, b): if b == 0: return a, 1, 0 g, x, y = egcd(b, a % b) return g, y, x - (a // b) * y g, x, y = egcd(a, mod) if g != 1: return None return x % mod print("=== On prime modulus ===") a = 3 print(f"Fermat: {modinv_fermat(a, MOD_PRIME)}") print(f"ExtGCD: {modinv_extgcd(a, MOD_PRIME)}") print("\n=== On composite modulus ===") print(f"Fermat: {modinv_fermat(a, MOD_COMPOSITE)}") print(f"ExtGCD: {modinv_extgcd(a, MOD_COMPOSITE)}") # Verify inv = modinv_extgcd(a, MOD_COMPOSITE) print(f"Verify: {a} * {inv} % {MOD_COMPOSITE} = {a * inv % MOD_COMPOSITE} (should be 1)") # Fermat gives wrong result fermat_inv = modinv_fermat(a, MOD_COMPOSITE) print(f"Fermat's inverse computed: {fermat_inv}") print(f"Verify Fermat result: {a} * {fermat_inv} % {MOD_COMPOSITE} = {a * fermat_inv % MOD_COMPOSITE} (should be 1)")
Fermat: 332748118
ExtGCD: 332748118
=== On composite modulus ===
Fermat: 333333334
ExtGCD: 333333001
Verify: 3 * 333333001 % 1000000 = 1 (should be 1)
Fermat's inverse computed: 333333334
Verify Fermat result: 3 * 333333334 % 1000000 = 3 (should be 1)
Modular Division: The Pattern That Trips Everyone Up
You cannot divide in modular arithmetic. Full stop. The expression (a / b) mod m has no meaning in modular arithmetic — division is not defined. Instead, you multiply by the modular inverse: (a / b) mod m = (a × b⁻¹) mod m.
This pattern appears everywhere: computing nCr mod p (which involves dividing by r! and (n-r)!), solving linear equations mod p, and normalising fractions in competitive programming. The conversion is always the same: replace /b with *inv(b).
Common mistake: computing (a b) % MOD / b — this doesn't work because the division happens in integer arithmetic, not modular arithmetic. The correct approach: (a inv(b)) % MOD.
# io.thecodeforge: Modular Division — The Pattern That Trips Everyone Up MOD = 10**9 + 7 def mod_inv(a, mod=MOD): return pow(a, mod - 2, mod) # ───────────────────────────────────────────────────────────── # WRONG WAY: integer division after mod — gives wrong answer # ───────────────────────────────────────────────────────────── a, b = 10, 3 wrong = (a % MOD) // b # integer division — WRONG in modular arithmetic right = (a % MOD) * mod_inv(b) % MOD # modular inverse — CORRECT print('=== Modular Division ===') print(f'{a
🎯 Key Takeaways
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.