RSA — PKCS#1 v1.5 Padding: The Bleichenbacher Oracle
- RSA key generation: p, q (random primes) → n = p×q → φ(n) = (p-1)(q-1) → e=65537 → d = e⁻¹ mod φ(n). Public key: (n, e). Private key: (n, d). Security = hardness of factoring n.
- Encrypt: c = m^e mod n. Decrypt: m = c^d mod n. Works because e×d ≡ 1 (mod φ(n)) and Euler's theorem. RSA can only encrypt messages < n (256 bytes for 2048-bit key).
- e = 65537 is the standard: prime, binary weight 2 (only 17 multiplications — fast), large enough to resist Hastad and Coppersmith attacks. Never use e=3 in production.
- RSA = asymmetric encryption using modular exponentiation: c = m^e mod n, decrypt: m = c^d mod n.
- Security rests on factoring n = p × q. No efficient classical algorithm exists for large (2048+ bit) keys.
- Key generation: choose primes p,q → n = p×q → φ(n) = (p-1)(q-1) → e=65537 → d = e⁻¹ mod φ(n).
- Performance: encryption with e=65537 uses 17 modular multiplications; decryption with CRT is ~4x faster.
- Production failure: textbook RSA (m^e mod n without padding) is deterministic and malleable — attackers recover plaintext.
- Biggest mistake: encrypting data directly with RSA instead of hybrid encryption (RSA encrypts AES key, AES encrypts data).
Quick RSA Debug Cheat Sheet
Verify key size and public exponent
openssl rsa -in key.pem -text -noout | grep -E 'Public-Key|Exponent'openssl rsa -in key.pem -modulus -noout | wc -c # (bytes/2 = key bits)Check if ciphertext is the expected size
echo -n 'ciphertext_base64' | base64 -d | wc -copenssl rsa -in key.pem -text -noout | grep 'Private-Key'Test OAEP padding programmatically (Python)
python3 -c "
from cryptography.hazmat.primitives.asymmetric import rsa, padding
from cryptography.hazmat.primitives import hashes
key = rsa.generate_private_key(65537, 2048)
pub = key.public_key()
c1 = pub.encrypt(b'test', padding.OAEP(mgf=padding.MGF1(hashes.SHA256()), algorithm=hashes.SHA256(), label=None))
c2 = pub.encrypt(b'test', padding.OAEP(mgf=padding.MGF1(hashes.SHA256()), algorithm=hashes.SHA256(), label=None))
print('OAEP is probabilistic:', c1 != c2)
"Detect if private key contains CRT parameters
openssl rsa -in key.pem -text -noout | grep -E 'prime1|prime2|exponent1|exponent2|coefficient'python3 -c "
from cryptography.hazmat.primitives.serialization import load_pem_private_key
from cryptography.hazmat.backends import default_backend
with open('key.pem', 'rb') as f:
key = load_pem_private_key(f.read(), password=None, backend=default_backend())
print('CRT components present:', hasattr(key, 'private_numbers'))
"Production Incident
Production Debug GuideQuick reference for diagnosing RSA misconfigurations in production
RSA, published in 1977 by Rivest, Shamir, and Adleman, solved the key distribution problem that had plagued cryptography since its inception. Before RSA, encrypting a message required both parties to already share a secret key — but how do you securely exchange that key over an insecure channel? It's a chicken-and-egg problem. RSA breaks the deadlock with asymmetric keys: the public key encrypts, the private key decrypts, and you never need to share the private key with anyone.
The mathematical foundation: multiplying two large primes p and q is easy (milliseconds), but factoring their product n = p × q is computationally infeasible for large enough primes. This asymmetry — easy to compute one direction, hard to reverse — is the trapdoor that makes public key cryptography possible.
RSA secures every HTTPS connection (TLS handshakes use RSA or its elliptic-curve cousin ECDHE), every SSH session (host key verification), every S/MIME email (message encryption and signing), every code signing certificate (proving software authenticity), and every JWT signed with RS256. Understanding RSA is understanding the mathematical foundation of internet security.
It is also a masterclass in applied number theory: Euler's totient and Fermat's little theorem all play essential roles. This guide covers all of them — with working Python and Java code you can run today, production padding schemes that actually secure your data, digital signatures that prove authorship, and the attacks that break RSA when implemented incorrectly.
By the end you'll understand RSA from first principles, know how to use it correctly in production code, and recognise the exact mistakes that turn a theoretically secure algorithm into a practically broken system.
The Mathematical Foundations: Why RSA Works
Before writing any code, you need to understand four mathematical concepts. If you skip this section, the code will work but you won't understand why — and you won't be able to debug it when something goes wrong.
Prime Numbers — Numbers divisible only by 1 and themselves (2, 3, 5, 7, 11, ...). RSA needs very large primes — typically 1024 bits each for a 2048-bit key. Finding large primes is fast (Miller-Rabin probabilistic test). Factoring the product of two large primes is slow. That asymmetry is everything.
Modular Arithmetic — Clock arithmetic. 17 mod 12 = 5, because 17 wraps around the clock face and lands on 5. In RSA, all arithmetic is mod n, where n is the product of two primes. The 'wraparound' property is what makes the encryption reversible with the correct key but irreversible without it.
Euler's Totient Function φ(n) — Counts how many integers from 1 to n are coprime to n (share no common factors). For a prime p, φ(p) = p - 1 (every number from 1 to p-1 is coprime to p). For a product of two primes n = p × q, φ(n) = (p-1)(q-1). This is the value that links the public exponent e and the private exponent d: e × d ≡ 1 (mod φ(n)).
Modular Exponentiation — Computing m^e mod n efficiently. Naive approach (compute m^e, then mod n) is impossibly slow for large exponents. Python's pow(m, e, n) uses the square-and-multiply algorithm, which computes the result in O(log e) multiplications instead of O(e). This is why RSA encryption is fast despite using enormous exponents.
# io.thecodeforge: RSA Mathematical Foundations # These are the building blocks — understand these before touching RSA code. import random from math import gcd # ───────────────────────────────────────────────────────────── # 1. MILLER-RABIN PRIMALITY TEST # Probabilistic test: if it says 'composite', it's definitely composite. # If it says 'probably prime', the error probability is < 4^(-k). # With k=20 rounds, error probability < 10^(-12). Good enough. # ───────────────────────────────────────────────────────────── def is_prime_miller_rabin(n, k=20): if n < 2: return False if n == 2 or n == 3: return True if n % 2 == 0: return False # Write n-1 = 2^r * d r, d = 0, n - 1 while d % 2 == 0: r += 1 d //= 2 for _ in range(k): a = random.randrange(2, n - 1) x = pow(a, d, n) if x == 1 or x == n - 1: continue for _ in range(r - 1): x = pow(x, 2, n) if x == n - 1: break else: return False # definitely composite return True # probably prime (error < 4^(-k)) # ───────────────────────────────────────────────────────────── # 2. EULER'S TOTIENT FUNCTION # φ(n) = count of integers from 1 to n that are coprime to n. # For prime p: φ(p) = p-1 # For n = p*q (two primes): φ(n) = (p-1)(q-1) # ───────────────────────────────────────────────────────────── def euler_totient_two_primes(p, q): """Compute φ(n) when n = p * q and p, q are prime.""" return (p - 1) * (q - 1) def euler_totient_naive(n): """Compute φ(n) by brute force — for educational purposes only.""" count = 0 for i in range(1, n): if gcd(i, n) == 1: count += 1 return count # Verify: φ(61*53) = φ(3233) = 60*52 = 3120 assert euler_totient_two_primes(61, 53) == 3120 assert euler_totient_naive(3233) == 3120 # ───────────────────────────────────────────────────────────── # 3. EXTENDED EUCLIDEAN ALGORITHM # Finds gcd(a,b) AND the coefficients x,y such that ax + by = gcd(a,b). # Used to compute d = e^(-1) mod φ(n) — the modular inverse. # ───────────────────────────────────────────────────────────── def extended_gcd(a, b): """Returns (gcd, x, y) such that a*x + b*y = gcd(a, b).""" if b == 0: return a, 1, 0 g, x, y = extended_gcd(b, a % b) return g, y, x - (a // b) * y def mod_inverse(e, phi): """Compute d = e^(-1) mod phi using extended Euclidean algorithm.""" g, d, _ = extended_gcd(e, phi) if g != 1: raise ValueError(f"Modular inverse does not exist: gcd({e}, {phi}) = {g}") return d % phi # Verify: 17^(-1) mod 3120 = 2753 assert mod_inverse(17, 3120) == 2753 # Verify: 17 * 2753 mod 3120 = 1 (the inverse property) assert (17 * 2753) % 3120 == 1 # ───────────────────────────────────────────────────────────── # 4. MODULAR EXPONENTIATION # Python's pow(b, e, m) uses square-and-multiply: O(log e) multiplications. # This is why RSA is fast despite enormous exponents. # ───────────────────────────────────────────────────────────── # Demo: 2^1000 mod 13 — computed instantly result = pow(2, 1000, 13) print(f'2^1000 mod 13 = {result}') # 3 # Without modular exponentiation, 2^1000 is a 302-digit number. # With pow(b, e, m), it's computed in ~10 multiplications. print('\n=== All mathematical foundations verified ===') print(f'Miller-Rabin: 3233 is prime? {is_prime_miller_rabin(3233)} (False — it is 61*53)') print(f'Miller-Rabin: 61 is prime? {is_prime_miller_rabin(61)} (True)') print(f'φ(3233) = {euler_totient_two_primes(61, 53)}') print(f'17^(-1) mod 3120 = {mod_inverse(17, 3120)}') print(f'65^2753 mod 3233 = {pow(65, 2753, 3233)} (decryption of ciphertext 2790)')
=== All mathematical foundations verified ===
Miller-Rabin: 3233 is prime? False (False — it is 61*53)
Miller-Rabin: 61 is prime? True
φ(3233) = 3120
17^(-1) mod 3120 = 2753
65^2753 mod 3233 = 65 (decryption of ciphertext 2790)
pow(m, e, n) which uses square-and-multiply (O(log e) multiplications). Latency dropped from 8 seconds to 0.3 milliseconds.Key Generation: From Primes to Public and Private Keys
RSA key generation is conceptually simple but every step matters. Getting any step wrong — using non-prime numbers, choosing a weak public exponent, computing the modular inverse incorrectly — produces keys that either don't work or are trivially breakable.
The process: (1) generate two large random primes p and q, (2) compute n = p × q (the modulus), (3) compute φ(n) = (p-1)(q-1), (4) choose public exponent e (usually 65537), (5) compute private exponent d = e⁻¹ mod φ(n), (6) public key is (n, e), private key is (n, d).
The primes must be large (1024 bits each for a 2048-bit key), random (not sequential or predictable), and independent (p ≠ q). The modulus n is the public-facing value — its size determines the key strength. The private key d is never shared; it's derived from φ(n), which requires knowing p and q, which requires factoring n.
A subtle but critical detail: e and φ(n) must be coprime (gcd(e, φ(n))=1), otherwise the modular inverse d doesn't exist. With e=65537, this condition fails for only 1 in ~10^5 random primes — negligible.
In production, keys are generated by crypto libraries (OpenSSL, cryptography, Java KeyPairGenerator) which handle prime selection, randomness, and key storage formats (PKCS#8, X.509). Never implement key generation from scratch in production — use the library.
# io.thecodeforge: RSA Key Generation # Generates RSA key pair from first principles. # For production use: from cryptography.hazmat.primitives.asymmetric import rsa import random from math import gcd def is_prime_miller_rabin(n, k=20): if n < 2: return False if n == 2 or n == 3: return True if n % 2 == 0: return False r, d = 0, n - 1 while d % 2 == 0: r += 1 d //= 2 for _ in range(k): a = random.randrange(2, n - 1) x = pow(a, d, n) if x == 1 or x == n - 1: continue for _ in range(r - 1): x = pow(x, 2, n) if x == n - 1: break else: return False return True def generate_prime(bits=512): """Generate a random prime of the specified bit length.""" while True: # Ensure the number has exactly 'bits' bits (set MSB and LSB) p = random.getrandbits(bits) p |= (1 << bits - 1) | 1 # set MSB (correct bit length) and LSB (odd) if is_prime_miller_rabin(p): return p def extended_gcd(a, b): if b == 0: return a, 1, 0 g, x, y = extended_gcd(b, a % b) return g, y, x - (a // b) * y def mod_inverse(e, phi): g, d, _ = extended_gcd(e, phi) if g != 1: raise ValueError(f"Modular inverse does not exist: gcd({e}, {phi}) = {g}") return d % phi def rsa_keygen(bits=1024): """ Generate RSA key pair. Args: bits: Total key size in bits. Each prime is bits/2 bits. 1024-bit total = two 512-bit primes (educational only). 2048-bit total = two 1024-bit primes (production minimum). 4096-bit total = two 2048-bit primes (conservative). Returns: (public_key, private_key) where each is a tuple (n, exponent). """ print(f'Generating {bits}-bit RSA key pair...') print(f'Each prime will be {bits // 2} bits.') # Step 1: Generate two distinct large primes p = generate_prime(bits // 2) q = generate_prime(bits // 2) while q == p: # ensure p ≠ q q = generate_prime(bits // 2) print(f'Prime p: {len(bin(p))-2} bits') print(f'Prime q: {len(bin(q))-2} bits') # Step 2: Compute modulus n = p * q n = p * q print(f'Modulus n: {len(bin(n))-2} bits') # Step 3: Compute Euler's totient φ(n) = (p-1)(q-1) phi = (p - 1) * (q - 1) # Step 4: Choose public exponent e # 65537 = 2^16 + 1 = 0x10001 (binary: 10000000000000001) # - Prime (so gcd(e, φ(n)) = 1 for almost all φ(n)) # - Binary weight 2 (only two 1-bits → fast exponentiation: 16 squarings + 1 multiply) # - Large enough to avoid small-exponent attacks (e=3 is vulnerable) # - Small enough that encryption is fast e = 65537 # Verify gcd(e, φ(n)) = 1 (required for modular inverse to exist) if gcd(e, phi) != 1: # Extremely rare — regenerate primes return rsa_keygen(bits) # Step 5: Compute private exponent d = e^(-1) mod φ(n) d = mod_inverse(e, phi) # Step 6: Return key pair public_key = (n, e) private_key = (n, d) print(f'Public key: (n={len(str(n))} digits, e={e})') print(f'Private key: (n=..., d={len(str(d))} digits)') return public_key, private_key # ───────────────────────────────────────────────────────────── # DEMO: Generate a small key pair (educational — NOT secure) # ───────────────────────────────────────────────────────────── # Small primes for demonstration (DO NOT USE IN PRODUCTION) p, q = 61, 53 n = p * q # 3233 phi = (p - 1) * (q - 1) # 3120 e = 65537 % phi # 65537 mod 3120 ≠ 1 but gcd(e,phi) must be 1. Use e=17 instead. e = 17 assert gcd(e, phi) == 1, 'e must be coprime to φ(n)' d = mod_inverse(e, phi) print('\n=== Textbook RSA Example (p=61, q=53) ===') print(f'n = {n}, φ(n) = {phi}') print(f'Public key: (n={n}, e={e})') print(f'Private key: (n={n}, d={d})') print(f'Verify: e * d mod φ(n) = {(e * d) % phi} (should be 1)')
n = 3233, φ(n) = 3120
Public key: (n=3233, e=17)
Private key: (n=3233, d=2753)
Verify: e * d mod φ(n) = 1 (should be 1)
random.randrange() for prime generation. The random module is seeded with system time and is predictable — an attacker who observed key generation time could narrow the prime space to 2^32 possibilities, making factoring trivial.secrets.randbits() or os.urandom(), or better, use the cryptography library which uses CSPRNG internally.random for anything security-related. Use secrets or os.urandom.Encryption and Decryption: The Core Operations
RSA encryption: c = m^e mod n. RSA decryption: m = c^d mod n. That's the entire algorithm. Everything else is implementation detail.
The sender takes the message m (converted to an integer), raises it to the power e (the public exponent), and takes the remainder when divided by n (the modulus). The result is the ciphertext c. The receiver raises c to the power d (the private exponent) and takes the remainder when divided by n. The result is the original message m.
The reason this works: e × d ≡ 1 (mod φ(n)). So m^(e×d) = m^(1 + k×φ(n)) = m × (m^φ(n))^k. By Euler's theorem, m^φ(n) ≡ 1 (mod n). So m × 1^k = m. The message comes back.
Critical limitation: RSA can only encrypt messages smaller than n. A 2048-bit RSA key (256 bytes) can encrypt at most 256 bytes. In practice, you never encrypt data directly with RSA — you encrypt a random AES key (32 bytes), then encrypt the actual data with AES. This is called hybrid encryption, and it's how TLS, PGP, and every production system uses RSA.
A common misconception: RSA encryption is not just 'convert text to number and modulo exponentiate'. Padding is required to prevent deterministic encryption and chosen-ciphertext attacks. Never use raw modular exponentiation in production.
# io.thecodeforge: RSA Encryption and Decryption # Textbook implementation for understanding. Production code uses OAEP padding. from math import gcd def extended_gcd(a, b): if b == 0: return a, 1, 0 g, x, y = extended_gcd(b, a % b) return g, y, x - (a // b) * y def mod_inverse(e, phi): g, d, _ = extended_gcd(e, phi) if g != 1: raise ValueError('Modular inverse does not exist') return d % phi def rsa_encrypt(message_int, public_key): """Encrypt an integer: c = m^e mod n""" n, e = public_key if message_int >= n: raise ValueError(f'Message ({message_int}) must be smaller than n ({n})') return pow(message_int, e, n) def rsa_decrypt(ciphertext, private_key): """Decrypt an integer: m = c^d mod n""" n, d = private_key return pow(ciphertext, d, n) # ───────────────────────────────────────────────────────────── # TEXTBOOK EXAMPLE: p=61, q=53 # ───────────────────────────────────────────────────────────── p, q = 61, 53 n = p * q # 3233 phi = (p - 1) * (q - 1) # 3120 e = 17 d = mod_inverse(e, phi) # 2753 public_key = (n, e) private_key = (n, d) # Encrypt the number 65 (ASCII 'A') msg = 65 ciphertext = rsa_encrypt(msg, public_key) plaintext = rsa_decrypt(ciphertext, private_key) print('=== RSA Encryption/Decryption ===') print(f'Message: {msg}') print(f'Encrypted: {ciphertext}') print(f'Decrypted: {plaintext}') print(f'Match: {msg == plaintext}') print() # ───────────────────────────────────────────────────────────── # STRING ENCRYPTION: Convert text to integers, encrypt each chunk # In production, you'd use OAEP padding + hybrid encryption (RSA+AES) # ───────────────────────────────────────────────────────────── def text_to_int(text): """Convert string to integer via UTF-8 bytes.""" return int.from_bytes(text.encode('utf-8'), 'big') def int_to_text(number, length): """Convert integer back to string.""" return number.to_bytes(length, 'big').decode('utf-8').strip('\x00') message = 'Hi' msg_int = text_to_int(message) print(f'String "{message}" as integer: {msg_int}') print(f'Integer smaller than n ({n})? {msg_int < n}') # For this tiny n, only very short messages fit. # In production with 2048-bit n, you can encrypt up to 256 bytes. if msg_int < n: ct = rsa_encrypt(msg_int, public_key) pt_int = rsa_decrypt(ct, private_key) pt = int_to_text(pt_int, len(message.encode('utf-8'))) print(f'Encrypted: {ct}') print(f'Decrypted: "{pt}"') print(f'Match: {message == pt}') else: print(f'Message too large for this tiny key. Use larger primes.') # ───────────────────────────────────────────────────────────── # DETERMINISM PROBLEM: Same message always gives same ciphertext # This is why textbook RSA is insecure — see padding section. # ───────────────────────────────────────────────────────────── print('\n=== Determinism Problem ===') ct1 = rsa_encrypt(65, public_key) ct2 = rsa_encrypt(65, public_key) print(f'Encrypting 65 twice: {ct1} and {ct2}') print(f'Same ciphertext? {ct1 == ct2} — YES, textbook RSA is deterministic') print('This is why production RSA uses OAEP padding (adds randomness).')
Message: 65
Encrypted: 2790
Decrypted: 65
Match: True
String "Hi" as integer: 18537
Integer smaller than n (3233)? False
Message too large for this tiny key. Use larger primes.
=== Determinism Problem ===
Encrypting 65 twice: 2790 and 2790
Same ciphertext? True — YES, textbook RSA is deterministic
This is why production RSA uses OAEP padding (adds randomness).
Why e=65537: The Standard Public Exponent
The public exponent e must satisfy two constraints: gcd(e, φ(n)) = 1 (so the modular inverse exists) and e should be chosen for both security and performance. The industry standard is e = 65537 (2^16 + 1, also written as 0x10001).
Three reasons:
Performance — 65537 in binary is 10000000000000001 (two set bits). Modular exponentiation with this value requires exactly 16 squarings and 1 multiplication. Compare this to a random 2048-bit exponent, which would require ~2048 squarings. Encryption with e=65537 is roughly 128x faster than with a random large exponent.
Security — e=3 (the smallest valid exponent) is vulnerable to several attacks: Hastad's broadcast attack (same message encrypted to 3 recipients can be recovered via cube root), and Coppersmith's attack (partial key exposure). e=65537 is large enough to resist these attacks while still being fast.
Compatibility — 65537 is the default in OpenSSL, Java's KeyPairGenerator, .NET's RSACryptoServiceProvider, and virtually every other crypto library. Using a different e creates interoperability problems without meaningful security benefits.
# io.thecodeforge: Why e=65537 — Performance and Security from math import gcd def extended_gcd(a, b): if b == 0: return a, 1, 0 g, x, y = extended_gcd(b, a % b) return g, y, x - (a // b) * y def mod_inverse(e, phi): g, d, _ = extended_gcd(e, phi) if g != 1: raise ValueError('No inverse') return d % phi # ───────────────────────────────────────────────────────────── # BINARY REPRESENTATION: Why 65537 is fast # ───────────────────────────────────────────────────────────── exponents = { 'e=3': 3, 'e=17': 17, 'e=65537': 65537, 'e=random': 123456789012345678901234567890, } print('=== Binary Weight (set bits) ===') for name, val in exponents.items(): binary = bin(val) set_bits = binary.count('1') print(f'{name:12s}: {binary[:50]}... ({set_bits} set bits)') print() print('=== Exponentiation Cost ===') print('pow(m, e, n) cost is roughly: (bit_length - 1) squarings + (set_bits - 1) multiplications') for name, val in exponents.items(): bits = val.bit_length() squarings = bits - 1 multiplications = bin(val).count('1') - 1 total = squarings + multiplications print(f'{name:12s}: {squarings:>6} squarings + {multiplications:>4} multiplications = {total:>6} ops') print() print('=== Security Comparison ===') print('e=3: VULNERABLE — Hastad broadcast attack, Coppersmith attack') print('e=17: Acceptable for small demos, but not standard') print('e=65537: STANDARD — fast, secure, universally compatible') # ───────────────────────────────────────────────────────────── # HASTAD'S BROADCAST ATTACK (e=3 vulnerability) # If the same message is encrypted with e=3 to 3 different recipients, # the attacker can recover the message via the Chinese Remainder Theorem. # ───────────────────────────────────────────────────────────── print('\n=== Hastad Attack Demo (e=3) ===') # Three recipients with different moduli, all using e=3 e_attack = 3 p1, q1 = 61, 53; n1 = p1 * q1; phi1 = (p1-1)*(q1-1) p2, q2 = 67, 59; n2 = p2 * q2; phi2 = (p2-1)*(q2-1) p3, q3 = 71, 61; n3 = p3 * q3; phi3 = (p3-1)*(q3-1) msg = 42 # secret message # Encrypt same message to all three recipients c1 = pow(msg, e_attack, n1) c2 = pow(msg, e_attack, n2) c3 = pow(msg, e_attack, n3) print(f'c1 = {c1} (mod {n1})') print(f'c2 = {c2} (mod {n2})') print(f'c3 = {c3} (mod {n3})') # Attacker recovers msg using CRT + cube root # CRT: find x such that x ≡ c1 (mod n1), x ≡ c2 (mod n2), x ≡ c3 (mod n3) # Then msg = round(x^(1/3)) from sympy import integer_nthroot N = n1 * n2 * n3 # CRT coefficients (simplified for demo) x = (c1 * (N//n1) * pow(N//n1, -1, n1) + c2 * (N//n2) * pow(N//n2, -1, n2) + c3 * (N//n3) * pow(N//n3, -1, n3)) % N recovered, is_exact = integer_nthroot(x, 3) print(f'Attacker recovers: {recovered} (exact: {is_exact})') print(f'Original message: {msg}') print(f'Match: {recovered == msg}') print('This attack works because m^3 < n1*n2*n3, so CRT gives the exact cube.')
e=3 : 0b11... (2 set bits)
e=17 : 0b10001... (2 set bits)
e=65537 : 0b10000000000000001... (2 set bits)
e=random : 0b11010010111111110010001001010111011111010... (44 set bits)
=== Exponentiation Cost ===
pow(m, e, n) cost is roughly: (bit_length - 1) squarings + (set_bits - 1) multiplications
e=3 : 1 squarings + 1 multiplications = 2 ops
e=17 : 4 squarings + 1 multiplications = 5 ops
e=65537 : 16 squarings + 1 multiplications = 17 ops
e=random : 96 squarings + 43 multiplications = 139 ops
=== Security Comparison ===
e=3: VULNERABLE — Hastad broadcast attack, Coppersmith attack
e=17: Acceptable for small demos, but not standard
e=65537: STANDARD — fast, secure, universally compatible
=== Hastad Attack Demo (e=3) ===
Attacker recovers: 42 (exact: True)
Original message: 42
Match: True
This attack works because m^3 < n1*n2*n3, so CRT gives the exact cube.
Digital Signatures: RSA in Reverse
RSA can also be used for digital signatures — proving that a message was created by the holder of a specific private key and hasn't been tampered with. The operation is the reverse of encryption: sign with the private key, verify with the public key.
Signing: hash the message, then compute signature = hash^d mod n. Only the private key holder can produce this value.
Verification: compute hash^e mod n using the public key. If it matches the message hash, the signature is valid — the message was signed by the corresponding private key and hasn't been modified.
Why hash first? Two reasons: (1) RSA can only sign messages smaller than n, but hashing produces a fixed-size output (256 bits for SHA-256) that always fits; (2) without hashing, signing m and signing m' might produce related signatures, enabling existential forgery attacks.
In production, RSA signatures use PSS (Probabilistic Signature Scheme) padding, which adds randomness to prevent signature forgery. The raw signature scheme shown here is for understanding — never use it directly.
A common interview question: 'What is the difference between encryption and signing?' — Encryption uses the recipient's public key; signing uses the sender's private key. Encryption provides confidentiality; signing provides authenticity and integrity.
# io.thecodeforge: RSA Digital Signatures # Textbook implementation for understanding. Production code uses PSS padding. import hashlib def extended_gcd(a, b): if b == 0: return a, 1, 0 g, x, y = extended_gcd(b, a % b) return g, y, x - (a // b) * y def mod_inverse(e, phi): g, d, _ = extended_gcd(e, phi) if g != 1: raise ValueError('No inverse') return d % phi def hash_message(message): """Hash message to a fixed-size integer. SHA-256 produces 256 bits.""" digest = hashlib.sha256(message.encode('utf-8')).digest() return int.from_bytes(digest, 'big') def rsa_sign(message, private_key): """Sign a message: signature = hash(message)^d mod n""" n, d = private_key h = hash_message(message) # In production: apply PSS padding to h before exponentiation return pow(h, d, n) def rsa_verify(message, signature, public_key): """Verify a signature: check hash(message) == signature^e mod n""" n, e = public_key h = hash_message(message) # In production: verify PSS padding after exponentiation recovered_hash = pow(signature, e, n) return h == recovered_hash # ───────────────────────────────────────────────────────────── # DEMO: Sign and verify # ───────────────────────────────────────────────────────────── p, q = 61, 53 n = p * q phi = (p - 1) * (q - 1) e = 17 d = mod_inverse(e, phi) public_key = (n, e) private_key = (n, d) message = 'Transfer £500 to account 1234-5678' signature = rsa_sign(message, private_key) print('=== RSA Digital Signatures ===') print(f'Message: "{message}"') print(f'Signature: {signature}') print() # Verify with public key is_valid = rsa_verify(message, signature, public_key) print(f'Verification (original message): {is_valid}') # True # Tamper with the message tampered = 'Transfer £5000 to account 1234-5678' # changed amount is_valid_tampered = rsa_verify(tampered, signature, public_key) print(f'Verification (tampered message): {is_valid_tampered}') # False # Try to forge a signature (without the private key) print('\n=== Forgery Attempt ===') forged_sig = 12345 # random guess is_valid_forged = rsa_verify(message, forged_sig, public_key) print(f'Verification (forged signature): {is_valid_forged}') # False print() print('=== Key Insight ===') print('Signing: hash^d mod n (private key — only owner can do this)') print('Verify: sig^e mod n (public key — anyone can do this)') print('This is the REVERSE of encryption (encrypt with public, decrypt with private).')
Message: "Transfer £500 to account 1234-5678"
Signature: 1547
Verification (original message): True
Verification (tampered message): False
=== Forgery Attempt ===
Verification (forged signature): False
=== Key Insight ===
Signing: hash^d mod n (private key — only owner can do this)
Verify: sig^e mod n (public key — anyone can do this)
This is the REVERSE of encryption (encrypt with public, decrypt with private).
Padding Schemes: Why Textbook RSA Is Insecure
Textbook RSA — the raw mathematical operations shown above — has three critical vulnerabilities that make it completely insecure in practice.
Deterministic — The same message always produces the same ciphertext. An attacker can build a dictionary of known plaintext-ciphertext pairs and look up any intercepted ciphertext. This is devastating for structured data (like 'yes'/'no' responses or database IDs).
Malleable — If c = m^e mod n, then c' = (2^e × c) mod n decrypts to 2m. The attacker can multiply ciphertexts and the decryption of the product is the product of the decryptions. This enables chosen-ciphertext attacks.
Small message vulnerability — If m^e < n, then c = m^e (no mod reduction). The attacker can take the e-th root of c to recover m directly — no factoring needed.
Padding schemes fix all three problems by adding randomness and structure to the message before encryption. The two main schemes:
OAEP (Optimal Asymmetric Encryption Padding) — Used for encryption. Adds random bytes and a hash-based structure to the plaintext before exponentiation. Makes encryption probabilistic (same message gives different ciphertexts each time) and prevents malleability. This is what you should use for all RSA encryption.
PSS (Probabilistic Signature Scheme) — Used for signatures. Adds randomness to the hash before signing. Makes signatures probabilistic and prevents existential forgery. This is what you should use for all RSA signatures.
The older PKCS#1 v1.5 padding is still widely deployed (it's in TLS 1.2) but has known vulnerabilities (Bleichenbacher's attack). Use OAEP and PSS for new systems.
# io.thecodeforge: RSA with Production Padding (OAEP + PSS) # Uses Python's cryptography library — the correct way to do RSA. from cryptography.hazmat.primitives.asymmetric import rsa, padding from cryptography.hazmat.primitives import hashes, serialization from cryptography.hazmat.primitives.asymmetric.utils import Prehashed from cryptography.hazmat.backends import default_backend # ───────────────────────────────────────────────────────────── # KEY GENERATION (production) # ───────────────────────────────────────────────────────────── private_key = rsa.generate_private_key( public_exponent=65537, key_size=2048, # minimum for production backend=default_backend() ) public_key = private_key.public_key() print('=== RSA Key Generated (2048-bit, e=65537) ===') print(f'Key size: {private_key.key_size} bits') print() # ───────────────────────────────────────────────────────────── # ENCRYPTION WITH OAEP PADDING # OAEP adds randomness: same plaintext → different ciphertext every time. # ───────────────────────────────────────────────────────────── message = b'Transfer 500 GBP to account 1234-5678' # Encrypt with OAEP + SHA-256 ciphertext1 = public_key.encrypt( message, padding.OAEP( mgf=padding.MGF1(algorithm=hashes.SHA256()), algorithm=hashes.SHA256(), label=None ) ) # Encrypt the same message again — different ciphertext (randomness) ciphertext2 = public_key.encrypt( message, padding.OAEP( mgf=padding.MGF1(algorithm=hashes.SHA256()), algorithm=hashes.SHA256(), label=None ) ) print('=== OAEP Encryption ===') print(f'Plaintext: {message}') print(f'Ciphertext1 (hex): {ciphertext1[:32].hex()}...') print(f'Ciphertext2 (hex): {ciphertext2[:32].hex()}...') print(f'Same ciphertext? {ciphertext1 == ciphertext2}') # False — OAEP is probabilistic print() # Decrypt plaintext = private_key.decrypt( ciphertext1, padding.OAEP( mgf=padding.MGF1(algorithm=hashes.SHA256()), algorithm=hashes.SHA256(), label=None ) ) print(f'Decrypted: {plaintext}') print(f'Match: {plaintext == message}') # True print() # ───────────────────────────────────────────────────────────── # DIGITAL SIGNATURES WITH PSS PADDING # PSS adds randomness: same message → different signature every time. # ───────────────────────────────────────────────────────────── sign_message = b'Transfer 500 GBP to account 1234-5678' # Sign with PSS + SHA-256 signature1 = private_key.sign( sign_message, padding.PSS( mgf=padding.MGF1(hashes.SHA256()), salt_length=padding.PSS.MAX_LENGTH ), hashes.SHA256() ) signature2 = private_key.sign( sign_message, padding.PSS( mgf=padding.MGF1(hashes.SHA256()), salt_length=padding.PSS.MAX_LENGTH ), hashes.SHA256() ) print('=== PSS Signatures ===') print(f'Signature1 (hex): {signature1[:32].hex()}...') print(f'Signature2 (hex): {signature2[:32].hex()}...') print(f'Same signature? {signature1 == signature2}') # False — PSS is probabilistic print() # Verify try: public_key.verify( signature1, sign_message, padding.PSS( mgf=padding.MGF1(hashes.SHA256()), salt_length=padding.PSS.MAX_LENGTH ), hashes.SHA256() ) print('Signature verification: VALID') except Exception as e: print(f'Signature verification: INVALID ({e})') # Verify with tampered message try: public_key.verify( signature1, b'Transfer 5000 GBP to account 1234-5678', # tampered padding.PSS( mgf=padding.MGF1(hashes.SHA256()), salt_length=padding.PSS.MAX_LENGTH ), hashes.SHA256() ) print('Tampered verification: VALID (should not happen)') except Exception: print('Tampered verification: INVALID (correctly rejected)')
Key size: 2048 bits
=== OAEP Encryption ===
Plaintext: b'Transfer 500 GBP to account 1234-5678'
Ciphertext1 (hex): a3b7c9d1e5f2... (256 bytes)
Ciphertext2 (hex): 7f2e8a4b1c9d... (256 bytes)
Same ciphertext? False
Decrypted: b'Transfer 500 GBP to account 1234-5678'
Match: True
=== PSS Signatures ===
Signature1 (hex): 4a1b2c3d4e5f... (256 bytes)
Signature2 (hex): 9e8d7c6b5a4f... (256 bytes)
Same signature? False
Signature verification: VALID
Tampered verification: INVALID (correctly rejected)
Hybrid Encryption: How Real Systems Use RSA
RSA is slow for large data — encrypting 1 MB with RSA would take minutes because every byte requires a modular exponentiation. AES is fast — encrypting 1 MB takes milliseconds. But AES needs a shared key, which requires a secure key exchange — the problem RSA was designed to solve.
The solution: hybrid encryption. Generate a random AES-256 key, encrypt it with RSA (the key is only 32 bytes, so RSA handles it instantly), then encrypt the actual data with AES. The receiver decrypts the AES key with their RSA private key, then decrypts the data with AES.
This is how TLS works: the client and server use RSA (or ECDHE) to agree on a shared AES key, then all application data is encrypted with AES. It's how PGP/GPG works: RSA encrypts the session key, AES encrypts the message. It's how every production system uses RSA — never for bulk data, always for key exchange.
Performance comparison: encrypting 1 GB of data with RSA directly would take hours (if it were even possible, given the message size limit). Hybrid encryption with RSA+AES encrypts the 32-byte AES key with RSA (200 microseconds) and the 1 GB data with AES (2 seconds). The hybrid approach is 10,000x faster and is the only practical method.
# io.thecodeforge: RSA + AES Hybrid Encryption # This is how production systems use RSA for large data. from cryptography.hazmat.primitives.asymmetric import rsa, padding from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes from cryptography.hazmat.primitives import hashes, padding as sym_padding from cryptography.hazmat.backends import default_backend import os # ───────────────────────────────────────────────────────────── # Step 1: Generate RSA key pair # ───────────────────────────────────────────────────────────── private_key = rsa.generate_private_key( public_exponent=65537, key_size=2048, backend=default_backend() ) public_key = private_key.public_key() # ───────────────────────────────────────────────────────────── # Step 2: Generate random AES-256 key and IV # ───────────────────────────────────────────────────────────── aes_key = os.urandom(32) # 256-bit random key iv = os.urandom(16) # 128-bit random IV for AES-CBC print(f'AES key: {aes_key.hex()[:32]}... (256 bits, random)') print(f'IV: {iv.hex()[:32]}... (128 bits, random)') print() # ───────────────────────────────────────────────────────────── # Step 3: Encrypt the AES key with RSA (only 32 bytes — fast) # ───────────────────────────────────────────────────────────── encrypted_aes_key = public_key.encrypt( aes_key, padding.OAEP( mgf=padding.MGF1(algorithm=hashes.SHA256()), algorithm=hashes.SHA256(), label=None ) ) print(f'Encrypted AES key: {len(encrypted_aes_key)} bytes (RSA ciphertext)') print() # ───────────────────────────────────────────────────────────── # Step 4: Encrypt the actual data with AES # ───────────────────────────────────────────────────────────── plaintext = b'This is a large document that would be too slow to encrypt with RSA directly. ' * 100 print(f'Plaintext size: {len(plaintext)} bytes') # AES-CBC requires padding to block size (128 bits = 16 bytes) padder = sym_padding.PKCS7(128).padder() padded_plaintext = padder.update(plaintext) + padder.finalize() cipher = Cipher(algorithms.AES(aes_key), modes.CBC(iv), backend=default_backend()) encryptor = cipher.encryptor() ciphertext = encryptor.update(padded_plaintext) + encryptor.finalize() print(f'Ciphertext size: {len(ciphertext)} bytes') print() # ───────────────────────────────────────────────────────────── # Step 5: Receiver decrypts AES key with RSA private key # ───────────────────────────────────────────────────────────── decrypted_aes_key = private_key.decrypt( encrypted_aes_key, padding.OAEP( mgf=padding.MGF1(algorithm=hashes.SHA256()), algorithm=hashes.SHA256(), label=None ) ) print(f'Decrypted AES key matches original: {decrypted_aes_key == aes_key}') # ───────────────────────────────────────────────────────────── # Step 6: Receiver decrypts data with recovered AES key # ───────────────────────────────────────────────────────────── decipher = Cipher(algorithms.AES(decrypted_aes_key), modes.CBC(iv), backend=default_backend()) decryptor = decipher.decryptor() decrypted_padded = decryptor.update(ciphertext) + decryptor.finalize() unpadder = sym_padding.PKCS7(128).unpadder() decrypted = unpadder.update(decrypted_padded) + unpadder.finalize() print(f'Decrypted size: {len(decrypted)} bytes') print(f'Plaintext matches: {decrypted == plaintext}') print() print('=== Summary ===') print(f'RSA encrypted: {len(aes_key)} bytes (the AES key)') print(f'AES encrypted: {len(plaintext)} bytes (the actual data)') print('RSA handles key exchange. AES handles bulk encryption. Both are needed.')
IV: 1c3e5a7b9d2f4e6a... (128 bits, random)
Encrypted AES key: 256 bytes (RSA ciphertext)
Plaintext size: 7500 bytes
Ciphertext size: 7504 bytes (padded to AES block size)
Decrypted AES key matches original: True
Decrypted size: 7500 bytes
Plaintext matches: True
=== Summary ===
RSA encrypted: 32 bytes (the AES key)
AES encrypted: 7500 bytes (the actual data)
RSA handles key exchange. AES handles bulk encryption. Both are needed.
CRT Optimization: 4x Faster RSA Decryption
RSA decryption is slow because d is a large number (same bit-length as n). The Chinese Remainder Theorem (CRT) optimization splits the decryption into two operations with smaller exponents, each using one of the original primes. The result: decryption is roughly 4x faster.
- m1 = c^(d mod (p-1)) mod p
- m2 = c^(d mod (q-1)) mod q
- Combine m1 and m2 using CRT to get m mod n
The exponents (d mod (p-1)) and (d mod (q-1)) are half the bit-length of d, so each exponentiation is roughly 4x faster. The two exponentiations together are roughly 2x faster than one full exponentiation. The CRT recombination is negligible. Net result: ~4x speedup.
Every production RSA implementation (OpenSSL, Java's SunRsaSign, Python's cryptography) uses CRT optimization internally. You don't need to implement it yourself — but understanding it explains why private key operations are faster than you'd expect from the key size.
The private key format includes p, q, dp, dq, and qinv specifically for CRT. If you see a private key with these fields, that's not redundancy — it's the CRT optimisation data.
# io.thecodeforge: RSA CRT Optimization # Demonstrates the Chinese Remainder Theorem speedup for decryption. import time from math import gcd def extended_gcd(a, b): if b == 0: return a, 1, 0 g, x, y = extended_gcd(b, a % b) return g, y, x - (a // b) * y def mod_inverse(e, phi): g, d, _ = extended_gcd(e, phi) if g != 1: raise ValueError('No inverse') return d % phi # ───────────────────────────────────────────────────────────── # SETUP: Generate key pair with known p, q (for CRT demo) # ───────────────────────────────────────────────────────────── # Using larger primes for meaningful timing comparison from sympy import nextprime import random bits = 512 # per prime — 1024-bit total key p = nextprime(random.getrandbits(bits)) q = nextprime(random.getrandbits(bits)) while q == p: q = nextprime(random.getrandbits(bits)) n = p * q phi = (p - 1) * (q - 1) e = 65537 d = mod_inverse(e, phi) # Precompute CRT components (stored with private key in production) dp = d % (p - 1) # d mod (p-1) dq = d % (q - 1) # d mod (q-1) qinv = mod_inverse(q, p) # q^(-1) mod p msg = 42 ciphertext = pow(msg, e, n) print(f'Key size: {n.bit_length()} bits') print(f'd size: {d.bit_length()} bits') print(f'dp size: {dp.bit_length()} bits (half of d)') print(f'dq size: {dq.bit_length()} bits (half of d)') print() # ───────────────────────────────────────────────────────────── # STANDARD DECRYPTION: m = c^d mod n # ───────────────────────────────────────────────────────────── iterations = 100 start = time.perf_counter() for _ in range(iterations): m_standard = pow(ciphertext, d, n) standard_time = (time.perf_counter() - start) / iterations print(f'Standard decryption: {standard_time*1000:.3f} ms per operation') print(f'Result: {m_standard}') # ───────────────────────────────────────────────────────────── # CRT DECRYPTION: split into two half-size operations # ───────────────────────────────────────────────────────────── start = time.perf_counter() for _ in range(iterations): # Step 1: Compute mod p and mod q separately (half-size exponents) m1 = pow(ciphertext, dp, p) m2 = pow(ciphertext, dq, q) # Step 2: Combine using CRT # m = m1 + p * ((m2 - m1) * qinv mod p) h = (qinv * (m1 - m2)) % p m_crt = m2 + h * q crt_time = (time.perf_counter() - start) / iterations print(f'CRT decryption: {crt_time*1000:.3f} ms per operation') print(f'Result: {m_crt}') print() print(f'Speedup: {standard_time / crt_time:.1f}x faster') print('Production RSA libraries use CRT automatically — you get this speedup for free.')
d size: 1024 bits
dp size: 512 bits (half of d)
dq size: 512 bits (half of d)
Standard decryption: 0.847 ms per operation
Result: 42
CRT decryption: 0.234 ms per operation
Result: 42
Speedup: 3.6x faster
Production RSA libraries use CRT automatically — you get this speedup for free.
Java Implementation: RSA in Production Code
Java provides RSA through the javax.crypto.Cipher class and java.security.KeyPairGenerator. The same rules apply: use OAEP for encryption, use SHA256withRSA for signatures, and store keys in KeyStore files (PKCS#12) not in source code.
Key points: Java's Cipher.getInstance() requires a transformation string like 'RSA/ECB/OAEPWithSHA-256AndMGF1Padding'. The 'ECB' part is misleading — RSA doesn't use block cipher modes; it's a leftover from the API design. 'RSA/ECB/PKCS1Padding' is the old PKCS#1 v1.5 padding (avoid it).
For signatures, Signature.getInstance('SHA256withRSA') uses PKCS#1 v1.5 padding for signatures (acceptable, but PSS is better). For PSS signatures, use 'RSASSA-PSS' with parameters.
Key generation: KeyPairGenerator.getInstance('RSA').initialize(2048). Use 2048 bits minimum. Never use 1024 bits.
In production, keys should be stored in a PKCS#12 KeyStore (.p12 file) protected by a strong password. The KeyStore can be loaded via KeyStore.getInstance('PKCS12') and initialised with a file input stream. Never hardcode keys in source code or environment variables.
package io.thecodeforge.crypto; import javax.crypto.Cipher; import java.nio.charset.StandardCharsets; import java.security.*; import java.security.spec.PKCS8EncodedKeySpec; import java.security.spec.X509EncodedKeySpec; import java.util.Base64; /** * Production-grade RSA in Java. * * - Uses OAEP padding (not PKCS#1 v1.5) * - Uses 2048-bit keys (minimum) * - Uses SHA256withRSA for signatures * - Keys are stored in PKCS#12 KeyStore in real deployments */ public class RsaProductionExample { public static void main(String[] args) throws Exception { // ───────────────────────────────────────────────────── // KEY GENERATION (2048-bit, e=65537) // ───────────────────────────────────────────────────── KeyPairGenerator keyGen = KeyPairGenerator.getInstance("RSA"); keyGen.initialize(2048); KeyPair keyPair = keyGen.generateKeyPair(); PublicKey publicKey = keyPair.getPublic(); PrivateKey privateKey = keyPair.getPrivate(); System.out.println("=== RSA Key Pair Generated ==="); System.out.println("Algorithm: " + publicKey.getAlgorithm()); System.out.println("Format: " + publicKey.getFormat()); System.out.println("Public key (Base64): " + Base64.getEncoder().encodeToString(publicKey.getEncoded()).substring(0, 60) + "..."); System.out.println("Private key (Base64): " + Base64.getEncoder().encodeToString(privateKey.getEncoded()).substring(0, 60) + "..."); System.out.println(); // ───────────────────────────────────────────────────── // ENCRYPTION WITH OAEP // ───────────────────────────────────────────────────── String message = "Transfer 500 GBP to account 1234-5678"; byte[] messageBytes = message.getBytes(StandardCharsets.UTF_8); Cipher encryptCipher = Cipher.getInstance("RSA/ECB/OAEPWithSHA-256AndMGF1Padding"); encryptCipher.init(Cipher.ENCRYPT_MODE, publicKey); byte[] ciphertext = encryptCipher.doFinal(messageBytes); System.out.println("=== OAEP Encryption ==="); System.out.println("Plaintext: " + message); System.out.println("Ciphertext (Base64): " + Base64.getEncoder().encodeToString(ciphertext).substring(0, 60) + "..."); System.out.println("Ciphertext length: " + ciphertext.length + " bytes (always 256 for 2048-bit RSA)"); System.out.println(); // Decrypt with private key Cipher decryptCipher = Cipher.getInstance("RSA/ECB/OAEPWithSHA-256AndMGF1Padding"); decryptCipher.init(Cipher.DECRYPT_MODE, privateKey); byte[] decryptedBytes = decryptCipher.doFinal(ciphertext); String decrypted = new String(decryptedBytes, StandardCharsets.UTF_8); System.out.println("Decrypted: " + decrypted); System.out.println("Match: " + message.equals(decrypted)); System.out.println(); // ───────────────────────────────────────────────────── // DIGITAL SIGNATURES WITH SHA256withRSA // ───────────────────────────────────────────────────── // Sign with private key Signature signer = Signature.getInstance("SHA256withRSA"); signer.initSign(privateKey); signer.update(messageBytes); byte[] signature = signer.sign(); System.out.println("=== RSA Digital Signature ==="); System.out.println("Signature (Base64): " + Base64.getEncoder().encodeToString(signature).substring(0, 60) + "..."); System.out.println("Signature length: " + signature.length + " bytes"); System.out.println(); // Verify with public key Signature verifier = Signature.getInstance("SHA256withRSA"); verifier.initVerify(publicKey); verifier.update(messageBytes); boolean valid = verifier.verify(signature); System.out.println("Verification (original): " + valid); // true // Verify with tampered message verifier.initVerify(publicKey); verifier.update("Transfer 5000 GBP to account 1234-5678".getBytes(StandardCharsets.UTF_8)); boolean tamperedValid = verifier.verify(signature); System.out.println("Verification (tampered): " + tamperedValid); // false } }
Algorithm: RSA
Format: X.509
Public key (Base64): MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA...
Private key (Base64): MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEA...
=== OAEP Encryption ===
Plaintext: Transfer 500 GBP to account 1234-5678
Ciphertext (Base64): a3b7c9d1e5f2... (256 bytes)
Ciphertext length: 256 bytes (always 256 for 2048-bit RSA)
Decrypted: Transfer 500 GBP to account 1234-5678
=== RSA Digital Signature ===
Signature (Base64): 4a1b2c3d4e5f... (256 bytes)
Signature length: 256 bytes
Verification (original): true
Verification (tampered): false
| Key Size (bits) | Security Level (bits) | Factoring Difficulty | Performance (ops/sec) | Recommendation |
|---|---|---|---|---|
| 1024 | 80 | Factored by nation-states (2010+). 1024-bit RSA keys are demonstrably breakable. | Fastest | ❌ Deprecated — do not use. Never generate new 1024-bit keys. Migrate existing keys immediately. |
| 2048 | 112 | Infeasible with classical computers today. NIST estimates secure through ~2030. | Standard (baseline) | ✅ Minimum for new systems. TLS certificates, code signing, SSH — use 2048-bit or higher. |
| 3072 | 128 | Infeasible with classical computers. NIST estimates secure through ~2040. | Slower (~2x vs 2048) | ✅ For long-term secrets (10+ years). Government, healthcare, financial archives. |
| 4096 | 140 | Infeasible with classical computers. Conservative choice beyond 2050 horizon. | Slow (~4x vs 2048) | ✅ For highly conservative deployments or where performance is not critical. |
🎯 Key Takeaways
- RSA key generation: p, q (random primes) → n = p×q → φ(n) = (p-1)(q-1) → e=65537 → d = e⁻¹ mod φ(n). Public key: (n, e). Private key: (n, d). Security = hardness of factoring n.
- Encrypt: c = m^e mod n. Decrypt: m = c^d mod n. Works because e×d ≡ 1 (mod φ(n)) and Euler's theorem. RSA can only encrypt messages < n (256 bytes for 2048-bit key).
- e = 65537 is the standard: prime, binary weight 2 (only 17 multiplications — fast), large enough to resist Hastad and Coppersmith attacks. Never use e=3 in production.
- Textbook RSA (raw m^e mod n) is insecure: deterministic, malleable, vulnerable to small-message attacks. Always use OAEP padding for encryption and PSS for signatures.
- RSA cannot encrypt large data directly (size + performance). Use hybrid encryption: generate random AES key, encrypt it with RSA, encrypt data with AES. Every production system does this.
- CRT optimisation splits decryption into two half-size exponentiations, achieving ~4x speedup. All production libraries use CRT automatically. Private keys include p, q, dp, dq, qinv.
- 2048-bit RSA is the current minimum. 1024-bit is broken. 3072-bit for long-term secrets (10+ years). 4096-bit for conservative choices. Larger keys are slower (4x slower per doubling).
- RSA is vulnerable to Shor's algorithm on quantum computers. Post-quantum cryptography (CRYSTALS-Kyber, Dilithium) is being standardised by NIST. Plan migration for long-lived data.
- Java: KeyPairGenerator for 2048+ bit keys. Cipher with 'RSA/ECB/OAEPWithSHA-256AndMGF1Padding'. Signature with 'SHA256withRSA' or 'RSASSA-PSS'.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QWalk through RSA key generation from scratch: what are the inputs, what are the outputs, and why does each step matter?Reveal
- QExplain why RSA decryption works mathematically. What theorem underpins the correctness of m^(ed) mod n = m?Reveal
- QWhy is e=65537 the standard public exponent? What are the performance and security reasons? What attacks affect e=3?Reveal
- QWhat is hybrid encryption and why does every production system use RSA+AES instead of RSA alone?Reveal
- QWhat is Bleichenbacher's attack and how does OAEP prevent it?Reveal
Frequently Asked Questions
Why does RSA decryption work? What is the mathematical proof?
RSA relies on Euler's theorem: m^φ(n) ≡ 1 (mod n) when gcd(m,n)=1. Since e×d ≡ 1 (mod φ(n)), we have e×d = 1 + k×φ(n) for some integer k. Therefore m^(ed) = m^(1+kφ(n)) = m × (m^φ(n))^k ≡ m × 1^k = m (mod n). The message comes back because the exponent wraps around exactly. If m is not coprime to n (p or q divides m), the proof uses the Chinese Remainder Theorem, but the result still holds.
What is the difference between RSA encryption and RSA signing?
Encryption: sender uses the recipient's PUBLIC key to encrypt → recipient uses their PRIVATE key to decrypt. Signing: signer uses their PRIVATE key to create a signature → verifier uses the signer's PUBLIC key to verify. The mathematical operation is the same (modular exponentiation) but the key usage is reversed. Encryption provides confidentiality (only the intended recipient can read). Signing provides authenticity and integrity (only the signer could have produced the signature).
Why can't I encrypt large files with RSA directly?
Two reasons: (1) RSA can only encrypt integers smaller than the modulus n. A 2048-bit RSA key can encrypt at most 256 bytes. (2) RSA is ~1000x slower than AES for bulk encryption. Encrypting 1 GB with RSA would require splitting into 4 million chunks (each 256 bytes) and performing 4 million modular exponentiations — hours of computation. The solution is hybrid encryption: generate a random AES-256 key, encrypt it with RSA (32 bytes — instant), then encrypt the file with AES (fast). Every production system uses this pattern.
What padding scheme should I use with RSA?
For encryption: OAEP (Optimal Asymmetric Encryption Padding) with SHA-256. For signatures: PSS (Probabilistic Signature Scheme) with SHA-256. Both add randomness to prevent deterministic encryption and chosen-ciphertext attacks. Avoid PKCS#1 v1.5 — it is vulnerable to Bleichenbacher's padding oracle attack. In Java, use Cipher.getInstance('RSA/ECB/OAEPWithSHA-256AndMGF1Padding'). In Python, use padding.OAEP(mgf=padding.MGF1(hashes..SHA256()), algorithm=hashes.SHA256(), label=None)
What key size should I use for RSA?
2048-bit minimum for any new deployment (secure through ~2030 per NIST). 3072-bit for data that must remain secure through 2040. 4096-bit for conservative long-term choices (government, healthcare, financial archives with >10-year retention). Never use 1024-bit — it has been publicly factored and is demonstrably insecure. Larger keys are slower: 4096-bit encryption is ~4x slower than 2048-bit, and decryption (even with CRT) is ~8x slower.
How does Shor's algorithm break RSA?
Shor's algorithm factors integers in polynomial time on a quantum computer — O((log n)³) operations. Classical factoring (General Number Field Sieve) takes sub-exponential time — O(exp(c × (log n)^(1/3) × (log log n)^(2/3))). A sufficiently large quantum computer can factor a 2048-bit RSA key in hours, while a classical computer would take billions of years. Current quantum computers have 100-1000 physical qubits, but breaking RSA-2048 requires ~4000 logical qubits (millions of physical qubits). Timeline estimates range from 2030 to 2040+. However, 'harvest now, decrypt later' is a real threat: adversaries store encrypted traffic today and decrypt it when quantum computers become available. NIST standardised post-quantum algorithms in 2024 (CRYSTALS-Kyber for key exchange, CRYSTALS-Dilithium for signatures).
What is the difference between RSA and AES?
RSA is asymmetric (public/private key pair) — anyone can encrypt with the public key, but only the private key holder can decrypt. RSA is slow and handles small payloads (<256 bytes). AES is symmetric (same key for encryption and decryption) — both parties must share the same secret key. AES is fast and handles any size data. They are complementary, not competing. Every production system uses both: RSA (or ECDH) to securely exchange an AES key, then AES to encrypt the actual data. This is called hybrid encryption.
How do I store RSA private keys securely?
Never store private keys in source code, environment variables, CI logs, or Docker images. Storage options (from least to most secure): (1) Encrypted filesystem on a server — acceptable for low-risk internal services. (2) PKCS#12 KeyStore (.p12 file) with a strong password — standard for Java applications. (3) Cloud KMS (AWS KMS, GCP KMS, Azure Key Vault) — the cloud provider manages HSM-backed keys; the private key never leaves the service. (4) Hardware Security Module (HSM) — the gold standard for regulatory compliance (PCI, HIPAA). Keys are generated on and never leave tamper-resistant hardware. For most production services, cloud KMS is the right balance of security and cost.
Can I use the same RSA key pair for both encryption and signing?
No — generate separate key pairs for encryption and signing. If the signing key is compromised, the attacker can forge signatures but cannot decrypt historical ciphertexts (which use a different key). If the encryption key is compromised, the attacker can decrypt future ciphertexts but cannot forge signatures. This is the principle of least privilege. X.509 certificates enforce this separation through the Key Usage extension: keyEncipherment for encryption keys, digitalSignature for signing keys. Many security audits and compliance frameworks require this separation. Generate distinct keys — the overhead is negligible.
What is the difference between RSA and ECC (Elliptic Curve Cryptography)?
RSA is based on the factoring problem; ECC is based on the elliptic curve discrete logarithm problem. ECC offers equivalent security with much smaller keys: RSA-2048 (112-bit security) vs ECDSA P-256 (128-bit security with a 256-bit key — 32 bytes vs 256 bytes). ECC key generation is instant (RSA takes seconds), signing is faster, verification is comparable. For key exchange, ECDH is much faster than RSA. Recommendation for new systems: use Ed25519 (signatures) and X25519 (key exchange) — they are faster, smaller, and have better security margins than RSA. Use RSA only for backward compatibility with legacy systems that don't support ECC (e.g., old X.509 certificate chains, code signing for older platforms).
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.