Vigenère Cipher — Polyalphabetic Encryption
- Vigenère: apply Caesar shift from repeating keyword — different shift per position disrupts simple frequency analysis.
- Kasiski test: find repeated ciphertext substrings, GCD of their distances = key length. Breaks the cipher's dependency on key repetition.
- Index of Coincidence: IC ≈ 0.065 for English, ≈ 0.038 for random. Correctly-grouped Vigenère groups show English-level IC.
For three centuries, the Vigenère cipher was known as 'le chiffre indéchiffrable' — the indecipherable cipher. Then in 1854, Charles Babbage (the father of computing) cracked it. He never published his solution, so Friedrich Kasiski independently rediscovered and published the attack in 1863. The Kasiski test — finding repeated substrings to determine keyword length — then applying frequency analysis to each Caesar-shifted subgroup — is one of the most elegant examples of turning a cipher's internal structure against itself.
Vigenère sits at the conceptual midpoint between Caesar (trivially broken) and modern stream ciphers (using a truly random keystream). Understanding why Vigenère's repeating key fails is exactly why one-time pads use a non-repeating key the same length as the message.
Implementation
Vigenère encrypts by adding the key character's position to each plaintext character position, cycling through the key.
def vigenere_encrypt(text: str, key: str) -> str: key = key.upper() result = [] ki = 0 for ch in text: if ch.isalpha(): base = ord('A') if ch.isupper() else ord('a') shift = ord(key[ki % len(key)]) - ord('A') result.append(chr((ord(ch) - base + shift) % 26 + base)) ki += 1 else: result.append(ch) return ''.join(result) def vigenere_decrypt(text: str, key: str) -> str: key = key.upper() result = [] ki = 0 for ch in text: if ch.isalpha(): base = ord('A') if ch.isupper() else ord('a') shift = ord(key[ki % len(key)]) - ord('A') result.append(chr((ord(ch) - base - shift) % 26 + base)) ki += 1 else: result.append(ch) return ''.join(result) ct = vigenere_encrypt('ATTACKATDAWN', 'LEMON') print(f'Encrypted: {ct}') print(f'Decrypted: {vigenere_decrypt(ct, "LEMON")}')
Decrypted: ATTACKATDAWN
Kasiski Test — Finding Keyword Length
If the same plaintext substring aligns with the same part of the key, it produces the same ciphertext substring. The distances between repeated ciphertext substrings are multiples of the key length. GCD of those distances reveals the key length.
from math import gcd from functools import reduce def kasiski_test(ciphertext: str, min_len: int = 3) -> int: """Estimate Vigenère key length from repeated substrings.""" ct = ciphertext.upper().replace(' ', '') distances = [] for length in range(min_len, min_len + 5): for i in range(len(ct) - length): substr = ct[i:i+length] j = ct.find(substr, i + length) while j != -1: distances.append(j - i) j = ct.find(substr, j + 1) if not distances: return 1 return reduce(gcd, distances) long_msg = 'WEAREDISCOVEREDRUNATONCE' * 3 ct = vigenere_encrypt(long_msg, 'CRYPTO') key_len = kasiski_test(ct) print(f'Estimated key length: {key_len}') # Should be 6 (CRYPTO)
Index of Coincidence — Confirming Key Length
The Index of Coincidence (IC) measures how 'English-like' a text is. Random text IC ≈ 0.038. English text IC ≈ 0.065. Once you know the key length k, split the ciphertext into k groups (every k-th letter). Each group is a Caesar cipher — apply frequency analysis to each.
from collections import Counter def index_of_coincidence(text: str) -> float: text = [c for c in text.upper() if c.isalpha()] n = len(text) if n < 2: return 0 counts = Counter(text) return sum(c * (c-1) for c in counts.values()) / (n * (n-1)) # English IC ≈ 0.065, random IC ≈ 0.038 print(f'English IC: {index_of_coincidence("the quick brown fox jumps over the lazy dog"):.3f}') long_ct = vigenere_encrypt('WEAREDISCOVEREDRUNATONCE' * 5, 'LEMON') print(f'Vigenère IC: {index_of_coincidence(long_ct):.3f}') # Between 0.038 and 0.065
Vigenère IC: 0.051
Why Vigenère Still Fails — And What Fixes It
Vigenère fails because the key repeats. The Kasiski test exploits this repetition to find the key length, then frequency analysis breaks each individual Caesar stream.
The theoretical fix: use a key that never repeats and is truly random — the One-Time Pad. The Lorenz cipher (used by Hitler's High Command in WWII) was a mechanical Vigenère variant. It was broken by British codebreakers at Bletchley Park partly because German operators reused key settings — introducing the very repetition that Vigenère's security depends on avoiding.
Modern stream ciphers (ChaCha20, RC4-successors) use a cryptographically secure pseudorandom keystream that appears non-repeating for any practical message length. They are Vigenère with an effectively infinite, unpredictable key.
🎯 Key Takeaways
- Vigenère: apply Caesar shift from repeating keyword — different shift per position disrupts simple frequency analysis.
- Kasiski test: find repeated ciphertext substrings, GCD of their distances = key length. Breaks the cipher's dependency on key repetition.
- Index of Coincidence: IC ≈ 0.065 for English, ≈ 0.038 for random. Correctly-grouped Vigenère groups show English-level IC.
- Once key length is known, each position is independently a Caesar cipher — apply frequency analysis to each.
- Fix: truly random non-repeating key (One-Time Pad) or cryptographically secure pseudorandom stream (ChaCha20, AES-CTR).
Interview Questions on This Topic
- QHow does the Kasiski test determine the key length of a Vigenère cipher?
- QWhat is the Index of Coincidence and how is it used to confirm key length?
- QWhy does Vigenère fail despite using multiple Caesar shifts?
- QHow does a modern stream cipher relate to Vigenère cipher conceptually?
Frequently Asked Questions
Was Vigenère cipher actually invented by Vigenère?
No — it was described by Giovan Battista Bellaso in 1553. Vigenère invented a stronger autokey cipher in 1586. The misattribution happened in the 19th century and stuck. The cipher was also independently described by Blaise de Vigenère — hence the name — but his version was actually the stronger autokey variant.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.