Home DSA Vigenère Cipher — Polyalphabetic Encryption

Vigenère Cipher — Polyalphabetic Encryption

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Cryptography → Topic 2 of 8
Learn the Vigenère cipher — how it defeated frequency analysis for 300 years, the Kasiski and Index of Coincidence attacks, and why polyalphabetic substitution still falls to statistical analysis.
⚙️ Intermediate — basic DSA knowledge assumed
In this tutorial, you'll learn:
  • Vigenère: apply Caesar shift from repeating keyword — different shift per position disrupts simple frequency analysis.
  • Kasiski test: find repeated ciphertext substrings, GCD of their distances = key length. Breaks the cipher's dependency on key repetition.
  • Index of Coincidence: IC ≈ 0.065 for English, ≈ 0.038 for random. Correctly-grouped Vigenère groups show English-level IC.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
⚡ Quick Answer
The Vigenère cipher uses a keyword to apply a different Caesar shift to each letter. The word 'LEMON' gives shifts 11,4,12,14,13 — repeating for the full message. This scrambles letter frequencies enough to defeat simple frequency analysis. But the keyword repeats, and that repetition is the fatal flaw Kasiski exploited in 1863 after 300 years of Vigenère being considered unbreakable.

For three centuries, the Vigenère cipher was known as 'le chiffre indéchiffrable' — the indecipherable cipher. Then in 1854, Charles Babbage (the father of computing) cracked it. He never published his solution, so Friedrich Kasiski independently rediscovered and published the attack in 1863. The Kasiski test — finding repeated substrings to determine keyword length — then applying frequency analysis to each Caesar-shifted subgroup — is one of the most elegant examples of turning a cipher's internal structure against itself.

Vigenère sits at the conceptual midpoint between Caesar (trivially broken) and modern stream ciphers (using a truly random keystream). Understanding why Vigenère's repeating key fails is exactly why one-time pads use a non-repeating key the same length as the message.

Implementation

Vigenère encrypts by adding the key character's position to each plaintext character position, cycling through the key.

vigenere.py · PYTHON
12345678910111213141516171819202122232425262728293031
def vigenere_encrypt(text: str, key: str) -> str:
    key = key.upper()
    result = []
    ki = 0
    for ch in text:
        if ch.isalpha():
            base = ord('A') if ch.isupper() else ord('a')
            shift = ord(key[ki % len(key)]) - ord('A')
            result.append(chr((ord(ch) - base + shift) % 26 + base))
            ki += 1
        else:
            result.append(ch)
    return ''.join(result)

def vigenere_decrypt(text: str, key: str) -> str:
    key = key.upper()
    result = []
    ki = 0
    for ch in text:
        if ch.isalpha():
            base = ord('A') if ch.isupper() else ord('a')
            shift = ord(key[ki % len(key)]) - ord('A')
            result.append(chr((ord(ch) - base - shift) % 26 + base))
            ki += 1
        else:
            result.append(ch)
    return ''.join(result)

ct = vigenere_encrypt('ATTACKATDAWN', 'LEMON')
print(f'Encrypted: {ct}')
print(f'Decrypted: {vigenere_decrypt(ct, "LEMON")}')
▶ Output
Encrypted: LXFOPVEFRNHR
Decrypted: ATTACKATDAWN

Kasiski Test — Finding Keyword Length

If the same plaintext substring aligns with the same part of the key, it produces the same ciphertext substring. The distances between repeated ciphertext substrings are multiples of the key length. GCD of those distances reveals the key length.

kasiski.py · PYTHON
12345678910111213141516171819202122
from math import gcd
from functools import reduce

def kasiski_test(ciphertext: str, min_len: int = 3) -> int:
    """Estimate Vigenère key length from repeated substrings."""
    ct = ciphertext.upper().replace(' ', '')
    distances = []
    for length in range(min_len, min_len + 5):
        for i in range(len(ct) - length):
            substr = ct[i:i+length]
            j = ct.find(substr, i + length)
            while j != -1:
                distances.append(j - i)
                j = ct.find(substr, j + 1)
    if not distances:
        return 1
    return reduce(gcd, distances)

long_msg = 'WEAREDISCOVEREDRUNATONCE' * 3
ct = vigenere_encrypt(long_msg, 'CRYPTO')
key_len = kasiski_test(ct)
print(f'Estimated key length: {key_len}')  # Should be 6 (CRYPTO)
▶ Output
Estimated key length: 6

Index of Coincidence — Confirming Key Length

The Index of Coincidence (IC) measures how 'English-like' a text is. Random text IC ≈ 0.038. English text IC ≈ 0.065. Once you know the key length k, split the ciphertext into k groups (every k-th letter). Each group is a Caesar cipher — apply frequency analysis to each.

index_of_coincidence.py · PYTHON
12345678910111213
from collections import Counter

def index_of_coincidence(text: str) -> float:
    text = [c for c in text.upper() if c.isalpha()]
    n = len(text)
    if n < 2: return 0
    counts = Counter(text)
    return sum(c * (c-1) for c in counts.values()) / (n * (n-1))

# English IC ≈ 0.065, random IC ≈ 0.038
print(f'English IC: {index_of_coincidence("the quick brown fox jumps over the lazy dog"):.3f}')
long_ct = vigenere_encrypt('WEAREDISCOVEREDRUNATONCE' * 5, 'LEMON')
print(f'Vigenère IC: {index_of_coincidence(long_ct):.3f}')  # Between 0.038 and 0.065
▶ Output
English IC: 0.065
Vigenère IC: 0.051

Why Vigenère Still Fails — And What Fixes It

Vigenère fails because the key repeats. The Kasiski test exploits this repetition to find the key length, then frequency analysis breaks each individual Caesar stream.

The theoretical fix: use a key that never repeats and is truly random — the One-Time Pad. The Lorenz cipher (used by Hitler's High Command in WWII) was a mechanical Vigenère variant. It was broken by British codebreakers at Bletchley Park partly because German operators reused key settings — introducing the very repetition that Vigenère's security depends on avoiding.

Modern stream ciphers (ChaCha20, RC4-successors) use a cryptographically secure pseudorandom keystream that appears non-repeating for any practical message length. They are Vigenère with an effectively infinite, unpredictable key.

🔥
The Autokey CipherAn improvement: use the plaintext itself as the key extension after the initial keyword — preventing exact repetition. This was proposed by Vigenère himself. It is stronger but still broken by more sophisticated statistical attacks since the key is correlated with the plaintext.

🎯 Key Takeaways

  • Vigenère: apply Caesar shift from repeating keyword — different shift per position disrupts simple frequency analysis.
  • Kasiski test: find repeated ciphertext substrings, GCD of their distances = key length. Breaks the cipher's dependency on key repetition.
  • Index of Coincidence: IC ≈ 0.065 for English, ≈ 0.038 for random. Correctly-grouped Vigenère groups show English-level IC.
  • Once key length is known, each position is independently a Caesar cipher — apply frequency analysis to each.
  • Fix: truly random non-repeating key (One-Time Pad) or cryptographically secure pseudorandom stream (ChaCha20, AES-CTR).

Interview Questions on This Topic

  • QHow does the Kasiski test determine the key length of a Vigenère cipher?
  • QWhat is the Index of Coincidence and how is it used to confirm key length?
  • QWhy does Vigenère fail despite using multiple Caesar shifts?
  • QHow does a modern stream cipher relate to Vigenère cipher conceptually?

Frequently Asked Questions

Was Vigenère cipher actually invented by Vigenère?

No — it was described by Giovan Battista Bellaso in 1553. Vigenère invented a stronger autokey cipher in 1586. The misattribution happened in the 19th century and stuck. The cipher was also independently described by Blaise de Vigenère — hence the name — but his version was actually the stronger autokey variant.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousCaesar Cipher — Substitution EncryptionNext →RSA Algorithm — Public Key Cryptography
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged