Home DSA Caesar Cipher — Substitution Encryption

Caesar Cipher — Substitution Encryption

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Cryptography → Topic 1 of 8
Learn the Caesar cipher — the simplest substitution cipher, frequency analysis attacks, and why it matters for understanding modern cryptography fundamentals.
🧑‍💻 Beginner-friendly — no prior DSA experience needed
In this tutorial, you'll learn:
  • Caesar cipher: shift each letter by k positions mod 26. Only 26 possible keys — trivially brute-forced.
  • Frequency analysis: most common ciphertext letter is likely 'E'. Compute shift, decrypt. Works on any monoalphabetic substitution cipher.
  • Fails on confusion: linear key-ciphertext relationship. Fails on diffusion: each letter encrypted independently.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
⚡ Quick Answer
The Caesar cipher shifts every letter by a fixed number. A shift of 3 turns 'HELLO' into 'KHOOR'. Julius Caesar used it to communicate with his generals. It is completely broken by modern standards — but understanding exactly why it breaks teaches you the two fundamental properties every secure cipher must have: confusion and diffusion.

The Caesar cipher is the entry point to cryptography for a reason — it is simple enough to understand completely, and broken in enough ways to illustrate every major weakness that centuries of cryptanalysis have identified. ROT13, the internet's favourite 'obfuscation', is a Caesar cipher with shift 13. The Vigenère cipher, which stumped cryptanalysts for 300 years, is just multiple Caesar ciphers stacked. And frequency analysis — the attack that breaks Caesar — is the same technique that cracked the Enigma machine.

Start with Caesar. Understand why it fails. Every modern cipher is essentially a series of answers to the questions Caesar's failure raises.

Implementation

Caesar cipher with shift k: encrypt by adding k mod 26 to each letter's position, decrypt by subtracting k. Non-letters pass through unchanged.

caesar.py · PYTHON
1234567891011121314151617
def caesar_encrypt(text: str, shift: int) -> str:
    result = []
    for ch in text:
        if ch.isalpha():
            base = ord('A') if ch.isupper() else ord('a')
            result.append(chr((ord(ch) - base + shift) % 26 + base))
        else:
            result.append(ch)
    return ''.join(result)

def caesar_decrypt(text: str, shift: int) -> str:
    return caesar_encrypt(text, -shift)

print(caesar_encrypt('Hello, World!', 3))   # Khoor, Zruog!
print(caesar_decrypt('Khoor, Zruog!', 3))   # Hello, World!
print(caesar_encrypt('Hello', 13))          # ROT13: Uryyb
print(caesar_encrypt('Uryyb', 13))          # ROT13 is self-inverse: Hello
▶ Output
Khoor, Zruog!
Hello, World!
Uryyb
Hello

Breaking Caesar — Brute Force and Frequency Analysis

Caesar has only 26 possible keys. Brute force tries all 26. But frequency analysis is more powerful: in English, 'E' appears ~13% of the time, 'T' ~9%, 'A' ~8%. The most frequent letter in the ciphertext is almost certainly 'E'. Find it, compute the shift, decrypt.

caesar_break.py · PYTHON
1234567891011121314151617181920
from collections import Counter

def break_caesar(ciphertext: str) -> tuple[int, str]:
    """Break Caesar cipher using frequency analysis."""
    letters = [c.upper() for c in ciphertext if c.isalpha()]
    if not letters:
        return 0, ciphertext
    # Most frequent letter in English is 'E'
    most_common = Counter(letters).most_common(1)[0][0]
    shift = (ord(most_common) - ord('E')) % 26
    return shift, caesar_decrypt(ciphertext, shift)

# Brute force — try all 26 shifts
def brute_force_caesar(ciphertext: str) -> list:
    return [(shift, caesar_decrypt(ciphertext, shift)) for shift in range(26)]

ct = caesar_encrypt('The quick brown fox jumps over the lazy dog', 7)
shift, plaintext = break_caesar(ct)
print(f'Detected shift: {shift}')
print(f'Decrypted: {plaintext}')
▶ Output
Detected shift: 7
Decrypted: The quick brown fox jumps over the lazy dog

Why Caesar Fails — The Two Properties of Secure Ciphers

Caesar fails on both criteria that define a secure cipher:

Confusion (key relationship to ciphertext should be complex): Caesar's relationship between key and ciphertext is linear and trivially invertible. Knowing one plaintext-ciphertext pair reveals the key instantly.

Diffusion (each plaintext bit should affect many ciphertext bits): Caesar maps each letter independently. 'E' always maps to the same letter. No letter affects any other. Letter frequencies are perfectly preserved.

Modern ciphers (AES, ChaCha20) address both: complex non-linear key schedules for confusion, and mixing operations that propagate each bit throughout the entire block for diffusion. Caesar has neither.

🔥
ROT13 is Caesar(13)ROT13 uses shift 13, making it self-inverse: ROT13(ROT13(x)) = x. It was used on Usenet newsgroups to 'hide' spoilers and offensive content. It provides zero security — it is pure obfuscation and should never be used where actual confidentiality is needed.

Historical Context and the Road to Vigenère

Julius Caesar used shift 3. His nephew Augustus used shift 1. Suetonius documented this in 'The Twelve Caesars' (121 AD) — making the Caesar cipher the oldest documented encryption method.

The cipher survived in various forms for 1500 years because letter frequency analysis was not documented until Al-Kindi described it around 850 AD. Once frequency analysis was understood, the Caesar cipher was immediately broken.

The response was the Vigenère cipher (1553) — use a different shift for each letter position, determined by a keyword. This defeated frequency analysis for 300 years until Charles Babbage (1854) and Friedrich Kasiski (1863) independently discovered how to detect the keyword length. The pattern of attack-and-response continues to define cryptographic history.

🎯 Key Takeaways

  • Caesar cipher: shift each letter by k positions mod 26. Only 26 possible keys — trivially brute-forced.
  • Frequency analysis: most common ciphertext letter is likely 'E'. Compute shift, decrypt. Works on any monoalphabetic substitution cipher.
  • Fails on confusion: linear key-ciphertext relationship. Fails on diffusion: each letter encrypted independently.
  • ROT13 is Caesar(13) — self-inverse, zero security, pure obfuscation. Never use for actual confidentiality.
  • Historical entry point: Al-Kindi's frequency analysis (850 AD) → Vigenère (1553) → Babbage/Kasiski (1854) → modern ciphers. Each step fixes the previous cipher's weakness.

Interview Questions on This Topic

  • QWhat are the two fundamental properties (confusion and diffusion) that a secure cipher must have, and why does Caesar fail both?
  • QHow does frequency analysis break any monoalphabetic substitution cipher?
  • QWhat is ROT13 and when is it appropriate to use?
  • QHow many keys does Caesar cipher have, and what does this tell you about the security of short key spaces?

Frequently Asked Questions

Is Caesar cipher ever used in practice today?

Only for obfuscation, not security. ROT13 (Caesar 13) is still used on Reddit, Stack Overflow, and similar platforms to hide spoilers. Never use it where actual confidentiality is required — any competent attacker breaks it in seconds.

What is a monoalphabetic substitution cipher?

Any cipher where each letter always maps to the same cipher letter. Caesar is monoalphabetic. All monoalphabetic ciphers are broken by frequency analysis regardless of how complex the substitution alphabet is.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

Next →Vigenère Cipher — Polyalphabetic Encryption
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged