Homeβ€Ί DSAβ€Ί Hamming Code β€” Error Detection and Correction

Hamming Code β€” Error Detection and Correction

Where developers are forged. Β· Structured learning Β· Free forever.
πŸ“ Part of: Hashing β†’ Topic 11 of 11
Learn Hamming codes for single-bit error detection and correction.
βš™οΈ Intermediate β€” basic DSA knowledge assumed
In this tutorial, you'll learn:
  • Hamming distance: bits that differ between two strings β€” minimum distance 3 enables 1-bit correction.
  • Parity bits at power-of-2 positions each cover a specific subset of data positions.
  • Syndrome decoding: XOR parity checks form a binary number pointing to the error position.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
⚑ Quick Answer
When data travels over a network or is stored on disk, bits can flip. Hamming code adds a few extra 'parity' bits that together act like a GPS for errors β€” they can pinpoint exactly which bit flipped and correct it. Richard Hamming invented this in 1950 after getting frustrated with punch card machines that crashed on single-bit errors.

In 1947, Richard Hamming was working at Bell Labs on the Model V relay computer. Every time the computer hit an error while he was away for the weekend, it would simply stop and wait for an operator to fix it β€” losing his entire Saturday batch job. Frustrated, he thought: 'If the machine can detect an error, why can't it correct it?' Over the next two years he developed Hamming codes β€” the first error-correcting codes.

Today, ECC (Error Correcting Code) RAM uses Hamming codes on every memory operation. Server-grade machines will not pass certification without ECC RAM. Every time you access memory on a server, a Hamming code is silently verifying and potentially correcting the read. The DRAM inside your laptop likely performs billions of Hamming-protected reads per second.

Hamming Distance

The Hamming distance between two binary strings is the number of positions where they differ. Hamming(7,4) has minimum distance 3 β€” any two valid codewords differ in at least 3 positions. This means: - Any 1-bit error produces an invalid codeword (detectable) - The invalid codeword is closest to exactly one valid codeword (correctable)

General rule: minimum distance d allows detecting d-1 errors and correcting ⌊(d-1)/2βŒ‹ errors.

Hamming(7,4) Encoding

Place data bits at positions 1,2,4,7 (non-powers-of-2... wait, actually data at non-power-of-2 positions). Parity bits at positions 1,2,4 (powers of 2). Each parity bit covers a specific set of positions.

hamming.py Β· PYTHON
1234567891011121314151617181920212223242526272829303132333435363738394041
def hamming_encode(data: list[int]) -> list[int]:
    """Encode 4 data bits into 7-bit Hamming code."""
    # Positions 1-7 (1-indexed): p1,p2,d1,p4,d2,d3,d4
    d = data  # [d1, d2, d3, d4]
    # Build 7-bit codeword (0-indexed internally)
    code = [0] * 7
    code[2] = d[0]  # position 3
    code[4] = d[1]  # position 5
    code[5] = d[2]  # position 6
    code[6] = d[3]  # position 7
    # Compute parity bits
    # p1 (pos 1): covers positions 1,3,5,7
    code[0] = code[2] ^ code[4] ^ code[6]
    # p2 (pos 2): covers positions 2,3,6,7
    code[1] = code[2] ^ code[5] ^ code[6]
    # p4 (pos 4): covers positions 4,5,6,7
    code[3] = code[4] ^ code[5] ^ code[6]
    return code

def hamming_decode(code: list[int]) -> tuple[list[int], int]:
    """Decode and correct single-bit errors. Returns (data, error_pos)."""
    s1 = code[0] ^ code[2] ^ code[4] ^ code[6]  # p1 check
    s2 = code[1] ^ code[2] ^ code[5] ^ code[6]  # p2 check
    s4 = code[3] ^ code[4] ^ code[5] ^ code[6]  # p4 check
    syndrome = s4 * 4 + s2 * 2 + s1             # error position (1-indexed)
    if syndrome:
        code[syndrome - 1] ^= 1  # flip the erroneous bit
    data = [code[2], code[4], code[5], code[6]]
    return data, syndrome

# Encode 1011
original = [1, 0, 1, 1]
codeword = hamming_encode(original)
print(f'Data: {original} β†’ Codeword: {codeword}')

# Introduce error at position 5
erroneous = codeword[:]
erroneous[4] ^= 1
print(f'Received (error at pos 5): {erroneous}')
decoded, err_pos = hamming_decode(erroneous)
print(f'Corrected: {decoded}, error at position {err_pos}')
β–Ά Output
Data: [1, 0, 1, 1] β†’ Codeword: [0, 1, 1, 1, 0, 1, 1]
Received (error at pos 5): [0, 1, 1, 1, 1, 1, 1]
Corrected: [1, 0, 1, 1], error at position 5

Applications

ECC RAM: Error Correcting Code memory adds Hamming-like parity bits to each 64-bit word. Used in servers, workstations, and anywhere data corruption is unacceptable.

RAID-2: Uses Hamming codes across disk drives (largely superseded by RAID-5/6).

QR Codes: Use Reed-Solomon codes (a generalisation) allowing correction of up to 30% data loss.

Satellite/Deep Space: Voyager probes use Golay codes (24,12) β€” correcting 3 errors per 24-bit block across billions of kilometres.

🎯 Key Takeaways

  • Hamming distance: bits that differ between two strings β€” minimum distance 3 enables 1-bit correction.
  • Parity bits at power-of-2 positions each cover a specific subset of data positions.
  • Syndrome decoding: XOR parity checks form a binary number pointing to the error position.
  • Hamming(7,4): encode 4 bits in 7 β€” 3 parity bits overhead for single-bit correction.
  • ECC RAM, QR codes, RAID, satellite comm all use Hamming or generalised codes (Reed-Solomon).

Interview Questions on This Topic

  • QHow many parity bits are needed to correct a single-bit error in an n-bit codeword?
  • QExplain syndrome decoding β€” how does the syndrome identify the error position?
  • QWhat is the relationship between minimum Hamming distance and error correction capability?
  • QHow does ECC RAM differ from regular RAM?

Frequently Asked Questions

How many parity bits does Hamming code need for n data bits?

You need r parity bits where 2^r β‰₯ n + r + 1. For 4 data bits: 2^3=8 β‰₯ 4+3+1=8 β†’ 3 parity bits (Hamming(7,4)). For 8 data bits: 2^4=16 β‰₯ 8+4+1=13 β†’ 4 parity bits (Hamming(12,8)).

πŸ”₯
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousUniversal Hashing and Perfect Hashing
Forged with πŸ”₯ at TheCodeForge.io β€” Where Developers Are Forged