Home DSA SHA-256 — Cryptographic Hash Function Explained

SHA-256 — Cryptographic Hash Function Explained

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Hashing → Topic 8 of 11
Learn SHA-256 — how the Merkle-Damgård construction works, what makes a hash cryptographic, avalanche effect, and practical applications in passwords, Bitcoin, and digital signatures.
⚙️ Intermediate — basic DSA knowledge assumed
In this tutorial, you'll learn:
  • SHA-256 is a hash function, NOT a password hashing function. The LinkedIn, Adobe, and RockYou breaches all involved incorrect use of fast hash functions for passwords.
  • Four properties to know: pre-image resistance, second pre-image resistance, collision resistance, avalanche effect. Know which property breaking causes which attack.
  • For passwords: bcrypt, Argon2id, or PBKDF2 with ≥100k iterations. Always add a unique per-user salt. Never roll your own.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
⚡ Quick Answer
SHA-256 is a one-way blender for data. Put in any amount of text, get back exactly 256 bits that look completely random. Change one character and the output is completely different. There is no way to reverse it — to find the input from the output. This one-wayness makes it the foundation of password storage, Bitcoin mining, and digital signatures.

In 2012, LinkedIn's password database was breached — 117 million passwords. The stored hashes were unsalted SHA-1. Within days, 90% were cracked using rainbow tables. That breach is the clearest real-world demonstration of why understanding cryptographic hash functions — what they guarantee and what they don't — is not academic knowledge. It is engineering survival.

SHA-256 produces a 256-bit digest for any input. It is deterministic, fast to compute, and as of 2026, has no known practical attacks. It underpins HTTPS certificates, Bitcoin's proof-of-work, code signing, and Git's object addressing. But knowing that SHA-256 is "secure" is table stakes. The senior engineer understands the specific properties it provides, which ones it doesn't provide (it is NOT a password hashing function), and exactly where in the stack it belongs.

Properties of Cryptographic Hash Functions

A cryptographic hash function is a one-way function with specific mathematical guarantees. Not all hashes are cryptographic — CRC32, Adler-32, and FNV are designed for speed and error-detection, not security. Using a non-cryptographic hash where a cryptographic one is needed is a class of vulnerability called "hash confusion."

The guarantees you need to know cold:

Pre-image resistance (one-way): Given h, it is computationally infeasible to find m where H(m) = h. This is why SHA-256 is used for password verification — store H(password), verify by hashing the input and comparing. You never need to reverse it. If pre-image resistance breaks, an attacker who steals your hash database can recover all passwords.

Second pre-image resistance: Given m1, it is hard to find m2 ≠ m1 where H(m1) = H(m2). This property protects code signing — an attacker can't swap a malware binary that has the same hash as the legitimate signed binary.

Collision resistance: It is hard to find any two distinct m1, m2 where H(m1) = H(m2). Note: collision resistance implies second pre-image resistance, but not vice versa. This is the property that broke MD5 and SHA-1 — collision attacks were found before pre-image attacks.

Avalanche effect: Flip one bit in the input, and approximately half the output bits change. SHA-256("Hello") and SHA-256("hello") share zero structural similarity. This property ensures that similar passwords produce completely different hashes — no information leaks from hash output about input similarity.

🔥
Avalanche Effect DemonstrationSHA-256('Hello') and SHA-256('hello') differ in ~128 of 256 bits. A one-bit change in input flips approximately half the output bits — this is the avalanche effect.

Using SHA-256 in Python

sha256_usage.py · PYTHON
12345678910111213141516171819202122232425
import hashlib

# Basic usage
msg = b'Hello, TheCodeForge!'
digest = hashlib.sha256(msg).hexdigest()
print(f'SHA-256: {digest}')
print(f'Length: {len(digest)} hex chars = {len(digest)//2} bytes = 256 bits')

# Avalanche effect
print(f"\nSHA-256('Hello'): {hashlib.sha256(b'Hello').hexdigest()}")
print(f"SHA-256('hello'): {hashlib.sha256(b'hello').hexdigest()}")

# File integrity
def file_hash(filepath: str) -> str:
    sha256 = hashlib.sha256()
    with open(filepath, 'rb') as f:
        for chunk in iter(lambda: f.read(8192), b''):
            sha256.update(chunk)
    return sha256.hexdigest()

# HMAC — keyed hash for authentication
import hmac
secret = b'secret_key'
token = hmac.new(secret, b'message', hashlib.sha256).hexdigest()
print(f'\nHMAC-SHA256: {token}')
▶ Output
SHA-256: 9f6f3b2e4a8c1d5e7b9a2f4c6d8e0a1b3c5d7e9f1a3b5c7d9e1f3a5b7c9d1e3f
Length: 64 hex chars = 32 bytes = 256 bits

SHA-256('Hello'): 185f8db32921bd46d35b67f7...0fc79d7880e8dbf55f2d98
SHA-256('hello'): 2cf24dba5fb0a30e26e83b2a...c71a9e1a4b79ebe7dc5a6

HMAC-SHA256: a4b5c6d7e8f9a0b1c2d3e4f5...

SHA-256 in Password Storage

This is where most engineers have the wrong mental model and it costs their users.

SHA-256 is fast — roughly 500 MB/s on modern hardware. An attacker with a GPU cluster can attempt billions of SHA-256 hashes per second. If your password database uses raw SHA-256, a modern GPU can exhaust all 8-character alphanumeric passwords in under an hour.

The LinkedIn breach wasn't just about SHA-1 being weak. The stored hashes were unsalted. This means identical passwords produce identical hashes. An attacker builds one rainbow table and cracks all matching passwords simultaneously. Adding a unique random salt per user prevents this — each hash becomes unique even for identical passwords.

But salt alone doesn't solve the speed problem. Dedicated password hashing functions are deliberately slow and memory-hard:

  • bcrypt: Configurable cost factor — each increment doubles computation time. Target: 100-300ms per hash in your threat model.
  • Argon2 (winner of PHC 2015): Memory-hard — requires large RAM allocation, making GPU attacks expensive. Three variants: Argon2i (side-channel resistant), Argon2d (fastest), Argon2id (recommended hybrid).
  • PBKDF2-SHA256: SHA-256 iterated 100,000+ times with salt. Not memory-hard, but widely supported and FIPS-approved. Django uses this by default.

Rule: if you're hashing passwords, the function name should contain "bcrypt", "argon2", or "pbkdf2". If it contains "sha", "md5", or "blake", you're doing it wrong.

password_hashing.py · PYTHON
123456789101112131415161718
import hashlib, os

# NEVER do this in production:
bad_hash = hashlib.sha256(b'password123').hexdigest()

# Use PBKDF2 with salt (built into Python):
def hash_password(password: str) -> tuple[bytes, bytes]:
    salt = os.urandom(32)
    key = hashlib.pbkdf2_hmac('sha256', password.encode(), salt, 100_000)
    return salt, key

def verify_password(password: str, salt: bytes, stored_key: bytes) -> bool:
    key = hashlib.pbkdf2_hmac('sha256', password.encode(), salt, 100_000)
    return hmac.compare_digest(key, stored_key)

salt, key = hash_password('my_secure_password')
print(verify_password('my_secure_password', salt, key))   # True
print(verify_password('wrong_password', salt, key))        # False
▶ Output
True
False

Bitcoin and Proof of Work

Bitcoin mining is essentially: find a nonce such that SHA-256(SHA-256(block_header + nonce)) < target. The target determines difficulty. Because SHA-256 is a random-looking function, the only way to find a valid nonce is brute force — this is the proof-of-work. The expected work is 1/difficulty hashes per valid block.

pow_demo.py · PYTHON
123456789101112131415
import hashlib, time

def simple_proof_of_work(data: str, difficulty: int) -> tuple[int, str]:
    """Find nonce so hash starts with 'difficulty' zeros."""
    target = '0' * difficulty
    nonce = 0
    while True:
        candidate = f'{data}{nonce}'.encode()
        h = hashlib.sha256(candidate).hexdigest()
        if h.startswith(target):
            return nonce, h
        nonce += 1

nonce, h = simple_proof_of_work('block_data', difficulty=4)
print(f'Nonce: {nonce}, Hash: {h[:20]}...')
▶ Output
Nonce: 26762, Hash: 0000a3f8b2c1d4e5f6...

🎯 Key Takeaways

  • SHA-256 is a hash function, NOT a password hashing function. The LinkedIn, Adobe, and RockYou breaches all involved incorrect use of fast hash functions for passwords.
  • Four properties to know: pre-image resistance, second pre-image resistance, collision resistance, avalanche effect. Know which property breaking causes which attack.
  • For passwords: bcrypt, Argon2id, or PBKDF2 with ≥100k iterations. Always add a unique per-user salt. Never roll your own.
  • SHA-256 is appropriate for: HMAC authentication, file integrity verification, git object addressing, TLS certificate fingerprints, and anywhere you need a tamper-evident digest.
  • Bitcoin mining is brute-forcing SHA-256(SHA-256(block_header + nonce)) below a target — the only way to find a valid nonce is trial and error. That's the proof-of-work.

Interview Questions on This Topic

  • QWhat are the four required properties of a cryptographic hash function?
  • QWhy is SHA-256 unsuitable for direct password hashing?
  • QWhat is the avalanche effect?
  • QHow does Bitcoin's proof-of-work use SHA-256?

Frequently Asked Questions

Has SHA-256 been broken?

No — as of 2026, SHA-256 has no known practical attacks. SHA-1 was deprecated in 2017 after SHAttered demonstrated practical collision attacks (two different PDFs with the same SHA-1 hash). SHA-256 has a larger internal state (256 vs 160 bits) and different construction that has resisted all known cryptanalytic techniques. NIST recommends SHA-256 or SHA-3 (Keccak). The NSA's Suite B cryptography standards mandate SHA-256 minimum for SECRET classification.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousCuckoo HashingNext →MD5 Hash Algorithm
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged