Home Java Java volatile Keyword Explained — Memory Visibility, Happens-Before, and When It's Not Enough

Java volatile Keyword Explained — Memory Visibility, Happens-Before, and When It's Not Enough

In Plain English 🔥
Imagine a busy office where five colleagues share a whiteboard showing the current stock count. Each person also has a sticky note on their desk with the last number they saw. If Alice updates the whiteboard but Bob only looks at his sticky note, Bob acts on stale information. Java's volatile keyword is the rule that says: 'No sticky notes allowed — everyone must always read and write directly on the whiteboard.' It forces every thread to go straight to main memory instead of using its own cached copy.
⚡ Quick Answer
Imagine a busy office where five colleagues share a whiteboard showing the current stock count. Each person also has a sticky note on their desk with the last number they saw. If Alice updates the whiteboard but Bob only looks at his sticky note, Bob acts on stale information. Java's volatile keyword is the rule that says: 'No sticky notes allowed — everyone must always read and write directly on the whiteboard.' It forces every thread to go straight to main memory instead of using its own cached copy.

In any system running more than one thread, there's a silent enemy you can't see in a debugger: CPU cache coherence. Modern processors don't read RAM on every instruction — they pull values into fast, thread-local L1/L2 caches for performance. That's brilliant for single-threaded code, but in multithreaded Java it means one thread can be happily looping on a value that another thread already changed minutes ago. This isn't a bug in your code; it's the hardware working exactly as designed. The Java Memory Model (JMM) was built precisely to give you control over when those cache lines synchronize.

The Stale-Read Problem: Why Threads See Phantom Values

The Java Memory Model allows the JVM and the CPU to reorder instructions and cache variable values inside a thread's working memory — a logical representation of its CPU registers and cache lines. Without any synchronization construct, there is zero guarantee that a write performed by Thread A will ever become visible to Thread B, even if the write happened 'earlier' in wall-clock time.

This isn't theoretical. The JVM's JIT compiler can hoist a variable read out of a loop because, from a single-threaded perspective, the value 'can't change'. The result: an infinite loop that should have stopped. The canonical example is a shutdown flag. A background worker checks a boolean each iteration; the main thread sets it to true. Without volatile, the worker may never see the update.

The fix isn't to add a print statement (which accidentally introduces memory barriers) or call Thread.sleep (which may or may not flush caches). The fix is to declare the flag volatile, which instructs the JVM to never cache the value in a thread's working memory and to insert the necessary memory barriers at every read and write site.

StaleReadDemo.java · JAVA
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
/**
 * Demonstrates the stale-read problem WITHOUT volatile,
 * then fixes it WITH volatile.
 *
 * Run with: java StaleReadDemo
 * JVM flag to make the bug more reproducible: -server
 */
public class StaleReadDemo {

    // ❌ Without volatile — the JIT may cache this in a register.
    // The worker thread might loop forever even after main sets it true.
    private static boolean shutdownRequestedUnsafe = false;

    // ✅ With volatile — every read goes to main memory. No caching allowed.
    private static volatile boolean shutdownRequested = false;

    public static void main(String[] args) throws InterruptedException {

        Thread worker = new Thread(() -> {
            long iterationCount = 0;

            // The JIT compiler, in server mode, may decide this value
            // never changes (from *this* thread's perspective) and hoist
            // the read outside the loop entirely — an infinite loop ensues.
            while (!shutdownRequested) {
                iterationCount++;

                // Intentionally empty loop body — we're testing visibility only.
                // Adding a System.out.println here would mask the bug because
                // println acquires a monitor lock, which flushes thread-local memory.
            }

            System.out.println("Worker stopped cleanly after " + iterationCount + " iterations.");
        }, "background-worker");

        worker.start();

        // Give the worker time to get fully JIT-compiled and into its hot loop.
        Thread.sleep(100);

        // Main thread updates the flag.
        shutdownRequested = true;
        System.out.println("Main thread set shutdownRequested = true");

        // Wait for the worker to notice. With volatile this completes quickly.
        // Without volatile (using shutdownRequestedUnsafe) this may hang forever.
        worker.join(2000);

        if (worker.isAlive()) {
            System.out.println("BUG: Worker is still running — it never saw the update!");
            worker.interrupt(); // clean up the demo
        }
    }
}
▶ Output
Main thread set shutdownRequested = true
Worker stopped cleanly after 48291774 iterations.
⚠️
Watch Out: The Invisible JIT OptimisationAdding System.out.println() inside the loop masks the stale-read bug because PrintStream.println() is synchronized, which happens to flush thread-local memory. Never rely on this as a fix — it's a coincidence, not a solution. Use volatile or a proper synchronization construct instead.

How volatile Actually Works — Memory Barriers and the JMM Happens-Before Rule

When you mark a field volatile, the JVM inserts memory barrier instructions around every read and write to that field. A memory barrier (also called a memory fence) is a CPU instruction that forces all pending writes to be flushed to main memory before the barrier, and all subsequent reads to be fetched from main memory after it. On x86 this maps roughly to MFENCE or LOCK-prefixed instructions; on ARM it's DMB/DSB instructions.

More formally, the Java Memory Model defines a happens-before relationship. If Thread A writes to a volatile field and Thread B subsequently reads that same volatile field, then everything Thread A did before the write is guaranteed visible to Thread B after the read. This is not just about that one field — it's a full memory fence that drags all preceding writes along with it.

This has a powerful implication: publishing an object safely via a volatile reference guarantees that the object's constructor-initialized fields are visible to any thread that reads that reference afterward. This is why the double-checked locking pattern requires the singleton instance field to be volatile — without it, a thread could see a non-null reference to a partially constructed object.

VolatileHappensBefore.java · JAVA
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
/**
 * Demonstrates the happens-before guarantee of volatile.
 * The volatile write acts as a full memory fence — all writes
 * before it become visible to any thread that sees the volatile write.
 *
 * This is the safe publication pattern used in production systems.
 */
public class VolatileHappensBefore {

    // The configuration object being published to worker threads.
    static class ServerConfig {
        final String hostName;        // final fields are safe, but
        final int    port;            // non-final fields also get
        String       environment;     // dragged across by the volatile fence.

        ServerConfig(String hostName, int port, String environment) {
            this.hostName    = hostName;
            this.port        = port;
            this.environment = environment;
        }
    }

    // ✅ volatile ensures: once a thread reads a non-null configRef,
    // it is guaranteed to see fully-initialized ServerConfig fields.
    private static volatile ServerConfig configRef = null;

    public static void main(String[] args) throws InterruptedException {

        Thread reader = new Thread(() -> {
            // Spin until the config is published (busy-wait for demo purposes only).
            while (configRef == null) {
                Thread.onSpinWait(); // hint to the CPU: we're in a spin loop
            }

            // Because configRef is volatile, the read of configRef happens-after
            // the write to configRef in the writer thread.
            // That means ALL writes the writer did BEFORE writing configRef
            // are also visible here — including the non-final 'environment' field.
            ServerConfig config = configRef;
            System.out.println("Reader sees host:        " + config.hostName);
            System.out.println("Reader sees port:        " + config.port);
            System.out.println("Reader sees environment: " + config.environment);
        }, "config-reader");

        reader.start();

        Thread.sleep(50); // let the reader get into its spin loop

        // Writer constructs the object fully BEFORE the volatile write.
        ServerConfig freshConfig = new ServerConfig("api.example.com", 8443, "production");

        // This single volatile write acts as a release fence.
        // Every write above this line is flushed to main memory before this store.
        configRef = freshConfig;
        System.out.println("Writer published config via volatile write.");

        reader.join();
    }
}
▶ Output
Writer published config via volatile write.
Reader sees host: api.example.com
Reader sees port: 8443
Reader sees environment: production
🔥
Interview Gold: The Happens-Before Chainvolatile gives you a happens-before edge in the JMM graph — not just cache flushing. Everything the writing thread did before the volatile write is visible to any thread that subsequently reads that volatile field. Interviewers love this distinction because most candidates only describe it as 'forces reads from main memory', missing the broader ordering guarantee.

volatile's Hard Limit: Why It Can't Replace synchronized for Compound Actions

Here's where developers get burned in production: volatile guarantees visibility and ordering, but it does NOT guarantee atomicity for compound operations. A compound operation is any read-modify-write sequence — incrementing a counter, checking-then-acting, or swapping two values.

Consider a volatile int counter. The statement counter++ looks atomic in Java source, but the JVM compiles it into three bytecode instructions: GETSTATIC (read), IADD (increment), PUTSTATIC (write). Two threads can both read the same stale value, both increment, and both write back the same result — losing one update entirely. The counter was volatile the whole time, yet you still lost data.

For single-variable atomic operations use java.util.concurrent.atomic classes (AtomicInteger, AtomicLong, AtomicReference). They use CPU-level Compare-And-Swap (CAS) instructions which are both visible AND atomic — no locks required. Use synchronized or ReentrantLock when your critical section spans multiple variables that must stay consistent with each other. volatile is the right tool only when: (a) exactly one thread writes and one or more threads only read, OR (b) the write is already atomic (assigning a reference or a primitive other than long/double on most JVMs) and you only need the visibility/ordering guarantee.

VolatileCounterRaceCondition.java · JAVA
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.atomic.AtomicInteger;

/**
 * Proves that volatile does NOT prevent lost updates on compound operations.
 * Runs 10 threads each incrementing a counter 10,000 times.
 * Expected final value: 100,000
 *
 * Three counter implementations are compared:
 *   1. volatile int          — loses updates (race condition)
 *   2. synchronized int      — correct, but slower
 *   3. AtomicInteger         — correct and fast (preferred)
 */
public class VolatileCounterRaceCondition {

    // ❌ volatile does NOT make ++ atomic.
    private static volatile int volatileCounter = 0;

    // ✅ synchronized makes the entire read-modify-write atomic.
    private static int synchronizedCounter = 0;

    // ✅ AtomicInteger uses CAS — atomic AND lock-free.
    private static final AtomicInteger atomicCounter = new AtomicInteger(0);

    private static final int THREAD_COUNT      = 10;
    private static final int INCREMENTS_EACH   = 10_000;
    private static final int EXPECTED_TOTAL    = THREAD_COUNT * INCREMENTS_EACH;

    public static void main(String[] args) throws InterruptedException {
        CountDownLatch startGate = new CountDownLatch(1); // all threads start at once
        CountDownLatch doneGate  = new CountDownLatch(THREAD_COUNT);

        for (int threadIndex = 0; threadIndex < THREAD_COUNT; threadIndex++) {
            new Thread(() -> {
                try {
                    startGate.await(); // wait for the starting gun

                    for (int i = 0; i < INCREMENTS_EACH; i++) {
                        volatileCounter++;          // ❌ read-modify-write — NOT atomic

                        synchronized (VolatileCounterRaceCondition.class) {
                            synchronizedCounter++;  // ✅ entire operation is serialized
                        }

                        atomicCounter.incrementAndGet(); // ✅ single CAS instruction
                    }
                } catch (InterruptedException e) {
                    Thread.currentThread().interrupt();
                } finally {
                    doneGate.countDown();
                }
            }).start();
        }

        startGate.countDown(); // fire the starting gun
        doneGate.await();      // wait for all threads to finish

        System.out.println("Expected total       : " + EXPECTED_TOTAL);
        System.out.println("volatile result      : " + volatileCounter      + (volatileCounter      == EXPECTED_TOTAL ? " ✅" : " ❌ LOST UPDATES!"));
        System.out.println("synchronized result  : " + synchronizedCounter  + (synchronizedCounter  == EXPECTED_TOTAL ? " ✅" : " ❌"));
        System.out.println("AtomicInteger result : " + atomicCounter.get()  + (atomicCounter.get()  == EXPECTED_TOTAL ? " ✅" : " ❌"));
    }
}
▶ Output
Expected total : 100000
volatile result : 94371 ❌ LOST UPDATES!
synchronized result : 100000 ✅
AtomicInteger result : 100000 ✅
⚠️
Pro Tip: The Single-Writer Rulevolatile is safe for compound read-modify-write ONLY when a single designated thread is the sole writer. If only one thread ever sets the value and all others only read it (a common pattern for shutdown flags, config toggles, and heartbeat timestamps), volatile is exactly the right tool — it's cheaper than a lock and gives you exactly the guarantees you need.

Double-Checked Locking and Safe Publication — The Classic volatile Production Pattern

Double-checked locking (DCL) is the most famous real-world use of volatile. It solves the expensive problem of initializing a heavy singleton lazily without paying synchronization cost on every access after initialization.

The broken pre-Java-5 version skipped volatile and suffered from instruction reordering: the JVM is allowed to write the reference to the singleton field BEFORE fully running the constructor body. A second thread could see a non-null reference and skip the synchronized block, then dereference an incompletely constructed object and read garbage field values. This was not hypothetical — it caused subtle production bugs on multi-socket servers where memory ordering between CPUs was weak.

The fix is exactly one volatile keyword on the instance field. The volatile write at the end of the constructor (implicitly, when the reference is assigned) acts as a release barrier: the JVM cannot reorder the reference publication before any write inside the constructor. Any thread that reads a non-null instance is guaranteed the object is fully constructed. This pattern is in the JDK itself — look at FutureTask, ConcurrentHashMap, and lazy initializers in java.util.concurrent.

SafeSingletonDCL.java · JAVA
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869
/**
 * Thread-safe lazy singleton using Double-Checked Locking (DCL).
 * This is the correct, production-safe pattern as of Java 5+.
 *
 * The volatile keyword on 'instance' is NOT optional.
 * Without it, instruction reordering can expose a partially
 * constructed object to threads that win the second null-check.
 */
public class SafeSingletonDCL {

    // ✅ volatile prevents the JIT from reordering the constructor
    // writes with the store to this reference field.
    // Without volatile, another thread could see non-null instance
    // but still read uninitialised field values from the object.
    private static volatile SafeSingletonDCL instance = null;

    private final String databaseUrl;
    private final int    maxConnections;

    // Private constructor simulates expensive initialisation.
    private SafeSingletonDCL() {
        // Pretend this takes 200ms to initialise a connection pool.
        this.databaseUrl    = "jdbc:postgresql://prod-db:5432/app";
        this.maxConnections = 20;
        System.out.println("[" + Thread.currentThread().getName() + "] Constructor running — expensive setup done.");
    }

    public static SafeSingletonDCL getInstance() {
        // First check — no lock. The vast majority of calls take this fast path
        // once the singleton is initialised. volatile ensures we see the true value.
        if (instance == null) {

            // Only one thread should run the constructor.
            synchronized (SafeSingletonDCL.class) {

                // Second check — we now hold the lock, but another thread
                // might have already initialised it while we were waiting.
                if (instance == null) {
                    instance = new SafeSingletonDCL(); // volatile write = release barrier
                }
            }
        }
        return instance; // volatile read = acquire barrier
    }

    public String getDatabaseUrl()    { return databaseUrl; }
    public int    getMaxConnections() { return maxConnections; }

    // ---- Demo main -------------------------------------------------------
    public static void main(String[] args) throws InterruptedException {
        int threadCount = 8;
        Thread[] threads = new Thread[threadCount];

        for (int i = 0; i < threadCount; i++) {
            threads[i] = new Thread(() -> {
                SafeSingletonDCL singleton = SafeSingletonDCL.getInstance();
                // Every thread must print the same object hash and URL.
                System.out.printf("[%-20s] instance@%d  url=%s  maxConn=%d%n",
                    Thread.currentThread().getName(),
                    System.identityHashCode(singleton),
                    singleton.getDatabaseUrl(),
                    singleton.getMaxConnections());
            }, "worker-" + i);
        }

        for (Thread t : threads) t.start();
        for (Thread t : threads) t.join();
    }
}
▶ Output
[worker-0 ] Constructor running — expensive setup done.
[worker-0 ] instance@1829164700 url=jdbc:postgresql://prod-db:5432/app maxConn=20
[worker-3 ] instance@1829164700 url=jdbc:postgresql://prod-db:5432/app maxConn=20
[worker-1 ] instance@1829164700 url=jdbc:postgresql://prod-db:5432/app maxConn=20
[worker-5 ] instance@1829164700 url=jdbc:postgresql://prod-db:5432/app maxConn=20
[worker-2 ] instance@1829164700 url=jdbc:postgresql://prod-db:5432/app maxConn=20
[worker-7 ] instance@1829164700 url=jdbc:postgresql://prod-db:5432/app maxConn=20
[worker-4 ] instance@1829164700 url=jdbc:postgresql://prod-db:5432/app maxConn=20
[worker-6 ] instance@1829164700 url=jdbc:postgresql://prod-db:5432/app maxConn=20
🔥
Interview Gold: Why DCL Needs volatile, Not Just synchronizedThe synchronized block prevents two threads running the constructor simultaneously. volatile prevents a third category of failure: instruction reordering inside the JIT that publishes the reference before the object is fully constructed. They solve different problems. An interviewer who asks 'why is the volatile necessary if you already have synchronized?' is testing whether you understand memory ordering vs mutual exclusion.
Feature / Aspectvolatilesynchronized / ReentrantLockAtomicInteger / AtomicReference
Visibility guarantee✅ Yes — always reads/writes main memory✅ Yes — on monitor exit/entry✅ Yes — via CAS memory semantics
Atomicity of compound ops (e.g. ++)❌ No — three separate bytecodes✅ Yes — critical section is serialised✅ Yes — single CPU CAS instruction
Mutual exclusion (only one thread at a time)❌ No✅ Yes❌ No (lock-free, not mutex-free)
Risk of deadlock❌ None⚠️ Yes if locks acquired out of order❌ None
Performance overheadLowest (memory fence only)Highest (OS-level monitor + context switch risk)Middle (CAS loop may retry under contention)
Instruction reorder prevention✅ Full fence — no reorder past barrier✅ Within the synchronized block✅ Tied to each CAS operation
Safe for long / double fields on 32-bit JVM✅ Yes — volatile makes them atomic✅ Yes (if properly locked)✅ AtomicLong handles this
Typical use caseFlags, status fields, safe publication of referencesMulti-variable invariants, check-then-actCounters, accumulators, CAS-based state machines
JDK examplesFutureTask.outcome, Thread.parkBlockerArrayList internals, Collections.synchronized*ConcurrentHashMap cell counts, Striped64

🎯 Key Takeaways

  • volatile guarantees visibility and ordering via memory barriers — every read fetches from main memory and every write flushes to main memory, but it does NOT make compound operations like ++ atomic.
  • The JMM happens-before rule means a volatile write drags ALL preceding writes into main memory — not just the volatile field itself. This is what makes safe publication of objects via a volatile reference correct.
  • Double-checked locking requires volatile on the instance reference because synchronized prevents two threads running the constructor simultaneously, while volatile prevents the JIT from publishing the reference before the constructor body has finished executing — two completely different failure modes.
  • Use volatile for: shutdown flags (single-writer, many readers), publishing immutable objects safely, and long/double fields on 32-bit JVMs. Switch to AtomicInteger/AtomicReference for CAS-based updates, and to synchronized or locks when multiple variables must stay mutually consistent.

⚠ Common Mistakes to Avoid

  • Mistake 1: Using volatile for counters shared by multiple writers — counter++ on a volatile int is compiled to three bytecodes (read, increment, write), so two threads can both read 5, both compute 6, and both write 6, losing one increment. The symptom is a final count lower than expected, and it's non-deterministic so it passes unit tests. Fix: replace volatile int with AtomicInteger and use incrementAndGet() instead of ++.
  • Mistake 2: Thinking volatile replaces synchronized for check-then-act logic — code like 'if (cache == null) { cache = compute(); }' is still a race condition even if cache is volatile, because the read of null and the write of the result are two separate operations with no atomicity between them. Two threads can both see null, both compute, and both assign — wasting work or worse, assigning inconsistent objects. Fix: use synchronized around the entire check-then-act block, or use AtomicReference.compareAndSet() for a lock-free approach.
  • Mistake 3: Omitting volatile on the instance field in Double-Checked Locking — the code looks correct, passes single-threaded tests, and may even pass most concurrent tests on x86 (which has a relatively strong memory model). But on ARM-based servers or under aggressive JIT optimization the reference can be published before the constructor finishes writing all fields. Fix: always declare the DCL instance field volatile. If you find volatile unacceptable, use the Initialization-on-Demand Holder idiom (a static inner class) instead — it's lazily loaded by the classloader with full thread-safety and zero volatile overhead.

Interview Questions on This Topic

  • QCan you explain the difference between volatile and synchronized in Java, and give a real example where volatile is the better choice and one where it's dangerously insufficient?
  • QWhat exactly does the Java Memory Model's happens-before guarantee mean in the context of a volatile write followed by a volatile read — and does it affect fields other than the volatile field itself?
  • QWhy does the double-checked locking pattern for a singleton specifically require the instance field to be volatile, even though the constructor call is already inside a synchronized block? What exact failure mode does removing volatile introduce?

Frequently Asked Questions

Does volatile make a variable thread-safe in Java?

Partially — volatile guarantees that reads and writes to a variable are always performed on main memory, not a CPU cache, so every thread sees the latest written value. However, it does not make compound operations like increment (++) atomic. For full thread safety on read-modify-write operations use AtomicInteger or synchronized instead.

Is volatile better than synchronized in Java?

They solve different problems. volatile is faster (no lock acquisition, no thread blocking) and appropriate when a single thread writes and others only read. synchronized is necessary when you need mutual exclusion or when a critical section covers multiple related variables that must stay consistent. Using volatile where you need synchronized is a correctness bug; using synchronized where volatile suffices is an unnecessary performance penalty.

Why do long and double fields need volatile in multithreaded Java code?

On 32-bit JVMs, the JLS permits reading and writing 64-bit types (long and double) as two separate 32-bit operations. Without volatile, a thread can read the high 32 bits from one write and the low 32 bits from a different write, producing a completely garbage value that never existed. Declaring the field volatile forces the JVM to treat reads and writes as atomic 64-bit operations. On modern 64-bit JVMs this tearing is unlikely in practice, but the spec does not guarantee it without volatile.

🔥
TheCodeForge Editorial Team Verified Author

Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.

← PreviousExecutors and Thread Pools in JavaNext →Deadlock in Java
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged