JVM Memory Model Deep Dive: Heap, Stack, GC and Thread Visibility
- 1 — Heap as a conveyor belt: Fast turnover matters more than size. Reduce allocation rate, not heap size. 90-98% of objects die young.
- 2 — Old Gen as a warehouse: Expensive to clean. Prevent premature promotion by sizing Survivor spaces correctly. Monitor with -XX:+PrintTenuringDistribution.
- 3 — GC selection as tradeoffs: Lower latency = more frequent GC = higher CPU. Higher throughput = less frequent GC = longer pauses. When NOT to use G1: ultra-low latency (use ZGC), high-throughput batch (use Parallel GC), heaps < 2 GB (use Serial GC).
- Heap = shared memory for all objects (GC-managed)
- Stack = per-thread memory (local variables, freed on return)
- Metaspace = class metadata (outside heap, control via
-XX:MaxMetaspaceSize) - GC pauses = stop-the-world events (G1 default, ZGC for <1ms latency)
- Happens-before = thread visibility guarantee (use
volatileorsynchronized)
Production Incident
Production Debug GuideSymptom → Action — use when production is on fire
Every Java performance crisis, every mysterious NullPointerException in production at 3 AM, and every subtle data-race bug ultimately traces back to the same root cause: the developer didn't have a clear mental model of how the JVM manages memory. It's not an academic concern — OutOfMemoryErrors, thread-visibility bugs, and stop-the-world GC pauses are day-one realities on any high-traffic service. Yet most Java developers can describe the syntax of a HashMap far better than they can explain why two threads can see different values for the same variable without any apparent concurrency bug.
The JVM Memory Model (JMM) solves two distinct but interrelated problems. First, it defines the physical layout of memory — where objects live, how long they live, and how the garbage collector reclaims them. Second, it defines the visibility and ordering guarantees between threads — the rules that determine whether a write made by Thread A is actually observable by Thread B. Mixing up these two concerns is the source of enormous confusion. The JMM specification (JSR-133, baked into the Java Language Specification since Java 5) is one of the most carefully engineered pieces of the Java platform, and understanding it separates senior engineers from the rest.
I've debugged JVM memory issues across payment processing systems handling 50,000 TPS, recommendation engines running 60 GB heaps, and microservices dying silently from metaspace exhaustion after hot-deploy cycles. The patterns are always the same: developers who understand the memory layout fix problems in minutes; developers who don't spend days chasing phantom bugs.
By the end of this article you'll be able to walk through a running JVM and name exactly what lives where and why. You'll understand the happens-before relationship well enough to reason about data races without guessing. You'll know how to tune GC regions for low-latency workloads, avoid the common memory-layout mistakes that cause silent correctness bugs, and answer the JMM interview questions that trip up even experienced engineers.
> ⚠️ Terminology note: This guide covers two distinct concepts that share confusingly similar names. JVM Memory (heap, stack, metaspace, GC) is the runtime memory structure — where objects live and how they're reclaimed. Java Memory Model (JMM) (happens-before, volatile, synchronized) is the thread visibility specification — the rules that determine when one thread's writes are observable by another. Both are covered here because they're deeply interrelated in production debugging.
What is JVM Memory Model?
The JVM Memory Model defines two things that engineers constantly conflate:
- The memory layout — how the JVM divides process memory into regions (heap, stack, metaspace, etc.), what lives in each region, and when memory is reclaimed.
- The visibility model — the happens-before rules that determine when a write by one thread is guaranteed to be visible to another thread. This is what volatile, synchronized, java.util.concurrent, and final fields are built on.
Every OutOfMemoryError you've ever seen is a failure of the first part. Every 'works on my machine but not in production' concurrency bug is a failure of the second part. They're different problems requiring different tools, and confusing them is the single most common mistake I see in JMM discussions.
The JVM spec divides runtime memory into five areas: heap, stack (per-thread), program counter register (per-thread), native method stack (per-thread), and metaspace (class metadata, since Java 8). The heap is shared across all threads. The stack, PC register, and native method stack are per-thread — no synchronization needed. Metaspace is shared but rarely mutated after class loading.
// io.thecodeforge.jvm.memory.MemoryLayoutDemo // Demonstrates the five JVM memory areas and what lives where. import java.lang.management.ManagementFactory; import java.lang.management.MemoryMXBean; import java.lang.management.MemoryPoolMXBean; import java.lang.management.MemoryUsage; import java.lang.management.ThreadMXBean; public class MemoryLayoutDemo { private String applicationName = "TheCodeForge"; private static final int MAX_CONNECTIONS = 1024; public static void main(String[] args) { MemoryMXBean memoryBean = ManagementFactory.getMemoryMXBean(); System.out.println("=== JVM MEMORY LAYOUT ==="); System.out.println(); MemoryUsage heap = memoryBean.getHeapMemoryUsage(); System.out.println("HEAP (shared across all threads):"); System.out.printf(" Init: %,d bytes (%.1f MB)%n", heap.getInit(), heap.getInit() / 1048576.0); System.out.printf(" Used: %,d bytes (%.1f MB)%n", heap.getUsed(), heap.getUsed() / 1048576.0); System.out.printf(" Committed: %,d bytes (%.1f MB)%n", heap.getCommitted(), heap.getCommitted() / 1048576.0); System.out.printf(" Max: %s%n", heap.getMax() == -1 ? "unlimited" : String.format("%,d bytes (%.1f MB)", heap.getMax(), heap.getMax() / 1048576.0)); System.out.println(" Contains: all objects, arrays, string pool contents"); System.out.println(); MemoryUsage nonHeap = memoryBean.getNonHeapMemoryUsage(); System.out.println("NON-HEAP (includes Metaspace):"); System.out.printf(" Init: %,d bytes (%.1f MB)%n", nonHeap.getInit(), nonHeap.getInit() / 1048576.0); System.out.printf(" Used: %,d bytes (%.1f MB)%n", nonHeap.getUsed(), nonHeap.getUsed() / 1048576.0); System.out.printf(" Max: %s%n", nonHeap.getMax() == -1 ? "unlimited (Infinity MB)" : String.format("%,d bytes (%.1f MB)", nonHeap.getMax(), nonHeap.getMax() / 1048576.0)); System.out.println(" Contains: class metadata, method bytecode, JIT code cache"); System.out.println(); System.out.println("MEMORY POOLS (heap regions + non-heap regions):"); for (MemoryPoolMXBean pool : ManagementFactory.getMemoryPoolMXBeans()) { MemoryUsage usage = pool.getUsage(); System.out.printf(" %-30s used: %6.1f MB max: %s type: %s%n", pool.getName(), usage.getUsed() / 1048576.0, usage.getMax() == -1 ? "unlimited" : String.format("%.1f MB", usage.getMax() / 1048576.0), pool.getType()); } System.out.println(); ThreadMXBean threadBean = ManagementFactory.getThreadMXBean(); System.out.println("STACK (per-thread — each thread has its own):"); System.out.printf(" Active threads: %d%n", threadBean.getThreadCount()); System.out.printf(" Current thread stack: %s%n", Thread.currentThread().getName()); System.out.println(" Contains: local variables, method parameters, return addresses"); System.out.println(" Each stack frame = one method call on the call stack"); System.out.println(); System.out.println("PROGRAM COUNTER REGISTER (per-thread):"); System.out.println(" Points to the next JVM instruction to execute"); System.out.println(" For native methods: undefined (native code manages its own PC)"); System.out.println(); System.out.println("=== SUMMARY ==="); System.out.println(" HEAP: shared, objects/arrays, garbage collected"); System.out.println(" STACK: per-thread, local variables, auto-managed"); System.out.println(" METASPACE: shared, class metadata, grows until MaxMetaspaceSize"); System.out.println(" PC REG: per-thread, current instruction pointer"); System.out.println(" NATIVE: per-thread, for JNI/native method calls"); } }
- Heap = the JVM's RAM — all objects live here, shared across threads
- Stack = per-thread workspace — each thread has its own, no sharing needed
- Metaspace = blueprint storage — class definitions, loaded once at startup
Heap Memory — Young Generation, Old Generation, and How Objects Age
The heap is where all Java objects live. It's shared across all threads, and it's where garbage collection operates.
📊Heap Flow: `` ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ EDEN │ ──→ │ SURVIVOR │ ──→ │ OLD GEN │ │ (new objects)│ │ (aged objects)│ │ (long-lived) │ └──────────────┘ └──────────────┘ └──────────────┘ ↓ ↓ ↓ Minor GC Minor GC Full GC (fast) (copying) (slow) ``
Young Generation (New Space): Where new objects are allocated. - Eden: All new objects start here. When Eden fills up, a minor GC runs. - Survivor Space 0 (S0) and Survivor Space 1 (S1): Two equal-sized spaces. Objects that survive minor GCs get copied between them, aging each time. - Promotion: When age exceeds threshold (default: 15), object moves to Old Generation.
Old Generation (Tenured Space): Long-lived objects. When Old Gen fills up, a major GC (or full GC) runs — expensive, often stop-the-world.
The generational hypothesis: 90-98% of objects die young. Minor GCs are fast (1-10ms). Full GCs are slow (100ms to seconds).
- Ultra-low latency systems (<1ms pauses): G1's generational model still causes stop-the-world. Use ZGC instead (-XX:+UseZGC).
- Heaps > 64 GB: G1's region management overhead grows. Consider ZGC or Shenandoah.
- Short-lived batch jobs: GC tuning won't help if the JVM exits in seconds. Focus on allocation rate.
// io.thecodeforge.jvm.memory.HeapStructureDemo // Shows how objects move through heap generations. import java.lang.management.ManagementFactory; import java.lang.management.MemoryPoolMXBean; import java.lang.management.MemoryUsage; import java.util.ArrayList; import java.util.List; public class HeapStructureDemo { static class OrderEvent { private final String orderId; private final String customerId; private final double amount; private final long timestamp; private final byte[] payload; OrderEvent(String orderId, String customerId, double amount) { this.orderId = orderId; this.customerId = customerId; this.amount = amount; this.timestamp = System.currentTimeMillis(); this.payload = new byte[256]; } } public static void main(String[] args) { System.out.println("=== HEAP GENERATION TRACKING ==="); System.out.println(); printMemoryPools("BEFORE allocation"); System.out.println("\nPhase 1: Allocating 100,000 short-lived objects..."); for (int batch = 0; batch < 10; batch++) { List<OrderEvent> shortLived = new ArrayList<>(); for (int i = 0; i < 10_000; i++) { shortLived.add(new OrderEvent( "ORD-" + batch + "-" + i, "CUST-" + (i % 1000), Math.random() * 500 )); } } printMemoryPools("AFTER short-lived allocation"); System.out.println("\nRequesting GC..."); System.gc(); printMemoryPools("AFTER GC — Eden should be nearly empty"); System.out.println("\nPhase 2: Allocating 50,000 long-lived objects..."); List<OrderEvent> longLived = new ArrayList<>(); for (int i = 0; i < 50_000; i++) { longLived.add(new OrderEvent( "LONG-" + i, "CUST-PERM-" + (i % 100), Math.random() * 1000 )); } printMemoryPools("AFTER long-lived allocation"); System.out.println("\nPhase 3: Multiple GCs to promote survivors to Old Gen..."); for (int i = 0; i < 5; i++) { System.gc(); System.out.println(" GC cycle " + (i + 1) + " complete"); } printMemoryPools("AFTER promotion cycles — Old Gen should have grown"); } static void printMemoryPools(String label) { System.out.println("\n " + label + ":"); for (MemoryPoolMXBean pool : ManagementFactory.getMemoryPoolMXBeans()) { if (pool.getType() == java.lang.management.MemoryType.HEAP) { MemoryUsage usage = pool.getUsage(); System.out.printf(" %-30s used: %6.1f MB committed: %8.1f MB%n", pool.getName(), usage.getUsed() / 1048576.0, usage.getCommitted() / 1048576.0); } } } }
- New objects land on the belt (Eden) — most die here instantly
- Survivors move to a holding area (Survivor spaces), aging each pass
- Long-lived objects graduate to the warehouse (Old Gen)
- Cleaning the belt = fast (minor GC). Cleaning the warehouse = slow (full GC)
-XX:+PrintTenuringDistribution, target 70-80% survival rateGarbage Collection — How the JVM Reclaims Memory
The garbage collector automatically reclaims memory occupied by objects that are no longer reachable from any GC root (local variables, static fields, active threads, JNI references).
GC Root types: Local variables, static fields, active threads, JNI references, monitors.
Major GC algorithms:
G1 (Garbage First) — Default since Java 9. Divides heap into regions (1-4 MB). Collects regions with most garbage first. Target pause time: -XX:MaxGCPauseMillis (default 200ms). Best for: heaps 4-64 GB, moderate latency.
ZGC — Ultra-low latency. Sub-millisecond pauses regardless of heap size (tested to 16 TB). Uses colored pointers + load barriers. Available since Java 15, generational since Java 21. Best for: heaps > 16 GB, sub-ms latency requirements.
Parallel GC — Throughput-optimized. Multiple threads, stop-the-world. Max application time vs GC. Best for: batch jobs, ETL, analytics.
- Ultra-low latency systems (<1ms pauses): G1 still has stop-the-world phases. Use ZGC.
- High-throughput batch processing: G1's concurrent overhead reduces throughput. Use Parallel GC.
- Heaps < 2 GB: G1's region management overhead isn't worth it. Use Serial GC (-XX:+UseSerialGC).
→ Java Memory Leaks and Prevention — Fix container OOMKills and set correct memory limits
// io.thecodeforge.jvm.memory.GCTuningDemo // Demonstrates GC behavior and collector selection. import java.lang.management.GarbageCollectorMXBean; import java.lang.management.ManagementFactory; import java.lang.management.MemoryPoolMXBean; import java.lang.management.MemoryUsage; import java.util.ArrayList; import java.util.List; public class GCTuningDemo { public static void main(String[] args) { System.out.println("=== GC INFORMATION ==="); System.out.println(); System.out.println("Active Garbage Collectors:"); for (GarbageCollectorMXBean gc : ManagementFactory.getGarbageCollectorMXBeans()) { System.out.printf(" Name: %-30s Collections: %d Time: %d ms%n", gc.getName(), gc.getCollectionCount(), gc.getCollectionTime()); } System.out.println(); System.out.println("=== GC SELECTION GUIDE ==="); System.out.println(); System.out.println("┌─────────────────────────────────────────────────────────────────┐"); System.out.println("│ USE G1 IF: │ USE ZGC IF: │"); System.out.println("├────────────────────────────────┼──────────────────────────────┤"); System.out.println("│ • Heap 4-64 GB │ • Heap > 16 GB │"); System.out.println("│ • Moderate latency (50-200ms) │ • Sub-millisecond pauses │"); System.out.println("│ • Default, no tuning needed │ • Real-time / trading systems│"); System.out.println("├────────────────────────────────┼──────────────────────────────┤"); System.out.println("│ USE PARALLEL GC IF: │ AVOID G1 IF: │"); System.out.println("│ • Batch jobs / ETL │ • Ultra-low latency (<1ms) │"); System.out.println("│ • Max throughput needed │ • Heap < 2 GB │"); System.out.println("│ • GC pauses don't matter │ • High-throughput batch │"); System.out.println("└────────────────────────────────┴──────────────────────────────┘"); System.out.println(); System.out.println("=== RECOMMENDED GC FLAGS ==="); System.out.println(); System.out.println("Low-latency (web services):"); System.out.println(" -XX:+UseG1GC -XX:MaxGCPauseMillis=50 -XX:G1HeapRegionSize=4m"); System.out.println(); System.out.println("Ultra-low latency (<1ms):"); System.out.println(" -XX:+UseZGC -XX:+ZGenerational"); System.out.println(); System.out.println("Throughput (batch):"); System.out.println(" -XX:+UseParallelGC -XX:ParallelGCThreads=<cores>"); System.out.println(); System.out.println("GC logging (ALWAYS enable):"); System.out.println(" -Xlog:gc*:file=gc.log:time,uptime,level,tags:filecount=5,filesize=50m"); System.out.println("\n=== ALLOCATION PRESSURE TEST ==="); long gcCountBefore = getTotalGCCount(); long gcTimeBefore = getTotalGCTime(); List<byte[]> pressure = new ArrayList<>(); for (int i = 0; i < 200; i++) { pressure.add(new byte[1024 * 1024]); if ((i + 1) % 50 == 0) { long gcCountNow = getTotalGCCount(); long gcTimeNow = getTotalGCTime(); System.out.printf(" Allocated %d MB — GC count: %d (+%d), GC time: %d ms (+%d ms)%n", i + 1, gcCountNow, gcCountNow - gcCountBefore, gcTimeNow, gcTimeNow - gcTimeBefore); } } pressure.clear(); System.gc(); System.out.printf("\n Total GC events: %d%n", getTotalGCCount() - gcCountBefore); System.out.printf(" Total GC time: %d ms%n", getTotalGCTime() - gcTimeBefore); } static long getTotalGCCount() { return ManagementFactory.getGarbageCollectorMXBeans().stream() .mapToLong(GarbageCollectorMXBean::getCollectionCount).sum(); } static long getTotalGCTime() { return ManagementFactory.getGarbageCollectorMXBeans().stream() .mapToLong(GarbageCollectorMXBean::getCollectionTime).sum(); } }
- G1 — cleans the messiest aisles first (Garbage First). Best for 4–64 GB heaps
- ZGC — hires a night crew that cleans while you work. Sub-ms pauses, any size heap
- Parallel GC — brings the whole team in. Max throughput, stop-the-world pauses
Happens-Before — Thread Visibility and the Rules That Prevent Data Races
This is the second half of the JMM — and the half that causes the most subtle bugs. The memory layout (heap, stack, GC) determines where objects live. The happens-before rules determine when one thread's writes are visible to another thread.
The core problem: Modern CPUs have multiple cores, each with its own L1/L2 cache. Without synchronization, there is NO guarantee that Thread B sees Thread A's write.
The JMM solution — happens-before: A partial ordering of operations. If A happens-before B, then A's writes are visible to B.
Key rules: 1. Program order: Within one thread, every action happens-before later actions. 2. Monitor lock: Unlock happens-before subsequent lock on same monitor (synchronized). 3. Volatile variable: Write to volatile happens-before subsequent read of that volatile. 4. Thread start: Thread.start() happens-before actions in started thread. 5. Thread join: Thread's actions happen-before Thread.join() returns. 6. Transitivity: If A happens-before B and B happens-before C, then A happens-before C.
- Compound operations (count++, x = y): Volatile only provides visibility, not atomicity. Use AtomicInteger or synchronized.
- Multiple variables needing consistent state: Volatile on one variable doesn't create happens-before for others. Use synchronized or Lock.
- When you need mutual exclusion: Volatile doesn't block threads. Use synchronized or ReentrantLock.
⚠️ x86 Hides Concurrency Bugs — ARM Exposes Them: x86 has strong memory ordering (TSO). Many data races 'work' on x86 but crash on ARM (Graviton, Apple Silicon). If you deploy to ARM, test there. Always establish happens-before edges — never rely on architecture-specific behavior.
→ Multithreading in Java — Concurrent collections and thread-safe patterns
// io.thecodeforge.jvm.memory.HappensBeforeDemo // Demonstrates visibility, volatile, and data races. public class HappensBeforeDemo { private static boolean running = true; private static volatile boolean volatileRunning = true; private static volatile int volatileCounter = 0; public static void main(String[] args) throws InterruptedException { System.out.println("=== HAPPENS-BEFORE DEMONSTRATION ==="); System.out.println(); System.out.println("--- Demo 1: Non-volatile flag (NO happens-before) ---"); Thread worker = new Thread(() -> { int iterations = 0; while (running) { iterations++; } System.out.println(" Worker exited after " + iterations + " iterations"); }); worker.start(); Thread.sleep(100); running = false; System.out.println(" Main set running=false. Worker MAY never see it."); worker.join(1000); if (worker.isAlive()) { System.out.println(" ❌ Worker still running — data race! (x86 may hide this)"); worker.interrupt(); } System.out.println(); System.out.println("--- Demo 2: Volatile flag (happens-before guaranteed) ---"); Thread worker2 = new Thread(() -> { int iterations = 0; while (volatileRunning) { iterations++; } System.out.println(" Worker exited after " + iterations + " iterations"); }); worker2.start(); Thread.sleep(100); volatileRunning = false; worker2.join(); System.out.println(" ✅ Worker exited — happens-before guaranteed"); System.out.println(); System.out.println("--- Demo 3: Volatile does NOT provide atomicity ---"); Thread[] incrementers = new Thread[10]; for (int i = 0; i < 10; i++) { incrementers[i] = new Thread(() -> { for (int j = 0; j < 10000; j++) { volatileCounter++; } }); } for (Thread t : incrementers) t.start(); for (Thread t : incrementers) t.join(); System.out.printf(" 10 threads * 10,000 increments = 100,000 expected%n"); System.out.printf(" volatileCounter = %d (likely less — lost updates!)%n", volatileCounter); System.out.println(" Fix: Use AtomicInteger or synchronized"); System.out.println(); System.out.println("--- Demo 4: Double-checked locking (requires volatile) ---"); System.out.println(" Before Java 5, double-checked locking was BROKEN."); System.out.println(" Java 5+ requires volatile for correctness:"); System.out.println(" private volatile static MyClass instance;"); System.out.println(" if (instance == null) {"); System.out.println(" synchronized (MyClass.class) {"); System.out.println(" if (instance == null) {"); System.out.println(" instance = new MyClass();"); System.out.println(" }"); System.out.println(" }"); System.out.println(" }"); } }
- volatile = bulletin board: writes posted for all threads to see (visibility only)
- synchronized = meeting room: one thread inside at a time (visibility + atomicity)
Thread.start()/join()= handshake: guarantees ordering across thread boundaries
Common Production Mistakes and Debugging Patterns
These are the mistakes I've seen in production systems and the debugging patterns that caught them. Every one of these has caused a real incident.
→ Java Memory Leaks and Prevention — Take and analyse heap dumps step by step
// io.thecodeforge.jvm.memory.ProductionMistakesDemo // Demonstrates common production memory mistakes and fixes. import java.lang.management.ManagementFactory; import java.lang.management.MemoryMXBean; import java.lang.management.MemoryUsage; import java.nio.ByteBuffer; import java.util.ArrayList; import java.util.List; public class ProductionMistakesDemo { private static final ThreadLocal<List<byte[]>> leakedThreadLocal = new ThreadLocal<>(); public static void main(String[] args) throws Exception { System.out.println("=== PRODUCTION JVM MEMORY MISTAKES ==="); System.out.println(); System.out.println("--- Mistake 1: -Xmx too large for container ---"); MemoryMXBean bean = ManagementFactory.getMemoryMXBean(); MemoryUsage heap = bean.getHeapMemoryUsage(); MemoryUsage nonHeap = bean.getNonHeapMemoryUsage(); System.out.printf(" Heap: %.1f MB, Non-heap: %.1f MB, Thread stacks: ~10 MB%n", heap.getCommitted() / 1048576.0, nonHeap.getUsed() / 1048576.0); System.out.println(" 🔧 Fix: -Xmx = 75-80% of container limit"); System.out.println(); System.out.println("--- Mistake 2: Off-heap (direct buffer) memory ---"); List<ByteBuffer> directBuffers = new ArrayList<>(); for (int i = 0; i < 100; i++) { directBuffers.add(ByteBuffer.allocateDirect(1024 * 1024)); } System.out.printf(" Allocated 100 MB direct buffers — heap unchanged at %.1f MB%n", bean.getHeapMemoryUsage().getUsed() / 1048576.0); System.out.println(" 🔧 Monitor: -XX:NativeMemoryTracking=summary"); System.out.println(" 🔧 Inspect: jcmd <pid> VM.native_memory summary"); System.out.println(); System.out.println("--- Mistake 3: ThreadLocal leak ---"); Thread[] pool = new Thread[3]; for (int i = 0; i < 3; i++) { pool[i] = new Thread(() -> { leakedThreadLocal.set(new ArrayList<>(List.of(new byte[1024 * 1024]))); }); pool[i].start(); pool[i].join(); } System.out.println(" 3 threads leaked 1 MB each → 3 MB unreachable"); System.out.println(" 🔧 Fix: try { tl.set(value); } finally { tl.remove(); }"); System.out.println(); System.out.println("--- Mistake 4: String.intern() on user input ---"); System.out.println(" ❌ String userId = request.getParameter(\"id\").intern();"); System.out.println(" 🔧 Fix: Use equals(), never intern user input"); System.out.println(); System.out.println("=== PRODUCTION FLAGS (COPY-PASTE READY) ==="); System.out.println(); System.out.println("-Xms2g -Xmx2g"); System.out.println("-XX:MaxMetaspaceSize=512m"); System.out.println("-XX:ReservedCodeCacheSize=256m"); System.out.println("-XX:+UseG1GC -XX:MaxGCPauseMillis=50"); System.out.println("-XX:+DisableExplicitGC"); System.out.println("-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/jvm/"); System.out.println("-Xlog:gc*:file=gc.log:time,uptime,level,tags:filecount=5,filesize=50m"); System.out.println("-XX:NativeMemoryTracking=summary"); System.out.println("-XX:+FlightRecorder"); } }
- OOM: Java heap space → take a heap dump, analyze with Eclipse MAT
- Latency spikes → enable GC logging, check pause times
- OOMKilled (no Java exception) → non-heap memory; set -Xmx to 75% of limit
- Inconsistent values between threads → missing happens-before; add volatile or lock
| Memory Area | Shared/Per-Thread | Contents | Management | Key Flag |
|---|---|---|---|---|
| Heap | Shared | All objects, arrays, string pool | Garbage Collector | -Xms, -Xmx |
| Eden (Young Gen) | Shared | Newly allocated objects | Minor GC | -XX:NewRatio-XX:SurvivorRatio |
| Survivor (Young Gen) | Shared | Objects surviving minor GC | Minor GC (copying) | -XX:SurvivorRatio-XX:MaxTenuringThreshold |
| Old Gen (Tenured) | Shared | Long-lived objects | Major/Full GC | -XX:NewRatio |
| Metaspace | Shared | Class metadata, bytecode | ClassLoader GC | -XX:MaxMetaspaceSize |
| Stack | Per-thread | Local variables, frames | Automatic (pop on return) | -Xss |
| PC Register | Per-thread | Current instruction pointer | JVM internal | N/A |
| Code Cache | Shared | JIT-compiled code | Code cache flushing | -XX:ReservedCodeCacheSize |
🎯 Key Takeaways
- 1 — Heap as a conveyor belt: Fast turnover matters more than size. Reduce allocation rate, not heap size. 90-98% of objects die young.
- 2 — Old Gen as a warehouse: Expensive to clean. Prevent premature promotion by sizing Survivor spaces correctly. Monitor with -XX:+PrintTenuringDistribution.
- 3 — GC selection as tradeoffs: Lower latency = more frequent GC = higher CPU. Higher throughput = less frequent GC = longer pauses. When NOT to use G1: ultra-low latency (use ZGC), high-throughput batch (use Parallel GC), heaps < 2 GB (use Serial GC).
- 4 — Happens-before as a contract: Without it, threads live in parallel universes. volatile = visibility only. synchronized = visibility + atomicity + mutual exclusion. When NOT to use volatile: compound operations, multiple variables, mutual exclusion needed.
- 5 — Metaspace as a safety valve: Always set -XX:MaxMetaspaceSize. Native memory OOM kills the process with no heap dump — silent and deadly.
- 6 — Stack is cheap, but not free: 500 threads × 1 MB = 500 MB before heap allocation. Virtual threads (Java 21+) fix this for I/O-bound workloads.
- 7 — Memory problems follow patterns: Learn the pattern → recognize symptom → apply fix. Quick reference: OOM → heap dump, latency spikes → GC logs, OOMKilled → native memory tracking, inconsistent values → happens-before.
- 🚨 Production Incident Recap: The 4GB container with -Xmx4g kept getting OOMKilled. Root cause: non-heap memory (metaspace, thread stacks, code cache, direct buffers) pushed total process memory over the limit. Fix: -Xmx3g + NativeMemoryTracking. Lesson: Heap ≠ total memory. Always leave 20-25% headroom.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QExplain the difference between heap and stack memory. What lives in each?
- QWalk me through what happens when I write
new— where does it go, and how does it move through GC generations?Object() - QWhat is the purpose of Survivor spaces? What happens if they're too small? Too large?
- QCompare G1, ZGC, and Parallel GC. When would you choose each? When would you NOT choose G1?
- QWhat is the happens-before relationship? List 4 rules that create happens-before edges.
- QWhy is double-checked locking broken without volatile? How does volatile fix it?
- QYou see 'OOMKilled' in Kubernetes but no heap dump. What happened and how do you debug?
- QWhat is a ClassLoader leak? How does it cause Metaspace exhaustion? How do you detect it?
- QWhat is the difference between volatile and synchronized? When can't you use volatile?
- QWhy does x86 hide concurrency bugs that ARM exposes? How do you protect against this?
Frequently Asked Questions
How do I choose the right garbage collector?
Quick version: Web services with 4-64 GB heap → G1. Ultra-low latency (<1ms) → ZGC. Batch jobs → Parallel GC. Heaps < 2 GB → G1 or Serial. When NOT to use G1: ultra-low latency systems (use ZGC), high-throughput batch (use Parallel GC), heaps < 2 GB (use Serial GC). The wrong GC choice can make latency 10x worse.
What's the single most common cause of full GCs?
Premature promotion — objects moving to Old Gen before they die. Caused by Survivor spaces too small or tenuring threshold too low. Fix: increase -XX:SurvivorRatio and -XX:MaxTenuringThreshold. Monitor with -XX:+PrintTenuringDistribution.
How do I detect a memory leak without a heap dump?
Monitor with jstat -gcutil <pid> 1s. If Old Gen grows monotonically without decreasing after GC, you have a leak. If Metaspace grows continuously, you have a ClassLoader leak. If the process's RSS grows but heap is stable, you have a native memory leak (direct buffers, JNI).
What's the difference between minor GC, major GC, and full GC?
Minor GC = Young Gen only (Eden + Survivor). Fast (1-10ms). Major GC = Old Gen only (rare in G1). Full GC = entire heap + metaspace. Slow (100ms to seconds). Your goal: eliminate full GCs entirely. If you see them in logs, something is wrong.
Can I rely on x86's strong memory model?
No. Never. Code that works on x86 may fail on ARM (AWS Graviton, Apple Silicon, Android). The JMM guarantees are the minimum you can rely on across all platforms. If you don't establish a happens-before edge, you have a bug — even if it doesn't crash on your laptop.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.