JVM Memory Model — OOMKilled by Non-Heap Overhead
JVM's -Xmx4g in 4GB container leaves zero headroom; non-heap overhead ~490 MB triggers OOMKilled.
- Heap: Shared memory for all objects, managed by the garbage collector. Divided into young generation (eden + survivors) and old generation.
- Stack: Per-thread memory holding local variables and method frames. Freed automatically on method return — not GC-managed.
- Metaspace: Stores class metadata outside the heap. Unbounded by default — always set -XX:MaxMetaspaceSize in production to prevent runaway growth.
- GC pauses: Stop-the-world events where all application threads halt. G1 is the default (50–200ms). Use ZGC for sub-10ms pause requirements.
- Happens-before: The JMM guarantee that memory writes in one thread are visible to another. Established by volatile, synchronized, and Lock — without it, changes may never be seen.
Imagine your Java program is a busy restaurant kitchen. The heap is the giant walk-in fridge where all the ingredients (objects) are stored — anyone on the team can grab from it. Each chef (thread) has their own small personal workbench (stack) for chopping and prep — nobody else touches it. The maitre d' (garbage collector) periodically walks the fridge and tosses anything nobody is using anymore. The JVM Memory Model is simply the blueprint that describes exactly how that kitchen is laid out, who can access what, and the rules for keeping orders from getting mixed up.
Every Java performance crisis, every mysterious NullPointerException in production at 3 AM, and every subtle data-race bug ultimately traces back to the same root cause: the developer didn't have a clear mental model of how the JVM manages memory. It's not an academic concern — OutOfMemoryErrors, thread-visibility bugs, and stop-the-world GC pauses are day-one realities on any high-traffic service. Yet most Java developers can describe the syntax of a HashMap far better than they can explain why two threads can see different values for the same variable without any apparent concurrency bug.
The JVM Memory Model (JMM) solves two distinct but interrelated problems. First, it defines the physical layout of memory — where objects live, how long they live, and how the garbage collector reclaims them. Second, it defines the visibility and ordering guarantees between threads — the rules that determine whether a write made by Thread A is actually observable by Thread B. Mixing up these two concerns is the source of enormous confusion. The JMM specification (JSR-133, baked into the Java Language Specification since Java 5) is one of the most carefully engineered pieces of the Java platform, and understanding it separates senior engineers from the rest.
I've debugged JVM memory issues across payment processing systems handling 50,000 TPS, recommendation engines running 60 GB heaps, and microservices dying silently from metaspace exhaustion after hot-deploy cycles. The patterns are always the same: developers who understand the memory layout fix problems in minutes; developers who don't spend days chasing phantom bugs.
By the end of this article you'll be able to walk through a running JVM and name exactly what lives where and why. You'll understand the happens-before relationship well enough to reason about data races without guessing. You'll know how to tune GC regions for low-latency workloads, avoid the common memory-layout mistakes that cause silent correctness bugs, and answer the JMM interview questions that trip up even experienced engineers.
> ⚠️ Terminology note: This guide covers two distinct concepts that share confusingly similar names. JVM Memory (heap, stack, metaspace, GC) is the runtime memory structure — where objects live and how they're reclaimed. Java Memory Model (JMM) (happens-before, volatile, synchronized) is the thread visibility specification — the rules that determine when one thread's writes are observable by another. Both are covered here because they're deeply interrelated in production debugging.
What is JVM Memory Model?
The JVM Memory Model defines two things that engineers constantly conflate:
- The memory layout — how the JVM divides process memory into regions (heap, stack, metaspace, etc.), what lives in each region, and when memory is reclaimed.
- The visibility model — the happens-before rules that determine when a write by one thread is guaranteed to be visible to another thread. This is what volatile, synchronized, java.util.concurrent, and final fields are built on.
Every OutOfMemoryError you've ever seen is a failure of the first part. Every 'works on my machine but not in production' concurrency bug is a failure of the second part. They're different problems requiring different tools, and confusing them is the single most common mistake I see in JMM discussions.
The JVM spec divides runtime memory into five areas: heap, stack (per-thread), program counter register (per-thread), native method stack (per-thread), and metaspace (class metadata, since Java 8). The heap is shared across all threads. The stack, PC register, and native method stack are per-thread — no synchronization needed. Metaspace is shared but rarely mutated after class loading.
📚 RELATED NEXT STEPS
→ Garbage Collection in Java — If you're seeing memory errors or OOM crashes
→ Multithreading in Java — If you're debugging thread visibility or race conditions
- Heap = the JVM's RAM — all objects live here, shared across threads
- Stack = per-thread workspace — each thread has its own, no sharing needed
- Metaspace = blueprint storage — class definitions, loaded once at startup
PC Register and Native Method Stack: The Overlooked Per‑Thread Memory Regions
While heap and stack get all the attention, two smaller per-thread regions play a critical role in execution: the Program Counter (PC) register and the Native Method Stack.
Program Counter (PC) Register - Each thread has its own PC register, which points to the address of the next JVM instruction to execute. - For Java methods, the PC holds the offset of the current instruction in the method’s bytecode. - For native methods (methods marked native, implemented in C/C++), the PC value is undefined — the native code manages its own program counter. - The PC register is small (a few bytes) and never causes memory errors directly. However, understanding it helps interpret thread dumps: the PC often appears as the top frame’s instruction pointer.
Native Method Stack - Also per-thread, the Native Method Stack supports calls to native methods via the Java Native Interface (JNI). - It’s structured like the Java stack: each native method call pushes a frame containing local variables, operand stack, and references to native objects. - Unlike the Java stack, its size is platform-dependent and not directly configurable with JVM flags. On most platforms the default is 512 KB – 1 MB, shared with the Java stack in the same OS thread. - If native code deeply recurses or allocates large local arrays, it can cause a StackOverflowError inside the native method — but the error message may be confusing because JVM doesn't always report it clearly.
Why These Regions Matter in Production - Thread dumps show the PC register value (often as pc=0x...) for each thread — useful for identifying where a thread is stuck (e.g., infinite loop, blocking I/O). - Native method stack exhaustion is rare but can happen with JNI-intensive libraries. Symptoms: the process freezes or crashes with no heap dump. Diagnose with -XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading and native memory tracking. - Virtual threads (Java 21+) share the carrier thread's native stack but have their own PC register state — a subtle detail that matters when debugging virtual thread pinning.
jstack, look for the line "PC = ...". On x86, you can match this address to the generated assembly (use -XX:PrintAssembly). For most developers, the PC value is less useful than the stack frame listing, but it’s essential for JVM developers and profiler tooling.JVM Memory Regions: Visual Overview
The JVM divides its process memory into five primary regions, each with a distinct role. The diagram below groups them into shared (heap, metaspace) and per‑thread (stack, PC register, native method stack).
Heap (Shared) - All objects, arrays, and the string pool live here. - Garbage collector reclaims unreachable objects. - Tuned via -Xms, -Xmx, and GC algorithm flags.
Metaspace (Shared, since Java 8) - Holds class metadata (bytecode, method tables, field layouts). - Unbounded by default — always set -XX:MaxMetaspaceSize to prevent runaway growth from classloader leaks. - Replaced PermGen; now uses native memory, not heap.
Java Stack (Per‑Thread) - Stores method call frames (local variables, operand stack, return address). - Size controlled by -Xss (default ~1 MB on most platforms). - StackOverflowError occurs when stack depth exceeds limit (recursion bug).
PC Register (Per‑Thread) - Holds the address of the next JVM instruction to execute. - For native methods, the value is undefined. - Tiny memory footprint; never a source of OOM.
Native Method Stack (Per‑Thread) - Supports JNI calls; each native method gets a frame. - Size is platform-dependent and not directly configurable. - Exhaustion leads to SIGSEGV crashes, not Java exceptions.
This five-region layout is the foundation for all JVM memory management. Every production memory issue maps to one or more of these regions: high heap usage → GC tuning, metaspace growth → classloader leak, stack overflow → recursion, native crash → JNI issue.
jcmd <pid> VM.native_memory summary to see a breakdown of all JVM memory regions. Enable with -XX:NativeMemoryTracking=summary. This is the closest you can get to a live diagram of your JVM's memory layout.Stack vs Heap: Side‑by‑Side Comparison
The stack and heap are the two most important memory regions developers interact with daily. Below is a direct comparison of their key characteristics:
| Property | Stack | Heap |
|---|---|---|
| Access speed | Very fast (direct memory access, no GC) | Slower (allocation + GC overhead) |
| Thread safety | Naturally thread‑safe (per‑thread) | Not thread‑safe (shared; needs synchronization) |
| Size | Small, fixed per thread (default 1 MB) | Large, configurable (GBs) |
| Overflow error | StackOverflowError (deep recursion) | OutOfMemoryError: Java heap space |
| Storage type | Local variables, method parameters, return addresses | Objects, arrays, string pool |
| Visibility | Only owning thread can access | All threads can access (with references) |
| Lifetime | Until method returns (freed automatically) | Until unreachable (garbage collected) |
| Memory management | Automatic on method exit (pop frame) | Garbage collector (mark‑sweep or copying) |
Key Takeaways - Stack: fast, small, private — use for primitives and object references. - Heap: slower, large, shared — use for objects that outlive the method or need to be accessed by multiple threads. - Common mistake: keeping large arrays or collections as local variables in a deep recursive method — can blow the stack because the array object is allocated on the heap, but its reference lives on the stack, and the stack frame itself is small. The array object doesn't cause stack overflow, but the frame count does.
When to Worry About Stack Size - Recursive algorithms (DFS, tree traversal) — increase -Xss or convert to iteration. - Deep call chains in enterprise frameworks (e.g., Spring AOP, many filters). - Virtual threads (Java 21+): they don't consume OS stack, but the carrier thread still has a fixed stack — pinning can cause stack overflow.
- Stack: locals, method params, return addresses — gone when method returns
- Heap: objects, arrays — live until GC decides they're unreachable
- Both: an object reference lives on the stack; the object itself lives on the heap
Heap Memory — Young Generation, Old Generation, and How Objects Age
The heap is where all Java objects live. It's shared across all threads, and it's where garbage collection operates.
📊Heap Flow: `` ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ EDEN │ ──→ │ SURVIVOR │ ──→ │ OLD GEN │ │ (new objects)│ │ (aged objects)│ │ (long-lived) │ └──────────────┘ └──────────────┘ └──────────────┘ ↓ ↓ ↓ Minor GC Minor GC Full GC (fast) (copying) (slow) ``
Young Generation (New Space): Where new objects are allocated. - Eden: All new objects start here. When Eden fills up, a minor GC runs. - Survivor Space 0 (S0) and Survivor Space 1 (S1): Two equal-sized spaces. Objects that survive minor GCs get copied between them, aging each time. - Promotion: When age exceeds threshold (default: 15), object moves to Old Generation.
Old Generation (Tenured Space): Long-lived objects. When Old Gen fills up, a major GC (or full GC) runs — expensive, often stop-the-world.
The generational hypothesis: 90-98% of objects die young. Minor GCs are fast (1-10ms). Full GCs are slow (100ms to seconds).
- Ultra-low latency systems (<1ms pauses): G1's generational model still causes stop-the-world. Use ZGC instead (-XX:+UseZGC).
- Heaps > 64 GB: G1's region management overhead grows. Consider ZGC or Shenandoah.
- Short-lived batch jobs: GC tuning won't help if the JVM exits in seconds. Focus on allocation rate.
- New objects land on the belt (Eden) — most die here instantly
- Survivors move to a holding area (Survivor spaces), aging each pass
- Long-lived objects graduate to the warehouse (Old Gen)
- Cleaning the belt = fast (minor GC). Cleaning the warehouse = slow (full GC)
-XX:+PrintTenuringDistribution, target 70-80% survival rateGarbage Collection — How the JVM Reclaims Memory
The garbage collector automatically reclaims memory occupied by objects that are no longer reachable from any GC root (local variables, static fields, active threads, JNI references).
GC Root types: Local variables, static fields, active threads, JNI references, monitors.
Major GC algorithms:
G1 (Garbage First) — Default since Java 9. Divides heap into regions (1-4 MB). Collects regions with most garbage first. Target pause time: -XX:MaxGCPauseMillis (default 200ms). Best for: heaps 4-64 GB, moderate latency.
ZGC — Ultra-low latency. Sub-millisecond pauses regardless of heap size (tested to 16 TB). Uses colored pointers + load barriers. Available since Java 15, generational since Java 21. Best for: heaps > 16 GB, sub-ms latency requirements.
Parallel GC — Throughput-optimized. Multiple threads, stop-the-world. Max application time vs GC. Best for: batch jobs, ETL, analytics.
- Ultra-low latency systems (<1ms pauses): G1 still has stop-the-world phases. Use ZGC.
- High-throughput batch processing: G1's concurrent overhead reduces throughput. Use Parallel GC.
- Heaps < 2 GB: G1's region management overhead isn't worth it. Use Serial GC (-XX:+UseSerialGC).
📚 RELATED NEXT STEPS
→ Garbage Collection in Java — Tune GC algorithms and pause targets in depth
→ Java Memory Leaks and Prevention — Fix container OOMKills and set correct memory limits
- G1 — cleans the messiest aisles first (Garbage First). Best for 4–64 GB heaps
- ZGC — hires a night crew that cleans while you work. Sub-ms pauses, any size heap
- Parallel GC — brings the whole team in. Max throughput, stop-the-world pauses
JVM Flags Reference: Setting Heap, Stack, Metaspace, and Code Cache
Configuration flags are the first line of defense against memory-related production incidents. Below is a reference table of the five essential JVM memory flags, with their purpose, typical values, and critical notes.
| Flag | Sets | Typical Value | Notes |
|---|---|---|---|
-Xms | Initial heap size | -Xms2g | JVM pre-allocates this at startup. Set equal to -Xmx to avoid resizing overhead. |
-Xmx | Maximum heap size | -Xmx2g (75% of container limit) | Never use 100% of container memory; leave headroom for non-heap. |
-Xss | Thread stack size | -Xss1m (default) | Common mistake: 1000 threads × 1 MB = 1 GB stack overhead. Consider 256 KB for virtual threads. |
-XX:MaxMetaspaceSize | Maximum metaspace size | -XX:MaxMetaspaceSize=512m | Always set this. Unbounded metaspace can silently consume all native memory. |
-XX:ReservedCodeCacheSize | Maximum JIT code cache size | -XX:ReservedCodeCacheSize=256m | Code cache fills up if you have large codebase or many JIT compilations. Flushes cause performance drops. |
Interaction Between Flags - -Xms and -Xmx control only the heap. Non-heap regions are additive. - Metaspace (-XX:MaxMetaspaceSize) is separate from heap — an application can run out of native memory even if heap is 50% free. - Code cache (-XX:ReservedCodeCacheSize) is also native memory and competes with metaspace for the non-heap budget. - Thread stacks (-Xss) multiply by thread count: 500 threads × 1 MB = 500 MB of native memory.
Container + JVM Memory Budget Calculation Total process memory ≈ Heap + Metaspace + (Threads × StackSize) + CodeCache + DirectBuffers + GC overhead
Example for a 4 GB container with 200 threads, default 1 MB stacks, 512 MB metaspace, 256 MB code cache: - Heap: 3 GB (75%) - Stacks: 200 MB - Metaspace: 512 MB - Code cache: 256 MB - GC overhead: ~10% of heap = 300 MB - Total: ~4.3 GB → OOM risk. Solution: reduce -Xmx to 2.5 GB or lower stack size to 512 KB.
-XX:MaxMetaspaceSize, the JVM will let metaspace grow until it consumes all available native memory. This is especially dangerous in containers because Linux OOM killer will terminate the process without a heap dump. Always set a limit based on your application's class metadata footprint (typically 128–512 MB).Happens-Before — Thread Visibility and the Rules That Prevent Data Races
This is the second half of the JMM — and the half that causes the most subtle bugs. The memory layout (heap, stack, GC) determines where objects live. The happens-before rules determine when one thread's writes are visible to another thread.
The core problem: Modern CPUs have multiple cores, each with its own L1/L2 cache. Without synchronization, there is NO guarantee that Thread B sees Thread A's write.
The JMM solution — happens-before: A partial ordering of operations. If A happens-before B, then A's writes are visible to B.
Key rules: 1. Program order: Within one thread, every action happens-before later actions. 2. Monitor lock: Unlock happens-before subsequent lock on same monitor (synchronized). 3. Volatile variable: Write to volatile happens-before subsequent read of that volatile. 4. Thread start: Thread.start() happens-before actions in started thread. 5. Thread join: Thread's actions happen-before Thread.join() returns. 6. Transitivity: If A happens-before B and B happens-before C, then A happens-before C.
- Compound operations (count++, x = y): Volatile only provides visibility, not atomicity. Use AtomicInteger or synchronized.
- Multiple variables needing consistent state: Volatile on one variable doesn't create happens-before for others. Use synchronized or Lock.
- When you need mutual exclusion: Volatile doesn't block threads. Use synchronized or ReentrantLock.
⚠️ x86 Hides Concurrency Bugs — ARM Exposes Them: x86 has strong memory ordering (TSO). Many data races 'work' on x86 but crash on ARM (Graviton, Apple Silicon). If you deploy to ARM, test there. Always establish happens-before edges — never rely on architecture-specific behavior.
📚 RELATED NEXT STEPS
→ Multithreading in Java — Debug race conditions and thread visibility issues
→ Multithreading in Java — Concurrent collections and thread-safe patterns
Common Production Mistakes and Debugging Patterns
These are the mistakes I've seen in production systems and the debugging patterns that caught them. Every one of these has caused a real incident.
📚 RELATED NEXT STEPS
→ Java Memory Leaks and Prevention — Fix OOMKilled pods and set correct -Xmx for containers
→ Java Memory Leaks and Prevention — Take and analyse heap dumps step by step
The 4GB Container That Kept Dying
- Heap ≠ total memory
- Non-heap consumes 20–30% of your container budget
- Always leave headroom
That's Advanced Java. Mark it forged?
13 min read · try the examples if you haven't