JVM Memory Model — OOMKilled by Non-Heap Overhead
JVM's -Xmx4g in 4GB container leaves zero headroom; non-heap overhead ~490 MB triggers OOMKilled.
20+ years shipping production Java in banking & fintech. Everything here is grounded in real deployments.
- Heap: Shared memory for all objects, managed by the garbage collector. Divided into young generation (eden + survivors) and old generation.
- Stack: Per-thread memory holding local variables and method frames. Freed automatically on method return — not GC-managed.
- Metaspace: Stores class metadata outside the heap. Unbounded by default — always set -XX:MaxMetaspaceSize in production to prevent runaway growth.
- GC pauses: Stop-the-world events where all application threads halt. G1 is the default (50–200ms). Use ZGC for sub-10ms pause requirements.
- Happens-before: The JMM guarantee that memory writes in one thread are visible to another. Established by volatile, synchronized, and Lock — without it, changes may never be seen.
Imagine your Java program is a busy restaurant kitchen. The heap is the giant walk-in fridge where all the ingredients (objects) are stored — anyone on the team can grab from it. Each chef (thread) has their own small personal workbench (stack) for chopping and prep — nobody else touches it. The maitre d' (garbage collector) periodically walks the fridge and tosses anything nobody is using anymore. The JVM Memory Model is simply the blueprint that describes exactly how that kitchen is laid out, who can access what, and the rules for keeping orders from getting mixed up.
Every Java performance crisis, every mysterious NullPointerException in production at 3 AM, and every subtle data-race bug ultimately traces back to the same root cause: the developer didn't have a clear mental model of how the JVM manages memory. It's not an academic concern — OutOfMemoryErrors, thread-visibility bugs, and stop-the-world GC pauses are day-one realities on any high-traffic service. Yet most Java developers can describe the syntax of a HashMap far better than they can explain why two threads can see different values for the same variable without any apparent concurrency bug.
The JVM Memory Model (JMM) solves two distinct but interrelated problems. First, it defines the physical layout of memory — where objects live, how long they live, and how the garbage collector reclaims them. Second, it defines the visibility and ordering guarantees between threads — the rules that determine whether a write made by Thread A is actually observable by Thread B. Mixing up these two concerns is the source of enormous confusion. The JMM specification (JSR-133, baked into the Java Language Specification since Java 5) is one of the most carefully engineered pieces of the Java platform, and understanding it separates senior engineers from the rest.
I've debugged JVM memory issues across payment processing systems handling 50,000 TPS, recommendation engines running 60 GB heaps, and microservices dying silently from metaspace exhaustion after hot-deploy cycles. The patterns are always the same: developers who understand the memory layout fix problems in minutes; developers who don't spend days chasing phantom bugs.
By the end of this article you'll be able to walk through a running JVM and name exactly what lives where and why. You'll understand the happens-before relationship well enough to reason about data races without guessing. You'll know how to tune GC regions for low-latency workloads, avoid the common memory-layout mistakes that cause silent correctness bugs, and answer the JMM interview questions that trip up even experienced engineers.
> ⚠️ Terminology note: This guide covers two distinct concepts that share confusingly similar names. JVM Memory (heap, stack, metaspace, GC) is the runtime memory structure — where objects live and how they're reclaimed. Java Memory Model (JMM) (happens-before, volatile, synchronized) is the thread visibility specification — the rules that determine when one thread's writes are observable by another. Both are covered here because they're deeply interrelated in production debugging.
JVM Memory Model — The Two-Heap Trap
The JVM memory model defines how Java applications allocate and manage memory at runtime, split into two primary regions: Heap and Non-Heap (Metaspace, Code Cache, thread stacks, direct buffers). Heap stores object instances and arrays; Non-Heap holds class metadata, JIT-compiled code, and native allocations. The JVM garbage collector manages Heap automatically, but Non-Heap memory is largely outside GC control — it grows with class loading, JIT compilation, and direct buffer usage.
In practice, the JVM starts with a fixed maximum Heap (-Xmx) but imposes no hard cap on Non-Heap by default. Metaspace expands as classes are loaded; the Code Cache fills with compiled methods; each thread consumes a native stack (~1 MB default). A typical microservice with 2 GB Heap (-Xmx2g) can silently accumulate 500 MB–1 GB of Non-Heap overhead under heavy load or dynamic class generation, leading to unexpected container OOM kills.
Understanding this split is critical when running JVMs in memory-constrained environments like Kubernetes. You must budget for both Heap and Non-Heap in your container memory request. Ignoring Non-Heap overhead is the #1 cause of unexplained OOMKilled pods in production — the JVM respects -Xmx but the OS sees total RSS exceeding the limit.
PC Register and Native Method Stack: The Overlooked Per‑Thread Memory Regions
While heap and stack get all the attention, two smaller per-thread regions play a critical role in execution: the Program Counter (PC) register and the Native Method Stack.
Program Counter (PC) Register - Each thread has its own PC register, which points to the address of the next JVM instruction to execute. - For Java methods, the PC holds the offset of the current instruction in the method’s bytecode. - For native methods (methods marked native, implemented in C/C++), the PC value is undefined — the native code manages its own program counter. - The PC register is small (a few bytes) and never causes memory errors directly. However, understanding it helps interpret thread dumps: the PC often appears as the top frame’s instruction pointer.
Native Method Stack - Also per-thread, the Native Method Stack supports calls to native methods via the Java Native Interface (JNI). - It’s structured like the Java stack: each native method call pushes a frame containing local variables, operand stack, and references to native objects. - Unlike the Java stack, its size is platform-dependent and not directly configurable with JVM flags. On most platforms the default is 512 KB – 1 MB, shared with the Java stack in the same OS thread. - If native code deeply recurses or allocates large local arrays, it can cause a StackOverflowError inside the native method — but the error message may be confusing because JVM doesn't always report it clearly.
Why These Regions Matter in Production - Thread dumps show the PC register value (often as pc=0x...) for each thread — useful for identifying where a thread is stuck (e.g., infinite loop, blocking I/O). - Native method stack exhaustion is rare but can happen with JNI-intensive libraries. Symptoms: the process freezes or crashes with no heap dump. Diagnose with -XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading and native memory tracking. - Virtual threads (Java 21+) share the carrier thread's native stack but have their own PC register state — a subtle detail that matters when debugging virtual thread pinning.
jstack, look for the line "PC = ...". On x86, you can match this address to the generated assembly (use -XX:PrintAssembly). For most developers, the PC value is less useful than the stack frame listing, but it’s essential for JVM developers and profiler tooling.JVM Memory Regions: Visual Overview
The JVM divides its process memory into five primary regions, each with a distinct role. The diagram below groups them into shared (heap, metaspace) and per‑thread (stack, PC register, native method stack).
Heap (Shared) - All objects, arrays, and the string pool live here. - Garbage collector reclaims unreachable objects. - Tuned via -Xms, -Xmx, and GC algorithm flags.
Metaspace (Shared, since Java 8) - Holds class metadata (bytecode, method tables, field layouts). - Unbounded by default — always set -XX:MaxMetaspaceSize to prevent runaway growth from classloader leaks. - Replaced PermGen; now uses native memory, not heap.
Java Stack (Per‑Thread) - Stores method call frames (local variables, operand stack, return address). - Size controlled by -Xss (default ~1 MB on most platforms). - StackOverflowError occurs when stack depth exceeds limit (recursion bug).
PC Register (Per‑Thread) - Holds the address of the next JVM instruction to execute. - For native methods, the value is undefined. - Tiny memory footprint; never a source of OOM.
Native Method Stack (Per‑Thread) - Supports JNI calls; each native method gets a frame. - Size is platform-dependent and not directly configurable. - Exhaustion leads to SIGSEGV crashes, not Java exceptions.
This five-region layout is the foundation for all JVM memory management. Every production memory issue maps to one or more of these regions: high heap usage → GC tuning, metaspace growth → classloader leak, stack overflow → recursion, native crash → JNI issue.
jcmd <pid> VM.native_memory summary to see a breakdown of all JVM memory regions. Enable with -XX:NativeMemoryTracking=summary. This is the closest you can get to a live diagram of your JVM's memory layout.Stack vs Heap: Side‑by‑Side Comparison
The stack and heap are the two most important memory regions developers interact with daily. Below is a direct comparison of their key characteristics:
| Property | Stack | Heap |
|---|---|---|
| Access speed | Very fast (direct memory access, no GC) | Slower (allocation + GC overhead) |
| Thread safety | Naturally thread‑safe (per‑thread) | Not thread‑safe (shared; needs synchronization) |
| Size | Small, fixed per thread (default 1 MB) | Large, configurable (GBs) |
| Overflow error | StackOverflowError (deep recursion) | OutOfMemoryError: Java heap space |
| Storage type | Local variables, method parameters, return addresses | Objects, arrays, string pool |
| Visibility | Only owning thread can access | All threads can access (with references) |
| Lifetime | Until method returns (freed automatically) | Until unreachable (garbage collected) |
| Memory management | Automatic on method exit (pop frame) | Garbage collector (mark‑sweep or copying) |
Key Takeaways - Stack: fast, small, private — use for primitives and object references. - Heap: slower, large, shared — use for objects that outlive the method or need to be accessed by multiple threads. - Common mistake: keeping large arrays or collections as local variables in a deep recursive method — can blow the stack because the array object is allocated on the heap, but its reference lives on the stack, and the stack frame itself is small. The array object doesn't cause stack overflow, but the frame count does.
When to Worry About Stack Size - Recursive algorithms (DFS, tree traversal) — increase -Xss or convert to iteration. - Deep call chains in enterprise frameworks (e.g., Spring AOP, many filters). - Virtual threads (Java 21+): they don't consume OS stack, but the carrier thread still has a fixed stack — pinning can cause stack overflow.
- Stack: locals, method params, return addresses — gone when method returns
- Heap: objects, arrays — live until GC decides they're unreachable
- Both: an object reference lives on the stack; the object itself lives on the heap
Heap Memory — Young Generation, Old Generation, and How Objects Age
The heap is where all Java objects live. It's shared across all threads, and it's where garbage collection operates.
📊Heap Flow: `` ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ EDEN │ ──→ │ SURVIVOR │ ──→ │ OLD GEN │ │ (new objects)│ │ (aged objects)│ │ (long-lived) │ └──────────────┘ └──────────────┘ └──────────────┘ ↓ ↓ ↓ Minor GC Minor GC Full GC (fast) (copying) (slow) ``
Young Generation (New Space): Where new objects are allocated. - Eden: All new objects start here. When Eden fills up, a minor GC runs. - Survivor Space 0 (S0) and Survivor Space 1 (S1): Two equal-sized spaces. Objects that survive minor GCs get copied between them, aging each time. - Promotion: When age exceeds threshold (default: 15), object moves to Old Generation.
Old Generation (Tenured Space): Long-lived objects. When Old Gen fills up, a major GC (or full GC) runs — expensive, often stop-the-world.
The generational hypothesis: 90-98% of objects die young. Minor GCs are fast (1-10ms). Full GCs are slow (100ms to seconds).
- Ultra-low latency systems (<1ms pauses): G1's generational model still causes stop-the-world. Use ZGC instead (-XX:+UseZGC).
- Heaps > 64 GB: G1's region management overhead grows. Consider ZGC or Shenandoah.
- Short-lived batch jobs: GC tuning won't help if the JVM exits in seconds. Focus on allocation rate.
- New objects land on the belt (Eden) — most die here instantly
- Survivors move to a holding area (Survivor spaces), aging each pass
- Long-lived objects graduate to the warehouse (Old Gen)
- Cleaning the belt = fast (minor GC). Cleaning the warehouse = slow (full GC)
-XX:+PrintTenuringDistribution, target 70-80% survival rateGarbage Collection — How the JVM Reclaims Memory
The garbage collector automatically reclaims memory occupied by objects that are no longer reachable from any GC root (local variables, static fields, active threads, JNI references).
GC Root types: Local variables, static fields, active threads, JNI references, monitors.
Major GC algorithms:
G1 (Garbage First) — Default since Java 9. Divides heap into regions (1-4 MB). Collects regions with most garbage first. Target pause time: -XX:MaxGCPauseMillis (default 200ms). Best for: heaps 4-64 GB, moderate latency.
ZGC — Ultra-low latency. Sub-millisecond pauses regardless of heap size (tested to 16 TB). Uses colored pointers + load barriers. Available since Java 15, generational since Java 21. Best for: heaps > 16 GB, sub-ms latency requirements.
Parallel GC — Throughput-optimized. Multiple threads, stop-the-world. Max application time vs GC. Best for: batch jobs, ETL, analytics.
- Ultra-low latency systems (<1ms pauses): G1 still has stop-the-world phases. Use ZGC.
- High-throughput batch processing: G1's concurrent overhead reduces throughput. Use Parallel GC.
- Heaps < 2 GB: G1's region management overhead isn't worth it. Use Serial GC (-XX:+UseSerialGC).
📚 RELATED NEXT STEPS
→ Garbage Collection in Java — Tune GC algorithms and pause targets in depth
→ Java Memory Leaks and Prevention — Fix container OOMKills and set correct memory limits
- G1 — cleans the messiest aisles first (Garbage First). Best for 4–64 GB heaps
- ZGC — hires a night crew that cleans while you work. Sub-ms pauses, any size heap
- Parallel GC — brings the whole team in. Max throughput, stop-the-world pauses
JVM Flags Reference: Setting Heap, Stack, Metaspace, and Code Cache
Configuration flags are the first line of defense against memory-related production incidents. Below is a reference table of the five essential JVM memory flags, with their purpose, typical values, and critical notes.
| Flag | Sets | Typical Value | Notes |
|---|---|---|---|
-Xms | Initial heap size | -Xms2g | JVM pre-allocates this at startup. Set equal to -Xmx to avoid resizing overhead. |
-Xmx | Maximum heap size | -Xmx2g (75% of container limit) | Never use 100% of container memory; leave headroom for non-heap. |
-Xss | Thread stack size | -Xss1m (default) | Common mistake: 1000 threads × 1 MB = 1 GB stack overhead. Consider 256 KB for virtual threads. |
-XX:MaxMetaspaceSize | Maximum metaspace size | -XX:MaxMetaspaceSize=512m | Always set this. Unbounded metaspace can silently consume all native memory. |
-XX:ReservedCodeCacheSize | Maximum JIT code cache size | -XX:ReservedCodeCacheSize=256m | Code cache fills up if you have large codebase or many JIT compilations. Flushes cause performance drops. |
Interaction Between Flags - -Xms and -Xmx control only the heap. Non-heap regions are additive. - Metaspace (-XX:MaxMetaspaceSize) is separate from heap — an application can run out of native memory even if heap is 50% free. - Code cache (-XX:ReservedCodeCacheSize) is also native memory and competes with metaspace for the non-heap budget. - Thread stacks (-Xss) multiply by thread count: 500 threads × 1 MB = 500 MB of native memory.
Container + JVM Memory Budget Calculation Total process memory ≈ Heap + Metaspace + (Threads × StackSize) + CodeCache + DirectBuffers + GC overhead
Example for a 4 GB container with 200 threads, default 1 MB stacks, 512 MB metaspace, 256 MB code cache: - Heap: 3 GB (75%) - Stacks: 200 MB - Metaspace: 512 MB - Code cache: 256 MB - GC overhead: ~10% of heap = 300 MB - Total: ~4.3 GB → OOM risk. Solution: reduce -Xmx to 2.5 GB or lower stack size to 512 KB.
-XX:MaxMetaspaceSize, the JVM will let metaspace grow until it consumes all available native memory. This is especially dangerous in containers because Linux OOM killer will terminate the process without a heap dump. Always set a limit based on your application's class metadata footprint (typically 128–512 MB).Happens-Before — Thread Visibility and the Rules That Prevent Data Races
This is the second half of the JMM — and the half that causes the most subtle bugs. The memory layout (heap, stack, GC) determines where objects live. The happens-before rules determine when one thread's writes are visible to another thread.
The core problem: Modern CPUs have multiple cores, each with its own L1/L2 cache. Without synchronization, there is NO guarantee that Thread B sees Thread A's write.
The JMM solution — happens-before: A partial ordering of operations. If A happens-before B, then A's writes are visible to B.
Key rules: 1. Program order: Within one thread, every action happens-before later actions. 2. Monitor lock: Unlock happens-before subsequent lock on same monitor (synchronized). 3. Volatile variable: Write to volatile happens-before subsequent read of that volatile. 4. Thread start: Thread.start() happens-before actions in started thread. 5. Thread join: Thread's actions happen-before Thread.join() returns. 6. Transitivity: If A happens-before B and B happens-before C, then A happens-before C.
- Compound operations (count++, x = y): Volatile only provides visibility, not atomicity. Use AtomicInteger or synchronized.
- Multiple variables needing consistent state: Volatile on one variable doesn't create happens-before for others. Use synchronized or Lock.
- When you need mutual exclusion: Volatile doesn't block threads. Use synchronized or ReentrantLock.
⚠️ x86 Hides Concurrency Bugs — ARM Exposes Them: x86 has strong memory ordering (TSO). Many data races 'work' on x86 but crash on ARM (Graviton, Apple Silicon). If you deploy to ARM, test there. Always establish happens-before edges — never rely on architecture-specific behavior.
📚 RELATED NEXT STEPS
→ Multithreading in Java — Debug race conditions and thread visibility issues
→ Multithreading in Java — Concurrent collections and thread-safe patterns
Common Production Mistakes and Debugging Patterns
These are the mistakes I've seen in production systems and the debugging patterns that caught them. Every one of these has caused a real incident.
📚 RELATED NEXT STEPS
→ Java Memory Leaks and Prevention — Fix OOMKilled pods and set correct -Xmx for containers
→ Java Memory Leaks and Prevention — Take and analyse heap dumps step by step
The Method Area: Your Class's DNA and Why Metaspace Changed Everything
Here's where most tutorials lie to you. They call it 'Metaspace' and wave it off as 'just class metadata.' No. The Method Area is the genetic blueprint of every object you'll ever allocate. It stores class structures — runtime constant pool, field and method data, the bytecode for constructors and methods, and those special final static variables. Before Java 8, this lived in the Permanent Generation (PermGen), a contiguous heap region that burned you with java.lang.OutOfMemoryError: PermGen space every time you did hot redeploys in an app server.
Then came Metaspace (Java 8+). Oracle killed PermGen and moved this data to native memory — outside the Java heap entirely. Why? Because native memory grows dynamically. You still hit OOM, but now it's Metaspace instead of PermGen, and the default max is unbounded. That's the trade-off: no more fixed ceiling, but you can eat your entire machine's RAM on class metadata if you're careless. The Method Area is where your classloader leaks live. Every framework that creates classloaders (Spring Boot devtools, OSGI, application servers) can leave rotting class metadata here unless you understand that -XX:MaxMetaspaceSize exists.
-XX:MaxMetaspaceSize to cap it, and use jcmd <pid> GC.class_stats to identify leaking loaders.String Pool: How Your String.intern() Every Morning Costs You 200ms of GC
Every Java developer learns about string interning in week one. They use it in production maybe twice before it bites them. The String Pool is a hashmap inside the heap (used to be in PermGen, now in the regular heap) that holds interned string literals and explicitly interned strings. When you write String a = "hello", the JVM checks this pool first. If "hello" exists, a gets the same reference. If not, the JVM allocates the string in the heap and adds it to the pool.
Here's the part the textbooks skip: the pool is backed by a Hashtable with a fixed bucket count defined by -XX:StringTableSize. Default is 60013. If you intern 10 million unique strings, you get hash collisions. Lots of them. Suddenly your String.intern() call becomes a linked-list traversal that burns CPU and serializes GG threads. And because the pool lives in the heap, interning a 40-character string creates a char[] and a String object — about 88 bytes per entry. 10 million entries? Nearly a gig of heap you can't GC until you drop all references. Your next full GC just doubled in duration.
Oracle's engineering notes recommend tuning -XX:StringTableSize if you create more than 100,000 unique interned strings. I've debugged Elasticsearch clusters where interned field keys ballooned to 2GB of heap. The fix wasn't to stop interning — it was to bump StringTableSize to 1,000,019 and review the code that was writing 800,000 unique field names.
-XX:StringTableSize=10000019 (prime number near anticipated unique interned strings). Prime bucket counts reduce collisions. Monitor with jcmd <pid> VM.stringtable every time you do a redeployment with heavy interning.GC Logging: Your First Diagnostic Tool
Why logging GC matters: without it, you diagnose heap problems blind. GC logs reveal pause times, frequency, and promotion failures—data you need before tuning flags. Enable with -Xlog:gc (Java 9+) or -XX:+PrintGCDetails (Java 8). Focus on young GC pause durations: under 10ms is healthy, over 100ms signals a problem. Old GC pauses over 1 second demand immediate action—check heap size or object retention. Always log to a rotating file with timestamps: -Xlog:gc:file=gc-%t.log:filesize=10M. Parse logs with tools like GCeasy or gceasy.io. Track promotion rates: if objects move to old gen faster than collection cycles, increase young gen. Spot allocation stalls: frequent full GCs with low heap usage indicate premature promotion. The key insight: GC logs don't lie—they show exactly where throughput and latency trade off. Set logging in production from day one, never after an outage.
32/64 Bit JVM: The Pointer Width Trap
Why 32 vs 64 matters: memory addressing limits and compressed OOPs. A 32-bit JVM caps heap at ~4GB—hard limit for legacy systems. 64-bit JVM supports terabytes of heap but doubles reference size (8 bytes vs 4), wasting memory if your heap is under 32GB. Compressed OOPs (-XX:+UseCompressedOops, default on 64-bit with heap < 32GB) pack object references into 32 bits, saving 40% memory in many apps. The dirty secret: running a 32-bit JVM on 64-bit hardware gains nothing—use 64-bit with compressed OOPs instead. Watch for: when heap exceeds 32GB, compressed OOPs disable automatically, reference size doubles, and memory per object jumps. Benchmark your app at 30GB vs 34GB—you may see worse performance at higher heap due to pointer bloat. Direct memory (NIO buffers) on 64-bit still uses full 8-byte pointers. Key rule: stay under 32GB heap to keep compressed OOPs active, or accept 10-20% memory overhead.
The 4GB Container That Kept Dying
- Heap ≠ total memory
- Non-heap consumes 20–30% of your container budget
- Always leave headroom
Common mistakes to avoid
4 patternsSetting -Xmx equal to container memory limit
Not setting -XX:MaxMetaspaceSize
Using volatile without understanding atomicity
Assuming x86 memory ordering is universal
Interview Questions on This Topic
What is the difference between stack and heap memory in the JVM?
Frequently Asked Questions
20+ years shipping production Java in banking & fintech. Everything here is grounded in real deployments.
That's Advanced Java. Mark it forged?
16 min read · try the examples if you haven't