Java Thread States — Lock During I/O Causes BLOCKED
A 30-second 503? Check thread dumps for BLOCKED threads on a lock held during I/O — exactly the real production incident we decode step by step..
20+ years shipping production Java in banking & fintech. Drawn from code that ran under real load.
- Java threads cycle through 6 states: NEW, RUNNABLE, BLOCKED, WAITING, TIMED_WAITING, TERMINATED
- state() method reveals the current state in a thread dump
- BLOCKED and WAITING are not the same — monitor contention vs indefinite park
- TIMED_WAITING is WAITING with a timeout — always bound
- Thread state transitions are driven by JVM internals and OS scheduling
- Biggest mistake: treating RUNNABLE as "actively running" — it includes ready-to-run
Imagine a chef in a restaurant kitchen. Sometimes they're actively cooking (RUNNING). Sometimes they're waiting for ingredients to arrive (WAITING). Sometimes a timer is going off and they'll be ready in 30 seconds (TIMED_WAITING). Sometimes another chef is using the stove and our chef is standing right there ready to grab it the moment it's free (BLOCKED). Before their shift starts, they haven't even put on their apron yet (NEW). When the shift ends and the kitchen closes, they're done for the night (TERMINATED). Java threads are exactly like that chef — and the JVM is the kitchen manager deciding who gets the stove.
Every production outage involving threads — the deadlock that froze your payment service at 2 AM, the thread pool that silently starved under load, the race condition that corrupted user data — traces back to a misunderstanding of what a thread is actually doing at any given moment. The Java thread lifecycle isn't just an academic diagram you memorize for interviews. It's the mental model that lets you read a thread dump, diagnose a hung application, and design concurrent systems that hold up under real traffic.
The problem is that most resources treat the lifecycle as a static state machine — here are the six boxes, here are the arrows, done. But threads don't live in boxes. They transition between states in ways that depend on OS scheduling, JVM implementation details, monitor ownership, and the specific flavor of waiting you've asked them to do. Miss those nuances and you'll write code that looks correct, passes unit tests, and then silently misbehaves in production with 200 concurrent users.
By the end of this article you'll be able to read a real thread dump and know exactly what each thread is doing and why. You'll understand the difference between BLOCKED and WAITING at the JVM level — not just the textbook definition. You'll know which state transitions are guaranteed, which are platform-dependent, and which ones hide the bugs that take senior engineers days to find. Let's build that mental model from the ground up.
What Thread Lifecycle in Java Actually Means
A Java thread lifecycle is the state machine every thread passes through from creation to termination: NEW, RUNNABLE, BLOCKED, WAITING, TIMED_WAITING, and TERMINATED. The JVM manages transitions between these states based on thread scheduling, lock acquisition, and I/O operations. Understanding this lifecycle is critical because a thread in BLOCKED state is not consuming CPU but is holding resources, which can cascade into system-wide stalls.
Threads start in NEW after instantiation but before start() is called. Once started, they enter RUNNABLE — the only state where the thread can actually execute on a CPU core. From RUNNABLE, a thread can transition to BLOCKED when it fails to acquire an intrinsic lock (synchronized block), to WAITING via Object.wait() or LockSupport.park(), or to TIMED_WAITING via sleep() or timed waits. The key property: BLOCKED threads are waiting for a lock held by another thread, while WAITING threads are waiting for a signal from another thread.
In real systems, the most dangerous state is BLOCKED because it often indicates lock contention. A thread blocked on I/O — say, reading from a slow database connection inside a synchronized block — will hold its lock, forcing all other threads needing that lock into BLOCKED state. This can collapse throughput from thousands of requests per second to near zero. The rule: never hold locks during blocking I/O operations.
notify().The 6 Thread States — What Each Actually Means
Java defines six thread states in java.lang.Thread.State. They're not just labels — each maps to a specific JVM or OS condition.
- NEW: Thread created but
start()not called. Not yet alive. - RUNNABLE: Thread is executing in the JVM (or ready to execute, waiting for CPU). Includes both running and ready-to-run.
- BLOCKED: Thread is waiting for a monitor lock to enter a synchronized block/method.
- WAITING: Thread is waiting indefinitely for another thread to perform a specific action (e.g.,
wait(),join(),park()). - TIMED_WAITING: Same as WAITING but with a timeout (sleep, wait(timeout), join(timeout), parkNanos).
- TERMINATED: Thread has completed (
run()finished or exception).
The key insight: RUNNABLE does not mean 'using CPU right now'. It means the thread is eligible for scheduling. The OS decides when it actually runs. This is why busy-wait loops (while(!flag)) keep a thread in RUNNABLE but waste CPU.
State Transitions — The Arrows Between the Boxes
Threads don't jump randomly. Each transition has a trigger:
- NEW → RUNNABLE: Calling
start(). - RUNNABLE → BLOCKED: Attempting to enter a synchronized block/method without the lock. JVM puts you on the monitor's entry set.
- BLOCKED → RUNNABLE: The lock holder releases the lock (exits synchronized block).
- RUNNABLE → WAITING: Calling
Object.wait(),Thread.join(), orLockSupport.park(). Thread is put in the wait set of the monitor. - WAITING → RUNNABLE: Another thread calls
notify()/notifyAll() on the same monitor, or the thread is interrupted. But the thread must re-acquire the lock before proceeding — so it goes to BLOCKED first, then RUNNABLE. - RUNNABLE → TIMED_WAITING: Thread.sleep(time), wait(timeout), join(timeout), parkNanos().
- TIMED_WAITING → RUNNABLE: Timeout expires, or notify/interrupt.
- RUNNABLE → TERMINATED:
run()completes.
The critical detail: after notify(), the waiting thread doesn't run immediately. It must re-acquire the monitor lock. This is why waiting code should always loop on the condition (spurious wakeup).
- Entry set: Threads trying to enter the synchronized block (BLOCKED). Bouncer holds them back until the current occupant leaves.
- Wait set: Threads that called
wait()(WAITING). They voluntarily stepped aside and wait for a signal from the bouncer. - When
notify()is called, one thread moves from wait set to entry set. It's still BLOCKED until it actually grabs the lock. - Multiple
notify()calls move multiple threads — but only one gets the lock at a time. - Always wait inside a while loop — because of spurious wakeups and the gap between
notify()and lock acquisition.
notify() but the condition the waiting thread checks is still false because of a race. The waiting thread wakes, checks the condition, finds it false, and goes back to WAITING. The notifier never calls notify() again, so the waiter waits forever. This is why the 'while loop' around wait() is non-negotiable.notify(), the waiting thread must still re-acquire the lock (BLOCKED) before proceeding.wait().wait() isn't in a while loop, you'll hit a production bug within a year.BLOCKED vs WAITING — The JVM Difference
At the JVM level, BLOCKED and WAITING are distinct in the thread dump output:
- BLOCKED (on object monitor): The thread is in the entry set of a monitor, waiting to acquire the lock. The dump shows which lock and which thread holds it.
- WAITING (on object monitor): The thread is in the wait set, having called
wait()on that monitor. The dump shows 'waiting on <monitor>' but not who will wake it. - WAITING (parking): Thread used
LockSupport.park()— typically from java.util.concurrent (e.g., ForkJoinPool workers, CompletableFuture).
The performance impact: A BLOCKED thread consumes no CPU but the OS keeps it in the scheduler's run queue (it's legally runnable but the JVM won't let it). In contrast, a WAITING thread is typically descheduled until notified. Both are 'idle' but the reason matters for debugging.
A thread dump might show hundreds of BLOCKED threads all waiting on the same lock — that's a contention hotspot. WAITING threads on the same condition often indicate a missing notify. WAITING threads with 'parking' are usually normal (thread pool idle).
notify() or notify() happened before wait(). Check that the notifier sets a boolean flag and that the waiter checks it.Thread Dump Analysis — Reading the Lifecycle in Action
When a production incident hits, your first tool is the thread dump. Here's what you're looking for:
- Thread name: Often configured in thread pools. 'http-nio-8080-exec-1' indicates a Tomcat worker.
- State: One of the six above.
- Stack trace: Shows exactly where the thread is blocked.
- Lock details: 'waiting for <0x00000007>', 'locked <0x00000008>'. The hex ID identifies the monitor.
Key patterns to recognize:
- Deadlock: Thread A holds lock L1 and wants L2. Thread B holds L2 and wants L1. Both are BLOCKED. The dump explicitly says 'Found one Java-level deadlock'.
- Lock contention: Many threads BLOCKED on the same lock, one owner.
- Missed signal: Threads WAITING on a condition, nobody holding the lock.
- Spinning: Thread state is RUNNABLE but the stack trace shows a tight loop (while(!flag){ } ) — consumes CPU without progress.
Common Pitfalls and How to Avoid Them
Even experienced engineers fall into these traps. Here are the most common production failures linked to thread lifecycle misunderstanding:
Pitfall 1: Holding a lock during I/O A synchronized block wrapping a database call or HTTP request. If the external call hangs, every other thread wanting that lock is stuck in BLOCKED. Fix: Move I/O outside the synchronized block, or use a read/write lock, or apply a timeout on the I/O and recheck inside.
Pitfall 2: Notify without state flag Calling notify() but forgetting to set a condition variable that the waiting thread checks. The waiting thread wakes, checks the condition, finds it false, and goes back to WAITING — never to be woken again. Fix: Always use a boolean flag in conjunction with wait/notify.
Pitfall 3: Calling start() twice Thread.start() can only be called once. A second call throws IllegalThreadStateException. This happens often when reusing a thread object. Fix: Create a new Thread instance for each execution.
Pitfall 4: Assuming RUNNABLE means 'working' Resource exhaustion may cause many threads to be in RUNNABLE but not progressing because they're waiting on CPU scheduling. Monitoring tools that only show thread count in RUNNABLE can mislead. Fix: Combine thread dump with CPU profiling (jstack + top -H).
Pitfall 5: Ignoring interrupted flag When InterruptedException is caught, forgetting to restore the interrupt flag (Thread.currentThread().interrupt()) can cause the thread to miss shutdown signals. Fix: Always preserve the interrupt status in catch blocks.
start() twice, ignoring interrupt flag.The Hidden Cost of Thread Transitions You're Ignoring
Everyone parrots the six states. Nobody talks about what it costs to move between them. That's where your production problems live. A state transition isn't free – it's a context switch paid in CPU cycles, cache misses, and wall-clock time. When your thread goes RUNNABLE to BLOCKED, the JVM has to save its register state, flush dirty cache lines, and reload everything when it gets the lock back. That's microseconds you can't buy back. The gap between 'it works on my machine' and 'it melts in prod' is usually a bunch of threads ping-ponging between RUNNABLE and BLOCKED every few milliseconds. You don't feel it until you're at 200 threads fighting over one synchronized block. Then the system collapses. The lifecycle isn't a CS diagram. It's a cost model. Map the transitions in your critical path. If you see threads bouncing between BLOCKED and RUNNABLE more than a few times per second, you're leaving performance on the floor.
Why Your Sleeping Threads Are Lying to You
TIMED_WAITING looks harmless. It's not. When a thread calls Thread.sleep(1000), it's not taking a nap – it's handing its time slice back to the OS scheduler. The JVM can't guarantee it'll wake up in exactly 1000 milliseconds. It's a best-effort. On a loaded system, '1000 ms' becomes 1050, 1200, or 1500. The thread is alive but useless. That's why timeouts on database connections or HTTP calls get weird: your waiting thread wakes up late, grabs a lock it should have released already, and you see latency spikes. The root cause isn't the network. It's your thread sleeping like it's on vacation. The fix: use timed wait methods only for polling and housekeeping, never for latency-sensitive operations. Use CountDownLatch with a deadline or CompletableFuture with timeout for real work. And never, ever sleep inside a synchronized block. That's how you turn a 'harmless' pause into a system-wide choke point.
Thread.sleep() with CompletableFuture.orTimeout(1, TimeUnit.SECONDS). It aborts on interrupt, doesn't hold locks, and gives you precise control. Sleeping is a code smell.How to Make Your Threads Die Cleanly (They Deserve It)
A thread in TERMINATED state is done. Its stack is reclaimed, its monitors are released, its ThreadLocal variables are garbage-collected. But only if you let it die cleanly. The biggest sin I see is threads that never terminate because they're stuck in an infinite loop with no exit condition. Or worse, threads that get interrupted but swallow the interrupt. When a thread is TERMINATED, it means its run() method finished. That's it. But many developers don't handle interruption properly. When you call thread.interrupt(), you're just setting a flag. The thread doesn't magically stop. It has to check Thread.interrupted() or catch InterruptedException and actually stop. If you eat the exception, the thread lives forever. Memory leaks, thread leaks, connection leaks – all starting from a thread that won't die. The pattern: always use volatile boolean running flag for graceful shutdown. Check it every iteration. If interrupted, clean up resources and break. Legacy threads that skip this are the reason your production JVM runs out of memory at 3 AM.
interrupt() together. One for cooperative shutdown, one to wake from blocking calls. Never rely on a single mechanism.Why Thread Lifecycle Starts Before You Call start()
Most Java developers assume a thread's lifecycle begins when start() executes. That's wrong. The lifecycle starts the moment you instantiate a Thread object. At that point, the JVM allocates a native thread stack and initializes internal state — even though the thread hasn't begun executing. This NEW state is not idle; it's a cost center. A new Thread() allocates ~1 MB of stack memory by default on most 64-bit JVMs. For 10,000 threads, that's 10 GB before a single line of runnable code executes. Production failures often trace back to unstarted thread objects piling up in memory. The start() call transitions the thread from NEW to RUNNABLE, but the resource commitment happened earlier. Always measure thread creation against heap limits. Use thread pools (ExecutorService) to amortize this upfront cost. If you must create threads manually, batch them and start immediately to avoid memory leaks in NEW state. The lifecycle's first hidden cost is the one you never see: the constructor.
start().TERMINATED Is Not the End — JVM Cleanup You Must Trigger
A thread entering TERMINATED state stops executing, but native resources often survive. The JVM must still deallocate the thread's native stack, release OS thread handles, and reclaim associated memory. Without proper cleanup, TERMINATED threads become zombie objects. The garbage collector reclaims the Thread object only when no references remain. But the native thread handle persists until completes or the thread is explicitly detached. Call join() to block until the thread's native resources fully release. Ignoring this causes thread handle leaks, especially in high-throughput systems. On Linux, join()/proc/[pid]/fd reveals leaked handles as open file descriptors. Each leaked handle burns ~8 KB kernel memory. For long-running applications with short-lived threads, these accumulate silently. The fix: always call or use join()CountDownLatch for one-shot threads. For pools, ExecutorService.shutdown() and awaitTermination() clean up all handles. TERMINATED threads don't die cleanly on their own — you must finish the lifecycle contract.
join(), TERMINATED threads leak native handles that degrade OS performance over hours of uptime.join() or use shutdown mechanisms to release native handles after TERMINATED state.The Case of the Frozen Payment Service
- Never hold a lock during I/O operations — you block all other threads waiting for that lock.
- Always use timeouts on external calls inside synchronized blocks, or better, avoid blocking I/O entirely when holding locks.
- On-call engineers need to know how to read thread dumps — the fix was straightforward once they saw the BLOCKED pattern.
notify()/signal(). Check the code that should wake them. Use 'jstack' to see which thread holds the monitor and what it's doing. Add logging around signal() to confirm it's called.take() with timeout). Check if the pool is idle at those times. Adjust keepAliveTime or use synchronous handoff. Also verify that the timed wait is not masking a slow upstream.jstack -l <pid> > /tmp/threaddump.txtgrep -E 'BLOCKED|WAITING \(on object monitor\)' /tmp/threaddump.txt | head -20Key takeaways
notify(), the waiting thread must re-acquire the lock before proceedingwait() to handle spurious wakeups and missed signals.Common mistakes to avoid
4 patternsTreating RUNNABLE as 'actively using CPU'
Calling notify() without a condition flag
notify() being called. The waiting thread checks the condition after waking, finds it false, and goes back to WAITING.notify(). The waiting thread must check that flag in a while loop.Holding a lock during blocking I/O
Ignoring the interrupt flag after InterruptedException
thread.stop() fail to interrupt the thread.Thread.currentThread().interrupt() in the catch block to restore the interrupt status.Interview Questions on This Topic
Explain the difference between BLOCKED and WAITING thread states in Java.
wait(), join(), or park() — it's in the wait set. The key difference: BLOCKED threads hold no locks and can't release the one they're waiting for; WAITING threads have released their lock (if any) and are parked until notified.Frequently Asked Questions
20+ years shipping production Java in banking & fintech. Drawn from code that ran under real load.
That's Multithreading. Mark it forged?
10 min read · try the examples if you haven't