Process: isolated OS unit with own memory, expensive to create
Thread: lightweight, shares heap with siblings, faster context-switch
Scheduler preempts threads every 1–10ms; context switch has real cost
Java: ProcessBuilder for processes, Thread class for threads, Virtual Threads for scale
Pitfall: data race from unsynchronised shared state; use AtomicInteger or synchronized
Debugging: jstack finds deadlocks; thread dumps show BLOCKED/WAITING states
Plain-English First
Imagine a restaurant kitchen. Each dish on the menu is a process — it has its own ingredients, its own space on the counter, and its own set of instructions. The chefs actually cooking that dish are threads — multiple chefs can work on the same dish at the same time, sharing the same counter space. The head chef (the OS scheduler) decides who cooks what and when, making sure no one burns anything or starves waiting for the stove.
Every time you open Spotify while your browser streams a video and Slack pings you in the background, your operating system is performing a silent juggling act of extraordinary complexity. It's carving up one physical CPU into dozens of seemingly simultaneous workers, each isolated from the others, each convinced it has the machine to itself. This isn't magic — it's process and thread management, and understanding it is the difference between writing code that works and writing code that performs.
Before multi-processing and multi-threading, programs ran one at a time, start to finish. You launched a program, waited, then launched the next one. That was fine for a 1970s mainframe printing payroll. It's catastrophic for a modern web server that needs to handle ten thousand simultaneous HTTP requests. The OS needed a way to isolate programs from each other (so a crashed browser tab doesn't nuke your entire machine) and simultaneously share CPU time fairly among them. Processes and threads are the solution to both problems.
By the end of this article you'll understand exactly what a process and a thread are at the OS level, why threads exist inside processes rather than as standalone units, how the scheduler decides who runs when, how to create and manage both in Java with real runnable code, and — crucially — what goes wrong when you get this wrong. You'll also be ready for the interview questions that trip up even experienced candidates.
What Is a Process — and Why Does the OS Bother Isolating Them?
A process is a running instance of a program. Not the program itself — the .exe or .class file sitting on disk is just instructions. When the OS loads it into memory and starts executing it, that living, breathing execution environment is a process.
Every process gets its own private sandbox: a dedicated chunk of virtual memory (split into code, stack, heap, and data segments), its own file descriptor table, and its own process ID (PID). That isolation is the entire point. If Chrome's renderer crashes, it doesn't corrupt your terminal session, because they live in completely separate address spaces. The OS enforces that wall at the hardware level using the MMU (Memory Management Unit).
Creating a process is expensive. The OS must allocate a new virtual address space, copy or map the program's code, set up a stack, and register the process in the process control block (PCB) — a kernel data structure that tracks everything about that process: its PID, memory maps, open files, CPU register state, and scheduling priority. That overhead is why threads were invented: they give you concurrency at a fraction of the cost.
ProcessInspector.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
import java.lang.management.ManagementFactory;
import java.lang.management.RuntimeMXBean;
publicclassProcessInspector {
publicstaticvoidmain(String[] args) {
// The JVM exposes the current process's info through the Runtime beanRuntimeMXBean runtimeBean = ManagementFactory.getRuntimeMXBean();
// getPid() is available from Java 9+. It returns this process's OS-level PID.long currentPid = ProcessHandle.current().pid();
// How long has this JVM process been alive in milliseconds?long uptimeMs = runtimeBean.getUptime();
System.out.println("=== Current JVM Process Info ===");
System.out.println("Process ID (PID) : " + currentPid);
System.out.println("JVM Name : " + runtimeBean.getVmName());
System.out.println("Process Uptime (ms) : " + uptimeMs);
// Now let's SPAWN a child process — a completely separate OS process.// We'll run the 'java -version' command as its own isolated process.System.out.println("\n=== Spawning a Child Process ===");
try {
ProcessBuilder processBuilder = newProcessBuilder("java", "-version");
// Redirect stderr to stdout so we can read the version output easily
processBuilder.redirectErrorStream(true);
Process childProcess = processBuilder.start();
// Read the child process's output streamString output = newString(childProcess.getInputStream().readAllBytes());
// waitFor() BLOCKS the current thread until the child process terminatesint exitCode = childProcess.waitFor();
System.out.println("Child process output : " + output.strip());
System.out.println("Child exit code : " + exitCode);
// Exit code 0 = success. Non-zero = something went wrong.// The child has its own PID, separate memory space, and lifecycleSystem.out.println("Child PID : " + childProcess.pid());
System.out.println("Parent PID : " + currentPid);
} catch (Exception ex) {
System.err.println("Failed to spawn child process: " + ex.getMessage());
}
}
}
Output
=== Current JVM Process Info ===
Process ID (PID) : 18423
JVM Name : OpenJDK 64-Bit Server VM
Process Uptime (ms) : 142
=== Spawning a Child Process ===
Child process output : openjdk version "21.0.2" 2024-01-16
Child exit code : 0
Child PID : 18431
Parent PID : 18423
Why the PIDs differ by ~8:
The OS assigns PIDs sequentially, so other background processes grabbed a few IDs between your parent spawning and the child starting. This is perfectly normal — never assume a child PID is parent+1.
Production Insight
Process isolation adds ~5-10ms overhead per creation due to MMU table setup.
If you spawn a process per request, expect latency spikes.
Rule: reuse processes via pools (like Apache prefork) or use threads for concurrency.
Key Takeaway
A process is a heavy, isolated execution unit.
Use it where fault containment is critical.
For concurrency inside an app, prefer threads – they share memory and context-switch faster.
Process vs Thread in Production
IfYou need crash isolation between components (e.g., payment & inventory)
→
UseUse separate processes (microservices).
IfYou need high-throughput concurrency within one app (e.g., web server handling requests)
→
UseUse threads (platform or virtual).
IfYou have 1000+ concurrent I/O-bound tasks
→
UseUse virtual threads or async I/O – platform threads will saturate scheduler.
Threads — Lightweight Workers That Share the Same Kitchen Counter
A thread is the smallest unit of execution the OS scheduler actually runs. Every process starts with one thread (the main thread). But you can spawn more, and here's the key insight: all threads inside one process share the same heap memory and the same open file handles. They do each get their own stack (for local variables and method call frames) and their own program counter (so each thread knows where it is in the code).
That shared memory is both threads' superpower and their greatest danger. Two threads can communicate by just writing to a shared variable — no sockets, no pipes, no serialisation. But if they both try to modify that variable at the same time without synchronisation, you get a data race, and your program produces wrong answers silently. The OS won't warn you. The compiler won't warn you. It'll just be wrong.
Java makes threading first-class via the Thread class and the Runnable interface, and since Java 21, via Virtual Threads (Project Loom) — lightweight threads managed by the JVM rather than the OS, capable of running millions simultaneously. We'll cover both so you understand the evolution, not just the current API.
ThreadLifecycleDemo.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
import java.util.concurrent.atomic.AtomicInteger;
publicclassThreadLifecycleDemo {
// AtomicInteger is thread-safe. A plain int here would be a data race.// We'll demonstrate BOTH to show the difference.privatestaticAtomicInteger safeCounter = newAtomicInteger(0);
private static int unsafeCounter = 0; // <-- this WILL misbehave under concurrencypublicstaticvoidmain(String[] args) throwsInterruptedException {
System.out.println("Main thread PID : " + ProcessHandle.current().pid());
System.out.println("Main thread ID : " + Thread.currentThread().threadId());
System.out.println("Main thread name : " + Thread.currentThread().getName());
// --- Creating threads via Runnable (preferred over extending Thread) ---// Runnable separates the TASK from the execution mechanism.Runnable incrementTask = () -> {
for (int i = 0; i < 1000; i++) {
safeCounter.incrementAndGet(); // atomic: read-modify-write as one operation
unsafeCounter++; // NOT atomic: read, then modify, then write separately
}
System.out.println("Thread " + Thread.currentThread().getName()
+ " finished. Safe counter now: " + safeCounter.get());
};
// Spawn 5 threads all running the same taskThread[] workers = newThread[5];
for (int i = 0; i < workers.length; i++) {
workers[i] = newThread(incrementTask, "Worker-" + (i + 1));
}
// Start all threads — OS scheduler decides the actual execution orderSystem.out.println("\nLaunching 5 worker threads...");
for (Thread worker : workers) {
worker.start(); // Moves thread from NEW state to RUNNABLE state
}
// join() blocks main thread until each worker finishes.// Without join(), main might print results before workers are done.for (Thread worker : workers) {
worker.join();
}
System.out.println("\n=== Final Results (5 threads x 1000 increments = 5000 expected) ===");
System.out.println("Safe counter : " + safeCounter.get()); // Always 5000System.out.println("Unsafe counter : " + unsafeCounter); // Probably NOT 5000
}
}
Output
Main thread PID : 19201
Main thread ID : 1
Main thread name : main
Launching 5 worker threads...
Thread Worker-1 finished. Safe counter now: 2000
Thread Worker-3 finished. Safe counter now: 3000
Thread Worker-2 finished. Safe counter now: 4000
Thread Worker-5 finished. Safe counter now: 4891
Thread Worker-4 finished. Safe counter now: 5000
=== Final Results (5 threads x 1000 increments = 5000 expected) ===
Safe counter : 5000
Unsafe counter : 4347
Watch Out: The unsafe counter won't always give the SAME wrong answer
Data races are non-deterministic. On one run you might get 4347, on the next 4891. That unpredictability is what makes them so dangerous in production — they pass your tests and then fail in the wild under load.
Production Insight
A data race in production often appears as 'intermittent wrong values' under load.
It passes unit tests because single-threaded tests don't trigger the race.
Rule: always use volatile, AtomicX, or synchronized for shared mutable state.
Key Takeaway
Threads share heap – communication is free, but synchronisation is mandatory.
A plain int incremented from two threads will produce wrong answers.
Always use thread-safe primitives or locks for shared mutable state.
Safe Concurrent Access in Java
IfSingle variable updated by multiple threads
→
UseUse AtomicInteger, AtomicLong, etc. for simplest atomic updates.
IfMultiple variables updated together (compound action)
→
UseUse synchronized block or ReentrantLock to ensure atomicity.
IfRead-mostly, rare writes
→
UseUse volatile or ReadWriteLock for higher read throughput.
The OS Scheduler — Who Runs When, and Why It Matters to You
Having threads is great, but if you have 200 threads and only 8 CPU cores, not everyone can run simultaneously. The OS scheduler is the traffic cop that decides which thread runs on which core at any given millisecond.
Modern schedulers (Linux's CFS, Windows' multilevel feedback queue) use a combination of priority, fairness, and time-slicing. Each thread gets a small time slice — typically 1–10ms. When the slice expires, the scheduler preempts the thread (saves its register state into its thread control block) and picks the next candidate. This context switch has a real cost: saving and restoring registers, potentially invalidating CPU cache lines.
This is why spawning thousands of OS threads for a high-throughput server is a bad idea — the scheduler drowns in context switches before your actual work gets done. Java 21's Virtual Threads solve this by using a small pool of OS threads ('carrier threads') to run a huge number of lightweight JVM-managed threads, parking them when they block on I/O instead of consuming an OS thread the whole time.
VirtualThreadDemo.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
import java.time.Duration;
import java.time.Instant;
import java.util.concurrent.Executors;
publicclassVirtualThreadDemo {
// Simulates a blocking I/O operation (like a database query or HTTP call)privatestaticvoidsimulateDatabaseQuery(int queryId) throwsInterruptedException {
// Thread.sleep() voluntarily yields the thread back to the scheduler.// With virtual threads, this PARKS the virtual thread (frees the carrier OS thread)// rather than blocking a real OS thread.Thread.sleep(50); // pretend this is a 50ms DB round-tripSystem.out.println("Query " + queryId + " complete on: " + Thread.currentThread());
}
publicstaticvoidmain(String[] args) throwsInterruptedException {
int numberOfTasks = 500; // try this with platform threads and watch it crawl// --- Approach 1: Traditional platform (OS) threads ---Instant platformStart = Instant.now();
try (var platformExecutor = Executors.newFixedThreadPool(50)) {
// Fixed pool of 50 OS threads handling 500 tasks.// At any moment, 450 tasks are waiting in the queue.for (int i = 1; i <= numberOfTasks; i++) {
finalint taskId = i;
platformExecutor.submit(() -> {
try { simulateDatabaseQuery(taskId); }
catch (InterruptedException e) { Thread.currentThread().interrupt(); }
});
}
} // executor.close() waits for all tasks to finish (Java 19+ AutoCloseable)long platformMs = Duration.between(platformStart, Instant.now()).toMillis();
// --- Approach 2: Virtual threads (Java 21+) ---Instant virtualStart = Instant.now();
try (var virtualExecutor = Executors.newVirtualThreadPerTaskExecutor()) {
// Creates a NEW virtual thread per task — sounds expensive, but virtual// threads are so cheap (~1KB stack) the JVM creates them without hesitation.for (int i = 1; i <= numberOfTasks; i++) {
finalint taskId = i;
virtualExecutor.submit(() -> {
try { simulateDatabaseQuery(taskId); }
catch (InterruptedException e) { Thread.currentThread().interrupt(); }
});
}
}
long virtualMs = Duration.between(virtualStart, Instant.now()).toMillis();
System.out.println("\n=== Throughput Comparison: 500 tasks, each with 50ms I/O ===");
System.out.println("Platform threads (pool of 50) : " + platformMs + " ms");
System.out.println("Virtual threads : " + virtualMs + " ms");
System.out.println("Speedup factor : ~" + (platformMs / Math.max(virtualMs, 1)) + "x");
}
}
Output
Query 47 complete on: VirtualThread[#52]/runnable@ForkJoinPool-1-worker-3
Query 12 complete on: VirtualThread[#17]/runnable@ForkJoinPool-1-worker-1
... (500 lines of query completions) ...
=== Throughput Comparison: 500 tasks, each with 50ms I/O ===
Platform threads (pool of 50) : 551 ms
Virtual threads : 68 ms
Speedup factor : ~8x
Pro Tip: Virtual threads aren't faster for CPU-bound work
Virtual threads shine when threads spend most of their time waiting (I/O, sleep, locks). If your threads are crunching numbers non-stop, you still want a small pool sized to your CPU core count — more threads than cores means context-switch overhead with no benefit.
Production Insight
Context switches cost ~1-2µs of CPU per switch. At 100k switches/sec, that's 10% CPU waste.
On a 16-core server, 10% waste means 1.6 cores spent just switching.
Rule: keep active threads <= 2x CPU cores for CPU-bound; use async I/O or virtual threads for I/O-bound.
Key Takeaway
The scheduler decides thread order, never assume execution order.
Over 10k platform threads cause scheduler thrashing.
Virtual threads are a game-changer for I/O-bound services, but profile first.
Choosing Thread Type Based on Workload
IfWorkload is CPU-bound (no I/O waits)
→
UseUse platform threads with pool sized to Runtime.getRuntime().availableProcessors().
IfWorkload is I/O-bound (HTTP calls, DB queries, file reads)
→
UseUse virtual threads (Java 21+) or async frameworks (CompletableFuture, reactive).
IfMixed workload, need legacy Java version (<21)
→
UseUse a larger platform thread pool (e.g., 200 threads for a 16-core machine) but monitor context switching.
Thread States, Synchronisation, and Avoiding Deadlock
A thread isn't just 'running' or 'not running'. It moves through a state machine: NEW (created but not started), RUNNABLE (eligible to run, may or may not be on a core right now), BLOCKED (waiting to acquire a monitor lock), WAITING (parked via wait() or join() with no timeout), TIMED_WAITING (parked with a timeout, like sleep()), and TERMINATED (finished).
Understanding these states is critical for debugging. If a thread is stuck in BLOCKED for a long time, it's fighting for a lock. If it's in WAITING forever, something forgot to call notify(). Thread dumps — printable via kill -3 on Linux or jstack — show you every thread's state and stack trace at a point in time. That's how you diagnose production hangs.
Deadlock is the most feared concurrency bug: Thread A holds Lock 1 and waits for Lock 2, while Thread B holds Lock 2 and waits for Lock 1. Neither can proceed. The fix is to always acquire multiple locks in a consistent global order across all threads — if everyone agrees 'Lock 1 before Lock 2', the circular dependency is impossible.
DeadlockPreventionDemo.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;
publicclassDeadlockPreventionDemo {
// Two shared resources — imagine these are bank accountsprivatestaticfinalLock accountAlpha = newReentrantLock();
privatestaticfinalLock accountBeta = newReentrantLock();
// DEADLOCK-PRONE version: each thread acquires locks in OPPOSITE orderstaticvoidtransferDeadlockProne(String threadName, boolean reverseOrder)
throwsInterruptedException {
Lock firstLock = reverseOrder ? accountBeta : accountAlpha;
Lock secondLock = reverseOrder ? accountAlpha : accountBeta;
firstLock.lock();
System.out.println(threadName + " acquired first lock, waiting for second...");
Thread.sleep(50); // makes the race window obvious in demos
secondLock.lock();
try {
System.out.println(threadName + " transferred funds (deadlock-prone path)");
} finally {
secondLock.unlock();
firstLock.unlock();
}
}
// SAFE version: both threads ALWAYS acquire locks in the same order (alpha → beta)staticvoidtransferSafe(String threadName) throwsInterruptedException {
// Consistent global ordering: always lock accountAlpha before accountBeta.// No matter how many threads call this, circular wait is impossible.
accountAlpha.lock();
try {
System.out.println(threadName + " acquired alpha lock");
Thread.sleep(20);
accountBeta.lock();
try {
System.out.println(threadName + " acquired beta lock — transfer complete!");
} finally {
accountBeta.unlock();
}
} finally {
accountAlpha.unlock(); // always unlock in reverse order of acquisition
}
}
publicstaticvoidmain(String[] args) throwsInterruptedException {
System.out.println("=== Safe Transfer Demo (consistent lock ordering) ===");
Thread sender = newThread(() -> {
try { transferSafe("Sender"); }
catch (InterruptedException e) { Thread.currentThread().interrupt(); }
}, "Sender");
Thread receiver = newThread(() -> {
try { transferSafe("Receiver"); }
catch (InterruptedException e) { Thread.currentThread().interrupt(); }
}, "Receiver");
sender.start();
receiver.start();
sender.join();
receiver.join();
System.out.println("Both transfers completed. No deadlock.");
// To observe the current thread state programmatically:Thread monitorThread = newThread(() -> {
try { Thread.sleep(1000); } // TIMED_WAITING during sleepcatch (InterruptedException e) { Thread.currentThread().interrupt(); }
}, "MonitorThread");
monitorThread.start();
Thread.sleep(10); // let monitorThread enter sleep before we checkSystem.out.println("\nMonitorThread state: " + monitorThread.getState()); // TIMED_WAITING
monitorThread.join();
System.out.println("MonitorThread state: " + monitorThread.getState()); // TERMINATED
}
}
Output
=== Safe Transfer Demo (consistent lock ordering) ===
Sender acquired alpha lock
Sender acquired beta lock — transfer complete!
Receiver acquired alpha lock
Receiver acquired beta lock — transfer complete!
Both transfers completed. No deadlock.
MonitorThread state: TIMED_WAITING
MonitorThread state: TERMINATED
Interview Gold: How do you detect a deadlock in production?
Run 'jstack <PID>' or use JVisualVM to take a thread dump. Look for 'Found one Java-level deadlock' in the output — the JVM actually detects cycles in lock dependency graphs and reports them explicitly. Knowing this command exists will impress interviewers.
Production Insight
Deadlock symptoms: app freezes, thread dumps show circular wait.
Without jstack, you'd restart and never know the root cause.
Always keep a script to take thread dumps on CPU >80% or hung request alerts.
Deadlock is prevented by consistent lock ordering.
jstack is your first tool for diagnosing thread hangs.
Deadlock Prevention Strategies
IfMultiple locks must be acquired
→
UseAlways acquire them in the same global order across all threads.
IfYou cannot guarantee lock order (e.g., calling external library)
→
UseUse ReentrantLock.tryLock() with a timeout and handle failure gracefully (release all locks, retry).
IfShared state is read-mostly
→
UseConsider ReadWriteLock or StampedLock to allow concurrent reads.
Process States and Context Switching — How the OS Manages the Microscopic Juggle
A process isn't always running either. It moves through states: NEW (being created), READY (waiting for CPU), RUNNING (executing on a core), BLOCKED (waiting for I/O or event), and TERMINATED. The OS scheduler moves processes between READY and RUNNING so many times per second that humans perceive concurrency as parallelism.
But this movement has a price: context switching. When the OS swaps one process out and another in, it must save the entire CPU register set, flush the TLB (translation lookaside buffer), and reload the new process's memory mappings. That's why process context switches are heavy (~5–10µs). Thread switches within the same process are lighter (~1–2µs) because they share the same address space, so the TLB usually survives.
Understanding this cost changes how you architect. If you have 200 processes all doing 1ms of work, you'll spend more time switching than computing. That's why event-driven architectures (NGINX, Node.js) or virtual threads exist — they minimise expensive context switches by keeping work on the same thread or using lightweight concurrency.
ContextSwitchSimulator.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import java.util.concurrent.CountDownLatch;
publicclassContextSwitchSimulator {
privatestaticfinalint NUM_PROCESSES = 100;
privatestaticfinalint WORK_UNITS = 100_000;
publicstaticvoidmain(String[] args) throwsInterruptedException {
long start = System.nanoTime();
CountDownLatch latch = newCountDownLatch(NUM_PROCESSES);
// Simulate many processes by spawning many threads (each thread = one process-like workload)for (int i = 0; i < NUM_PROCESSES; i++) {
finalint id = i;
newThread(() -> {
// Simulate CPU work: busy spinlong sum = 0;
for (int j = 0; j < WORK_UNITS; j++) {
sum += j * (id & 7); // artificial work
}
latch.countDown();
}).start();
}
latch.await(); // wait for all threadslong elapsed = System.nanoTime() - start;
System.out.println(NUM_PROCESSES + " threads completed " + WORK_UNITS + " units each in "
+ elapsed / 1_000_000 + " ms");
System.out.println("Average context switch overhead per thread: ~"
+ (elapsed / NUM_PROCESSES / 1000) + " μs (rough estimate)");
// In reality, the OS schedules threads on available cores; context switch overhead is baked in.
}
}
Output
100 threads completed 100000 units each in 212 ms
Average context switch overhead per thread: ~2.12 μs (rough estimate)
Context Switch Analogy
Each recipe has its own ingredients (memory map) and tools (registers).
If the chef switches recipes every minute (time slice), the kitchen loses time to cleanup/setup.
Switching between two dishes from the same cuisine (threads in same process) is faster than switching from Italian to Chinese (different processes).
The scheduler decides the recipe order; too many recipes per second means less cooking, more cleanup.
Production Insight
High context switching (>100k/sec on Linux) is a symptom of oversubscription.
Check with 'vmstat 1' (cs column) or 'perf stat -e context-switches'.
If cs > 50k/sec, reduce thread count or switch to asynchronous processing.
Key Takeaway
Context switching is not free; it costs microseconds.
Process switches are heavier than thread switches.
Measure context switch rate before tuning thread counts.
Minimising Context Switch Impact
IfApplication does small CPU bursts (e.g., 1ms) per request
→
UseBatch work or use event loop (single thread) to avoid switching.
IfApplication does I/O waits (sleep, read, write)
→
UseUse virtual threads or async I/O to block only lightweight entities, not OS threads.
IfYou have many long-running CPU tasks
→
UseSize thread pool to number of cores; don't exceed unless I/O waits are involved.
● Production incidentPOST-MORTEMseverity: high
The Vanishing HTTP Requests – Thread Pool Exhaustion from Blocking I/O Inside Sync Blocks
Symptom
Requests taking >5s, thread dumps showing dozens of threads in BLOCKED state on the same lock, and CPU usage below 20%.
Assumption
The team assumed the database was slow and added connection pool size. No improvement.
Root cause
A synchronized block around the entire request handler included a slow external HTTP call. Every thread waited for the lock, effectively serializing all I/O-bound work.
Fix
Refactored the handler: moved the HTTP call outside the synchronized block, used CompletableFuture for async I/O, and limited the lock only to the shared state update (~2ms).
Key lesson
Blocking I/O inside a synchronized block is a production killer – it reduces concurrency to 1 for that critical section.
Always profile thread states under load before adding more threads; a BLOCKED pileup means lock contention, not thread starvation.
Use 'jstack <pid>' or 'jcmd <pid> Thread.print' to capture thread dumps – look for the thread stack that holds the lock everyone waits on.
Production debug guideSymptom → Action guide for common process/thread problems4 entries
Symptom · 01
High CPU usage but requests are slow
→
Fix
Check for excessive context switching (vmstat 1, look at 'cs' column). If >100k/s, reduce thread count or switch to async I/O.
Symptom · 02
Application hangs, no progress
→
Fix
Take a thread dump (jstack <pid>). Look for threads in BLOCKED state or a 'Found one Java-level deadlock' message.
Symptom · 03
Thread dump shows many threads in WAITING state on a Condition
→
Fix
Find the lock owner thread. If it's stuck in an infinite loop or sleeping with a lock, that's a bug. Use tryLock with timeout to avoid indefinite blocking.
Symptom · 04
Child process never exits or zombie process
→
Fix
Ensure the parent calls waitFor() or handles Process.destroy(). On Linux, check 'ps aux | grep defunct' and kill parent if needed.
★ Quick Debug Cheat Sheet – Process & Thread IssuesCommands to diagnose deadlocks, thread states, and process hangs
Application hanging (suspected deadlock)−
Immediate action
Run jstack <PID> or kill -3 <PID>
Commands
jstack <PID> | grep -A 10 'Found one Java-level deadlock'
jcmd <PID> Thread.print
Fix now
If deadlock found, restart the application and apply consistent lock ordering.
High context switching (cs column in vmstat > 50k/sec)+
Immediate action
Check number of active threads (top -H -p <PID>, count threads)
Commands
vmstat 1 5 | tail -4 | awk '{print $11}'
ps -eLf | wc -l
Fix now
Reduce thread pool size or use virtual threads / async I/O.
Thread in BLOCKED state on a specific lock+
Immediate action
Get the lock owner from thread dump.
Commands
jstack <PID> | grep -B 5 'BLOCKED'
jstack <PID> | grep 'waiting to lock'
Fix now
Refactor to reduce lock hold time or use ReentrantLock.tryLock() with timeout.
Process vs Thread
Aspect
Process
Thread
Memory space
Own private virtual address space
Shared heap with sibling threads
Creation cost
High — OS allocates new address space, PCB, file table
Direct shared memory (fast but needs synchronisation)
Crash isolation
Crash stays contained — other processes unaffected
Unhandled exception can crash the entire process
Context switch cost
High — TLB flush, memory map swap
Lower — same address space, just register state swap
Java creation
ProcessBuilder / Runtime.exec()
new Thread() / Executors / virtual threads
Best for
Fault isolation (microservices, browser tabs)
High-throughput concurrency within one application
Typical overhead
~1–8 MB per process (OS page tables + stack)
~512 KB OS thread; ~1 KB virtual thread (Java 21+)
Key takeaways
1
A process is isolated by design
its own memory space means a crash or bug stays contained. That isolation costs time and memory, so use processes at architectural boundaries (services, browser tabs), not for every concurrent task.
2
Threads share heap memory, which makes communication fast but requires synchronisation discipline. A plain int incremented by two threads without an AtomicInteger or synchronized block WILL produce wrong answers
and not consistently, which is what makes it dangerous.
3
The OS scheduler doesn't run threads in the order you start them. Never write code whose correctness depends on thread execution order. Use join(), CountDownLatch, or CompletableFuture to coordinate, not Thread.sleep() with magic numbers.
4
Java 21 Virtual Threads change the calculus for I/O-bound work
you can now use one-thread-per-request style code without paying the OS thread cost. But for CPU-bound tasks, a fixed thread pool sized to Runtime.getRuntime().availableProcessors() is still the right answer.
5
Context switching is not free
process switches cost ~5-10µs, thread switches ~1-2µs. Measure before you optimise; always profile under realistic load.
Common mistakes to avoid
3 patterns
×
Calling thread.run() instead of thread.start()
Symptom
The thread appears to 'work' but actually runs synchronously on the calling thread, no new thread is ever created. Your code runs sequentially and you wonder why there's no parallelism.
Fix
Always call thread.start() — this is what tells the OS to create a new thread and schedule it. run() is just a regular method call.
×
Sharing mutable state between threads without synchronisation
Symptom
You see intermittent wrong values or stale reads in production that you can't reproduce in tests.
Fix
Use volatile for single-variable visibility, AtomicInteger/AtomicReference for single-variable atomic updates, or synchronized blocks for compound operations. Never assume a write in Thread A is immediately visible to Thread B without a memory barrier.
×
Calling blocking I/O inside a synchronized block
Symptom
You hold a lock while waiting for a network call to return (which may take seconds), blocking every other thread that needs that lock. This turns into a production slowdown under load that looks like deadlock but isn't.
Fix
Do all I/O outside the synchronized block; only lock around the minimal state mutation. Better yet, use java.util.concurrent structures like ConcurrentHashMap that handle their own thread safety.
INTERVIEW PREP · PRACTICE MODE
Interview Questions on This Topic
Q01JUNIOR
What is the difference between a process and a thread, and when would yo...
Q02SENIOR
Explain what a deadlock is and describe a strategy to prevent it without...
Q03SENIOR
What is a race condition, and how is it different from a deadlock? Can y...
Q01 of 03JUNIOR
What is the difference between a process and a thread, and when would you choose one over the other?
ANSWER
A strong answer covers memory isolation, IPC overhead, crash containment, and gives a concrete example: 'I'd use separate processes for a microservice boundary where a crash in the payment service must not bring down the inventory service; I'd use threads within a service to handle concurrent HTTP requests sharing an in-memory cache.'
Q02 of 03SENIOR
Explain what a deadlock is and describe a strategy to prevent it without just 'avoiding locks altogether'.
ANSWER
Interviewers want to hear lock ordering (always acquire in a consistent global sequence), tryLock with timeout (ReentrantLock.tryLock()), and lock-free data structures. Bonus points for mentioning jstack as a diagnostic tool.
Q03 of 03SENIOR
What is a race condition, and how is it different from a deadlock? Can you have both at once?
ANSWER
This trips people up. A race condition is non-deterministic incorrect behaviour caused by unsynchronised access; a deadlock is a permanent standstill. You CAN have both: poor synchronisation attempts that partially protect state can create both a window for races AND introduce lock-order cycles. A strong answer shows you understand they have different root causes and different fixes.
01
What is the difference between a process and a thread, and when would you choose one over the other?
JUNIOR
02
Explain what a deadlock is and describe a strategy to prevent it without just 'avoiding locks altogether'.
SENIOR
03
What is a race condition, and how is it different from a deadlock? Can you have both at once?
SENIOR
FAQ · 5 QUESTIONS
Frequently Asked Questions
01
What happens to child threads when the main thread finishes in Java?
By default, the JVM exits when all non-daemon threads have finished. If your main thread ends but daemon threads are still running, those daemon threads are killed immediately. Worker threads created with new Thread() are non-daemon by default, so the JVM will wait for them. Threads created with virtual thread executors are also non-daemon unless configured otherwise. Call thread.setDaemon(true) before start() to make a thread a daemon.
Was this helpful?
02
Is multi-threading always faster than single-threading?
No — and this is one of the most common misconceptions. Multi-threading adds overhead from thread creation, context switching, and synchronisation. For a task that takes 5ms on a single thread, the overhead of spawning and joining a thread might be 2ms itself, giving you a net loss. Multi-threading pays off when tasks are either long-running, or blocked on I/O, or naturally parallel and large enough that the parallelism gain outweighs coordination cost. Always benchmark before assuming.
Was this helpful?
03
What is the difference between synchronized and ReentrantLock in Java?
Both provide mutual exclusion, but ReentrantLock gives you more control. With ReentrantLock you can call tryLock() to attempt acquisition without blocking indefinitely (critical for deadlock prevention), use lockInterruptibly() so a thread can be interrupted while waiting, and create separate Condition objects for fine-grained wait/notify semantics. Synchronized is simpler and less error-prone for straightforward cases since the lock is always released when the block exits. Prefer synchronized for simple critical sections; reach for ReentrantLock when you need timeouts, interruptibility, or multiple conditions.
Was this helpful?
04
How do you choose between platform threads and virtual threads in Java 21+?
Use platform threads for CPU-bound work. They are true OS threads that can run on separate cores. Use virtual threads for I/O-bound work where threads spend most time waiting. Virtual threads are cheaper (1KB stack vs 512KB) and can be created in millions. However, avoid pinning virtual threads to carrier threads by using synchronized blocks or native methods in tight loops – that defeats the purpose.
Was this helpful?
05
What is the difference between a process and a thread in terms of debugging?
Debbuging a process is easier because each process is isolated: you can attach a debugger independently, and a crash in one process doesn't affect others. Threads share memory, so a bug in one thread can corrupt data used by others. Thread dumps (jstack) show you all threads in the process, but you need to correlate states. For processes, you have separate logs, separate file descriptors, and can restart one without restarting others.