Senior 5 min · March 06, 2026

Blocking I/O Inside Sync Blocks — Thread Management Killer

Requests >5s, threads BLOCKED on one lock, CPU <20% → thread pool exhaustion from blocked I/O inside sync blocks.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • Process: isolated OS unit with own memory, expensive to create
  • Thread: lightweight, shares heap with siblings, faster context-switch
  • Scheduler preempts threads every 1–10ms; context switch has real cost
  • Java: ProcessBuilder for processes, Thread class for threads, Virtual Threads for scale
  • Pitfall: data race from unsynchronised shared state; use AtomicInteger or synchronized
  • Debugging: jstack finds deadlocks; thread dumps show BLOCKED/WAITING states
Plain-English First

Imagine a restaurant kitchen. Each dish on the menu is a process — it has its own ingredients, its own space on the counter, and its own set of instructions. The chefs actually cooking that dish are threads — multiple chefs can work on the same dish at the same time, sharing the same counter space. The head chef (the OS scheduler) decides who cooks what and when, making sure no one burns anything or starves waiting for the stove.

Every time you open Spotify while your browser streams a video and Slack pings you in the background, your operating system is performing a silent juggling act of extraordinary complexity. It's carving up one physical CPU into dozens of seemingly simultaneous workers, each isolated from the others, each convinced it has the machine to itself. This isn't magic — it's process and thread management, and understanding it is the difference between writing code that works and writing code that performs.

Before multi-processing and multi-threading, programs ran one at a time, start to finish. You launched a program, waited, then launched the next one. That was fine for a 1970s mainframe printing payroll. It's catastrophic for a modern web server that needs to handle ten thousand simultaneous HTTP requests. The OS needed a way to isolate programs from each other (so a crashed browser tab doesn't nuke your entire machine) and simultaneously share CPU time fairly among them. Processes and threads are the solution to both problems.

By the end of this article you'll understand exactly what a process and a thread are at the OS level, why threads exist inside processes rather than as standalone units, how the scheduler decides who runs when, how to create and manage both in Java with real runnable code, and — crucially — what goes wrong when you get this wrong. You'll also be ready for the interview questions that trip up even experienced candidates.

What Is a Process — and Why Does the OS Bother Isolating Them?

A process is a running instance of a program. Not the program itself — the .exe or .class file sitting on disk is just instructions. When the OS loads it into memory and starts executing it, that living, breathing execution environment is a process.

Every process gets its own private sandbox: a dedicated chunk of virtual memory (split into code, stack, heap, and data segments), its own file descriptor table, and its own process ID (PID). That isolation is the entire point. If Chrome's renderer crashes, it doesn't corrupt your terminal session, because they live in completely separate address spaces. The OS enforces that wall at the hardware level using the MMU (Memory Management Unit).

Creating a process is expensive. The OS must allocate a new virtual address space, copy or map the program's code, set up a stack, and register the process in the process control block (PCB) — a kernel data structure that tracks everything about that process: its PID, memory maps, open files, CPU register state, and scheduling priority. That overhead is why threads were invented: they give you concurrency at a fraction of the cost.

ProcessInspector.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
import java.lang.management.ManagementFactory;
import java.lang.management.RuntimeMXBean;

public class ProcessInspector {

    public static void main(String[] args) {

        // The JVM exposes the current process's info through the Runtime bean
        RuntimeMXBean runtimeBean = ManagementFactory.getRuntimeMXBean();

        // getPid() is available from Java 9+. It returns this process's OS-level PID.
        long currentPid = ProcessHandle.current().pid();

        // How long has this JVM process been alive in milliseconds?
        long uptimeMs = runtimeBean.getUptime();

        System.out.println("=== Current JVM Process Info ===");
        System.out.println("Process ID (PID)    : " + currentPid);
        System.out.println("JVM Name            : " + runtimeBean.getVmName());
        System.out.println("Process Uptime (ms) : " + uptimeMs);

        // Now let's SPAWN a child process — a completely separate OS process.
        // We'll run the 'java -version' command as its own isolated process.
        System.out.println("\n=== Spawning a Child Process ===");
        try {
            ProcessBuilder processBuilder = new ProcessBuilder("java", "-version");

            // Redirect stderr to stdout so we can read the version output easily
            processBuilder.redirectErrorStream(true);

            Process childProcess = processBuilder.start();

            // Read the child process's output stream
            String output = new String(childProcess.getInputStream().readAllBytes());

            // waitFor() BLOCKS the current thread until the child process terminates
            int exitCode = childProcess.waitFor();

            System.out.println("Child process output : " + output.strip());
            System.out.println("Child exit code      : " + exitCode);
            // Exit code 0 = success. Non-zero = something went wrong.

            // The child has its own PID, separate memory space, and lifecycle
            System.out.println("Child PID            : " + childProcess.pid());
            System.out.println("Parent PID           : " + currentPid);

        } catch (Exception ex) {
            System.err.println("Failed to spawn child process: " + ex.getMessage());
        }
    }
}
Output
=== Current JVM Process Info ===
Process ID (PID) : 18423
JVM Name : OpenJDK 64-Bit Server VM
Process Uptime (ms) : 142
=== Spawning a Child Process ===
Child process output : openjdk version "21.0.2" 2024-01-16
Child exit code : 0
Child PID : 18431
Parent PID : 18423
Why the PIDs differ by ~8:
The OS assigns PIDs sequentially, so other background processes grabbed a few IDs between your parent spawning and the child starting. This is perfectly normal — never assume a child PID is parent+1.
Production Insight
Process isolation adds ~5-10ms overhead per creation due to MMU table setup.
If you spawn a process per request, expect latency spikes.
Rule: reuse processes via pools (like Apache prefork) or use threads for concurrency.
Key Takeaway
A process is a heavy, isolated execution unit.
Use it where fault containment is critical.
For concurrency inside an app, prefer threads – they share memory and context-switch faster.
Process vs Thread in Production
IfYou need crash isolation between components (e.g., payment & inventory)
UseUse separate processes (microservices).
IfYou need high-throughput concurrency within one app (e.g., web server handling requests)
UseUse threads (platform or virtual).
IfYou have 1000+ concurrent I/O-bound tasks
UseUse virtual threads or async I/O – platform threads will saturate scheduler.

Threads — Lightweight Workers That Share the Same Kitchen Counter

A thread is the smallest unit of execution the OS scheduler actually runs. Every process starts with one thread (the main thread). But you can spawn more, and here's the key insight: all threads inside one process share the same heap memory and the same open file handles. They do each get their own stack (for local variables and method call frames) and their own program counter (so each thread knows where it is in the code).

That shared memory is both threads' superpower and their greatest danger. Two threads can communicate by just writing to a shared variable — no sockets, no pipes, no serialisation. But if they both try to modify that variable at the same time without synchronisation, you get a data race, and your program produces wrong answers silently. The OS won't warn you. The compiler won't warn you. It'll just be wrong.

Java makes threading first-class via the Thread class and the Runnable interface, and since Java 21, via Virtual Threads (Project Loom) — lightweight threads managed by the JVM rather than the OS, capable of running millions simultaneously. We'll cover both so you understand the evolution, not just the current API.

ThreadLifecycleDemo.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
import java.util.concurrent.atomic.AtomicInteger;

public class ThreadLifecycleDemo {

    // AtomicInteger is thread-safe. A plain int here would be a data race.
    // We'll demonstrate BOTH to show the difference.
    private static AtomicInteger safeCounter = new AtomicInteger(0);
    private static int unsafeCounter = 0; // <-- this WILL misbehave under concurrency

    public static void main(String[] args) throws InterruptedException {

        System.out.println("Main thread PID  : " + ProcessHandle.current().pid());
        System.out.println("Main thread ID   : " + Thread.currentThread().threadId());
        System.out.println("Main thread name : " + Thread.currentThread().getName());

        // --- Creating threads via Runnable (preferred over extending Thread) ---
        // Runnable separates the TASK from the execution mechanism.
        Runnable incrementTask = () -> {
            for (int i = 0; i < 1000; i++) {
                safeCounter.incrementAndGet();  // atomic: read-modify-write as one operation
                unsafeCounter++;                // NOT atomic: read, then modify, then write separately
            }
            System.out.println("Thread " + Thread.currentThread().getName()
                + " finished. Safe counter now: " + safeCounter.get());
        };

        // Spawn 5 threads all running the same task
        Thread[] workers = new Thread[5];
        for (int i = 0; i < workers.length; i++) {
            workers[i] = new Thread(incrementTask, "Worker-" + (i + 1));
        }

        // Start all threads — OS scheduler decides the actual execution order
        System.out.println("\nLaunching 5 worker threads...");
        for (Thread worker : workers) {
            worker.start(); // Moves thread from NEW state to RUNNABLE state
        }

        // join() blocks main thread until each worker finishes.
        // Without join(), main might print results before workers are done.
        for (Thread worker : workers) {
            worker.join();
        }

        System.out.println("\n=== Final Results (5 threads x 1000 increments = 5000 expected) ===");
        System.out.println("Safe counter   : " + safeCounter.get());   // Always 5000
        System.out.println("Unsafe counter : " + unsafeCounter);        // Probably NOT 5000
    }
}
Output
Main thread PID : 19201
Main thread ID : 1
Main thread name : main
Launching 5 worker threads...
Thread Worker-1 finished. Safe counter now: 2000
Thread Worker-3 finished. Safe counter now: 3000
Thread Worker-2 finished. Safe counter now: 4000
Thread Worker-5 finished. Safe counter now: 4891
Thread Worker-4 finished. Safe counter now: 5000
=== Final Results (5 threads x 1000 increments = 5000 expected) ===
Safe counter : 5000
Unsafe counter : 4347
Watch Out: The unsafe counter won't always give the SAME wrong answer
Data races are non-deterministic. On one run you might get 4347, on the next 4891. That unpredictability is what makes them so dangerous in production — they pass your tests and then fail in the wild under load.
Production Insight
A data race in production often appears as 'intermittent wrong values' under load.
It passes unit tests because single-threaded tests don't trigger the race.
Rule: always use volatile, AtomicX, or synchronized for shared mutable state.
Key Takeaway
Threads share heap – communication is free, but synchronisation is mandatory.
A plain int incremented from two threads will produce wrong answers.
Always use thread-safe primitives or locks for shared mutable state.
Safe Concurrent Access in Java
IfSingle variable updated by multiple threads
UseUse AtomicInteger, AtomicLong, etc. for simplest atomic updates.
IfMultiple variables updated together (compound action)
UseUse synchronized block or ReentrantLock to ensure atomicity.
IfRead-mostly, rare writes
UseUse volatile or ReadWriteLock for higher read throughput.

The OS Scheduler — Who Runs When, and Why It Matters to You

Having threads is great, but if you have 200 threads and only 8 CPU cores, not everyone can run simultaneously. The OS scheduler is the traffic cop that decides which thread runs on which core at any given millisecond.

Modern schedulers (Linux's CFS, Windows' multilevel feedback queue) use a combination of priority, fairness, and time-slicing. Each thread gets a small time slice — typically 1–10ms. When the slice expires, the scheduler preempts the thread (saves its register state into its thread control block) and picks the next candidate. This context switch has a real cost: saving and restoring registers, potentially invalidating CPU cache lines.

This is why spawning thousands of OS threads for a high-throughput server is a bad idea — the scheduler drowns in context switches before your actual work gets done. Java 21's Virtual Threads solve this by using a small pool of OS threads ('carrier threads') to run a huge number of lightweight JVM-managed threads, parking them when they block on I/O instead of consuming an OS thread the whole time.

VirtualThreadDemo.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
import java.time.Duration;
import java.time.Instant;
import java.util.concurrent.Executors;

public class VirtualThreadDemo {

    // Simulates a blocking I/O operation (like a database query or HTTP call)
    private static void simulateDatabaseQuery(int queryId) throws InterruptedException {
        // Thread.sleep() voluntarily yields the thread back to the scheduler.
        // With virtual threads, this PARKS the virtual thread (frees the carrier OS thread)
        // rather than blocking a real OS thread.
        Thread.sleep(50); // pretend this is a 50ms DB round-trip
        System.out.println("Query " + queryId + " complete on: " + Thread.currentThread());
    }

    public static void main(String[] args) throws InterruptedException {

        int numberOfTasks = 500; // try this with platform threads and watch it crawl

        // --- Approach 1: Traditional platform (OS) threads ---
        Instant platformStart = Instant.now();
        try (var platformExecutor = Executors.newFixedThreadPool(50)) {
            // Fixed pool of 50 OS threads handling 500 tasks.
            // At any moment, 450 tasks are waiting in the queue.
            for (int i = 1; i <= numberOfTasks; i++) {
                final int taskId = i;
                platformExecutor.submit(() -> {
                    try { simulateDatabaseQuery(taskId); }
                    catch (InterruptedException e) { Thread.currentThread().interrupt(); }
                });
            }
        } // executor.close() waits for all tasks to finish (Java 19+ AutoCloseable)
        long platformMs = Duration.between(platformStart, Instant.now()).toMillis();

        // --- Approach 2: Virtual threads (Java 21+) ---
        Instant virtualStart = Instant.now();
        try (var virtualExecutor = Executors.newVirtualThreadPerTaskExecutor()) {
            // Creates a NEW virtual thread per task — sounds expensive, but virtual
            // threads are so cheap (~1KB stack) the JVM creates them without hesitation.
            for (int i = 1; i <= numberOfTasks; i++) {
                final int taskId = i;
                virtualExecutor.submit(() -> {
                    try { simulateDatabaseQuery(taskId); }
                    catch (InterruptedException e) { Thread.currentThread().interrupt(); }
                });
            }
        }
        long virtualMs = Duration.between(virtualStart, Instant.now()).toMillis();

        System.out.println("\n=== Throughput Comparison: 500 tasks, each with 50ms I/O ===");
        System.out.println("Platform threads (pool of 50) : " + platformMs + " ms");
        System.out.println("Virtual threads               : " + virtualMs  + " ms");
        System.out.println("Speedup factor                : ~" + (platformMs / Math.max(virtualMs, 1)) + "x");
    }
}
Output
Query 47 complete on: VirtualThread[#52]/runnable@ForkJoinPool-1-worker-3
Query 12 complete on: VirtualThread[#17]/runnable@ForkJoinPool-1-worker-1
... (500 lines of query completions) ...
=== Throughput Comparison: 500 tasks, each with 50ms I/O ===
Platform threads (pool of 50) : 551 ms
Virtual threads : 68 ms
Speedup factor : ~8x
Pro Tip: Virtual threads aren't faster for CPU-bound work
Virtual threads shine when threads spend most of their time waiting (I/O, sleep, locks). If your threads are crunching numbers non-stop, you still want a small pool sized to your CPU core count — more threads than cores means context-switch overhead with no benefit.
Production Insight
Context switches cost ~1-2µs of CPU per switch. At 100k switches/sec, that's 10% CPU waste.
On a 16-core server, 10% waste means 1.6 cores spent just switching.
Rule: keep active threads <= 2x CPU cores for CPU-bound; use async I/O or virtual threads for I/O-bound.
Key Takeaway
The scheduler decides thread order, never assume execution order.
Over 10k platform threads cause scheduler thrashing.
Virtual threads are a game-changer for I/O-bound services, but profile first.
Choosing Thread Type Based on Workload
IfWorkload is CPU-bound (no I/O waits)
UseUse platform threads with pool sized to Runtime.getRuntime().availableProcessors().
IfWorkload is I/O-bound (HTTP calls, DB queries, file reads)
UseUse virtual threads (Java 21+) or async frameworks (CompletableFuture, reactive).
IfMixed workload, need legacy Java version (<21)
UseUse a larger platform thread pool (e.g., 200 threads for a 16-core machine) but monitor context switching.

Thread States, Synchronisation, and Avoiding Deadlock

A thread isn't just 'running' or 'not running'. It moves through a state machine: NEW (created but not started), RUNNABLE (eligible to run, may or may not be on a core right now), BLOCKED (waiting to acquire a monitor lock), WAITING (parked via wait() or join() with no timeout), TIMED_WAITING (parked with a timeout, like sleep()), and TERMINATED (finished).

Understanding these states is critical for debugging. If a thread is stuck in BLOCKED for a long time, it's fighting for a lock. If it's in WAITING forever, something forgot to call notify(). Thread dumps — printable via kill -3 on Linux or jstack — show you every thread's state and stack trace at a point in time. That's how you diagnose production hangs.

Deadlock is the most feared concurrency bug: Thread A holds Lock 1 and waits for Lock 2, while Thread B holds Lock 2 and waits for Lock 1. Neither can proceed. The fix is to always acquire multiple locks in a consistent global order across all threads — if everyone agrees 'Lock 1 before Lock 2', the circular dependency is impossible.

DeadlockPreventionDemo.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;

public class DeadlockPreventionDemo {

    // Two shared resources — imagine these are bank accounts
    private static final Lock accountAlpha = new ReentrantLock();
    private static final Lock accountBeta  = new ReentrantLock();

    // DEADLOCK-PRONE version: each thread acquires locks in OPPOSITE order
    static void transferDeadlockProne(String threadName, boolean reverseOrder)
            throws InterruptedException {
        Lock firstLock  = reverseOrder ? accountBeta  : accountAlpha;
        Lock secondLock = reverseOrder ? accountAlpha : accountBeta;

        firstLock.lock();
        System.out.println(threadName + " acquired first lock, waiting for second...");
        Thread.sleep(50); // makes the race window obvious in demos
        secondLock.lock();
        try {
            System.out.println(threadName + " transferred funds (deadlock-prone path)");
        } finally {
            secondLock.unlock();
            firstLock.unlock();
        }
    }

    // SAFE version: both threads ALWAYS acquire locks in the same order (alpha → beta)
    static void transferSafe(String threadName) throws InterruptedException {
        // Consistent global ordering: always lock accountAlpha before accountBeta.
        // No matter how many threads call this, circular wait is impossible.
        accountAlpha.lock();
        try {
            System.out.println(threadName + " acquired alpha lock");
            Thread.sleep(20);
            accountBeta.lock();
            try {
                System.out.println(threadName + " acquired beta lock — transfer complete!");
            } finally {
                accountBeta.unlock();
            }
        } finally {
            accountAlpha.unlock(); // always unlock in reverse order of acquisition
        }
    }

    public static void main(String[] args) throws InterruptedException {

        System.out.println("=== Safe Transfer Demo (consistent lock ordering) ===");

        Thread sender   = new Thread(() -> {
            try { transferSafe("Sender");   }
            catch (InterruptedException e) { Thread.currentThread().interrupt(); }
        }, "Sender");

        Thread receiver = new Thread(() -> {
            try { transferSafe("Receiver"); }
            catch (InterruptedException e) { Thread.currentThread().interrupt(); }
        }, "Receiver");

        sender.start();
        receiver.start();
        sender.join();
        receiver.join();

        System.out.println("Both transfers completed. No deadlock.");

        // To observe the current thread state programmatically:
        Thread monitorThread = new Thread(() -> {
            try { Thread.sleep(1000); } // TIMED_WAITING during sleep
            catch (InterruptedException e) { Thread.currentThread().interrupt(); }
        }, "MonitorThread");

        monitorThread.start();
        Thread.sleep(10); // let monitorThread enter sleep before we check
        System.out.println("\nMonitorThread state: " + monitorThread.getState()); // TIMED_WAITING
        monitorThread.join();
        System.out.println("MonitorThread state: " + monitorThread.getState()); // TERMINATED
    }
}
Output
=== Safe Transfer Demo (consistent lock ordering) ===
Sender acquired alpha lock
Sender acquired beta lock — transfer complete!
Receiver acquired alpha lock
Receiver acquired beta lock — transfer complete!
Both transfers completed. No deadlock.
MonitorThread state: TIMED_WAITING
MonitorThread state: TERMINATED
Interview Gold: How do you detect a deadlock in production?
Run 'jstack <PID>' or use JVisualVM to take a thread dump. Look for 'Found one Java-level deadlock' in the output — the JVM actually detects cycles in lock dependency graphs and reports them explicitly. Knowing this command exists will impress interviewers.
Production Insight
Deadlock symptoms: app freezes, thread dumps show circular wait.
Without jstack, you'd restart and never know the root cause.
Always keep a script to take thread dumps on CPU >80% or hung request alerts.
Key Takeaway
Know thread states: NEW, RUNNABLE, BLOCKED, WAITING, TIMED_WAITING, TERMINATED.
Deadlock is prevented by consistent lock ordering.
jstack is your first tool for diagnosing thread hangs.
Deadlock Prevention Strategies
IfMultiple locks must be acquired
UseAlways acquire them in the same global order across all threads.
IfYou cannot guarantee lock order (e.g., calling external library)
UseUse ReentrantLock.tryLock() with a timeout and handle failure gracefully (release all locks, retry).
IfShared state is read-mostly
UseConsider ReadWriteLock or StampedLock to allow concurrent reads.

Process States and Context Switching — How the OS Manages the Microscopic Juggle

A process isn't always running either. It moves through states: NEW (being created), READY (waiting for CPU), RUNNING (executing on a core), BLOCKED (waiting for I/O or event), and TERMINATED. The OS scheduler moves processes between READY and RUNNING so many times per second that humans perceive concurrency as parallelism.

But this movement has a price: context switching. When the OS swaps one process out and another in, it must save the entire CPU register set, flush the TLB (translation lookaside buffer), and reload the new process's memory mappings. That's why process context switches are heavy (~5–10µs). Thread switches within the same process are lighter (~1–2µs) because they share the same address space, so the TLB usually survives.

Understanding this cost changes how you architect. If you have 200 processes all doing 1ms of work, you'll spend more time switching than computing. That's why event-driven architectures (NGINX, Node.js) or virtual threads exist — they minimise expensive context switches by keeping work on the same thread or using lightweight concurrency.

ContextSwitchSimulator.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import java.util.concurrent.CountDownLatch;

public class ContextSwitchSimulator {

    private static final int NUM_PROCESSES = 100;
    private static final int WORK_UNITS = 100_000;

    public static void main(String[] args) throws InterruptedException {

        long start = System.nanoTime();
        CountDownLatch latch = new CountDownLatch(NUM_PROCESSES);

        // Simulate many processes by spawning many threads (each thread = one process-like workload)
        for (int i = 0; i < NUM_PROCESSES; i++) {
            final int id = i;
            new Thread(() -> {
                // Simulate CPU work: busy spin
                long sum = 0;
                for (int j = 0; j < WORK_UNITS; j++) {
                    sum += j * (id & 7); // artificial work
                }
                latch.countDown();
            }).start();
        }

        latch.await(); // wait for all threads
        long elapsed = System.nanoTime() - start;
        System.out.println(NUM_PROCESSES + " threads completed " + WORK_UNITS + " units each in " 
            + elapsed / 1_000_000 + " ms");
        System.out.println("Average context switch overhead per thread: ~" 
            + (elapsed / NUM_PROCESSES / 1000) + " μs (rough estimate)");
        // In reality, the OS schedules threads on available cores; context switch overhead is baked in.
    }
}
Output
100 threads completed 100000 units each in 212 ms
Average context switch overhead per thread: ~2.12 μs (rough estimate)
Context Switch Analogy
  • Each recipe has its own ingredients (memory map) and tools (registers).
  • If the chef switches recipes every minute (time slice), the kitchen loses time to cleanup/setup.
  • Switching between two dishes from the same cuisine (threads in same process) is faster than switching from Italian to Chinese (different processes).
  • The scheduler decides the recipe order; too many recipes per second means less cooking, more cleanup.
Production Insight
High context switching (>100k/sec on Linux) is a symptom of oversubscription.
Check with 'vmstat 1' (cs column) or 'perf stat -e context-switches'.
If cs > 50k/sec, reduce thread count or switch to asynchronous processing.
Key Takeaway
Context switching is not free; it costs microseconds.
Process switches are heavier than thread switches.
Measure context switch rate before tuning thread counts.
Minimising Context Switch Impact
IfApplication does small CPU bursts (e.g., 1ms) per request
UseBatch work or use event loop (single thread) to avoid switching.
IfApplication does I/O waits (sleep, read, write)
UseUse virtual threads or async I/O to block only lightweight entities, not OS threads.
IfYou have many long-running CPU tasks
UseSize thread pool to number of cores; don't exceed unless I/O waits are involved.
● Production incidentPOST-MORTEMseverity: high

The Vanishing HTTP Requests – Thread Pool Exhaustion from Blocking I/O Inside Sync Blocks

Symptom
Requests taking >5s, thread dumps showing dozens of threads in BLOCKED state on the same lock, and CPU usage below 20%.
Assumption
The team assumed the database was slow and added connection pool size. No improvement.
Root cause
A synchronized block around the entire request handler included a slow external HTTP call. Every thread waited for the lock, effectively serializing all I/O-bound work.
Fix
Refactored the handler: moved the HTTP call outside the synchronized block, used CompletableFuture for async I/O, and limited the lock only to the shared state update (~2ms).
Key lesson
  • Blocking I/O inside a synchronized block is a production killer – it reduces concurrency to 1 for that critical section.
  • Always profile thread states under load before adding more threads; a BLOCKED pileup means lock contention, not thread starvation.
  • Use 'jstack <pid>' or 'jcmd <pid> Thread.print' to capture thread dumps – look for the thread stack that holds the lock everyone waits on.
Production debug guideSymptom → Action guide for common process/thread problems4 entries
Symptom · 01
High CPU usage but requests are slow
Fix
Check for excessive context switching (vmstat 1, look at 'cs' column). If >100k/s, reduce thread count or switch to async I/O.
Symptom · 02
Application hangs, no progress
Fix
Take a thread dump (jstack <pid>). Look for threads in BLOCKED state or a 'Found one Java-level deadlock' message.
Symptom · 03
Thread dump shows many threads in WAITING state on a Condition
Fix
Find the lock owner thread. If it's stuck in an infinite loop or sleeping with a lock, that's a bug. Use tryLock with timeout to avoid indefinite blocking.
Symptom · 04
Child process never exits or zombie process
Fix
Ensure the parent calls waitFor() or handles Process.destroy(). On Linux, check 'ps aux | grep defunct' and kill parent if needed.
★ Quick Debug Cheat Sheet – Process & Thread IssuesCommands to diagnose deadlocks, thread states, and process hangs
Application hanging (suspected deadlock)
Immediate action
Run jstack <PID> or kill -3 <PID>
Commands
jstack <PID> | grep -A 10 'Found one Java-level deadlock'
jcmd <PID> Thread.print
Fix now
If deadlock found, restart the application and apply consistent lock ordering.
High context switching (cs column in vmstat > 50k/sec)+
Immediate action
Check number of active threads (top -H -p <PID>, count threads)
Commands
vmstat 1 5 | tail -4 | awk '{print $11}'
ps -eLf | wc -l
Fix now
Reduce thread pool size or use virtual threads / async I/O.
Thread in BLOCKED state on a specific lock+
Immediate action
Get the lock owner from thread dump.
Commands
jstack <PID> | grep -B 5 'BLOCKED'
jstack <PID> | grep 'waiting to lock'
Fix now
Refactor to reduce lock hold time or use ReentrantLock.tryLock() with timeout.
Process vs Thread
AspectProcessThread
Memory spaceOwn private virtual address spaceShared heap with sibling threads
Creation costHigh — OS allocates new address space, PCB, file tableLow — shares parent process resources
CommunicationIPC: pipes, sockets, shared memory (explicit, slow)Direct shared memory (fast but needs synchronisation)
Crash isolationCrash stays contained — other processes unaffectedUnhandled exception can crash the entire process
Context switch costHigh — TLB flush, memory map swapLower — same address space, just register state swap
Java creationProcessBuilder / Runtime.exec()new Thread() / Executors / virtual threads
Best forFault isolation (microservices, browser tabs)High-throughput concurrency within one application
Typical overhead~1–8 MB per process (OS page tables + stack)~512 KB OS thread; ~1 KB virtual thread (Java 21+)

Key takeaways

1
A process is isolated by design
its own memory space means a crash or bug stays contained. That isolation costs time and memory, so use processes at architectural boundaries (services, browser tabs), not for every concurrent task.
2
Threads share heap memory, which makes communication fast but requires synchronisation discipline. A plain int incremented by two threads without an AtomicInteger or synchronized block WILL produce wrong answers
and not consistently, which is what makes it dangerous.
3
The OS scheduler doesn't run threads in the order you start them. Never write code whose correctness depends on thread execution order. Use join(), CountDownLatch, or CompletableFuture to coordinate, not Thread.sleep() with magic numbers.
4
Java 21 Virtual Threads change the calculus for I/O-bound work
you can now use one-thread-per-request style code without paying the OS thread cost. But for CPU-bound tasks, a fixed thread pool sized to Runtime.getRuntime().availableProcessors() is still the right answer.
5
Context switching is not free
process switches cost ~5-10µs, thread switches ~1-2µs. Measure before you optimise; always profile under realistic load.

Common mistakes to avoid

3 patterns
×

Calling thread.run() instead of thread.start()

Symptom
The thread appears to 'work' but actually runs synchronously on the calling thread, no new thread is ever created. Your code runs sequentially and you wonder why there's no parallelism.
Fix
Always call thread.start() — this is what tells the OS to create a new thread and schedule it. run() is just a regular method call.
×

Sharing mutable state between threads without synchronisation

Symptom
You see intermittent wrong values or stale reads in production that you can't reproduce in tests.
Fix
Use volatile for single-variable visibility, AtomicInteger/AtomicReference for single-variable atomic updates, or synchronized blocks for compound operations. Never assume a write in Thread A is immediately visible to Thread B without a memory barrier.
×

Calling blocking I/O inside a synchronized block

Symptom
You hold a lock while waiting for a network call to return (which may take seconds), blocking every other thread that needs that lock. This turns into a production slowdown under load that looks like deadlock but isn't.
Fix
Do all I/O outside the synchronized block; only lock around the minimal state mutation. Better yet, use java.util.concurrent structures like ConcurrentHashMap that handle their own thread safety.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR
What is the difference between a process and a thread, and when would yo...
Q02SENIOR
Explain what a deadlock is and describe a strategy to prevent it without...
Q03SENIOR
What is a race condition, and how is it different from a deadlock? Can y...
Q01 of 03JUNIOR

What is the difference between a process and a thread, and when would you choose one over the other?

ANSWER
A strong answer covers memory isolation, IPC overhead, crash containment, and gives a concrete example: 'I'd use separate processes for a microservice boundary where a crash in the payment service must not bring down the inventory service; I'd use threads within a service to handle concurrent HTTP requests sharing an in-memory cache.'
FAQ · 5 QUESTIONS

Frequently Asked Questions

01
What happens to child threads when the main thread finishes in Java?
02
Is multi-threading always faster than single-threading?
03
What is the difference between synchronized and ReentrantLock in Java?
04
How do you choose between platform threads and virtual threads in Java 21+?
05
What is the difference between a process and a thread in terms of debugging?
🔥

That's Operating Systems. Mark it forged?

5 min read · try the examples if you haven't

Previous
Introduction to Operating Systems
2 / 12 · Operating Systems
Next
Process Scheduling Algorithms