Intermediate 7 min · March 06, 2026

OS Interview Questions - Oversized Heap Triggers Swap Storm

Q: What are the four necessary conditions for a Deadlock?

For a deadlock to occur, these four (Coffman) conditions must hold true simultaneously: 1. Mutual Exclusion (non-shareable resources), 2. Hold and Wait (holding a resource while waiting for another), 3. No Preemption (resources cannot be forcibly taken), and 4. Circular Wait (a chain of processes waiting for each other).

Q: What is 'Thrashing' in an operating system?

Thrashing occurs when the OS is constantly swapping pages between RAM and Disk because the set of active pages (Working Set) is larger than the available physical memory. This causes CPU utilization to plummet because the system is always waiting for I/O.

Q: What is a Kernel and how is it different from an OS?

The Kernel is the core part of the OS that manages hardware (CPU, Memory, Devices). The OS includes the Kernel plus the system utilities, GUI, and shell that allow a user to actually interact with the machine.

Q: How does a process's memory layout look?

A typical process memory layout from low to high: Text (code), Data (global variables), Heap (dynamically allocated, grows upward), Stack (local variables, grows downward). This layout allows the heap and stack to expand toward each other.

Q: What is the difference between preemptive and non-preemptive scheduling?

Preemptive scheduling: the OS can interrupt a running process (via timer interrupt) and reassign the CPU to another. Non-preemptive: a process yields CPU voluntarily (by I/O, termination, or calling yield). Preemptive provides better responsiveness but requires more complex synchronization.

Latency spikes (2ms to >5s) from swap thrashing: JVM heap exceeded container limit.

Naren Founder & Principal Engineer

20+ years shipping production systems from the metal up. Lessons pulled from things that broke in production.

✓ Production

production tested

July 19, 2026

last updated

2,466

articles · all by Naren

Before you start⏱ 25 min

✓Solid grasp of fundamentals
✓Comfortable reading code examples
✓Basic production concepts

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

The OS coordinates hardware resource sharing among competing processes and threads.
Processes own isolated memory; threads share heap and code within a process.
Context switching processes flushes TLB; threads switch faster but risk races.
Virtual memory maps pages to frames; page faults trigger disk I/O.
Thrashing occurs when working set exceeds RAM — system spends more time swapping than computing.

✦ Definition~90s read

What is OS Interview Questions?

This article dissects a classic OS interview question that asks what happens when you allocate a heap larger than physical RAM. The trap is thinking 'it just swaps' — the real answer is that an oversized heap triggers a swap storm, where the OS thrashes between memory and disk, collapsing throughput by orders of magnitude.

★

Think of your operating system as the manager of a very busy restaurant kitchen.

It tests whether you understand that virtual memory is not free, that page faults have real latency (typically 10-20ms for disk I/O vs 100ns for RAM), and that the OS's page replacement algorithm (e.g., LRU approximation in Linux) will evict actively used pages to make room for heap metadata you never touch. The question separates engineers who've debugged production OOM situations from those who only know textbook definitions.

The article then builds the foundation you need to answer this correctly: process vs. thread as the unit of execution (threads share address space, processes don't), CPU scheduling policies (CFS in Linux, how nice values affect time slices), and the four Coffman conditions for deadlock (mutual exclusion, hold-and-wait, no preemption, circular wait) with practical break strategies like lock ordering or trylock. It covers paged virtual memory — how the MMU translates virtual addresses through multi-level page tables (e.g., x86-64's 4-level page walk) and why TLB misses hurt.

Finally, it quantifies context switch cost: a direct measurement on modern hardware shows ~1-5µs for a thread switch (saving/restoring registers, flushing TLB) vs ~10-100µs for a process switch (changing page table base register), and explains how to measure it with perf stat or getrusage to avoid guessing. When you understand these pieces, you see why a 16GB heap on an 8GB machine doesn't just 'work slower' — it brings the system to its knees.

Plain-English First

Think of your operating system as the manager of a very busy restaurant kitchen. The kitchen (hardware) can only do so much at once — it has limited burners (CPU cores), counter space (RAM), and storage shelves (disk). The OS manager decides who cooks what, when, and how much counter space each chef gets. When two chefs both reach for the same knife at the same time and neither will let go — that's a deadlock. When a chef needs ingredients from the walk-in fridge but it's far away — that's like hitting disk swap instead of RAM. Every OS concept maps back to this one idea: fairly and efficiently sharing limited resources among competing demands.

Operating system questions are the great equaliser in technical interviews. Whether you're going for a backend role, a systems position, or a cloud engineering job, interviewers reach for OS concepts because they reveal whether you actually understand what happens beneath your code — or whether you've just been writing for loops and calling it engineering. A candidate who understands why a context switch is expensive will write fundamentally different (and better) concurrent code than one who doesn't.

The OS bridges the gap between raw hardware and the applications we write every day. It solves an otherwise impossible coordination problem: dozens of programs all want the CPU, all want memory, all want to read files simultaneously — and the OS makes that work without them knowing about each other. Without it, every application would need to implement its own hardware drivers, scheduling logic, and memory allocation — chaos.

By the end of this article you'll be able to answer the most commonly asked OS interview questions with confidence and depth. You'll understand not just what processes, threads, scheduling, deadlocks, and virtual memory are, but why they were designed that way — which is what separates a good answer from a great one in any technical interview.

What an OS Interview Question About Heap Oversizing Actually Tests

An OS interview question about oversized heaps triggering swap storms tests your understanding of how virtual memory and garbage collection interact under memory pressure. The core mechanic: when a Java heap exceeds available physical RAM, the OS starts swapping pages between RAM and disk. The GC must then traverse swapped-out objects, causing page faults that stall threads for disk I/O — often 10–100x slower than RAM access. This creates a feedback loop where GC time skyrockets, throughput collapses, and the system becomes unresponsive.

Key properties: swap storms are not a GC tuning problem — they are a physical memory provisioning failure. Even with a perfectly tuned GC, if the heap is larger than available RAM minus OS and application overhead, the system will thrash. The JVM’s -Xmx flag sets the heap limit, but the OS doesn’t enforce it — it just swaps. Monitoring tools like vmstat show si and so columns spiking, while GC logs show excessive System.gc() calls or long pause times with no clear GC cause.

When to use this knowledge: in any production sizing decision, especially for latency-sensitive services. The rule is simple: total heap + metaspace + thread stacks + OS cache must fit comfortably in physical RAM. If you oversubscribe, you get swap storms — not gradual degradation, but sudden collapse. This is why teams set -Xmx to 80% of container memory and monitor vmstat proactively.

⚠ Swap Is Not a Safety Valve

Swap is not a fallback for heap oversizing — it's a performance cliff. Once swapping starts, GC pauses jump from milliseconds to seconds, and recovery requires restarting the JVM.

📊 Production Insight

A team set -Xmx=8GB on a 4GB container, thinking swap would absorb spikes. Within minutes, GC pauses hit 30 seconds, health checks failed, and the entire cluster was restarted.

Symptom: GC logs show long pauses with no GC cause, vmstat shows si/so > 1000 blocks/s, and application latency spikes from 10ms to 10s.

Rule of thumb: never set heap larger than 80% of container memory; monitor si/so in production and alert if sustained > 100 blocks/s.

🎯 Key Takeaway

Swap storms are a memory provisioning failure, not a GC tuning problem.

Total heap + metaspace + thread stacks + OS cache must fit in physical RAM.

Monitor vmstat si/so and set heap to ≤80% of container memory to avoid the cliff.

thecodeforge.io

Os Interview Questions

Process vs. Thread: The Unit of Execution

One of the most frequent senior-level questions is the architectural difference between a Process and a Thread. A Process is an independent program in execution with its own dedicated memory space (Stack, Heap, Data). A Thread is the smallest unit of execution within a process; all threads of a single process share the same Heap and Code segment but have their own separate Stacks.

From an interviewer's perspective, the 'aha!' moment comes when you discuss Context Switching overhead. Switching between processes is expensive because the OS must flush CPU caches and reload memory maps (TLB). Switching between threads is 'cheaper' but introduces the risk of race conditions, requiring careful synchronization using Mutexes or Semaphores.

io/thecodeforge/os/ConcurrencyDemo.javaJAVA

package io.thecodeforge.os;

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

/**
 * TheCodeForge — Demonstrating Thread vs Process mental model.
 * Threads share the same 'Heap' (the counter in this example).
 */
public class ConcurrencyDemo {
    private static int sharedCounter = 0;

    public static void main(String[] args) {
        // Using a Thread Pool to manage units of execution
        try (ExecutorService executor = Executors.newFixedThreadPool(2)) {
            for (int i = 0; i < 1000; i++) {
                executor.submit(() -> {
                    // Critical Section: Without synchronization, this is a Race Condition
                    synchronized (ConcurrencyDemo.class) {
                        sharedCounter++;
                    }
                });
            }
        }
        System.out.println("Final Shared Counter: " + sharedCounter);
    }
}

Output

Final Shared Counter: 1000

🔥Forge Tip: Zombie Processes

A 'Zombie' is a process that has finished execution but still has an entry in the process table. It happens because the parent hasn't read its exit status yet. They don't consume memory, but they do consume a PID—and if the process table fills up, no new processes can start.

📊 Production Insight

Thousands of threads in a JVM cause high context switching, not parallelism.

One heavily-threaded app increased CPU from 30% to 95% just on switching.

Rule: keep thread count <= number of cores for CPU-bound tasks.

🎯 Key Takeaway

Threads share memory, processes don't.

Context switching processes is 10-100x more expensive than threads.

Threads need synchronization; processes use IPC (pipes, sockets).

CPU Scheduling: How the OS Decides Who Runs Next

The CPU scheduler decides which process in the ready queue gets the CPU. Senior interview questions go beyond naming algorithms — they expect you to talk about trade-offs: throughput vs response time, fairness vs efficiency.

Round Robin (quantum = 10-100ms) is the most common time-sharing scheduler. It's fair but can have high switching overhead if quantum is too small. Completely Fair Scheduler (CFS) in Linux uses a red-black tree and targets a weighted fair share based on 'nice' values. CF Schedules deadlines: SCHED_DEADLINE (EDF) for real-time tasks.

A key production insight: Batch jobs can starve interactive tasks if the scheduler isn't tuned. Setting kernel.sched_latency_ns and sched_min_granularity_ns can reduce tail latency in mixed workloads.

io/thecodeforge/os/scheduler_tuning.shBASH

# TheCodeForge — View and tune Linux CPU scheduler parameters

# Check current scheduler metrics (for a specific PID's task group)
cat /proc/<pid>/sched

# Check overall scheduler statistics
cat /proc/schedstat

# Tune CFS latency (default 6ms, can reduce for low-latency apps)
echo 3000000 > /proc/sys/kernel/sched_latency_ns   # 3ms
echo 750000 > /proc/sys/kernel/sched_min_granularity_ns

# Set scheduler policy for real-time processes (e.g., Java with -XX:+UseCriticalCMSThreads)
chrt -f -p 99 <pid>   # SCHED_FIFO priority 99
chrt -r -p 50 <pid>   # SCHED_RR priority 50

Output

cpu0 : 123456789 active / 987654321 idle

cpu1 : 234567890 active / 876543210 idle

Mental Model

Scheduling as Traffic Control

Think of the scheduler as an air traffic controller — it decides which plane gets the runway and for how long.

Processes are flights requesting takeoff (ready queue).
Round Robin = planes take turns for fixed slots; fair but wastes time on taxi.
Priority Scheduling = priority flights go first; lower-priority flights can starve.
CFS = each flight gets a proportional share based on weight (nice value).
Context switch = time to move plane from gate to runway; too many switches jams the airport.

📊 Production Insight

A web server with 500 threads on 8 cores spends 40% CPU on context switching.

Switch to NIO (epoll) and reduce threads to 8 — same throughput, 1/10 the CPU.

Always measure with vmstat -w 1 and pidstat -w.

🎯 Key Takeaway

Scheduling algorithms trade fairness for throughput.

Round Robin is simple but wastes CPU on too-frequent switching.

CFS is the Linux default — understand its latency targets.

thecodeforge.io

Os Interview Questions

Deadlock: The Four Conditions and How to Break Them

A deadlock happens when two or more processes are each waiting for a resource that the other holds. The Coffman conditions must all hold simultaneously: Mutual Exclusion, Hold and Wait, No Preemption, Circular Wait. In interviews, you need to explain each and then discuss prevention, avoidance, detection, and recovery.

Prevention: Break one condition. For example, allow preemption (force a process to release resource) or require all resources upfront (Hold and Wait broken). Avoidance: Use Banker's Algorithm or resource allocation graph analysis. Detection: Build a wait-for graph and look for cycles. Recovery: Kill one process or preempt resources.

Production example: A Java application using two database connections with mixed ordering caused a deadlock that brought down a payment service. The fix was to always acquire connections in a fixed global order.

io/thecodeforge/os/DeadlockDemo.javaJAVA

package io.thecodeforge.os;

import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;

/**
 * TheCodeForge — Demonstrates classic deadlock and the fixed-order fix.
 */
public class DeadlockDemo {
    private static final Lock lockA = new ReentrantLock();
    private static final Lock lockB = new ReentrantLock();

    public static void main(String[] args) {
        Thread t1 = new Thread(() -> deadlockSequence1());
        Thread t2 = new Thread(() -> deadlockSequence2());
        t1.start(); t2.start();
    }

    // This order causes deadlock: lockA then lockB in both? No, here it's swapped.
    static void deadlockSequence1() {
        lockA.lock();
        try { Thread.sleep(50); } catch (InterruptedException e) {}
        lockB.lock();
        try { System.out.println("Thread1 got both locks"); }
        finally { lockB.unlock(); lockA.unlock(); }
    }

    static void deadlockSequence2() {
        lockB.lock();
        try { Thread.sleep(50); } catch (InterruptedException e) {}
        lockA.lock();
        try { System.out.println("Thread2 got both locks"); }
        finally { lockA.unlock(); lockB.unlock(); }
    }
}

Output

(will hang — no output, threads stuck)

📊 Production Insight

Deadlocks in production are quiet — app just stops processing requests.

We once had a database deadlock caused by two services updating the same table in opposite orders.

Fix: enforce global ordering of table access across all services.

🎯 Key Takeaway

Four conditions required: Mutual Exclusion, Hold and Wait, No Preemption, Circular Wait.

Break any one to prevent deadlock.

Fixed ordering of locks is the simplest prevention — use it.

Choosing a Deadlock Strategy

IfSystem is safety-critical and cannot tolerate restarts

→

UseAvoidance (Banker's Algorithm) or prevention (fixed locking order)

IfDeadlocks are rare but costly when they happen

→

UseDetection + Recovery (kill process, preempt resource) — typical in DBMS

IfResources are shareable (e.g., read-only files)

→

UseNo deadlock possible — skip prevention

IfSystem can afford preemption and rollback (e.g., transactions)

→

UseUse transaction timeouts and deadlock detection in DB

Memory Management: Paging and Virtual Memory

Why doesn't your app crash the moment you run out of physical RAM? The answer is Virtual Memory. The OS gives every process the illusion that it has a large, contiguous block of memory. In reality, this memory is broken into fixed-size 'Pages'. The OS maps these Virtual Pages to physical 'Frames' in RAM using a Page Table.

When a program tries to access a page that isn't currently in RAM, a Page Fault occurs. The OS then fetches that page from the Disk (Swap space). Senior engineers are expected to know that frequent page faults lead to Thrashing—where the system spends more time swapping pages than actually executing code.

io/thecodeforge/os/check_memory.shBASH

# TheCodeForge — Production OS Diagnostics

# 1. Check system virtual memory statistics (Look for 'si' and 'so' - swap in/out)
vmstat 1 5

# 2. View process-specific memory mappings (Stack, Heap, Libraries)
# Replace [PID] with your Java application's process ID
cat /proc/[PID]/maps | head -n 10

# 3. Check for OOM (Out Of Memory) kills in system logs
dmesg | grep -i "oom-kill"

Output

procs -----------memory---------- ---swap--

r b swpd free buff cache si so

1 0 0 824512 45120 125410 0 0

⚠ Interview Gold: The TLB

When asked how memory mapping is kept fast, mention the Translation Lookaside Buffer (TLB). It's a hardware cache for the Page Table. A TLB miss is the hidden cost behind frequent context switching.

📊 Production Insight

A TLB miss costs ~10-100 cycles; a page fault costs millions.

Running too many processes context-switches rapidly, flushing each TLB.

Use huge pages (2MB or 1GB) to reduce TLB miss rate for large data sets.

🎯 Key Takeaway

Virtual memory = page tables + swap.

Page faults are expensive — avoid thrashing by sizing working set to fit RAM.

TLB misses degrade performance more than most engineers realise.

The Cost of Context Switching and How to Measure It

A context switch is the OS's act of saving state of one process/thread and loading another. It's not just CPU registers — the TLB must be flushed (for processes), CPU caches warm up again, and the kernel scheduler runs. A direct context switch (process) can cost 1-10 microseconds, but the indirect costs (cache misses) can add 100s of microseconds to subsequent instructions.

In production, high context switching often means your thread pool is too large. A typical Java web server with sync I/O and 200 threads on 8 cores will spend more time switching than executing. The fix: tune thread pool size to number of cores for CPU-bound tasks; for I/O-bound use threads = cores / (1 - blocking coefficient).

Tools: vmstat -w 1 shows context switches per second (cs column). pidstat -w shows per-process voluntary vs involuntary switches. perf stat -e context-switches gives precise counts.

io/thecodeforge/os/measure_ctxt.shBASH

# TheCodeForge — Measure context switch overhead

# 1. Watch global context switch rate
vmstat -w 1 | awk '{print $14}' # cs column

# 2. Per-process context switches (voluntary and involuntary)
pidstat -w -I -p <PID> 1 5

# 3. Measure cost of single context switch (approx)
# Use a simple benchmark with taskset and strace:
taskset -c 0 perf stat -e context-switches,cpu-migrations ./your_app

Output

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----

r b swpd free buff cache si so bi bo in cs us sy id wa st

1 0 0 124512 45120 125410 0 0 0 0 100 500 12 5 83 0 0

📊 Production Insight

We saw a service with 500 threads on 4 cores — context switches were 250k/s.

Switched to async I/O (Netty) with 8 worker threads — switches dropped to 5k/s, latency halved.

Always correlate cs with us/sy — if sy (system CPU) is high, you're likely switching too much.

🎯 Key Takeaway

Measure before you optimise: vmstat, pidstat -w.

Too many threads = tax on context switching, not productivity.

Async I/O eliminates most context switches for I/O-bound apps.

Multithreading: Why Shared State Is the Root of All Production Evil

Multithreading looks good on paper. You split work across threads, and the CPU finishes faster. That's the promise. The reality is that threads share memory. That shared heap becomes a minefield the moment two threads write to the same variable. You get data races, corrupted state, and bugs that disappear the second you attach a debugger. Interviewers ask about multithreading because they want to know if you've been burned. They want you to explain that threads give you parallelism at the cost of synchronization complexity. The pros are clear: better resource utilization, lower latency on I/O-bound tasks, and simpler modeling for concurrent workflows. But the cons bite you in production. You need locks, semaphores, or lock-free data structures. You need to understand happens-before relationships. Without that, your multithreaded code is just a race condition waiting to deploy.

InventoryService.javaJAVA

// io.thecodeforge.core.inventory
import java.util.concurrent.atomic.AtomicInteger;

public class InventoryService {
    // NEVER use a plain int here. Two threads decrement => lost update.
    private final AtomicInteger stock = new AtomicInteger(100);

    public boolean purchase(int quantity) {
        // CAS loop beats synchronized on high contention paths
        while (true) {
            int current = stock.get();
            if (current < quantity) return false;
            if (stock.compareAndSet(current, current - quantity)) return true;
        }
    }
}

Output

// Single-thread correctness: purchase(3) returns true, stock=97

// Without AtomicInteger: purchase(3) twice could return true with stock=94

⚠ Production Trap:

You tested threading on a 4-core laptop and it passed. On a 64-core server, the race window shrinks to nanoseconds, and the bug hits under load. Always stress-test with thread counts that match your production topology.

🎯 Key Takeaway

Threads share heap, not safety. Always use explicit synchronization or lock-free primitives for shared mutable state.

Thrashing: When the OS Eats Itself Alive

Thrashing is what happens when the paging system collapses under its own weight. Your system has too many processes running, each holding pages the OS has to keep in memory. The working set exceeds physical RAM. The OS starts swapping pages in and out on every clock tick. CPU utilization plummets because the disk I/O for paging dominates. The scheduler sees low CPU usage and loads more processes, making the thrashing worse. It's a positive feedback loop that kills throughput. Interviewers ask about thrashing to see if you understand virtual memory's failure mode. The fix is not buying more RAM. The fix is limiting multiprogramming degree or using the working set model. When memory pressure hits, terminate processes or suspend them to disk entirely. Otherwise, your server becomes a disk thrash machine serving zero useful requests.

detect_thrashing.shBASH

#!/bin/bash
# io.thecodeforge.observability
# Check for thrashing: high swap activity + low CPU utilization

while true; do
  CPU_IDLE=$(mpstat 1 1 | awk '/Average/ {print $NF}')
  SWAP_IN=$(vmstat 1 2 | tail -1 | awk '{print $7}')
  SWAP_OUT=$(vmstat 1 2 | tail -1 | awk '{print $8}')

  if (( $(echo "$CPU_IDLE > 80" | bc -l) )) && [[ $SWAP_IN -gt 1000 || $SWAP_OUT -gt 1000 ]]; then
    echo "WARNING: Thrashing detected. CPU idle ${CPU_IDLE}%, swap in ${SWAP_IN}, swap out ${SWAP_OUT}"
    echo "Action: kill the memory-hungry process or reduce process count"
    break
  fi
  sleep 5
done

Output

WARNING: Thrashing detected. CPU idle 92.5%, swap in 2450, swap out 3100

Action: kill the memory-hungry process or reduce process count

💡Sysadmin Wisdom:

Monitor 'si' and 'so' columns in vmstat. If both are consistently above 0, you're paging. If they're above 1000 blocks/sec and CPU idle is above 80%, you're thrashing. Fix it before the OOM killer picks your DB.

🎯 Key Takeaway

Thrashing means the paging system is working harder than the CPU. Limit concurrent processes or use working set tracking to break the cycle.

Systems Design Interview OS Questions: Containers, Virtualization

Systems design interviews often probe OS knowledge through containers and virtualization. Containers (e.g., Docker) share the host OS kernel but isolate processes via namespaces and cgroups. Virtualization (e.g., KVM) uses a hypervisor to run multiple guest OSes on virtual hardware. A common question: 'How does a container use less memory than a VM?' Answer: Containers avoid duplicating the OS kernel and leverage copy-on-write filesystems. For example, running 10 containers of the same image uses one kernel and shared read-only layers, while 10 VMs each run a full OS. Another question: 'What happens when a container runs out of memory?' With cgroups, the OOM killer terminates processes within the container, not the host. Practical example: In Kubernetes, setting memory limits triggers OOM kills if exceeded. Understanding these mechanisms helps design scalable, resource-efficient systems.

container_memory_limit.yamlYAML

apiVersion: v1
kind: Pod
metadata:
  name: memory-demo
spec:
  containers:
  - name: mem-container
    image: polinux/stress
    resources:
      limits:
        memory: "200Mi"
      requests:
        memory: "100Mi"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "250M", "--vm-hang", "1"]

🔥Container vs VM Overhead

📊 Production Insight

In production, set container memory limits to prevent noisy neighbors; use resource quotas in Kubernetes to avoid one container starving others.

🎯 Key Takeaway

Containers achieve efficiency by sharing the host OS kernel and using namespaces/cgroups for isolation, while VMs provide full OS isolation at higher resource cost.

Linux Troubleshooting Interview Scenarios

Interviewers often present real-world Linux issues to test debugging skills. Common scenarios: high CPU usage, memory leaks, disk I/O bottlenecks, and network latency. For high CPU, use 'top' to find the process, then 'strace -p ' to trace system calls, or 'perf top' for kernel-level profiling. Example: A Java app consuming 100% CPU – run 'jstack' to capture thread dumps and identify infinite loops or contention. For memory leaks, use 'valgrind' or 'heaptrack' on C/C++ apps, or 'jmap' for Java. Disk I/O issues: 'iostat -x 1' shows await and %util; 'iotop' identifies processes. Network: 'tcpdump' captures packets, 'ss -tuln' checks listening ports, 'netstat -s' shows statistics. A classic scenario: 'A server is slow; where do you start?' Answer: Check 'uptime' for load average, 'free -h' for memory, 'df -h' for disk space, then drill down with 'top' and 'dmesg'. Practical example: A runaway process filling disk – 'lsof | grep deleted' finds open deleted files, then kill the process to free space.

troubleshoot_high_cpu.shBASH

#!/bin/bash
# Find top CPU consumer and trace system calls
PID=$(ps aux --sort=-%cpu | head -2 | tail -1 | awk '{print $2}')
echo "High CPU process PID: $PID"
strace -p $PID -c -S time 2>&1 | head -20

💡Quick Disk Space Check

📊 Production Insight

Set up monitoring with alerts (e.g., Prometheus + Grafana) for CPU, memory, disk, and network to catch issues before they escalate.

🎯 Key Takeaway

Systematic troubleshooting: check system load, memory, disk, and network; then use process-specific tools like strace, perf, and lsof to pinpoint issues.

eBPF and Observability Interview Questions

eBPF (extended Berkeley Packet Filter) is a revolutionary technology for observability, networking, and security. Interviewers ask about its use cases: tracing system calls, profiling CPU, monitoring network packets, and security auditing. eBPF programs run in the kernel sandboxed, allowing safe introspection. Common tools: bcc (BPF Compiler Collection) and bpftrace. Example: 'How to trace file opens system-wide?' Use bpftrace: 'bpftrace -e 'tracepoint:syscalls:sys_enter_openat { printf("%s %s ", comm, str(args->filename)); }''. Another question: 'How does eBPF improve observability over traditional tools?' eBPF provides low-overhead, dynamic tracing without modifying applications. For performance, eBPF can profile CPU stack traces with 'profile' tool. Practical example: Debugging a slow database query – use eBPF to trace MySQL query execution time by hooking uprobe on 'dispatch_command'. eBPF also powers Cilium for Kubernetes networking and security policies. Understanding eBPF's architecture (maps, helpers, verifier) is key for advanced interviews.

trace_open.bpftraceBPFTRACE

#!/usr/bin/bpftrace
# Trace all openat syscalls with process name and filename
tracepoint:syscalls:sys_enter_openat
{
    printf("%s opened %s\n", comm, str(args->filename));
}

🔥eBPF Safety

📊 Production Insight

Use eBPF-based tools like Cilium for network observability and security in Kubernetes, and bcc tools for real-time performance troubleshooting.

🎯 Key Takeaway

eBPF enables safe, efficient kernel-level tracing and observability, revolutionizing debugging and performance analysis without application changes.

● Production incidentPOST-MORTEMseverity: high

The Silent Swap Storm: How a 256MB JVM Heap Took Down a Trading Platform

Symptom

Latency spikes from 2ms to >5s every few minutes; CPU usage hovered at 100% but processes were not compute-bound; swap usage (si/so) was high in vmstat; no OOM killer activity.

Assumption

The team assumed Java GC was causing the pauses because they saw frequent GC logs. They increased heap size and tuned GC settings, making the problem worse.

Root cause

The Java heap (4GB) was configured larger than the container memory limit (2GB). Pages were constantly swapped in and out because the OS tried to keep the entire heap backed by physical RAM. The working set of the JVM (hot pages) exceeded available frames, causing continuous page faults. The JVM's GC thread also competed for memory, triggering additional faults.

Fix

Reduced Java heap to 1.5GB (below container memory limit), enabled compressed OOPs, pinned critical memory (mlockall for certain native buffers), and set vm.swappiness=1 to avoid swapping unless absolutely necessary. Added monitoring on sar -B page fault rates.

Key lesson

Never size Java heap larger than the container memory limit — the OS will swap and kill performance.
Watch vmstat si/so — if both are non-zero continuously, you're thrashing.
Container memory limits don't protect against swap inside the container; set -XX:+UseContainerSupport and respect cgroup limits.

Production debug guideSymptom → Action guide for common OS-level production failures4 entries

Symptom · 01

High context switching (>/= 100k/s per core)

→

Fix

Check /proc/<pid>/status for voluntary vs involuntary switches; review thread count; reduce thread pool size or use I/O multiplexing (epoll, io_uring).

Symptom · 02

Unexpected OOM killer activity

→

Fix

Run dmesg | grep -i oom; look at oom_score; adjust vm.overcommit_ratio or set vm.overcommit_memory=2 for strict accounting.

Symptom · 03

High system CPU but low user CPU

→

Fix

Check strace for excessive syscalls; look at interrupt affinity (/proc/interrupts); move to tickless kernel if idle.

Symptom · 04

Application response time degrades after adding more threads

→

Fix

Measure context switch rate; calculate CPU spent switching vs actual work; redesign with async I/O or reactor pattern.

★ Quick OS Debug Cheat SheetCommon symptoms, immediate actions, and canonical commands for OS-level production issues.

Process hangs or is in D state (uninterruptible sleep)−

Immediate action

Check `/proc/<pid>/stack` for kernel wait chain.

Commands

cat /proc/<pid>/stack | head -20

dmesg | tail -30 (look for hung_task or IO errors)

Fix now

Identify and fix underlying I/O issue — often a dead NFS mount or faulty disk. Use echo l > /proc/sysrq-trigger to dump stack traces.

High load average but low CPU usage+

Sudden out-of-memory, process killed but no OOM entry+

Process vs Thread Comparison

Feature	Process	Thread
Memory	Isolated (Own Address Space)	Shared (Common Address Space)
Switching Cost	High (Requires TLB flush)	Low (No memory map change)
Communication	Inter-Process (IPC, Sockets, Pipes)	Shared Variables (Fast, but needs sync)
Resilience	If one crashes, others survive	If one crashes, the whole process might die

⚙ Quick Reference

10 commands from this guide

File	Command / Code	Purpose
iothecodeforgeosConcurrencyDemo.java	/**	Process vs. Thread
iothecodeforgeosscheduler_tuning.sh	cat /proc//sched	CPU Scheduling
iothecodeforgeosDeadlockDemo.java	/**	Deadlock
iothecodeforgeoscheck_memory.sh	vmstat 1 5	Memory Management
iothecodeforgeosmeasure_ctxt.sh	vmstat -w 1 \| awk '{print $14}' # cs column	The Cost of Context Switching and How to Measure It
InventoryService.java	public class InventoryService {	Multithreading
detect_thrashing.sh	while true; do	Thrashing
container_memory_limit.yaml	apiVersion: v1	Systems Design Interview OS Questions
troubleshoot_high_cpu.sh	PID=$(ps aux --sort=-%cpu \| head -2 \| tail -1 \| awk '{print $2}')	Linux Troubleshooting Interview Scenarios
trace_open.bpftrace	tracepoint:syscalls:sys_enter_openat	eBPF and Observability Interview Questions

Key takeaways

The OS coordinates processes and threads, isolating memory for safety while allowing cheap communication via shared memory.

Context switching processes is expensive (TLB flush); threads are lighter but require synchronization.

Virtual memory with paging enables running larger-than-RAM programs but can thrash if working set exceeds physical memory.

CPU scheduling balances fairness and throughput; tune parameters like sched_latency_ns for latency-sensitive workloads.

Deadlocks require four conditions; break one to prevent. Fixed locking order is the simplest prevention.

Measure production performance with vmstat, pidstat, and perf

don't guess.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR

What is the difference between a Hard Link and a Soft Link (Symbolic Lin...

Q02SENIOR

Explain the 'Starvation' problem in Priority Scheduling. How does 'Aging...

Q03SENIOR

What happens during a System Call (Trap)? Describe the transition from U...

Q04SENIOR

Describe the 'Dining Philosophers' problem. How would you implement a so...

Q05SENIOR

What is 'DMA' (Direct Memory Access) and why is it critical for high-per...

Q01 of 05JUNIOR

What is the difference between a Hard Link and a Soft Link (Symbolic Link) at the filesystem level?

ANSWER

A hard link is an additional directory entry pointing to the same inode; both share the same data blocks and inode number. Deleting one hard link does not delete the data until the last link is removed. A soft link (symlink) is a separate file that contains a path to the target. If the target is deleted, the symlink becomes dangling. Hard links cannot cross filesystem boundaries or link to directories by default.

FAQ · 5 QUESTIONS

Frequently Asked Questions

What are the four necessary conditions for a Deadlock?

What is 'Thrashing' in an operating system?

What is a Kernel and how is it different from an OS?

How does a process's memory layout look?

What is the difference between preemptive and non-preemptive scheduling?

Naren Founder & Principal Engineer

20+ years shipping production systems from the metal up. Lessons pulled from things that broke in production.

✓ Verified

production tested

July 19, 2026

last updated

2,466

articles · all by Naren

🔥

That's Operating Systems. Mark it forged?

7 min read · try the examples if you haven't