Memory Management Thrashing — 120GB Working Set on 64GB
A Spark job's 120GB working set on 64GB server caused thrashing: 85% CPU on page faults, 200ms→4+ min queries.
- Memory management is the OS subsystem that allocates, tracks, and reclaims memory for running processes
- Physical memory is shared among processes; each process gets a private virtual address space
- Paging maps virtual pages to physical frames using page tables, enabling isolation and sparse use
- Virtual memory extends RAM to disk via swapping, with page replacement algorithms deciding what to evict
- Performance trap: TLB misses on context switch hurt throughput more than page faults in many workloads
- Biggest mistake: assuming virtual addresses match physical addresses — they never do in modern OSes
Imagine your desk is the computer's RAM — it's the space where you actually do your work. Your OS is the office manager who decides which papers (programs) get desk space, where they go, and who gets kicked off when the desk is full. When the desk overflows, the manager quietly moves older papers to a filing cabinet (your hard drive) and brings them back when needed — you barely notice. That swap between desk and cabinet is exactly what virtual memory does.
Every program you run — a browser, a game, a database — needs memory to breathe. Without a fair, structured way to hand out that memory, one misbehaving app could read your bank app's data, a crashed process could corrupt the entire system, and you'd never be able to run more than one program at a time. Memory management is the silent contract that makes modern computing safe and multi-tasking possible.
The problem it solves is deceptively deep. Physical RAM is finite and shared. Process A shouldn't be able to peek into Process B's address space. The OS needs to allocate memory fast, reclaim it when a process exits, and give each program the illusion that it owns all the memory in the world — even when RAM is nearly full. Without a memory manager, none of that is possible.
By the end of this article you'll understand exactly how the OS partitions memory, why paging replaced older schemes, how virtual memory lets your laptop run 40 browser tabs on 8 GB of RAM, and what questions about memory management reveal in a system-design or OS interview. Let's dig in.
What is Memory Management in OS?
Memory management is the OS subsystem responsible for allocating and deallocating memory to processes, tracking which parts of memory are free or in use, and providing isolation so that one process cannot access another's data. At its core, it solves three problems: sharing of finite physical RAM, protection between processes, and translation from virtual to physical addresses. Every modern OS—Linux, Windows, macOS—implements memory management as part of the kernel, using hardware support from the CPU's MMU (Memory Management Unit). Without it, a bug in a browser could corrupt a password manager in memory, and multiprogramming would be impossible.
- The building has limited rooms (physical frames).
- Each tenant has their own numbering system (virtual addresses).
- The landlord (MMU) translates the tenant's room number (virtual address) into a real room (physical address).
- Many tenants can share a common area (shared memory).
- If too many tenants arrive, the landlord moves some stuff to a storage locker (swap).
Memory Allocation Strategies: Contiguous vs Non-Contiguous
Early operating systems used contiguous allocation: a process gets a single block of physical memory. That led to fragmentation and overcommit problems. Modern OSes use non-contiguous allocation via paging. But within a process, memory allocation requests (malloc) are served by the heap manager, which uses a mix of contiguity and segmentation. The two main strategies are:
- Contiguous (Fixed Partitioning): Each process gets a fixed-size block; simple but wastes memory (external fragmentation). Not used in general-purpose OSes.
- Non-Contiguous (Paging): Physical memory is split into fixed-size frames; processes get virtual pages that can map to any frame. This eliminates external fragmentation and enables virtual memory.
Additionally, Segmentation allows variable-sized logical chunks (code, data, stack) but suffers from external fragmentation unless combined with paging. Most modern systems (Linux, Windows) use paged segmentation where segments are further broken into pages.
Paging: The Mechanism Behind Virtual Memory
Paging divides virtual memory into fixed-size blocks called pages (typically 4KB on x86_64) and physical memory into frames of the same size. Each process has a page table that maps virtual page numbers to physical frame numbers. The MMU uses this page table to translate every memory access from virtual to physical address. When a process accesses a page not currently in physical memory, a page fault occurs, and the OS loads the page from disk (swap) or from the file system (demand paging).
Page tables themselves are hierarchical (e.g., 4-level page tables on x86_64) to avoid needing a flat table with billions of entries. The Translation Lookaside Buffer (TLB) caches recent translations to speed up address translation. On a context switch, the TLB must be flushed, which is why repeated context switching hurts performance.
perf stat -e page-faults to count them.\nHuge pages (HugeTLB or transparent huge pages) can reduce TLB misses by 10-30% for large working sets.\nRule: For databases and VMs, explicit huge pages (HugeTLB) are more predictable than transparent huge pages (THP).Virtual Memory: The Illusion of Infinite Memory
Virtual memory extends the concept of paging: each process gets a full virtual address space (e.g., 2^48 on x86_64), but only the parts actively needed are backed by physical memory. The rest sits on disk (swap). When a process accesses a virtual address that is not in RAM, the OS loads the corresponding page from disk into a freed frame—this is demand paging. If no free frames exist, the OS evicts a page to disk using a replacement algorithm (LRU, Clock, etc.). Virtual memory enables: - Running programs larger than physical RAM. - Sharing libraries and memory-mapped files. - Copy-on-write (COW) forking.
The key components: page table, swap space (disk area), page replacement algorithm, and the page fault handler.
- The library catalog (page table) tells where each book is.
- A book on the shelf = page in RAM; in the basement = on swap.
- Frequent trips to the basement (thrashing) mean patrons are referencing books that keep getting removed.
- To avoid thrashing, ensure the total working set of active patrons fits on the shelves.
sar -B to monitor pgpgin/s and pgpgout/s.\nCopy-on-write (COW) after fork can cause memory doubling if not managed (e.g., after fork, child modifies many pages). Use vfork or posix_spawn to avoid COW.\nRule: Set vm.max_map_count appropriately; default (65530) is too low for large-memory processes like Elasticsearch.Page Replacement Algorithms: Which Pages Get Evicted?
When physical memory is full and a page fault occurs, the OS must evict a victim page to disk. The replacement algorithm determines which page to remove. The goal is to minimize future page faults by evicting pages unlikely to be used soon.
LRU (Least Recently Used): Evict the page not accessed for the longest time. Requires hardware support (reference bits) or software approximation (e.g., Clock algorithm).
Clock (Second Chance): Use a circular list with a reference bit. Sweep through, clearing bits; if bit is already clear, evict. Efficient approximation of LRU.
Working Set Model: Estimate the set of pages a process is actively using; only keep that set in RAM. Prevents thrashing by adjusting degree of multiprogramming.
Other algorithms: FIFO (simple but suffers from Belady's anomaly), Optimal (unimplementable, used as comparison).
The Thrashed Production Server — When the OS Spends More Time Swapping Than Working
memory.soft_limit_in_bytes in cgroup to deprioritize batch jobs before they cause thrashing.- Thrashing happens when total working set exceeds physical RAM, not when RAM is 'full' — always monitor page fault rates (ps -eo min_flt,maj_flt)
- Use cgroups and ulimits to prevent one rogue process from starving the system
- Set vm.swappiness low (1-10) on latency-sensitive servers; never let the OS swap application pages
Key takeaways
Common mistakes to avoid
4 patternsConfusing virtual address space with physical memory usage
ps -eo pid,rss,vsz and focus on RSS. Virtual memory is cheap; physical memory is expensive.Relying on swap to save you from running out of memory
sar -S and alert if swap usage > 10% of RAM.Forgetting to set per-process memory limits
Assuming huge pages are always faster
Interview Questions on This Topic
Explain how virtual memory works. What happens when a process accesses a page not in RAM?
Frequently Asked Questions
That's Operating Systems. Mark it forged?
4 min read · try the examples if you haven't