Senior 4 min · June 25, 2026

Cache Eviction Policies: Stop Guessing, Start Predicting What Gets Kicked Out

Cache eviction policies explained with production trade-offs.

N
Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Written from production experience, not tutorials.

Follow
Production
production tested
June 25, 2026
last updated
1,663
articles · all by Naren
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer

The most common cache eviction policies are LRU (Least Recently Used), LFU (Least Frequently Used), FIFO (First In First Out), and TTL-based expiry. Choose LRU for general-purpose caching with temporal locality, LFU for content with skewed popularity, and TTL for data that becomes stale after a fixed time.

✦ Definition~90s read
What is Cache Eviction Policies?

A cache eviction policy is the algorithm that decides which cached entry to remove when the cache is full. It's the bouncer at the club — when capacity's maxed, someone has to leave.

Imagine your desk has room for only 5 sticky notes.
Plain-English First

Imagine your desk has room for only 5 sticky notes. When you need to add a 6th, you have to throw one away. LRU means you toss the note you haven't looked at the longest. LFU means you toss the one you've touched the fewest times. FIFO means you toss the oldest note regardless of use. TTL means you set a timer on each note and it self-destructs when time's up.

Your cache is lying to you. It's full of data nobody asked for in hours, while the hot keys your users actually need are getting evicted every 30 seconds. That's not a cache — that's a memory leak with extra steps. The problem isn't your cache size. It's your eviction policy. Pick the wrong one and your hit rate tanks, your database gets hammered, and your pager goes off at 3 AM. This article gives you the decision framework to choose the right eviction policy for your workload, the gotchas that'll burn you in production, and the exact commands to debug when things go sideways.

Why Your Cache Needs a Bouncer (and Why FIFO Is the Worst Bouncer)

A cache without an eviction policy is a ticking time bomb. The moment you hit capacity, writes fail or the process OOMs. The simplest policy — FIFO — evicts the oldest entry regardless of how popular it is. That's fine for a queue, terrible for a cache. In production, I've seen FIFO cause a 90% cache miss rate because a batch job loaded a million cold keys, pushing out all the hot ones. The database got hammered, latency spiked, and the cache became a liability. The fix: never use FIFO for caching. Use it only for rate-limiting buckets or job queues where order matters.

FIFOEviction.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// io.thecodeforge — System Design tutorial

// Simulating FIFO eviction in a cache
class FIFOCache<K, V>(private val maxSize: Int) {
    private val queue = ArrayDeque<K>()
    private val map = mutableMapOf<K, V>()

    fun put(key: K, value: V) {
        if (map.size >= maxSize) {
            val oldest = queue.removeFirst() // O(1) eviction
            map.remove(oldest)
        }
        queue.addLast(key)
        map[key] = value
    }

    fun get(key: K): V? = map[key] // No promotion — key stays in place
}

fun main() {
    val cache = FIFOCache<String, String>(3)
    cache.put("hot1", "data1")
    cache.put("hot2", "data2")
    cache.put("hot3", "data3")
    cache.get("hot1") // Access hot1 — but FIFO doesn't care
    cache.put("cold1", "data4") // Evicts hot1, not the oldest unused
    println(cache.get("hot1")) // null — evicted despite being recently used
}
Output
null
Production Trap:
FIFO eviction in a cache causes 'cache stampede' — all hot keys evicted simultaneously, hammering the database. Use LRU or LFU instead.
Cache Eviction Policies Overview THECODEFORGE.IO Cache Eviction Policies Overview From FIFO to hybrid approaches for production caches FIFO First-in, first-out eviction LRU Least recently used eviction LFU Least frequently used eviction TTL Time-to-live expiration Hybrid Policies Combine LRU, LFU, TTL ⚠ Concurrency issues with eviction policies Use locking or striping to avoid race conditions THECODEFORGE.IO
thecodeforge.io
Cache Eviction Policies Overview
Cache Eviction Policies

LRU: The Default You Should Actually Use (But Tune)

LRU evicts the entry that hasn't been accessed the longest. It exploits temporal locality — if you accessed something recently, you'll likely access it again soon. This is the default for Redis (allkeys-lru) and memcached. But LRU has a blind spot: a one-hit wonder that was accessed once and never again will sit in the cache until it's the least recently used among all entries. That's fine for most workloads. The real gotcha: LRU requires tracking access order. In a naive implementation (linked list + hashmap), every get() causes a promotion — moving the node to the head. Under high concurrency, that's a mutex bottleneck. Redis uses an approximation (sampling) to avoid this. For your own implementation, consider using an approximate LRU with a clock algorithm or a concurrent skip list.

LRUCache.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// io.thecodeforge — System Design tutorial

// Thread-safe LRU cache using LinkedHashMap
class LRUCache<K, V>(private val maxSize: Int) {
    private val map = object : LinkedHashMap<K, V>(maxSize, 0.75f, true) {
        override fun removeEldestEntry(eldest: MutableMap.MutableEntry<K, V>?): Boolean {
            return size > maxSize // Auto-evict on insert
        }
    }

    @Synchronized
    fun get(key: K): V? = map[key] // Access promotes entry to end

    @Synchronized
    fun put(key: K, value: V) {
        map[key] = value
    }
}

fun main() {
    val cache = LRUCache<String, String>(3)
    cache.put("A", "1")
    cache.put("B", "2")
    cache.put("C", "3")
    cache.get("A") // Promotes A to most recent
    cache.put("D", "4") // Evicts B (least recently used)
    println(cache.get("B")) // null
}
Output
null
Senior Shortcut:
In Redis, use 'maxmemory-policy allkeys-lru' for general caching. For volatile keys with TTL, use 'volatile-lru' — it only evicts keys with TTL set, leaving persistent keys alone.
LRU Eviction in ActionTHECODEFORGE.IOLRU Eviction in ActionTemporal locality drives which entry gets kickedCache FullCapacity reached, eviction triggeredFind Least RecentScan for oldest accessed entryEvict EntryRemove the least recently used keyInsert New KeyAdd new entry, update recency orderPromote on GetMove accessed key to front⚠ One-hit wonders pollute LRU until evictedTHECODEFORGE.IO
thecodeforge.io
LRU Eviction in Action
Cache Eviction Policies

LFU: When Popularity Matters More Than Recency

LFU evicts the entry with the lowest access frequency. It's ideal for workloads where a small set of keys gets the vast majority of requests — think CDN caching, API rate limit counters, or leaderboard data. The problem: LFU has a 'cold start' issue. A new key starts with frequency 1, so it's immediately vulnerable to eviction. You need a frequency aging mechanism — periodically halve all counters to prevent old popular keys from dominating forever. Redis implements LFU via a probabilistic counter (Morris counter) with logarithmic aging. The gotcha: LFU consumes more memory per key (to store frequency) and can be tricked by bursty traffic — a key that gets 100 requests in 1 second then nothing for an hour might be kept while a steadily used key gets evicted.

LFUCache.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// io.thecodeforge — System Design tutorial

// Simplified LFU with min-heap for frequency tracking
class LFUCache<K, V>(private val maxSize: Int) {
    private val map = mutableMapOf<K, V>()
    private val freq = mutableMapOf<K, Int>()
    private val heap = PriorityQueue(compareBy { freq[it] }) // Min-heap by frequency

    fun put(key: K, value: V) {
        if (map.size >= maxSize && key !in map) {
            val evict = heap.poll() // Evict lowest frequency
            map.remove(evict)
            freq.remove(evict)
        }
        map[key] = value
        freq[key] = (freq[key] ?: 0) + 1
        heap.remove(key) // Reinsert to re-sort
        heap.add(key)
    }

    fun get(key: K): V? {
        val v = map[key]
        if (v != null) {
            freq[key] = freq[key]!! + 1
            heap.remove(key)
            heap.add(key)
        }
        return v
    }
}

fun main() {
    val cache = LFUCache<String, String>(3)
    cache.put("hot", "1")
    cache.get("hot")
    cache.get("hot")
    cache.put("cold", "2")
    cache.put("also_cold", "3")
    cache.put("new", "4") // Evicts 'cold' (freq=1) even though 'also_cold' is older
    println(cache.get("cold")) // null
}
Output
null
Interview Gold:
Q: When would you choose LFU over LRU? A: When access frequency is more predictive than recency — e.g., a cache of popular articles where old hits still matter. But beware the cold start problem.
LRU vs LFU: Recency vs PopularityTHECODEFORGE.IOLRU vs LFU: Recency vs PopularityWhich eviction policy fits your workload?LRU (Recency)Evicts least recently accessedBest for temporal localityOne-hit wonders lingerConcurrent promotion is costlyLFU (Frequency)Evicts least frequently accessedBest for skewed popularityCold start: new keys at riskNeeds counter aging to avoid biasChoose LRU for general use, LFU for hot-key workloadsTHECODEFORGE.IO
thecodeforge.io
LRU vs LFU: Recency vs Popularity
Cache Eviction Policies

TTL: The Self-Destruct Mechanism (and Why You Need a Purge Thread)

TTL (Time-To-Live) isn't strictly an eviction policy — it's an expiry mechanism. But combined with any eviction policy, it's essential for data that becomes stale. The classic mistake: relying on lazy expiration (checking TTL on access). If a key is never accessed, it stays in memory forever. Redis has an active expiry mechanism that samples keys periodically, but it's not guaranteed. In production, I've seen a cache grow unbounded because TTL was set but the active expiry couldn't keep up with writes. Fix: run a background purge thread that scans and deletes expired keys. Or use a time-ordered data structure (like a priority queue of expiry times) to evict precisely.

TTLCache.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// io.thecodeforge — System Design tutorial

// TTL cache with background purge thread
class TTLCache<K, V>(private val defaultTTL: Long) {
    private val map = mutableMapOf<K, V>()
    private val expiry = mutableMapOf<K, Long>() // Expiry timestamp
    private val purgeInterval = 1000L // ms

    init {
        thread(isDaemon = true) {
            while (true) {
                Thread.sleep(purgeInterval)
                val now = System.currentTimeMillis()
                val expired = expiry.filter { it.value <= now }.keys
                expired.forEach { map.remove(it); expiry.remove(it) }
            }
        }
    }

    fun put(key: K, value: V, ttl: Long = defaultTTL) {
        map[key] = value
        expiry[key] = System.currentTimeMillis() + ttl
    }

    fun get(key: K): V? {
        val exp = expiry[key] ?: return null
        if (System.currentTimeMillis() > exp) {
            map.remove(key)
            expiry.remove(key)
            return null
        }
        return map[key]
    }
}

fun main() {
    val cache = TTLCache<String, String>(2000)
    cache.put("temp", "data")
    Thread.sleep(2500)
    println(cache.get("temp")) // null — expired
}
Output
null
Never Do This:
Don't rely solely on lazy expiration. If a key is never accessed, it becomes a zombie. Always pair with active expiry or a max-size eviction policy.

When Eviction Policies Collide: The Hybrid Approach

Real production systems often combine policies. For example, Redis allows 'volatile-lru' — evicts among keys with TTL set, leaving persistent keys untouched. Or you can implement a two-tier cache: L1 (in-memory LRU) and L2 (Redis with LFU). The L1 handles hot keys with low latency; L2 handles warm keys with frequency-based retention. The gotcha: consistency between tiers. If L1 evicts a key, L2 might still have it stale. Use write-through or invalidation messages. Another hybrid: use TTL as the primary eviction trigger, but if the cache is full before TTL expires, fall back to LRU. This prevents stale data from lingering while still handling capacity spikes.

HybridCache.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
// io.thecodeforge — System Design tutorial

// Hybrid: TTL + LRU fallback
class HybridCache<K, V>(private val maxSize: Int, private val defaultTTL: Long) {
    private val map = LinkedHashMap<K, Pair<V, Long>>(maxSize, 0.75f, true)

    @Synchronized
    fun put(key: K, value: V, ttl: Long = defaultTTL) {
        if (map.size >= maxSize && key !in map) {
            // Evict oldest (LRU order due to accessOrder=true)
            val eldest = map.keys.first()
            map.remove(eldest)
        }
        map[key] = Pair(value, System.currentTimeMillis() + ttl)
    }

    @Synchronized
    fun get(key: K): V? {
        val entry = map[key] ?: return null
        val (value, expiry) = entry
        if (System.currentTimeMillis() > expiry) {
            map.remove(key)
            return null
        }
        return value // Access promotes in LinkedHashMap
    }
}

fun main() {
    val cache = HybridCache<String, String>(3, 5000)
    cache.put("A", "1", 100) // Short TTL
    Thread.sleep(200)
    println(cache.get("A")) // null — expired
}
Output
null
Production Pattern:
Use TTL as the primary eviction for data with known freshness requirements. Use LRU as a safety net to prevent unbounded growth. This is the default in many CDN caches.

The Concurrency Nightmare: Locking, Striping, and Lock-Free Approaches

Eviction policies that require promotion on every get() (like LRU and LFU) are a concurrency nightmare under high throughput. A naive synchronized LRU cache becomes a bottleneck — all threads queue up for the lock. The fix: use concurrent data structures. For LRU, consider a ConcurrentLinkedHashMap (from Caffeine library) or a striped lock where each stripe handles a subset of keys. For LFU, use a concurrent frequency counter with atomic increments and a lock-free min-heap (tricky). Redis avoids this by using a single-threaded event loop — no locks needed. If you're building your own, measure before optimizing. Often, a simple synchronized cache with a hit rate >95% is fast enough because the lock is held briefly.

ConcurrentLRU.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// io.thecodeforge — System Design tutorial

// Striped LRU: 16 stripes to reduce lock contention
class StripedLRUCache<K, V>(private val maxSize: Int, private val stripes: Int = 16) {
    private val caches = Array(stripes) { LRUCache<K, V>(maxSize / stripes) }

    private fun stripe(key: K): Int = key.hashCode().mod(stripes).let { if (it < 0) it + stripes else it }

    fun get(key: K): V? = caches[stripe(key)].get(key)
    fun put(key: K, value: V) = caches[stripe(key)].put(key, value)
}

// Note: This can evict more than maxSize if keys hash unevenly.
// Use consistent hashing or a central coordinator for strict limits.
The Classic Bug:
Striped caches can exceed total maxSize if keys are unevenly distributed. Monitor per-stripe size and rebalance if needed.

When Not to Use an Eviction Policy: The Anti-Cache Pattern

Sometimes you don't want eviction at all. If your data is small enough to fit in memory, just use a hash map. If your data has a natural expiry (like session tokens), use TTL-only with no max size — let expiry handle it. Eviction policies add complexity and overhead. I've seen teams implement LRU for a cache of 100 entries — the overhead of tracking order was higher than the cost of recomputing the data. Rule of thumb: if your cache hit rate is >99% with a simple TTL, don't add eviction. If your cache is large (>1GB) and access pattern is unpredictable, then reach for LRU or LFU.

Senior Shortcut:
Start with TTL + no eviction. If memory grows unbounded, add LRU. If hit rate is still low, analyze access pattern and switch to LFU or a custom policy.
● Production incidentPOST-MORTEMseverity: high

The 4GB Container That Kept Dying

Symptom
A microservice serving product recommendations crashed every 45 minutes with OOMKilled. Restart fixed it temporarily.
Assumption
The team assumed a memory leak in the recommendation engine.
Root cause
The cache used FIFO eviction with a 2GB max. Hot products were accessed every few seconds, but cold products loaded during a batch job filled the cache and never got evicted because FIFO only evicts the oldest entry. The hot keys were evicted, causing repeated database calls that ballooned memory with new entries.
Fix
Switched to LRU eviction. Set maxmemory to 1.5GB (leaving headroom). Added maxmemory-policy allkeys-lru in Redis config.
Key lesson
  • FIFO eviction is almost never the right choice for a cache — it doesn't respect access patterns.
Production debug guideSystematic recovery paths for the failure modes engineers actually hit.3 entries
Symptom · 01
Cache miss rate spikes after deployment — from 5% to 60%
Fix
1. Check cache size: redis-cli info memory | grep used_memory_human. 2. Check eviction count: redis-cli info stats | grep evicted_keys. 3. If evictions are high, increase maxmemory or switch to a more efficient policy (e.g., allkeys-lru). 4. Analyze access pattern with redis-cli --hotkeys.
Symptom · 02
Redis OOMKilled in Kubernetes
Fix
1. Check memory limit: kubectl describe pod <redis-pod> | grep Limits. 2. Check Redis maxmemory config: redis-cli config get maxmemory. 3. Ensure maxmemory is set to 80% of container limit. 4. Set maxmemory-policy to allkeys-lru. 5. Add memory request/limit to pod spec.
Symptom · 03
Stale data served despite TTL
Fix
1. Check if TTL is set on write: redis-cli ttl <key>. 2. If TTL is -1 (no expiry), fix code to set TTL. 3. If TTL is set but stale, check for lazy expiration: expired keys linger until accessed. 4. Enable active expiry: redis-cli config set activedefrag yes. 5. Consider using a background purge thread in your app.
★ Cache Eviction Policies Triage Cheat SheetFirst-response commands for when things go wrong — copy-paste ready.
High eviction rate: `evicted_keys` > 100/s
Immediate action
Check if maxmemory is too low or policy is wrong.
Commands
redis-cli info stats | grep evicted_keys
redis-cli config get maxmemory
Fix now
Increase maxmemory by 20% or switch to allkeys-lru: redis-cli config set maxmemory-policy allkeys-lru
Cache miss rate > 50%+
Immediate action
Check if working set fits in cache.
Commands
redis-cli --hotkeys
redis-cli info keyspace
Fix now
Increase maxmemory or switch to LFU if access is skewed: redis-cli config set maxmemory-policy allkeys-lfu
Memory grows unbounded despite TTL+
Immediate action
Check for keys without TTL.
Commands
redis-cli info keyspace
redis-cli --bigkeys
Fix now
Set TTL on all writes. Add maxmemory with eviction policy as safety net.
Write failures: `OOM command not allowed when used memory > 'maxmemory'`+
Immediate action
Eviction policy is set to noeviction.
Commands
redis-cli config get maxmemory-policy
redis-cli config set maxmemory-policy allkeys-lru
Fix now
Change policy to allkeys-lru or volatile-lru. Monitor eviction rate after change.
PolicyEviction CriteriaMemory OverheadConcurrency CostBest For
FIFOOldest insertion timeLow (queue)LowQueues, rate limiters
LRULeast recently accessedMedium (order tracking)High (promotion)General caching, temporal locality
LFULeast frequently accessedHigh (frequency counter)High (frequency update)Skewed popularity, CDN
TTLExpiry timeLow (timestamp)LowStale data, sessions

Key takeaways

1
FIFO is for queues, not caches. Never use it for caching.
2
LRU is the safe default for most workloads
it exploits temporal locality.
3
LFU works when a small set of keys gets most requests, but watch for cold start issues.
4
Always pair TTL with an eviction policy
lazy expiration alone leaks memory.
5
Under high concurrency, avoid naive synchronized LRU
use striped locks or approximate algorithms like Redis sampling.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
How does Redis implement LRU without a linked list?
Q02SENIOR
When would you choose LFU over LRU in a production cache?
Q03SENIOR
What happens when you use FIFO eviction in a cache under a batch workloa...
Q04JUNIOR
What is the difference between lazy expiration and active expiration?
Q05SENIOR
You see 'OOM command not allowed when used memory > maxmemory' in Redis ...
Q06SENIOR
Design a cache for a social media feed that shows the most recent 100 po...
Q01 of 06SENIOR

How does Redis implement LRU without a linked list?

ANSWER
Redis uses an approximation: it samples a few keys (default 5) and evicts the one with the oldest idle time. This avoids the concurrency cost of a global order. The sample size is configurable via maxmemory-samples. Larger samples give more accurate LRU but cost more CPU.
FAQ · 4 QUESTIONS

Frequently Asked Questions

01
What is the best cache eviction policy?
02
What's the difference between LRU and LFU?
03
How do I set the eviction policy in Redis?
04
Can a cache eviction policy cause data loss?
N
Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Written from production experience, not tutorials.

Follow
Verified
production tested
June 25, 2026
last updated
1,663
articles · all by Naren
🔥

That's Components. Mark it forged?

4 min read · try the examples if you haven't

Previous
Transactional Outbox Pattern
19 / 23 · Components
Next
Distributed Caching