Redis vs Memcached – No Persistence = 2hr Cache Warm-Up
After an unplanned Redis restart, all cache keys vanished, spiking API latency from 2ms to 8s.
20+ years shipping large-scale distributed systems. Everything here is grounded in real deployments.
- Redis: in-memory data structure server with optional persistence, replication, and pub/sub
- Memcached: pure key-value cache with multithreaded, shared-nothing architecture
- Use Redis when you need data structures (lists, sets, sorted sets, streams), persistence, or replication
- Use Memcached when you need a simple, fast, horizontally scalable cache with minimal memory overhead (~17 bytes per key)
- Performance insight: Memcached read latency ~1ms under 100k ops/s; Redis single-threaded can bottleneck at 100k+ ops/s without pipelining
- Production gotcha: Memcached has no authentication; Redis requires explicit
requirepassand TLS for secure access
Imagine your kitchen. Your fridge (the database) holds everything long-term, but every time you want salt you don't walk to the fridge — you keep a salt shaker on the counter. Memcached is that salt shaker: dead simple, holds one thing, lightning fast. Redis is a whole spice rack with labels, a timer that tells you when things expire, and a notepad that remembers what you used even after the lights went out. Same core idea — keep things close so you're not always going to the fridge — but wildly different capabilities.
Every high-traffic system eventually hits the same wall: the database can't keep up. You've added indexes, tuned queries, thrown more read replicas at the problem — and the latency still spikes every time a popular endpoint gets hammered. Caching is the answer almost every time, and for in-memory caching two names dominate every architecture conversation: Redis and Memcached. Getting this choice wrong at the design stage costs you months of painful migration later.
Both tools exist to solve the same fundamental problem — move frequently-read data off your primary datastore and into RAM where access times drop from milliseconds to microseconds. But they solve it with completely different philosophies. Memcached is a pure, stripped-down cache engine optimised for one thing: storing and retrieving string blobs as fast as physically possible across many threads. Redis is a data structure server that happens to be extraordinarily good at caching, but also does pub/sub messaging, server-side scripting, geospatial queries, streams, and durable storage. The overlap is real but the differences are load-bearing.
By the end of this article you'll be able to reason through any production scenario — whether you need a simple object cache for a stateless API, a session store that survives restarts, a leaderboard backed by a sorted set, or a multi-region cluster — and make a defensible, architecture-level decision with the trade-offs clearly understood. We'll go through the internals that actually matter, walk through real configuration and code, and flag the production gotchas that bite teams who only read the marketing page.
Why Redis Survives Restarts and Memcached Doesn't
Redis and Memcached are both in-memory key-value stores, but the critical difference is persistence. Memcached is a pure cache — data lives only in RAM, and a restart means a completely cold cache. Redis offers optional disk persistence (RDB snapshots, AOF logs), meaning it can survive a reboot with data intact. This isn't a minor feature; it changes how you design for failure.
In practice, Memcached gives you a simple, fast cache with O(1) operations and a tiny memory overhead per key (around 50 bytes). Redis adds data structures (lists, sets, sorted sets), replication, and Lua scripting — but at the cost of higher per-key memory (hundreds of bytes). The persistence trade-off is the real decider: without it, a Memcached cluster restart means a 2-hour cache warm-up as traffic slams your database. Redis can serve reads immediately after restart.
Use Memcached when you need a pure, ephemeral cache with maximum throughput and minimal latency — think session stores or HTML fragment caching that can be regenerated. Use Redis when cache data is expensive to recompute, or when you need atomic operations on complex structures. The rule: if losing the cache means a production incident, you need Redis persistence.
Core Architecture Differences: Single-Threaded Event Loop vs Multithreaded Slabs
Redis uses a single-threaded event loop for all commands (since v6, I/O threads can help, but execution remains single-threaded). This makes Redis predictable — no locks, no race conditions, just sequential processing. The cost: a single slow command blocks everything. Memcached, by contrast, runs one thread per CPU core. Each thread owns its own slab allocator and connection pool. This parallelism scales linearly with cores for simple gets/sets, but it means you can't run atomic operations across threads — everything Memcached does is per-key.
The slab allocator in Memcached pre-allocates chunks of different sizes (64B, 128B, 512B, etc.) to avoid malloc overhead. Redis uses malloc per key-value pair — flexible but can fragment. In production, Memcached's slab approach gives predictable memory usage, while Redis may waste ~5-10% due to fragmentation. Your choice here determines how you handle memory budgets and performance under load.
- Predictability: No lock contention, no race conditions, dead simple debugging.
- Slow operations like KEYS, FLUSHALL, or large SORT block all other clients — avoid them.
- Memcached is like a team of workers, each with their own desk. Simple operations scale with cores.
- Atomic operations (INCR, CAS) are per-thread in Memcached, not global. Redis gives true atomicity across keys with MULTI/EXEC and Lua scripts.
Data Structure Support: When Strings Aren't Enough
Memcached stores string blobs only. Every value is a byte array: you serialize, store, read, deserialize. Redis supports strings, lists, sets, sorted sets, hashes, bitmaps, hyperloglogs, geospatial indexes, and streams. This isn't just a feature list — it changes how you architect. With Memcached, if you need a leaderboard, you build it yourself. With Redis, you use ZADD and ZRANGE.
For session stores, Memcached works fine — just a TTL and a blob. But for a shopping cart, Redis hashes allow atomic field updates without transferring the entire cart object. That reduces network bandwidth and avoids race conditions. For rate limiting, Redis INCR with TTL is a single command. Memcached gets you close with incr, but it lacks the expire-in-the-same-call atomicity — you need two calls and risk a race.
Each data structure has memory overhead. Redis strings: ~90 bytes per key + value length. A hash uses ~10-20 bytes per field entry. Sorted sets: ~160 bytes per element. Memcached strings: ~59 bytes per key + value (plus slab internal fragmentation).
Persistence & Durability: What Happens When the Lights Go Out
Memcached is designed as a lossy cache. No persistence. No snapshots. No recovery. If it restarts, every key is gone. That's fine for a pure cache — you expect to repopulate from the database. But many teams accidentally store critical data in Memcached (session tokens, rate-limit counters) and get burned when a restart wipes them out.
Redis offers two persistence mechanisms: RDB (point-in-time snapshots) and AOF (append-only log). RDB is efficient — a fork-based dump every N changes. But if Redis crashes between snapshots, you lose the last N minutes of writes. AOF logs every write operation. With appendfsync everysec, you lose at most 1 second of data. The tradeoff: AOF files grow large and rewrite overhead can spike CPU. Use both: RDB for quick restarts, AOF for durability. But remember: enabling persistence kills some of Redis's performance advantage — writes become synchronous with disk.
For production caches that must survive restarts, use Redis with AOF and a reasonable RDB schedule. For pure caches where data loss is acceptable, disable persistence or use Memcached.
- Pure cache: tolerance for data loss is high. Use Memcached or Redis with persistence off.
- Session store: loss means logged-out users. Use Redis with AOF everysec.
- Leaderboard/real-time: partial loss acceptable but not complete. Use RDB with hourly snapshots.
- Database cache: loss means DB load spike. Use Redis with RDB every 5 minutes.
- Pro tip: Always have a warm-up script for AOF restarts — replay time can be minutes.
Clustering & High Availability: Scaling Beyond One Node
Memcached has no built-in clustering. You run multiple independent instances and use client-side consistent hashing to distribute keys. This is simple and works well — you just add more nodes. But there's no replication, no failover. If a node goes down, the keys on that node become unavailable until the database repopulates them on the remaining nodes.
- Redis Sentinel: master-replica setup with automatic failover. You get high availability but still limited storage to one master's memory.
- Redis Cluster: data sharded across multiple masters (up to 16384 slots), with replication per shard. Provides horizontal scale and automatic failover. But clients must support cluster protocol, and multi-key operations across slots require explicit hash tags.
For most production caches, Redis Standalone with Sentinel is sufficient. You get failover in ~5-10 seconds. Redis Cluster is for datasets larger than a single node's memory. The operational complexity jumps: you need to manage slot migrations, handle cross-slot operations carefully, and monitor gossip.
Memcached's simplicity means zero operational overhead for clustering — just add nodes to the client pool. But you lose replication and failover. For stateless caches where data loss is acceptable, that's fine. For stateful caches, you need Redis or a custom replication layer.
spymemcached implement consistent hashing. When you add a node, only a fraction of keys relocate. Use KetamaConnectionFactory for the most popular implementation. It's well tested in production.Memory Efficiency & Eviction Policies: Every Byte Matters
Memcached uses slab allocation: fixed-size chunks per slab class. You control maximum memory with -m. When memory runs out, it evicts least recently used (LRU) items from the global LRU list — but only within the same slab class. This means if you have many small values and one large value, the large value's slab class may evict prematurely.
Redis uses a global maxmemory limit and one of several eviction policies: - noeviction: returns errors on writes when memory is full. - allkeys-lru: evicts the LRU key from any key. - volatile-lru: evicts the LRU key among keys with TTL. - allkeys-random: randomly evicts. - allkeys-lfu (Redis 4+): evicts least frequently used.
For production, allkeys-lru is the safest default — it keeps the most recently accessed data. But if you have keys with critical data and keys that are expendable, use volatile-lru and only set TTL on expendable ones. Memcached gives you less control: it's always global LRU across all slabs, and you must manage item sizes carefully to avoid slab waste.
Memory efficiency also differs: Memcached stores keys and values as strings with ~59 bytes overhead per key. Redis keys have ~90 bytes overhead. But Redis data structures can store many values under one key, reducing per-key overhead. For example, a hash with 100 fields under one key costs ~90 + 10100 = ~1090 bytes, vs 100 individual keys in Memcached costing 10059 = 5900 bytes (assuming same value size).
stats items to see eviction counts per slab. If you see high evictions in the large slab class, adjust your data shapes or pre-split large values.noeviction in Redis will silently drop writes.noeviction with proactive monitoring of used_memory.Production Trade-offs: A Decision Framework
You've seen the internals. Now here's the decision matrix engineers actually use:
- Pure cache for API responses: Either works. Memcached is simpler, lower overhead, faster for gets/sets. Redis adds no value unless you need atomic invalidation or versioning.
- Session store with persistence: Redis. Memcached loses all sessions on restart.
- Leaderboards, counters, rate limiting: Redis — native sorted sets, INCR with expire, Lua scripts for atomicity.
- Pub/sub messaging: Redis only. Memcached has no pub/sub.
- Large data sets > single node memory: Redis Cluster or Memcached with consistent hashing. Memcached is easier to scale horizontally.
- Multi-region replication: Redis Active-Active with CRDTs (Redis Enterprise only) or client-side routing. Memcached can't do cross-region.
- Budget-constrained: Memcached runs on less memory due to lower overhead. But Redis with persistence can reduce DB load more effectively, potentially saving on database costs.
Senior engineers don't pick one forever — they layer both. Use Memcached in front of Redis: Memcached absorbs the hot read load, Redis handles writes and complex queries. This pattern is common at scale.
- L1 (Memcached): 1-2ms reads, no persistence, ~1GB memory, handles hot keys.
- L2 (Redis): 2-5ms reads, AOF persistence, ~10GB memory, holds all cache keys.
- Writes go to L2 synchronously, L1 asynchronously or via read-through.
- L1 miss loads from L2; L2 miss loads from database.
- If L1 restarts, L2 repopulates it quickly.
Serialization: Why Your Objects Are Killing Performance
Here's what nobody tells you about caching Java objects. Memcached treats everything as opaque bytes. You serialize, it stores. It returns bytes, you deserialize. Simple. But that simplicity hides a tax: you pay serialization cost on every single read and write.
Redis gives you structured data types. When you store a hash, you don't serialize the entire object. You update one field. That single-field update bypasses the serialization overhead entirely. In production, I've seen teams cut 40% latency by migrating from Memcached to Redis hashes for user session data.
The WHY: Memcached forces full-object serialization because it has no concept of data structure. Redis does know structures, so partial updates work without touching the byte stream. Your database row has 20 columns, but you only need the 'last_seen' timestamp. With Memcached, you rewrite the entire blob. With Redis, you HSet one field.
This matters most when your objects are large but your hot paths touch small slices. Profile your serialization cost before deciding.
Eviction: How Memcached Silently Loses Your Data (And You Won't Notice)
Your production system is paging you at 3 AM. Cache hit rate dropped from 90% to 40%. Database is melting. You check Memcached stats and see 'evictions: 2 million'. Memcached evicts data when memory fills. It uses LRU. Simple. But here's the trap: Memcached doesn't tell you what it evicted. Your critical session data? Gone. Your API rate limits? Reset.
Redis gives you choice. LFU for hot data that should never leave. LRU for TTL-bound cache. Allkeys-lru when you don't care what leaves. And the killer feature: volatile-ttl evicts items with the shortest remaining TTL first. This means your one-hour cache entries survive while your one-second rate limits get cleaned.
The WHY: Memcached treats all keys equally. Production systems don't. Your authentication tokens are worth more than your trending articles. Redis eviction policies let you encode that priority without custom code.
Practical advice: Use maxmemory-policy allkeys-lfu for general caching. Use volatile-lfu when you have key expiry set. Monitor eviction rates. If evictions exceed 0.1% of writes, you need more memory or a different strategy.
The Persistence Punch: How Redis Durability Cost Us 2 Hours of Cache Warm-Up
save ''), RDB snapshots were off, and AOF was not configured. On restart, the in-memory dataset was empty. No keys. No cache. The database couldn't handle the resulting read storm.appendonly yes and set auto-aof-rewrite-percentage 100. Also added a health-check that pre-warms critical keys from the database before accepting traffic.- Always configure at least one persistence method in Redis if cache loss is unacceptable.
- AOF with
appendfsync everysecadds ~1ms write latency but prevents total data loss on restart. - Pre-warm critical keys during deployment health checks to avoid a cold-cache stampede.
- Test what happens when you restart Redis in a staging environment — you'll be surprised.
stats items shows slab info; stats cachedump <slab> <limit> to list keys. If missing, verify TTL and eviction policy.INFO memory to see used_memory and used_memory_rss. Compare with maxmemory. If near limit, check eviction policy with CONFIG GET maxmemory-policy. Add maxmemory 2gb and set appropriate policy (e.g. allkeys-lru).SLOWLOG GET 10 to see slow commands. Typically caused by KEYS *, FLUSHALL, or large MGET. Replace KEYS with SCAN, or pipeline smaller batches.stats conns in telnet. Increase max connections with -c flag on startup. Ensure client connection pooling is configured to reuse connections.INFO replication to see master_repl_offset and slave_repl_offset. Lag = master offset - slave offset. If high, check network bandwidth or enable repl-backlog-size to allow larger buffers.redis-cli INFO stats | grep -i evictedecho stats | nc localhost 11211 | grep evictionsvolatile-ttl.Key takeaways
Common mistakes to avoid
5 patternsAssuming Memcached can handle session persistence
Using Redis with no maxmemory and no eviction policy
maxmemory to a reasonable value (e.g., 80% of available RAM) and choose an eviction policy. allkeys-lru is safe for most caches.Not accounting for serialisation cost in Memcached
Using KEYS * in production Redis
Not configuring Redis for data safety when used as a cache for critical data
always but accept write latency. Or use Memcached and accept cache-only semantics.Interview Questions on This Topic
Why would you choose Memcached over Redis in a production system?
Frequently Asked Questions
20+ years shipping large-scale distributed systems. Everything here is grounded in real deployments.
That's Databases in Design. Mark it forged?
9 min read · try the examples if you haven't