Junior 4 min · March 06, 2026

allkeys-lru Evicted Rate-Limit — Redis Interview Gotcha

allkeys-lru evicted TTL-less rate-limit keys, causing 200 not 429.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • Redis is an in-memory data structure server, not just a cache — it natively supports strings, hashes, lists, sorted sets, streams, and more
  • Single-threaded command execution eliminates lock contention and makes latency predictable — I/O is multi-threaded in v6+
  • Persistence is opt-in: RDB snapshots (compact, minutes of data loss) vs AOF logs (near-zero loss, larger) — use both for production
  • Eviction policies like allkeys-lru silently kick in when maxmemory is hit — wrong policy can evict important non-cache keys
  • Lua scripts (EVAL) provide true atomicity for conditional operations like rate limiting — MULTI/EXEC does NOT roll back on runtime errors
Plain-English First

Imagine your office has a massive filing cabinet (your database) and a sticky-note board right next to your desk (Redis). Every time you need a document, walking to the cabinet takes 30 seconds. But if you stick the most-requested documents on your board, you grab them in 2 seconds. Redis is that sticky-note board — blazing-fast, lives in memory, and holds your hottest data so your app never has to make the slow trip to the filing cabinet. The catch? Your board has limited space, and if the office loses power, the sticky notes are gone unless you back them up.

Redis shows up in almost every modern backend stack — from session management at Netflix scale to real-time leaderboards in gaming apps to rate-limiting at API gateways. If you're interviewing for any backend, full-stack, or DevOps role, expect at least two or three Redis questions. Interviewers don't ask them to trip you up — they ask because Redis is one of those tools where misuse causes production fires, and they want to know you understand the trade-offs.

What Redis Actually Is — And Why It's Not 'Just a Cache'

Redis stands for Remote Dictionary Server. Yes, it's famous as a cache, but calling it 'just a cache' in an interview is a red flag. Redis is an in-memory data structure store. It natively understands strings, lists, sets, sorted sets, hashes, bitmaps, hyperloglogs, and streams. That means it's not a dumb key-value bucket — it can perform operations directly on those structures without you pulling data out, modifying it in application code, and pushing it back.

Why does this matter? Take a leaderboard. In a relational database you'd SELECT all scores, sort them in application memory, and return the top 10. With Redis Sorted Sets, you call ZREVRANGE leaderboard 0 9 and Redis returns the top 10 in O(log N) time — atomically, server-side, with no round-trip logic. That's the real power: moving computation closer to the data.

Redis is single-threaded for command execution (as of v6, I/O is multi-threaded), which sounds like a weakness but is actually why it's so predictable. No lock contention. No deadlocks. One command finishes before the next starts.

redis_data_structures_demo.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# Connect to Redis CLI
redis-cli

# ── STRING: Simple key-value with TTL (time-to-live) ──
SET user:1001:session_token "abc123xyz" EX 3600
# EX 3600 means this key auto-deletes after 1 hour
# Perfect for session tokens — no manual cleanup needed

GET user:1001:session_token
# Returns: "abc123xyz"

TTL user:1001:session_token
# Returns: 3598 (seconds remaining — live countdown)

# ── HASH: Store a user object without serializing to JSON ──
HSET user:1001 name "Priya Sharma" email "priya@example.com" plan "pro"
# Redis stores each field separately — you can update ONE field
# without reading and rewriting the whole object

HGET user:1001 plan
# Returns: "pro"

HGETALL user:1001
# Returns all fields: name, email, plan

# ── SORTED SET: Real-time leaderboard ──
ZADD game:leaderboard 9450 "alice"
ZADD game:leaderboard 8820 "bob"
ZADD game:leaderboard 9900 "carol"

# Top 3 players, highest score first (0-indexed range)
ZREVRANGE game:leaderboard 0 2 WITHSCORES
# Returns:
# 1) "carol"
# 2) "9900"
# 3) "alice"
# 4) "9450"
# 5) "bob"
# 6) "8820"

# ── LIST: Message queue pattern ──
LPUSH email:queue "welcome:user:1002"   # Push to the LEFT (head)
LPUSH email:queue "receipt:order:5501"
RPOP email:queue                        # Pop from the RIGHT (tail) — FIFO queue
# Returns: "welcome:user:1002"
Output
"abc123xyz"
3598
"pro"
1) "name"
2) "Priya Sharma"
3) "email"
4) "priya@example.com"
5) "plan"
6) "pro"
1) "carol"
2) "9900"
3) "alice"
4) "9450"
5) "bob"
6) "8820"
"welcome:user:1002"
Interview Gold:
When asked 'what data structures does Redis support?', don't just list them — explain ONE use case per structure. Strings → session tokens with TTL, Hashes → user profiles, Sorted Sets → leaderboards, Lists → job queues, Sets → unique visitor tracking. That answer signals real-world experience, not textbook memorisation.
Production Insight
The biggest production mistake with Redis data structures is treating Sorted Sets as an afterthought. ZADD is O(log N) per element — inserting a million items blocks the event loop for seconds. Batch with ZADD ... NX or use a pipeline.
Also, Hashes with many fields (>1000) consume more memory than storing a serialised JSON string. Use HASH to update individual fields, not to store entire objects.
Key takeaway: choose the structure that minimises command count per user request — that's how you move computation closer to data.
Key Takeaway
Redis is a data structure server, not a key-value cache.
The right structure reduces application code complexity.
Single-threading makes latency predictable but commands must be fast — avoid O(N) commands in production.

Connecting from Java: Spring Data Redis & Jedis

In a production Spring Boot environment, you won't be using the CLI. You'll likely use Spring Data Redis with Lettuce or Jedis. The key is using the RedisTemplate or StringRedisTemplate to ensure thread-safe operations and proper serialization. Interviewers often check if you know how to handle connection pooling and why Lettuce (asynchronous/non-blocking) is generally preferred over Jedis (synchronous) in high-concurrency Spring Boot applications.

io/thecodeforge/redis/RedisConfig.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
package io.thecodeforge.redis;

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.redis.connection.RedisConnectionFactory;
import org.springframework.data.redis.core.RedisTemplate;
import org.springframework.data.redis.serializer.GenericJackson2JsonRedisSerializer;
import org.springframework.data.redis.serializer.StringRedisSerializer;

@Configuration
public class RedisConfig {

    /**
     * Production-grade RedisTemplate configuration with JSON serialization.
     * By default, Spring uses JdkSerializationRedisSerializer (binary),
     * which is unreadable in CLI. JSON is better for debugging.
     */
    @Bean
    public RedisTemplate<String, Object> redisTemplate(RedisConnectionFactory factory) {
        RedisTemplate<String, Object> template = new RedisTemplate<>();
        template.setConnectionFactory(factory);
        
        // Use String serializer for keys
        template.setKeySerializer(new StringRedisSerializer());
        
        // Use JSON serializer for values to store complex Java objects
        template.setValueSerializer(new GenericJackson2JsonRedisSerializer());
        
        return template;
    }
}
Output
RedisTemplate bean configured for io.thecodeforge package.
Forge Tip: lettuce vs Jedis
If asked about connection libraries, mention that Lettuce is thread-safe and shares a single connection for all operations, whereas Jedis requires a connection pool to handle multiple threads. This shows you understand resource management.
Production Insight
A common production issue: using RedisTemplate with default JdkSerializationRedisSerializer leads to keys like \xac\xed\x00\x05t\x00... in Redis, making CLI debugging impossible. Always override template key/value serializers to use String or JSON.
Another gotcha: Jedis by default creates a new connection for each operation when used outside a pool. Under load, this causes TCP connection storms and TIME_WAIT exhaustion. Always use a connection pool with Jedis, or switch to Lettuce.
Key takeaway: in Spring Boot, prefer Lettuce + StringRedisSerializer + GenericJackson2JsonRedisSerializer for production.
Key Takeaway
Serialization choice affects debuggability.
Lettuce shares one connection, Jedis needs a pool.
Default Spring Data Redis config is not production-ready without custom serializers.

Persistence, Eviction & the Trade-offs That Cause Production Incidents

The single most dangerous misconception about Redis is treating it as a durable store by default. It isn't. By default, Redis is in-memory only — restart the process and your data is gone. Redis gives you two persistence mechanisms: RDB (Redis Database snapshots) and AOF (Append-Only File), and you need to understand both for interviews and for production.

RDB takes point-in-time snapshots — like a photograph of your data every N seconds if M keys changed. It's compact, fast to restore, but you can lose up to the last snapshot window of writes. AOF logs every write operation — like a transaction log. Slower to restore, larger on disk, but much less data loss risk (configurable to fsync every second or every command).

Eviction policy is equally critical. When Redis hits its maxmemory limit, it has to decide what to drop. The allkeys-lru policy evicts the least-recently-used key across all keys — great for a pure cache. The volatile-lru policy only evicts keys that have a TTL set — useful when some keys must survive (like a rate limit counter with no TTL). Picking the wrong eviction policy is a silent killer: you'll see cache misses spike with no obvious error.

redis_persistence_and_eviction.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# ── Check current persistence config ──
redis-cli CONFIG GET save
# Default output — snapshot triggers:
# 1) "save"
# 2) "3600 1 300 100 60 10000"
# Meaning: snapshot if 1 change in 3600s, OR 100 changes in 300s, OR 10000 changes in 60s

# ── Enable AOF (Append-Only File) for better durability ──
redis-cli CONFIG SET appendonly yes
redis-cli CONFIG SET appendfsync everysec
# everysec = fsync every second — best balance of performance vs durability
# always  = fsync on every write — safest but ~10x slower
# no      = OS decides when to flush — fastest but data loss risk

# ── Set a memory limit and eviction policy ──
redis-cli CONFIG SET maxmemory 256mb
redis-cli CONFIG SET maxmemory-policy allkeys-lru
# allkeys-lru: when full, evict the least-recently-used key from ANY key
# volatile-lru: only evict keys that have a TTL — non-TTL keys are safe
# noeviction: reject new writes when full — returns OOM error (dangerous for a cache!)

# ── Simulate checking what would be evicted ──
redis-cli OBJECT IDLETIME user:1001:session_token
# Returns idle time in seconds — higher = more likely to be evicted under LRU

# ── Persist config changes to redis.conf so they survive restart ──
redis-cli CONFIG REWRITE
# Without this, all CONFIG SET changes are lost on restart!
Output
1) "save"
2) "3600 1 300 100 60 10000"
OK
OK
OK
OK
(integer) 142
OK
Watch Out:
CONFIG SET changes are runtime-only by default. If Redis restarts (crash, deploy, maintenance), every CONFIG SET you applied is gone. Always follow up with CONFIG REWRITE to persist changes to redis.conf — or manage config via your infrastructure-as-code tooling. This catches even experienced engineers off guard.
Production Insight
Many teams set maxmemory but never set maxmemory-policy, defaulting to noeviction. When Redis hits the limit, writes fail with OOM errors — causing cascading failures across dependent services. Always explicitly set a policy.
Another trap: using AOF with appendfsync always on a busy Redis (~10k writes/s) can limit throughput to ~50k ops/s due to fsync latency. everysec is the standard for production.
Key takeaway: RDB + AOF together gives snapshot backups plus near-real-time durability. But AOF rewrite can spike CPU and memory — schedule it during low traffic.
Key Takeaway
Default persistence = no persistence.
allkeys-lru evicts everything, volatile-lru protects non-TTL keys.
CONFIG SET without CONFIG REWRITE is lost on restart.

Atomicity, Transactions & the Lua Script Pattern

Redis is single-threaded, so individual commands are always atomic. But what about multi-step operations like 'check a counter, increment it only if it's below 100'? That's where MULTI/EXEC transactions and Lua scripts come in — and this is a favourite interview deep-dive.

MULTI/EXEC queues a batch of commands that execute atomically. No other client can sneak a command in between. But there's a critical gotcha: Redis transactions don't roll back on runtime errors. If one command in the queue fails (e.g. wrong type), the rest still execute. That's intentional and very different from SQL transactions.

For complex conditional logic, Lua scripts are the better tool. A Lua script runs entirely on the Redis server as a single atomic unit. This is how rate limiting is implemented correctly — the check-and-increment happens in one atomic server-side script with no race condition possible.

Understanding the difference between MULTI/EXEC (queue and batch) versus Lua (conditional server-side logic) is what separates candidates who've used Redis in production from those who've only read the docs.

redis_atomic_rate_limiter.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# ── MULTI/EXEC: Batch commands atomically ──
redis-cli MULTI
redis-cli INCR order:5501:item_count
redis-cli EXPIRE order:5501:item_count 86400  # Set TTL inside the transaction
redis-cli EXEC
# Both commands execute atomically — no client can modify order:5501:item_count
# between INCR and EXPIRE

# ── Lua Script: Atomic rate limiter ──
# This is the CORRECT way to implement rate limiting.
# Using separate GET + INCR commands creates a race condition.

redis-cli EVAL "
  local key = KEYS[1]          -- e.g. 'ratelimit:user:1001:minute:2024010112'
  local limit = tonumber(ARGV[1])  -- e.g. 100 (requests per minute)
  local current = redis.call('GET', key)

  if current == false then
    -- Key doesn't exist yet — first request in this window
    redis.call('SET', key, 1, 'EX', 60)  -- Set count=1 with 60s TTL
    return {1, limit - 1}  -- {current_count, remaining}
  end

  current = tonumber(current)

  if current >= limit then
    return {current, 0}  -- Rate limit hit — reject the request
  end

  -- Under the limit — increment and return updated values
  local new_count = redis.call('INCR', key)
  return {new_count, limit - new_count}
" 1 "ratelimit:user:1001:minute:2024010112" 100

# Output on first call: 1) (integer) 1   2) (integer) 99
# Output on 100th call: 1) (integer) 100  2) (integer) 0
# Output on 101st call: 1) (integer) 100  2) (integer) 0  (blocked)

# ── WATCH: Optimistic locking (alternative to Lua for simple cases) ──
redis-cli WATCH user:1001:balance
# If user:1001:balance is modified by another client before EXEC,
# the entire transaction aborts and returns nil — your app retries
redis-cli MULTI
redis-cli DECRBY user:1001:balance 50
redis-cli EXEC
# Returns nil if balance was modified by another client (optimistic lock failed)
# Returns array of results if successful
Output
1) (integer) 1
2) (integer) 1
1) (integer) 1
2) (integer) 99
OK
OK
1) (integer) 1
Interview Gold:
If asked 'how does Redis handle concurrency?', the layered answer is: (1) individual commands are atomic by nature of single-threading, (2) MULTI/EXEC batches commands atomically but doesn't rollback on error, (3) Lua scripts are the gold standard for conditional atomic operations, (4) WATCH provides optimistic locking for CAS (compare-and-swap) patterns. Mentioning all four layers shows real depth.
Production Insight
A common bug: using MULTI/EXEC for a 'deduct balance if sufficient' flow. The check (GET balance) and the deduct (DECRBY) are separate commands, even inside MULTI/EXEC another client can modify the balance between them. Use WATCH + MULTI for optimistic locking, or better, a Lua script that does both check and deduct atomically.
Another pitfall: Lua scripts that call redis.call() inside a loop with many iterations. This blocks the event loop for the entire duration. Keep scripts fast — O(1) or small O(log N) — and avoid time-consuming computations.
Key takeaway: For any 'check then act' pattern, use Lua EVAL, not MULTI/EXEC.
Key Takeaway
MULTI/EXEC batches commands but does not roll back.
Lua scripts provide true conditional atomicity.
WATCH + MULTI for optimistic locking, but Lua is simpler and faster.

Redis Replication, Sentinel & Cluster — Knowing When to Use Each

Single-node Redis is fine for development and small applications, but production requires thinking about high availability and horizontal scaling. Interviewers at mid-to-large companies almost always probe here.

Replication is the foundation: one primary node accepts writes, one or more replicas asynchronously receive those writes. It's not synchronous — there's always a tiny lag, meaning replicas can serve slightly stale reads. This is an intentional trade-off for write throughput.

Redis Sentinel adds automatic failover on top of replication. Sentinel is a separate process (or set of processes) that monitors your primary. If the primary goes down, Sentinel promotes the most up-to-date replica to primary and updates your clients. You need at least 3 Sentinel nodes to avoid split-brain scenarios — a quorum of 2 must agree before a failover triggers.

Redis Cluster is a different beast: it's about horizontal scaling, not just failover. It automatically shards your keyspace across multiple primary nodes using 16,384 hash slots. Each primary can have replicas. The trade-off is that multi-key commands (like MGET or Lua scripts touching multiple keys) only work if all keys hash to the same slot — which you control using hash tags like {user:1001}:session and {user:1001}:profile.

redis_cluster_and_sentinel_concepts.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# ── Check replication status ──
redis-cli INFO replication
# On primary node output includes:
# role:master
# connected_slaves:2
# slave0:ip=10.0.0.2,port=6379,state=online,offset=1234567,lag=0
# slave1:ip=10.0.0.3,port=6379,state=online,offset=1234560,lag=1
# replication_backlog_size:1048576

# ── Redis Cluster: Hash slot calculation ──
# Redis Cluster uses CRC16(key) % 16384 to decide which shard a key lives on
# Problem: these two keys land on DIFFERENT shards:
redis-cli CLUSTER KEYSLOT "user:1001:session"   # Returns e.g. 11543
redis-cli CLUSTER KEYSLOT "user:1001:profile"   # Returns e.g. 6452
# MGET user:1001:session user:1001:profile would FAIL in cluster mode!

# ── Solution: Hash Tags force keys to the same slot ──
# Wrap the shared part in curly braces — only the part in {} is hashed
redis-cli CLUSTER KEYSLOT "{user:1001}:session"   # Both hash on "user:1001"
redis-cli CLUSTER KEYSLOT "{user:1001}:profile"   # Same slot as above!
# Now MGET {user:1001}:session {user:1001}:profile works correctly

# ── Check cluster node topology ──
redis-cli CLUSTER NODES
# Output shows all nodes, their roles, and which hash slot ranges they own:
# a1b2c3... 10.0.0.1:6379 master - 0 1620000000000 1 connected 0-5460
# d4e5f6... 10.0.0.2:6379 master - 0 1620000000001 2 connected 5461-10922
# g7h8i9... 10.0.0.3:6379 master - 0 1620000000002 3 connected 10923-16383

# ── Check Sentinel state ──
redis-cli -p 26379 SENTINEL MASTERS  # Port 26379 is the default Sentinel port
# Shows the primary being monitored and its current state
# 'num-slaves': 2 — how many replicas exist
# 'num-other-sentinels': 2 — how many other Sentinel nodes are watching
# 'quorum': 2 — how many must agree to trigger failover
Output
# replication INFO
role:master
connected_slaves:2
slave0:ip=10.0.0.2,port=6379,state=online,offset=1234567,lag=0
# hash slots
(integer) 11543
(integer) 6452
(integer) 4847
(integer) 4847
# cluster nodes
a1b2c3 10.0.0.1:6379 master - connected 0-5460
d4e5f6 10.0.0.2:6379 master - connected 5461-10922
Pro Tip:
Interviewers love asking 'what's the difference between Redis Sentinel and Redis Cluster?' The clean answer: Sentinel = high availability for a single dataset (automatic failover), Cluster = horizontal sharding for a dataset too large for one node (plus built-in HA). They solve different problems. Many companies run both patterns — Cluster for scale, with each shard having Sentinel-like replica failover built in.
Production Insight
Replication lag can cause read-after-write inconsistency. A common fix: write to primary, then immediately wait for a replica to acknowledge the write using WAIT command (number of replicas, timeout). This adds latency but ensures consistency. Only use it for critical operations like payment writes.
In Redis Cluster, moving slots during resharding causes short periods (microseconds) where the slot is in 'migrating' or 'importing' state — clients that use smart clients (like Lettuce) handle this transparently, but Jedis may throw MOVED redirects.
Key takeaway: Sentinel for HA, Cluster for scale. Don't confuse the two.
Key Takeaway
Replication is async — expect stale reads.
Sentinel provides failover, Cluster provides sharding.
Hash tags group related keys on the same shard for multi-key operations.
● Production incidentPOST-MORTEMseverity: high

The Missing Rate Limit: How Redis Eviction Killed Our API Throttling

Symptom
Rate-limited endpoints started returning 200 instead of 429. The upstream database (PostgreSQL) saw connection pool exhaustion and began rejecting legitimate traffic.
Assumption
Redis maxmemory was set high enough (4GB), but the team assumed the allkeys-lru policy would only evict cache keys. They didn't realise rate-limit counters had no TTL and were not protected.
Root cause
During a traffic spike, Redis hit maxmemory (4GB). allkeys-lru evicted the least-recently-used keys, which included the rate-limit counter keys (e.g., ratelimit:user:1001:minute:...) because they had no TTL and were only accessed once per minute per user. Once evicted, the Lua script would find 'key does not exist' and reset the counter to 1, effectively bypassing the limit.
Fix
Changed eviction policy to volatile-lru — only evict keys with a TTL. Gave rate-limit counters a TTL equal to the window (e.g., 60 seconds) so they are eligible for eviction but only after the window expires. Also added monitoring on Redis eviction rate and increased maxmemory to 8GB to absorb spikes.
Key lesson
  • Eviction policies are not 'free' — each one has a class of keys it will silently destroy. volatile-lru protects non-TTL keys, allkeys-lru does not.
  • Rate-limit counters without TTL are unprotected under allkeys-lru. Always set a TTL matching the window length.
  • Watch eviction metrics (evicted_keys in INFO stats) in your monitoring. A spike means Redis is discarding data — investigate immediately.
Production debug guideCommon Redis failure signals and exactly what to check next5 entries
Symptom · 01
API responses slow down intermittently, Redis latency spikes from <1ms to >100ms
Fix
Check LATENCY LATEST and LATENCY HISTORY. Also run SLOWLOG GET 100 to identify long-running commands (likely KEYS, SMEMBERS, or high-cardinality SORT). Then use CLIENT LIST to see if many clients are blocking.
Symptom · 02
Redis process uses 100% CPU for extended periods
Fix
Check if AOF rewrite is running (INFO persistence shows aof_rewrite_in_progress=1). If yes, throttle rewrite via auto-aof-rewrite-percentage. If no, check for computationally expensive commands like SORT, ZUNIONSTORE, or EVAL scripts with O(N) operations.
Symptom · 03
Clients get connection refused or timeout
Fix
Verify maxclient setting (CONFIG GET maxclients). Check system file descriptor limits (ulimit -n). For timeouts, check timeout setting (CONFIG GET timeout) — if it's 0, clients never disconnect, leading to fd exhaustion.
Symptom · 04
Redis memory usage is high but maxmemory is not being hit — yet
Fix
Run MEMORY STATS to see breakdown (overhead vs dataset). Check INFO keyspace for key count and average TTL. Use redis-cli --bigkeys to find large keys consuming disproportionate memory. Consider active defragmentation if fragmentation ratio > 1.5.
Symptom · 05
Redis replication lag is growing and never catches up
Fix
Check replication backlog size (repl_backlog_size) — default 1MB may be too small for high-write workloads. Increase to 64MB+ via repl-backlog-size. Also check replica priority and disk I/O on replica (AOF rewrite can stall replication).
★ Redis Quick Debug Cheat SheetThree-command fix for the three most common production Redis issues.
Redis is slow (latency > 10ms per command)
Immediate action
Check LATENCY LATEST and SLOWLOG
Commands
LATENCY LATEST
SLOWLOG GET 50
Fix now
Kill the offending command (CLIENT KILL) or rewrite it using SCAN instead of KEYS. If AOF rewrite is the cause, use CONFIG SET auto-aof-rewrite-percentage 200 to reduce frequency.
Redis runs out of memory and starts evicting (evicted_keys > 0)+
Immediate action
Check MEMORY STATS and identify large keys
Commands
MEMORY STATS
redis-cli --bigkeys
Fix now
Add more maxmemory (CONFIG SET maxmemory 8gb) temporarily, then fix the root cause: set TTLs on cache keys, reduce key size, or move to volatile-lru if non-TTL keys are important.
Replication is falling behind (repl_backlog_histlen < repl_backlog_size, slave lag > 10s)+
Immediate action
Check repl_backlog_size and disk I/O on replica
Commands
INFO replication
iostat -x 1 5 on replica
Fix now
Increase repl-backlog-size to 64MB (CONFIG SET repl-backlog-size 67108864). If disk I/O is high, disable AOF on replica (CONFIG SET appendonly no) or use faster storage (NVMe).
RDB vs AOF Persistence
AspectRDB PersistenceAOF Persistence
MechanismPoint-in-time snapshots (fork + dump)Logs every write command sequentially
Data loss riskUp to last snapshot interval (minutes)Up to 1 second (with everysec config)
File sizeCompact binary format — smallGrows over time — needs periodic rewrite
Restart/restore speedVery fast — single file loadSlower — replays all commands
CPU/Memory impactFork() spike during snapshotContinuous small overhead per write
Best forAcceptable data loss, fast restartsNear-zero data loss requirement
Use together?Yes — Redis supports RDB+AOF simultaneously for best of both worldsAOF used for recovery, RDB for backups

Key takeaways

1
Redis is a data structure server, not just a cache
its native operations on Sorted Sets, Hashes, and Lists let you move computation to the data layer, which is its real superpower.
2
Redis persistence is opt-in
RDB gives compact snapshots with higher data-loss risk, AOF gives near-zero loss at the cost of file size. Use both together in production for defence-in-depth.
3
MULTI/EXEC does NOT roll back on runtime errors
use Lua scripts (EVAL) for conditional atomic operations like rate limiting, where a race condition between separate commands would cause bugs.
4
In Redis Cluster, multi-key commands only work if all keys share the same hash slot
control this with hash tags ({user:1001}:key) to group related keys onto the same shard deliberately.

Common mistakes to avoid

3 patterns
×

Using KEYS * in production

Symptom
The KEYS command scans the entire keyspace and blocks all other commands while it runs (remember: single-threaded). On a Redis instance with 10 million keys, this can block for seconds, causing cascading timeouts.
Fix
Always use SCAN with a COUNT hint and cursor-based iteration for any key discovery in production code.
×

Not setting TTLs on cached keys

Symptom
Developers SET thousands of keys during a traffic spike with no expiry. Redis fills to maxmemory, the eviction policy kicks in, and it starts evicting random keys — possibly evicting important non-cache data.
Fix
Always pass EX (seconds) or PX (milliseconds) when caching. Make TTL a first-class requirement in your caching strategy, not an afterthought.
×

Assuming MULTI/EXEC rolls back on error

Symptom
A developer wraps a payment flow in MULTI/EXEC, one command fails due to a wrong data type, and the other commands still execute — leaving data in a partially-updated state. Redis does NOT roll back on runtime errors (only on syntax errors before EXEC).
Fix
Use Lua scripts for operations that require true all-or-nothing atomicity with conditional logic, or validate all inputs before entering the MULTI block.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
Explain how you'd implement a rate limiter using Redis. What commands wo...
Q02SENIOR
Your Redis instance is running out of memory. Walk me through how you'd ...
Q03SENIOR
A developer on your team suggests using Redis Pub/Sub as a reliable mess...
Q04SENIOR
How does Redis handle master-slave replication? Is it synchronous or asy...
Q05JUNIOR
Describe the Big O complexity of the most common Redis operations (GET, ...
Q01 of 05SENIOR

Explain how you'd implement a rate limiter using Redis. What commands would you use and why? What race conditions could occur with a naive implementation?

ANSWER
Use a Lua script with EVAL: get the current count, check against limit, if under increment and set TTL equal to window (e.g., 60s). Return current count and remaining. The Lua script runs atomically, so no race condition between GET and INCR. Naive implementation using separate GET+INCR commands has a TOCTOU race: two concurrent requests could both read count=99 and both increment to 100, allowing 101 requests. Always use EVAL or WATCH+MULTI for atomic check-and-increment.
FAQ · 3 QUESTIONS

Frequently Asked Questions

01
Is Redis single-threaded and does that make it slow?
02
What is the difference between Redis cache eviction policies?
03
When should I use Redis Pub/Sub versus Redis Streams?
🔥

That's Database Interview. Mark it forged?

4 min read · try the examples if you haven't

Previous
NoSQL Interview Questions
4 / 4 · Database Interview
Next
Common HR Interview Questions