Redis is an in-memory data structure server, not just a cache — it natively supports strings, hashes, lists, sorted sets, streams, and more
Single-threaded command execution eliminates lock contention and makes latency predictable — I/O is multi-threaded in v6+
Persistence is opt-in: RDB snapshots (compact, minutes of data loss) vs AOF logs (near-zero loss, larger) — use both for production
Eviction policies like allkeys-lru silently kick in when maxmemory is hit — wrong policy can evict important non-cache keys
Lua scripts (EVAL) provide true atomicity for conditional operations like rate limiting — MULTI/EXEC does NOT roll back on runtime errors
Plain-English First
Imagine your office has a massive filing cabinet (your database) and a sticky-note board right next to your desk (Redis). Every time you need a document, walking to the cabinet takes 30 seconds. But if you stick the most-requested documents on your board, you grab them in 2 seconds. Redis is that sticky-note board — blazing-fast, lives in memory, and holds your hottest data so your app never has to make the slow trip to the filing cabinet. The catch? Your board has limited space, and if the office loses power, the sticky notes are gone unless you back them up.
Redis shows up in almost every modern backend stack — from session management at Netflix scale to real-time leaderboards in gaming apps to rate-limiting at API gateways. If you're interviewing for any backend, full-stack, or DevOps role, expect at least two or three Redis questions. Interviewers don't ask them to trip you up — they ask because Redis is one of those tools where misuse causes production fires, and they want to know you understand the trade-offs.
What Redis Actually Is — And Why It's Not 'Just a Cache'
Redis stands for Remote Dictionary Server. Yes, it's famous as a cache, but calling it 'just a cache' in an interview is a red flag. Redis is an in-memory data structure store. It natively understands strings, lists, sets, sorted sets, hashes, bitmaps, hyperloglogs, and streams. That means it's not a dumb key-value bucket — it can perform operations directly on those structures without you pulling data out, modifying it in application code, and pushing it back.
Why does this matter? Take a leaderboard. In a relational database you'd SELECT all scores, sort them in application memory, and return the top 10. With Redis Sorted Sets, you call ZREVRANGE leaderboard 0 9 and Redis returns the top 10 in O(log N) time — atomically, server-side, with no round-trip logic. That's the real power: moving computation closer to the data.
Redis is single-threaded for command execution (as of v6, I/O is multi-threaded), which sounds like a weakness but is actually why it's so predictable. No lock contention. No deadlocks. One command finishes before the next starts.
redis_data_structures_demo.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# Connect to RedisCLI
redis-cli
# ── STRING: Simple key-value with TTL (time-to-live) ──
SET user:1001:session_token "abc123xyz"EX3600
# EX3600 means this key auto-deletes after 1 hour
# Perfectfor session tokens — no manual cleanup needed
GET user:1001:session_token
# Returns: "abc123xyz"TTL user:1001:session_token
# Returns: 3598 (seconds remaining — live countdown)
# ── HASH: Store a user object without serializing to JSON ──
HSET user:1001 name "Priya Sharma" email "priya@example.com" plan "pro"
# Redis stores each field separately — you can update ONE field
# without reading and rewriting the whole object
HGET user:1001 plan
# Returns: "pro"HGETALL user:1001
# Returns all fields: name, email, plan
# ── SORTEDSET: Real-time leaderboard ──
ZADD game:leaderboard 9450"alice"ZADD game:leaderboard 8820"bob"ZADD game:leaderboard 9900"carol"
# Top3 players, highest score first (0-indexed range)
ZREVRANGE game:leaderboard 02WITHSCORES
# Returns:
# 1) "carol"
# 2) "9900"
# 3) "alice"
# 4) "9450"
# 5) "bob"
# 6) "8820"
# ── LIST: Message queue pattern ──
LPUSH email:queue "welcome:user:1002" # Push to the LEFT (head)
LPUSH email:queue "receipt:order:5501"RPOP email:queue # Pop from the RIGHT (tail) — FIFO queue
# Returns: "welcome:user:1002"
Output
"abc123xyz"
3598
"pro"
1) "name"
2) "Priya Sharma"
3) "email"
4) "priya@example.com"
5) "plan"
6) "pro"
1) "carol"
2) "9900"
3) "alice"
4) "9450"
5) "bob"
6) "8820"
"welcome:user:1002"
Interview Gold:
When asked 'what data structures does Redis support?', don't just list them — explain ONE use case per structure. Strings → session tokens with TTL, Hashes → user profiles, Sorted Sets → leaderboards, Lists → job queues, Sets → unique visitor tracking. That answer signals real-world experience, not textbook memorisation.
Production Insight
The biggest production mistake with Redis data structures is treating Sorted Sets as an afterthought. ZADD is O(log N) per element — inserting a million items blocks the event loop for seconds. Batch with ZADD ... NX or use a pipeline.
Also, Hashes with many fields (>1000) consume more memory than storing a serialised JSON string. Use HASH to update individual fields, not to store entire objects.
Key takeaway: choose the structure that minimises command count per user request — that's how you move computation closer to data.
Key Takeaway
Redis is a data structure server, not a key-value cache.
The right structure reduces application code complexity.
Single-threading makes latency predictable but commands must be fast — avoid O(N) commands in production.
Connecting from Java: Spring Data Redis & Jedis
In a production Spring Boot environment, you won't be using the CLI. You'll likely use Spring Data Redis with Lettuce or Jedis. The key is using the RedisTemplate or StringRedisTemplate to ensure thread-safe operations and proper serialization. Interviewers often check if you know how to handle connection pooling and why Lettuce (asynchronous/non-blocking) is generally preferred over Jedis (synchronous) in high-concurrency Spring Boot applications.
io/thecodeforge/redis/RedisConfig.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
package io.thecodeforge.redis;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.redis.connection.RedisConnectionFactory;
import org.springframework.data.redis.core.RedisTemplate;
import org.springframework.data.redis.serializer.GenericJackson2JsonRedisSerializer;
import org.springframework.data.redis.serializer.StringRedisSerializer;
@ConfigurationpublicclassRedisConfig {
/**
* Production-grade RedisTemplate configuration with JSON serialization.
* Bydefault, Spring uses JdkSerializationRedisSerializer (binary),
* which is unreadable in CLI. JSON is better for debugging.
*/
@BeanpublicRedisTemplate<String, Object> redisTemplate(RedisConnectionFactory factory) {
RedisTemplate<String, Object> template = newRedisTemplate<>();
template.setConnectionFactory(factory);
// Use String serializer for keys
template.setKeySerializer(newStringRedisSerializer());
// Use JSON serializer for values to store complex Java objects
template.setValueSerializer(newGenericJackson2JsonRedisSerializer());
return template;
}
}
Output
RedisTemplate bean configured for io.thecodeforge package.
Forge Tip: lettuce vs Jedis
If asked about connection libraries, mention that Lettuce is thread-safe and shares a single connection for all operations, whereas Jedis requires a connection pool to handle multiple threads. This shows you understand resource management.
Production Insight
A common production issue: using RedisTemplate with default JdkSerializationRedisSerializer leads to keys like \xac\xed\x00\x05t\x00... in Redis, making CLI debugging impossible. Always override template key/value serializers to use String or JSON.
Another gotcha: Jedis by default creates a new connection for each operation when used outside a pool. Under load, this causes TCP connection storms and TIME_WAIT exhaustion. Always use a connection pool with Jedis, or switch to Lettuce.
Key takeaway: in Spring Boot, prefer Lettuce + StringRedisSerializer + GenericJackson2JsonRedisSerializer for production.
Key Takeaway
Serialization choice affects debuggability.
Lettuce shares one connection, Jedis needs a pool.
Default Spring Data Redis config is not production-ready without custom serializers.
Persistence, Eviction & the Trade-offs That Cause Production Incidents
The single most dangerous misconception about Redis is treating it as a durable store by default. It isn't. By default, Redis is in-memory only — restart the process and your data is gone. Redis gives you two persistence mechanisms: RDB (Redis Database snapshots) and AOF (Append-Only File), and you need to understand both for interviews and for production.
RDB takes point-in-time snapshots — like a photograph of your data every N seconds if M keys changed. It's compact, fast to restore, but you can lose up to the last snapshot window of writes. AOF logs every write operation — like a transaction log. Slower to restore, larger on disk, but much less data loss risk (configurable to fsync every second or every command).
Eviction policy is equally critical. When Redis hits its maxmemory limit, it has to decide what to drop. The allkeys-lru policy evicts the least-recently-used key across all keys — great for a pure cache. The volatile-lru policy only evicts keys that have a TTL set — useful when some keys must survive (like a rate limit counter with no TTL). Picking the wrong eviction policy is a silent killer: you'll see cache misses spike with no obvious error.
redis_persistence_and_eviction.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# ── Check current persistence config ──
redis-cli CONFIGGET save
# Default output — snapshot triggers:
# 1) "save"
# 2) "3600 1 300 100 60 10000"
# Meaning: snapshot if1 change in 3600s, OR100 changes in 300s, OR10000 changes in 60s
# ── EnableAOF (Append-OnlyFile) for better durability ──
redis-cli CONFIGSET appendonly yes
redis-cli CONFIGSET appendfsync everysec
# everysec = fsync every second — best balance of performance vs durability
# always = fsync on every write — safest but ~10x slower
# no = OS decides when to flush — fastest but data loss risk
# ── Set a memory limit and eviction policy ──
redis-cli CONFIGSET maxmemory 256mb
redis-cli CONFIGSET maxmemory-policy allkeys-lru
# allkeys-lru: when full, evict the least-recently-used key from ANY key
# volatile-lru: only evict keys that have a TTL — non-TTL keys are safe
# noeviction: reject new writes when full — returns OOMerror (dangerous for a cache!)
# ── Simulate checking what would be evicted ──
redis-cli OBJECTIDLETIME user:1001:session_token
# Returns idle time in seconds — higher = more likely to be evicted under LRU
# ── Persist config changes to redis.conf so they survive restart ──
redis-cli CONFIGREWRITE
# Withoutthis, all CONFIGSET changes are lost on restart!
Output
1) "save"
2) "3600 1 300 100 60 10000"
OK
OK
OK
OK
(integer) 142
OK
Watch Out:
CONFIG SET changes are runtime-only by default. If Redis restarts (crash, deploy, maintenance), every CONFIG SET you applied is gone. Always follow up with CONFIG REWRITE to persist changes to redis.conf — or manage config via your infrastructure-as-code tooling. This catches even experienced engineers off guard.
Production Insight
Many teams set maxmemory but never set maxmemory-policy, defaulting to noeviction. When Redis hits the limit, writes fail with OOM errors — causing cascading failures across dependent services. Always explicitly set a policy.
Another trap: using AOF with appendfsync always on a busy Redis (~10k writes/s) can limit throughput to ~50k ops/s due to fsync latency. everysec is the standard for production.
Key takeaway: RDB + AOF together gives snapshot backups plus near-real-time durability. But AOF rewrite can spike CPU and memory — schedule it during low traffic.
CONFIG SET without CONFIG REWRITE is lost on restart.
Atomicity, Transactions & the Lua Script Pattern
Redis is single-threaded, so individual commands are always atomic. But what about multi-step operations like 'check a counter, increment it only if it's below 100'? That's where MULTI/EXEC transactions and Lua scripts come in — and this is a favourite interview deep-dive.
MULTI/EXEC queues a batch of commands that execute atomically. No other client can sneak a command in between. But there's a critical gotcha: Redis transactions don't roll back on runtime errors. If one command in the queue fails (e.g. wrong type), the rest still execute. That's intentional and very different from SQL transactions.
For complex conditional logic, Lua scripts are the better tool. A Lua script runs entirely on the Redis server as a single atomic unit. This is how rate limiting is implemented correctly — the check-and-increment happens in one atomic server-side script with no race condition possible.
Understanding the difference between MULTI/EXEC (queue and batch) versus Lua (conditional server-side logic) is what separates candidates who've used Redis in production from those who've only read the docs.
redis_atomic_rate_limiter.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# ── MULTI/EXEC: Batch commands atomically ──
redis-cli MULTI
redis-cli INCR order:5501:item_count
redis-cli EXPIRE order:5501:item_count 86400 # SetTTL inside the transaction
redis-cli EXEC
# Both commands execute atomically — no client can modify order:5501:item_count
# between INCR and EXPIRE
# ── LuaScript: Atomic rate limiter ──
# This is the CORRECT way to implement rate limiting.
# Using separate GET + INCR commands creates a race condition.
redis-cli EVAL "
local key = KEYS[1] -- e.g. 'ratelimit:user:1001:minute:2024010112'
local limit = tonumber(ARGV[1]) -- e.g. 100 (requests per minute)
local current = redis.call('GET', key)
if current == false then
-- Key doesn't exist yet — first request in this window
redis.call('SET', key, 1, 'EX', 60) -- Set count=1 with 60s TTLreturn {1, limit - 1} -- {current_count, remaining}
end
current = tonumber(current)
if current >= limit then
return {current, 0} -- Rate limit hit — reject the request
end
-- Under the limit — increment and return updated values
local new_count = redis.call('INCR', key)
return {new_count, limit - new_count}
" 1 "ratelimit:user:1001:minute:2024010112" 100
# Output on first call: 1) (integer) 12) (integer) 99
# Output on 100th call: 1) (integer) 1002) (integer) 0
# Output on 101st call: 1) (integer) 1002) (integer) 0 (blocked)
# ── WATCH: Optimisticlocking (alternative to Luafor simple cases) ──
redis-cli WATCH user:1001:balance
# If user:1001:balance is modified by another client before EXEC,
# the entire transaction aborts and returns nil — your app retries
redis-cli MULTI
redis-cli DECRBY user:1001:balance 50
redis-cli EXEC
# Returns nil if balance was modified by another client (optimistic lock failed)
# Returns array of results if successful
Output
1) (integer) 1
2) (integer) 1
1) (integer) 1
2) (integer) 99
OK
OK
1) (integer) 1
Interview Gold:
If asked 'how does Redis handle concurrency?', the layered answer is: (1) individual commands are atomic by nature of single-threading, (2) MULTI/EXEC batches commands atomically but doesn't rollback on error, (3) Lua scripts are the gold standard for conditional atomic operations, (4) WATCH provides optimistic locking for CAS (compare-and-swap) patterns. Mentioning all four layers shows real depth.
Production Insight
A common bug: using MULTI/EXEC for a 'deduct balance if sufficient' flow. The check (GET balance) and the deduct (DECRBY) are separate commands, even inside MULTI/EXEC another client can modify the balance between them. Use WATCH + MULTI for optimistic locking, or better, a Lua script that does both check and deduct atomically.
Another pitfall: Lua scripts that call redis.call() inside a loop with many iterations. This blocks the event loop for the entire duration. Keep scripts fast — O(1) or small O(log N) — and avoid time-consuming computations.
Key takeaway: For any 'check then act' pattern, use Lua EVAL, not MULTI/EXEC.
Key Takeaway
MULTI/EXEC batches commands but does not roll back.
Lua scripts provide true conditional atomicity.
WATCH + MULTI for optimistic locking, but Lua is simpler and faster.
Redis Replication, Sentinel & Cluster — Knowing When to Use Each
Single-node Redis is fine for development and small applications, but production requires thinking about high availability and horizontal scaling. Interviewers at mid-to-large companies almost always probe here.
Replication is the foundation: one primary node accepts writes, one or more replicas asynchronously receive those writes. It's not synchronous — there's always a tiny lag, meaning replicas can serve slightly stale reads. This is an intentional trade-off for write throughput.
Redis Sentinel adds automatic failover on top of replication. Sentinel is a separate process (or set of processes) that monitors your primary. If the primary goes down, Sentinel promotes the most up-to-date replica to primary and updates your clients. You need at least 3 Sentinel nodes to avoid split-brain scenarios — a quorum of 2 must agree before a failover triggers.
Redis Cluster is a different beast: it's about horizontal scaling, not just failover. It automatically shards your keyspace across multiple primary nodes using 16,384 hash slots. Each primary can have replicas. The trade-off is that multi-key commands (like MGET or Lua scripts touching multiple keys) only work if all keys hash to the same slot — which you control using hash tags like {user:1001}:session and {user:1001}:profile.
redis_cluster_and_sentinel_concepts.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# ── Check replication status ──
redis-cli INFO replication
# On primary node output includes:
# role:master
# connected_slaves:2
# slave0:ip=10.0.0.2,port=6379,state=online,offset=1234567,lag=0
# slave1:ip=10.0.0.3,port=6379,state=online,offset=1234560,lag=1
# replication_backlog_size:1048576
# ── RedisCluster: Hash slot calculation ──
# RedisCluster uses CRC16(key) % 16384 to decide which shard a key lives on
# Problem: these two keys land on DIFFERENT shards:
redis-cli CLUSTERKEYSLOT"user:1001:session" # Returns e.g. 11543
redis-cli CLUSTERKEYSLOT"user:1001:profile" # Returns e.g. 6452
# MGET user:1001:session user:1001:profile would FAIL in cluster mode!
# ── Solution: HashTags force keys to the same slot ──
# Wrap the shared part in curly braces — only the part in {} is hashed
redis-cli CLUSTERKEYSLOT"{user:1001}:session" # Both hash on "user:1001"
redis-cli CLUSTERKEYSLOT"{user:1001}:profile" # Same slot as above!
# NowMGET {user:1001}:session {user:1001}:profile works correctly
# ── Check cluster node topology ──
redis-cli CLUSTERNODES
# Output shows all nodes, their roles, and which hash slot ranges they own:
# a1b2c3... 10.0.0.1:6379 master - 016200000000001 connected 0-5460
# d4e5f6... 10.0.0.2:6379 master - 016200000000012 connected 5461-10922
# g7h8i9... 10.0.0.3:6379 master - 016200000000023 connected 10923-16383
# ── CheckSentinel state ──
redis-cli -p 26379SENTINELMASTERS # Port26379 is the defaultSentinel port
# Shows the primary being monitored and its current state
# 'num-slaves': 2 — how many replicas exist
# 'num-other-sentinels': 2 — how many other Sentinel nodes are watching
# 'quorum': 2 — how many must agree to trigger failover
Interviewers love asking 'what's the difference between Redis Sentinel and Redis Cluster?' The clean answer: Sentinel = high availability for a single dataset (automatic failover), Cluster = horizontal sharding for a dataset too large for one node (plus built-in HA). They solve different problems. Many companies run both patterns — Cluster for scale, with each shard having Sentinel-like replica failover built in.
Production Insight
Replication lag can cause read-after-write inconsistency. A common fix: write to primary, then immediately wait for a replica to acknowledge the write using WAIT command (number of replicas, timeout). This adds latency but ensures consistency. Only use it for critical operations like payment writes.
In Redis Cluster, moving slots during resharding causes short periods (microseconds) where the slot is in 'migrating' or 'importing' state — clients that use smart clients (like Lettuce) handle this transparently, but Jedis may throw MOVED redirects.
Key takeaway: Sentinel for HA, Cluster for scale. Don't confuse the two.
Hash tags group related keys on the same shard for multi-key operations.
● Production incidentPOST-MORTEMseverity: high
The Missing Rate Limit: How Redis Eviction Killed Our API Throttling
Symptom
Rate-limited endpoints started returning 200 instead of 429. The upstream database (PostgreSQL) saw connection pool exhaustion and began rejecting legitimate traffic.
Assumption
Redis maxmemory was set high enough (4GB), but the team assumed the allkeys-lru policy would only evict cache keys. They didn't realise rate-limit counters had no TTL and were not protected.
Root cause
During a traffic spike, Redis hit maxmemory (4GB). allkeys-lru evicted the least-recently-used keys, which included the rate-limit counter keys (e.g., ratelimit:user:1001:minute:...) because they had no TTL and were only accessed once per minute per user. Once evicted, the Lua script would find 'key does not exist' and reset the counter to 1, effectively bypassing the limit.
Fix
Changed eviction policy to volatile-lru — only evict keys with a TTL. Gave rate-limit counters a TTL equal to the window (e.g., 60 seconds) so they are eligible for eviction but only after the window expires. Also added monitoring on Redis eviction rate and increased maxmemory to 8GB to absorb spikes.
Key lesson
Eviction policies are not 'free' — each one has a class of keys it will silently destroy. volatile-lru protects non-TTL keys, allkeys-lru does not.
Rate-limit counters without TTL are unprotected under allkeys-lru. Always set a TTL matching the window length.
Watch eviction metrics (evicted_keys in INFO stats) in your monitoring. A spike means Redis is discarding data — investigate immediately.
Production debug guideCommon Redis failure signals and exactly what to check next5 entries
Symptom · 01
API responses slow down intermittently, Redis latency spikes from <1ms to >100ms
→
Fix
Check LATENCY LATEST and LATENCY HISTORY. Also run SLOWLOG GET 100 to identify long-running commands (likely KEYS, SMEMBERS, or high-cardinality SORT). Then use CLIENT LIST to see if many clients are blocking.
Symptom · 02
Redis process uses 100% CPU for extended periods
→
Fix
Check if AOF rewrite is running (INFO persistence shows aof_rewrite_in_progress=1). If yes, throttle rewrite via auto-aof-rewrite-percentage. If no, check for computationally expensive commands like SORT, ZUNIONSTORE, or EVAL scripts with O(N) operations.
Symptom · 03
Clients get connection refused or timeout
→
Fix
Verify maxclient setting (CONFIG GET maxclients). Check system file descriptor limits (ulimit -n). For timeouts, check timeout setting (CONFIG GET timeout) — if it's 0, clients never disconnect, leading to fd exhaustion.
Symptom · 04
Redis memory usage is high but maxmemory is not being hit — yet
→
Fix
Run MEMORY STATS to see breakdown (overhead vs dataset). Check INFO keyspace for key count and average TTL. Use redis-cli --bigkeys to find large keys consuming disproportionate memory. Consider active defragmentation if fragmentation ratio > 1.5.
Symptom · 05
Redis replication lag is growing and never catches up
→
Fix
Check replication backlog size (repl_backlog_size) — default 1MB may be too small for high-write workloads. Increase to 64MB+ via repl-backlog-size. Also check replica priority and disk I/O on replica (AOF rewrite can stall replication).
★ Redis Quick Debug Cheat SheetThree-command fix for the three most common production Redis issues.
Redis is slow (latency > 10ms per command)−
Immediate action
Check LATENCY LATEST and SLOWLOG
Commands
LATENCY LATEST
SLOWLOG GET 50
Fix now
Kill the offending command (CLIENT KILL) or rewrite it using SCAN instead of KEYS. If AOF rewrite is the cause, use CONFIG SET auto-aof-rewrite-percentage 200 to reduce frequency.
Redis runs out of memory and starts evicting (evicted_keys > 0)+
Immediate action
Check MEMORY STATS and identify large keys
Commands
MEMORY STATS
redis-cli --bigkeys
Fix now
Add more maxmemory (CONFIG SET maxmemory 8gb) temporarily, then fix the root cause: set TTLs on cache keys, reduce key size, or move to volatile-lru if non-TTL keys are important.
Replication is falling behind (repl_backlog_histlen < repl_backlog_size, slave lag > 10s)+
Immediate action
Check repl_backlog_size and disk I/O on replica
Commands
INFO replication
iostat -x 1 5 on replica
Fix now
Increase repl-backlog-size to 64MB (CONFIG SET repl-backlog-size 67108864). If disk I/O is high, disable AOF on replica (CONFIG SET appendonly no) or use faster storage (NVMe).
RDB vs AOF Persistence
Aspect
RDB Persistence
AOF Persistence
Mechanism
Point-in-time snapshots (fork + dump)
Logs every write command sequentially
Data loss risk
Up to last snapshot interval (minutes)
Up to 1 second (with everysec config)
File size
Compact binary format — small
Grows over time — needs periodic rewrite
Restart/restore speed
Very fast — single file load
Slower — replays all commands
CPU/Memory impact
Fork() spike during snapshot
Continuous small overhead per write
Best for
Acceptable data loss, fast restarts
Near-zero data loss requirement
Use together?
Yes — Redis supports RDB+AOF simultaneously for best of both worlds
AOF used for recovery, RDB for backups
Key takeaways
1
Redis is a data structure server, not just a cache
its native operations on Sorted Sets, Hashes, and Lists let you move computation to the data layer, which is its real superpower.
2
Redis persistence is opt-in
RDB gives compact snapshots with higher data-loss risk, AOF gives near-zero loss at the cost of file size. Use both together in production for defence-in-depth.
3
MULTI/EXEC does NOT roll back on runtime errors
use Lua scripts (EVAL) for conditional atomic operations like rate limiting, where a race condition between separate commands would cause bugs.
4
In Redis Cluster, multi-key commands only work if all keys share the same hash slot
control this with hash tags ({user:1001}:key) to group related keys onto the same shard deliberately.
Common mistakes to avoid
3 patterns
×
Using KEYS * in production
Symptom
The KEYS command scans the entire keyspace and blocks all other commands while it runs (remember: single-threaded). On a Redis instance with 10 million keys, this can block for seconds, causing cascading timeouts.
Fix
Always use SCAN with a COUNT hint and cursor-based iteration for any key discovery in production code.
×
Not setting TTLs on cached keys
Symptom
Developers SET thousands of keys during a traffic spike with no expiry. Redis fills to maxmemory, the eviction policy kicks in, and it starts evicting random keys — possibly evicting important non-cache data.
Fix
Always pass EX (seconds) or PX (milliseconds) when caching. Make TTL a first-class requirement in your caching strategy, not an afterthought.
×
Assuming MULTI/EXEC rolls back on error
Symptom
A developer wraps a payment flow in MULTI/EXEC, one command fails due to a wrong data type, and the other commands still execute — leaving data in a partially-updated state. Redis does NOT roll back on runtime errors (only on syntax errors before EXEC).
Fix
Use Lua scripts for operations that require true all-or-nothing atomicity with conditional logic, or validate all inputs before entering the MULTI block.
INTERVIEW PREP · PRACTICE MODE
Interview Questions on This Topic
Q01SENIOR
Explain how you'd implement a rate limiter using Redis. What commands wo...
Q02SENIOR
Your Redis instance is running out of memory. Walk me through how you'd ...
Q03SENIOR
A developer on your team suggests using Redis Pub/Sub as a reliable mess...
Q04SENIOR
How does Redis handle master-slave replication? Is it synchronous or asy...
Q05JUNIOR
Describe the Big O complexity of the most common Redis operations (GET, ...
Q01 of 05SENIOR
Explain how you'd implement a rate limiter using Redis. What commands would you use and why? What race conditions could occur with a naive implementation?
ANSWER
Use a Lua script with EVAL: get the current count, check against limit, if under increment and set TTL equal to window (e.g., 60s). Return current count and remaining. The Lua script runs atomically, so no race condition between GET and INCR. Naive implementation using separate GET+INCR commands has a TOCTOU race: two concurrent requests could both read count=99 and both increment to 100, allowing 101 requests. Always use EVAL or WATCH+MULTI for atomic check-and-increment.
Q02 of 05SENIOR
Your Redis instance is running out of memory. Walk me through how you'd diagnose what's taking up space, which eviction policy you'd choose and why, and how you'd prevent this from happening again.
ANSWER
Run MEMORY STATS to see breakdown of overhead vs dataset. Use MEMORY USAGE <key> on suspected large keys, or redis-cli --bigkeys. Check INFO keyspace for key count and average TTL. Choose eviction policy based on data: allkeys-lru for pure cache, volatile-lru for mixed workloads with critical non-TTL keys, noeviction if you'd rather crash than lose data (bad for caches). Prevention: set TTLs on every cache write, monitor evicted_keys metric, and set a maxmemory limit with headroom for spikes.
Q03 of 05SENIOR
A developer on your team suggests using Redis Pub/Sub as a reliable message queue for order processing. What would you tell them? What are the limitations of Pub/Sub that make it unsuitable, and what Redis feature would you recommend instead?
ANSWER
Pub/Sub is fire-and-forget: messages are delivered only to connected subscribers. If a subscriber disconnects, it misses messages. There's no persistence, no acknowledgement, no replay. For order processing, use Redis Streams: messages persist until explicitly acknowledged, support consumer groups, allow replay, and handle backpressure. Pub/Sub is appropriate for real-time notifications where delivery guarantee is not critical (e.g., live dashboards).
Q04 of 05SENIOR
How does Redis handle master-slave replication? Is it synchronous or asynchronous, and what happens to pending writes during a network partition?
ANSWER
Replication is asynchronous by default. The primary writes to its replication backlog and sends the stream to replicas. Replicas acknowledge the offset asynchronously. During a network partition, the primary continues accepting writes locally (they are buffered in the replication backlog). When reconnected, the replica requests the missing portion from the backlog. If the backlog is too small (default 1MB) and the partition lasts too long, the replica may need a full resync (RDB dump). For stronger guarantees, you can use WAIT with a timeout to block until N replicas acknowledge, but this adds latency.
Q05 of 05JUNIOR
Describe the Big O complexity of the most common Redis operations (GET, SET, LPOP, ZADD). Why is ZADD O(log N) while SET is O(1)?
ANSWER
GET: O(1) — direct hash lookup. SET: O(1) — insert into dictionary. LPOP: O(1) — removal from head of linked list. ZADD: O(log N) — Sorted Sets are implemented as a skip list (or hash table + skip list in newer versions). Inserting an element requires finding the correct position in the sorted order, which is O(log N) where N is the number of elements in the set. SET is just a hash table entry — no ordering needed. Interviewers often ask this to check if you understand the cost of ordering.
01
Explain how you'd implement a rate limiter using Redis. What commands would you use and why? What race conditions could occur with a naive implementation?
SENIOR
02
Your Redis instance is running out of memory. Walk me through how you'd diagnose what's taking up space, which eviction policy you'd choose and why, and how you'd prevent this from happening again.
SENIOR
03
A developer on your team suggests using Redis Pub/Sub as a reliable message queue for order processing. What would you tell them? What are the limitations of Pub/Sub that make it unsuitable, and what Redis feature would you recommend instead?
SENIOR
04
How does Redis handle master-slave replication? Is it synchronous or asynchronous, and what happens to pending writes during a network partition?
SENIOR
05
Describe the Big O complexity of the most common Redis operations (GET, SET, LPOP, ZADD). Why is ZADD O(log N) while SET is O(1)?
JUNIOR
FAQ · 3 QUESTIONS
Frequently Asked Questions
01
Is Redis single-threaded and does that make it slow?
Redis uses a single thread for command execution, which means no lock contention and perfectly predictable performance. It's not slow — it handles millions of operations per second because memory access is orders of magnitude faster than disk I/O. Since Redis 6.0, network I/O is handled by multiple threads, but command processing remains single-threaded by design.
Was this helpful?
02
What is the difference between Redis cache eviction policies?
The key ones: allkeys-lru evicts the least-recently-used key across all keys (best for a pure cache), volatile-lru evicts only keys with a TTL set (safe for mixed workloads where some keys must never be evicted), and noeviction rejects new writes when full (useful when you'd rather crash loudly than silently lose data). Always pair maxmemory-policy with a maxmemory limit, otherwise no eviction ever triggers.
Was this helpful?
03
When should I use Redis Pub/Sub versus Redis Streams?
Use Pub/Sub when you need fire-and-forget real-time messaging where losing messages is acceptable — live notifications, chat, presence indicators. Use Redis Streams when delivery guarantees matter: Streams persist messages, support consumer groups with acknowledgement, and let offline consumers catch up. For anything business-critical like order processing or payment events, Streams is the right choice.