Redis Basics Explained — Data Structures, Expiry and Real-World Patterns
Every high-traffic application hits the same wall eventually: the database becomes the bottleneck. A query that takes 40ms feels invisible during development but becomes catastrophic when 10,000 users hit it simultaneously. Twitter, GitHub, Stack Overflow, and Shopify all reached this wall — and Redis is a big part of how they broke through it. It's not an exaggeration to say that understanding Redis is what separates junior developers from engineers who can design systems that actually scale.
Redis (Remote Dictionary Server) solves the read-amplification problem. Most web applications read data far more than they write it — a product page might be read 50,000 times a day but updated once. Hammering your relational database with 50,000 identical queries is wasteful and slow. Redis lets you compute the answer once, store it in memory, and serve all 50,000 requests from there in microseconds. But Redis isn't just a cache — it's a full data structure server that can power rate limiters, leaderboards, pub/sub messaging, session stores, and queues.
By the end of this article you'll understand not just what Redis commands look like, but WHY each data structure exists, WHEN to reach for each one, and how to wire Redis into a real application pattern. You'll also learn the subtle mistakes — wrong expiry strategies, cache stampedes, missing persistence configs — that trip up developers who learned Redis from a cheat sheet instead of from first principles.
Why Redis Lives in RAM and Why That Changes Everything
Traditional databases store data on disk. Disk access — even an NVMe SSD — operates in the microseconds-to-milliseconds range. RAM access operates in nanoseconds. That's not a small difference; it's three orders of magnitude. Redis keeps its entire dataset in memory by default, which is the single most important architectural decision behind its speed.
But speed isn't the only trick. Redis is single-threaded for command execution. That sounds like a limitation until you understand what it eliminates: lock contention. In a multi-threaded database, threads fight over the same rows with locks. Redis sidesteps that fight entirely — one command runs to completion before the next starts. This makes Redis operations atomic by default, which matters enormously for things like incrementing a counter or checking-then-setting a value.
Redis also supports optional persistence. You can tell it to snapshot its RAM contents to disk every N seconds (RDB snapshotting) or to log every write command to an append-only file (AOF). Most production setups use both. This means Redis isn't just a volatile cache — it can survive a restart and recover its data.
The practical takeaway: use Redis for data that is read far more than it's written, where milliseconds matter, and where you can tolerate the data being slightly stale or reproducible if lost.
# --- Step 1: Start the Redis server (run in one terminal) --- # This launches Redis with default config on port 6379 redis-server # --- Step 2: Connect with the Redis CLI (run in another terminal) --- redis-cli # --- Step 3: Ping the server to confirm it's alive --- 127.0.0.1:6379> PING # Redis responds with PONG — the simplest health check you'll ever do # --- Step 4: Store a string value (the most basic operation) --- # SET key value # We're caching a user's display name keyed by their user ID 127.0.0.1:6379> SET user:1001:display_name "Alice Nguyen" # --- Step 5: Retrieve it --- 127.0.0.1:6379> GET user:1001:display_name # --- Step 6: Store a value with an expiry (TTL = Time To Live) --- # EX sets expiry in seconds — this key auto-deletes after 300 seconds (5 minutes) # This is the pattern for caching: store it, let it expire, recompute if missing 127.0.0.1:6379> SET product:42:price "29.99" EX 300 # --- Step 7: Check how many seconds remain before expiry --- 127.0.0.1:6379> TTL product:42:price # --- Step 8: Check if a key exists without fetching its value --- 127.0.0.1:6379> EXISTS user:1001:display_name
# * Ready to accept connections on port 6379
# redis-cli responses:
PONG
OK
"Alice Nguyen"
OK
298 # (seconds remaining — decreasing in real time)
(integer) 1 # 1 = key exists, 0 = it doesn't
Redis Data Structures — Picking the Right Tool for Each Problem
Redis isn't just a key-value store in the boring sense. It stores five core data types, and choosing the right one is the difference between an elegant solution and a painful hack.
Strings — the default. Good for counters, cached HTML, serialized JSON blobs, and session tokens. The INCR command atomically increments a string-as-integer, making it perfect for rate limiting and hit counters.
Hashes — think of a Hash as a mini dictionary attached to one key. Instead of storing a user as one giant JSON blob, you store their fields separately. This lets you update a single field without fetching and re-serializing the entire object.
Lists — ordered, duplicates allowed. Built on a linked list internally. Ideal for queues (push to the tail, pop from the head) and activity feeds (push new events to the head, trim the list to keep only the last N).
Sets — unordered, unique members. Perfect for tracking unique visitors, tagging systems, or finding common followers between two users with SINTER.
Sorted Sets — the crown jewel. Every member has a floating-point score. Redis keeps members ordered by score automatically. This is how you build leaderboards, priority queues, and range-based queries without a single SQL ORDER BY.
# ========================================================= # STRINGS — atomic counter for API rate limiting # ========================================================= # Track how many API calls user 2055 has made this minute # INCR is atomic — safe even with concurrent requests 127.0.0.1:6379> INCR api_calls:user:2055:minute:2024061514 127.0.0.1:6379> INCR api_calls:user:2055:minute:2024061514 127.0.0.1:6379> INCR api_calls:user:2055:minute:2024061514 # Set it to expire at the end of the minute (60 seconds) 127.0.0.1:6379> EXPIRE api_calls:user:2055:minute:2024061514 60 # ========================================================= # HASHES — store a user profile without one giant JSON blob # ========================================================= # HSET sets one or more fields on a hash key # Updating just the email later only rewrites that one field 127.0.0.1:6379> HSET user:2055 username "bob_the_dev" email "bob@example.com" plan "pro" login_count 0 # Retrieve a single field — no need to deserialize a full JSON object 127.0.0.1:6379> HGET user:2055 email # Retrieve all fields at once 127.0.0.1:6379> HGETALL user:2055 # Atomically increment just the login counter 127.0.0.1:6379> HINCRBY user:2055 login_count 1 # ========================================================= # SORTED SETS — real-time game leaderboard # ========================================================= # ZADD leaderboard_key score member # Score is the player's points — Redis sorts automatically 127.0.0.1:6379> ZADD game:leaderboard 4200 "player:alice" 127.0.0.1:6379> ZADD game:leaderboard 8750 "player:bob" 127.0.0.1:6379> ZADD game:leaderboard 6100 "player:carol" # Fetch top 3 players, highest score first (WITHSCORES shows the score) # ZREVRANGE = reverse order = highest to lowest 127.0.0.1:6379> ZREVRANGE game:leaderboard 0 2 WITHSCORES # Get a specific player's rank (0-indexed, 0 = top) 127.0.0.1:6379> ZREVRANK game:leaderboard "player:alice" # ========================================================= # LISTS — lightweight task queue # ========================================================= # LPUSH adds to the LEFT (head) of the list # Workers will RPOP from the RIGHT (tail) — FIFO queue 127.0.0.1:6379> RPUSH email_queue "{\"to\":\"alice@example.com\",\"subject\":\"Welcome\"}" 127.0.0.1:6379> RPUSH email_queue "{\"to\":\"bob@example.com\",\"subject\":\"Reset\"}" # BLPOP = blocking pop — worker waits up to 5 seconds for a job # This is more efficient than polling in a loop 127.0.0.1:6379> BLPOP email_queue 5
(integer) 1
(integer) 2
(integer) 3
(integer) 1 # EXPIRE confirmation
# HGET response:
"bob@example.com"
# HGETALL response:
1) "username"
2) "bob_the_dev"
3) "email"
4) "bob@example.com"
5) "plan"
6) "pro"
7) "login_count"
8) "0"
# HINCRBY response:
(integer) 1
# ZREVRANGE top 3:
1) "player:bob"
2) "8750"
3) "player:carol"
4) "6100"
5) "player:alice"
6) "4200"
# ZREVRANK alice (0-indexed from top):
(integer) 2 # alice is 3rd place
# BLPOP response (job dequeued):
1) "email_queue"
2) "{\"to\":\"alice@example.com\",\"subject\":\"Welcome\"}"
The Cache-Aside Pattern — Wiring Redis Into a Real Application
Knowing Redis commands is one thing. Knowing how to integrate Redis into your application code without creating subtle bugs is another. The most widely-used pattern is Cache-Aside (also called Lazy Loading). The logic is elegantly simple: when your app needs data, check Redis first. If it's there (a cache hit), return it immediately. If it's not (a cache miss), fetch it from the database, store it in Redis with a TTL, then return it. Redis never gets data pushed to it — your application pulls it through.
This pattern is powerful because it's self-healing. If Redis goes down and loses all its data, your app degrades gracefully — everything just goes to the database until Redis is warm again. The cache populates itself organically based on what users actually request, not what you predict they'll request.
The critical detail most tutorials skip: always set a TTL. Without one, your cache grows forever and you'll eventually run out of RAM. More importantly, stale data lives forever. If a product's price changes in your database but the Redis entry never expires, customers see wrong prices indefinitely. Your TTL is your freshness guarantee.
The code below shows this pattern implemented in Python with the redis-py library — the same library used by Instagram and Pinterest in production.
import redis import json import time # --- Connect to Redis --- # decode_responses=True means Redis returns strings instead of bytes redis_client = redis.Redis( host="localhost", port=6379, db=0, decode_responses=True ) # --- Simulated database fetch (replace with your real DB query) --- def fetch_product_from_database(product_id: int) -> dict: """ Simulates a slow database query. In production this would be: cursor.execute('SELECT * FROM products WHERE id = %s', [product_id]) """ print(f"[DB] Querying database for product {product_id}...") time.sleep(0.05) # simulate 50ms DB query latency return { "id": product_id, "name": "Mechanical Keyboard TKL", "price": 129.99, "stock": 42 } def get_product(product_id: int) -> dict: """ Cache-Aside Pattern implementation. Always check Redis first. Fall back to DB on miss. Always set a TTL. """ cache_key = f"product:{product_id}" # namespaced key following the colon convention cache_ttl_seconds = 300 # cache is valid for 5 minutes # --- Step 1: Try the cache first --- cached_value = redis_client.get(cache_key) if cached_value is not None: # Cache HIT — data found in Redis, no DB query needed print(f"[CACHE] Hit for key '{cache_key}'") return json.loads(cached_value) # deserialize from JSON string back to dict # --- Step 2: Cache MISS — go to the database --- print(f"[CACHE] Miss for key '{cache_key}'") product_data = fetch_product_from_database(product_id) # --- Step 3: Populate the cache for next time --- # json.dumps serializes the dict to a JSON string for storage # ex=cache_ttl_seconds ensures the key auto-expires — NEVER skip this redis_client.set( cache_key, json.dumps(product_data), ex=cache_ttl_seconds ) print(f"[CACHE] Stored '{cache_key}' with TTL={cache_ttl_seconds}s") return product_data def invalidate_product_cache(product_id: int): """ Call this whenever a product is updated in the database. Removing the key forces the next request to re-fetch fresh data. """ cache_key = f"product:{product_id}" deleted_count = redis_client.delete(cache_key) if deleted_count > 0: print(f"[CACHE] Invalidated key '{cache_key}'") else: print(f"[CACHE] Key '{cache_key}' wasn't in cache — nothing to invalidate") # --- Demo --- if __name__ == "__main__": print("=== First request — cold cache ===") product = get_product(product_id=7) print(f"Result: {product}\n") print("=== Second request — warm cache ===") product = get_product(product_id=7) print(f"Result: {product}\n") print("=== Simulating a product update ===") invalidate_product_cache(product_id=7) print("\n=== Third request — cache was invalidated ===") product = get_product(product_id=7) print(f"Result: {product}")
[CACHE] Miss for key 'product:7'
[DB] Querying database for product 7...
[CACHE] Stored 'product:7' with TTL=300s
Result: {'id': 7, 'name': 'Mechanical Keyboard TKL', 'price': 129.99, 'stock': 42}
=== Second request — warm cache ===
[CACHE] Hit for key 'product:7'
Result: {'id': 7, 'name': 'Mechanical Keyboard TKL', 'price': 129.99, 'stock': 42}
=== Simulating a product update ===
[CACHE] Invalidated key 'product:7'
=== Third request — cache was invalidated ===
[CACHE] Miss for key 'product:7'
[DB] Querying database for product 7...
[CACHE] Stored 'product:7' with TTL=300s
Result: {'id': 7, 'name': 'Mechanical Keyboard TKL', 'price': 129.99, 'stock': 42}
Redis Expiry, Eviction and Why Your Cache Will Betray You Without Them
TTLs are your first line of defense against stale data. But what happens when Redis runs out of memory before any keys expire? This is where eviction policies come in, and most developers don't think about them until Redis starts refusing writes in production — which is a very bad day.
Redis has several eviction policies configured via maxmemory-policy in your redis.conf. The default policy is noeviction — Redis refuses new writes when full. That sounds safe but it means your application starts throwing errors. For a cache, you almost always want allkeys-lru (evict the least recently used key across all keys) or volatile-lru (evict the least recently used key that has a TTL set).
There's also the cache stampede problem — also called the thundering herd. Imagine 500 concurrent users all request the same popular product page. The cache entry expires at the exact same moment. All 500 requests find a cache miss simultaneously and all fire a database query at once. Your database gets hammered with 500 identical queries in the same millisecond. The fix is probabilistic early expiration or using a mutex lock in your cache-miss path so only one request rebuilds the cache while others wait.
The rule of thumb: if your cache powers any page that gets high traffic, you need to think about stampedes. If your cache serves data with truly random access patterns, you probably don't.
# ========================================================= # CONFIGURING EVICTION POLICY (redis.conf or at runtime) # ========================================================= # Set max memory to 256MB — Redis will start evicting when this is reached 127.0.0.1:6379> CONFIG SET maxmemory 268435456 # allkeys-lru = evict least-recently-used key from the ENTIRE keyspace # Best default for a pure cache where all keys have roughly equal value 127.0.0.1:6379> CONFIG SET maxmemory-policy allkeys-lru # Confirm the config was applied 127.0.0.1:6379> CONFIG GET maxmemory-policy # ========================================================= # TTL COMMANDS — managing key lifetime # ========================================================= # SET with expiry in seconds 127.0.0.1:6379> SET session:user:8821 "eyJhbGciOiJIUzI1NiJ9" EX 3600 # SET with expiry in milliseconds (for sub-second precision) 127.0.0.1:6379> SET rate_check:ip:192.168.1.1 "1" PX 60000 # Check TTL in seconds (-1 = no expiry, -2 = key doesn't exist) 127.0.0.1:6379> TTL session:user:8821 # Check TTL in milliseconds (more precise) 127.0.0.1:6379> PTTL rate_check:ip:192.168.1.1 # PERSIST removes the TTL — key lives forever (use with caution) 127.0.0.1:6379> PERSIST session:user:8821 127.0.0.1:6379> TTL session:user:8821 # ========================================================= # MUTEX PATTERN — prevent cache stampedes # NX = only set if key does NOT exist (atomic check-and-set) # This is a distributed lock: only the first caller wins # ========================================================= # First request tries to acquire the rebuild lock (TTL=10s to auto-release) 127.0.0.1:6379> SET lock:product:7:rebuild "1" NX EX 10 # Second concurrent request tries the same — gets nil (lock is taken) 127.0.0.1:6379> SET lock:product:7:rebuild "1" NX EX 10 # After the first request rebuilds the cache and releases the lock: 127.0.0.1:6379> DEL lock:product:7:rebuild # ========================================================= # CHECK MEMORY USAGE # ========================================================= # See overall memory stats 127.0.0.1:6379> INFO memory # See exactly how much RAM one key is using (in bytes) 127.0.0.1:6379> MEMORY USAGE session:user:8821
OK
OK
# CONFIG GET maxmemory-policy:
1) "maxmemory-policy"
2) "allkeys-lru"
# SET with EX/PX:
OK
OK
# TTL session:user:8821:
(integer) 3598 # approximately 3600, decreasing
# PTTL rate_check (milliseconds):
(integer) 59847
# After PERSIST:
OK
(integer) -1 # -1 means no expiry — key lives forever now
# First SET NX (lock acquired):
OK
# Second SET NX (lock already held):
(nil) # nil = the SET was rejected — lock is taken
# DEL lock:
(integer) 1 # 1 = key was deleted
# MEMORY USAGE:
(integer) 88 # this specific key uses 88 bytes of RAM
| Data Structure | Best Use Case | Key Commands | Stores Duplicates? | Ordered? |
|---|---|---|---|---|
| String | Cached values, counters, session tokens | GET, SET, INCR, EXPIRE | N/A (single value) | N/A |
| Hash | Object/entity fields (user profiles, product data) | HGET, HSET, HGETALL, HINCRBY | N/A (field map) | No |
| List | Queues, activity feeds, job pipelines | RPUSH, LPOP, BLPOP, LRANGE | Yes | Insertion order |
| Set | Unique visitors, tags, friend graphs | SADD, SMEMBERS, SINTER, SUNION | No | No |
| Sorted Set | Leaderboards, priority queues, range queries | ZADD, ZREVRANGE, ZRANK, ZSCORE | No (by member) | By score (float) |
🎯 Key Takeaways
- Redis is fast because it lives in RAM and uses a single-threaded event loop — which makes every command atomic without needing locks.
- Sorted Sets are Redis's most underrated data structure — they let you build real-time leaderboards and priority queues in a single command with no application-side sorting.
- Always set a TTL and always configure maxmemory-policy before production — the default noeviction policy will cause your app to throw errors when Redis fills up.
- Cache-Aside (lazy loading) is the most production-proven caching pattern: check cache first, miss falls back to DB, always store with expiry, invalidate explicitly on writes.
⚠ Common Mistakes to Avoid
- ✕Mistake 1: Not setting a TTL on cached keys — The symptom is Redis memory growing unboundedly until the server OOMs or Redis starts rejecting writes. The fix: make it a rule in code review that every SET must include EX or PX. Better yet, write a wrapper function that makes TTL mandatory and throws an exception if the caller omits it.
- ✕Mistake 2: Using KEYS in production to find matching keys — KEYS is O(n) and blocks Redis's single thread while it scans the entire keyspace. On a server with 10 million keys, this freezes Redis for seconds, dropping all other requests. The fix: use SCAN instead — it iterates in small batches without blocking. Example:
SCAN 0 MATCH product:COUNT 100 returns up to 100 matching keys per call and a cursor to continue from. - ✕Mistake 3: Storing large serialized objects as Strings and updating them non-atomically — If two concurrent requests both GET a user JSON blob, update different fields in their own memory, then SET the blob back, one request's write silently overwrites the other's. This is a classic race condition. The fix: use a Hash and HSET to update individual fields atomically, or use a Lua script via EVAL to wrap multi-step read-modify-write operations in an atomic transaction.
Interview Questions on This Topic
- QRedis is single-threaded — how can it handle thousands of concurrent connections without being a bottleneck?
- QExplain the difference between Redis RDB snapshotting and AOF persistence. Which would you choose for a session store, and why?
- QWhat is a cache stampede and how would you prevent it in a high-traffic system that uses Redis as a cache?
Frequently Asked Questions
Is Redis a database or a cache?
Redis is both, and that's not a cop-out. It's an in-memory data structure server that can act as a cache (with TTLs and eviction), a primary database (with RDB or AOF persistence), a message broker (with pub/sub or Streams), or a queue (with Lists and BLPOP). Most teams use it alongside a relational database — Redis handles high-frequency reads while the relational DB handles durable writes.
What happens to Redis data when the server restarts?
By default, if you're using Redis purely as an in-memory cache with no persistence configured, all data is lost on restart — which is usually fine for a cache. For durability, enable RDB snapshotting (periodic disk snapshots) or AOF (append-only log of every write). With AOF and appendfsync always, you get near-zero data loss but a small write performance penalty.
When should I use a Hash instead of storing JSON as a String?
Use a Hash whenever you need to update individual fields of an object independently and frequently. If you store a user as a JSON String and need to increment their login_count, you must fetch the entire blob, deserialize it, increment the field, and store it back — all in two round trips with a race condition window. With a Hash, HINCRBY user:1001 login_count 1 does this atomically in one command with no deserialization needed.
Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.