Junior 5 min · March 06, 2026

Leaderboard Design — Race Conditions & Sharding Trade-offs

Concurrent ZADD calls caused vanishing scores during tournaments.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • A production leaderboard needs sorted sets, sharding, and atomic score updates
  • Redis Sorted Sets (ZADD, ZRANGE) are the canonical data structure — O(log n) per update
  • Global leaderboards require consistent hashing or range-based sharding
  • Real-time updates use in-memory stores; periodic snapshots persist to DB
  • Performance: single Redis node handles ~100k updates/s; shard across 10 nodes for 1M+ QPS
  • Production risk: hot keys (e.g., rank 1) cause contention; use local caching or shard by score band
Plain-English First

Imagine a school spelling bee where the teacher writes every student's score on a giant whiteboard, always keeping the list sorted so the best spellers are at the top. Now imagine 50 million students doing that simultaneously — you can't have one teacher with one whiteboard anymore. A leaderboard system is the engineering answer to that problem: how do you always know who's winning, instantly, at any scale, without the whiteboard catching fire.

Leaderboards are deceptively simple on the surface — just a sorted list of scores. But they're one of the most revealing system design questions an interviewer can ask, because a production-grade leaderboard for a game like Fortnite or a platform like Duolingo touches almost every hard problem in distributed systems at once: low-latency reads, high-throughput writes, consistency vs. availability trade-offs, and hot-key contention. The question isn't exotic — it's a lens that exposes how deeply you actually think.

The core problem is that sorting is expensive. Maintaining a globally consistent ranked list across millions of simultaneous score updates, while serving millions of 'what rank am I?' queries in under 10 milliseconds, is not something a simple SQL ORDER BY can handle. Naive approaches collapse under load in ways that are silent and insidious — rankings drift, scores get lost, and the leaderboard becomes a lie nobody notices until a player's 10,000-point climb disappears.

By the end of this article, you'll be able to walk into a system design interview and whiteboard a production-quality leaderboard: choosing the right data structure, explaining exactly why Redis sorted sets exist for this problem, handling the dense-rank vs. competition-rank distinction, sharding for global scale, dealing with score update atomicity, and knowing when to reach for approximate solutions. You'll also know the failure modes that catch senior engineers off guard.

What is Design a Leaderboard System?

Designing a leaderboard system means building a service that ingests score updates, maintains a globally sorted ranking, and serves queries for top N and individual rank — all under low-latency and high-throughput constraints. It's a common system design interview question because it forces you to choose the right data structure (Redis Sorted Sets), plan for sharding, handle atomicity, and decide on caching and consistency.

The naive approach — a relational database with ORDER BY score LIMIT 100 — collapses at thousands of concurrent writes. Each insert triggers a full table sort, locking rows and pegging CPU. Production leaderboards must avoid that entirely by keeping the ranking in-memory and sorted at all times.

LeaderboardService.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
package io.thecodeforge.leaderboard;

import redis.clients.jedis.Jedis;
import redis.clients.jedis.Tuple;
import java.util.Set;

public class LeaderboardService {
    private final Jedis jedis;
    private static final String LEADERBOARD_KEY = "leaderboard:global";

    public LeaderboardService(Jedis jedis) {
        this.jedis = jedis;
    }

    public void updateScore(String playerId, double points) {
        jedis.zincrby(LEADERBOARD_KEY, points, playerId);
    }

    public Set<Tuple> getTopN(int n) {
        return jedis.zrevrangeWithScores(LEADERBOARD_KEY, 0, n - 1);
    }

    public Long getRank(String playerId) {
        return jedis.zrevrank(LEADERBOARD_KEY, playerId);
    }
}
Forge Tip:
Type this code yourself rather than copy-pasting. The muscle memory of writing it will help it stick.
Production Insight
Leaderboards are read-heavy for top N and write-heavy for individual updates.
The top 0.1% of players generate 90% of the read traffic.
Hot key (rank 1) can saturate a single Redis node — cache it separately.
Key Takeaway
Leaderboard = sorted in-memory store + sharding + atomic updates.
Never use SQL ORDER BY for real-time ranking.
Cache the rank 1 spot — it's the hottest key.

Core Data Structure: Redis Sorted Sets

A Redis Sorted Set stores unique members each with a floating-point score. Internally, it's implemented as a skip list and a hash table, giving O(log n) for add/update/remove operations and O(log n + m) for range queries (like getting the top 100).

Here's a typical interaction pattern
  • When a game round ends, the game server sends ZINCRBY leaderboard:global <score_increment> <player_id>. This atomically increments the score.
  • To get the top N players: ZREVRANGE leaderboard:global 0 N-1 WITHSCORES. ZREVRANGE returns members sorted descending by score.
  • To get a player's rank: ZREVRANK leaderboard:global <player_id>.

This single data structure handles the core ranking logic. But it doesn't solve everything: you need to handle millions of concurrent writes, persistence, and global aggregation.

redis-leaderboard-commands.txtREDIS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Add player score (increment by 100)
ZINCRBY leaderboard:global 100 player:42

# Get top 10
ZREVRANGE leaderboard:global 0 9 WITHSCORES

# Get player rank (0-based)
ZREVRANK leaderboard:global player:42

# Get player's score
ZSCORE leaderboard:global player:42

# Atomic score update with Lua (to prevent race conditions)
EVAL "
  local current = redis.call('ZSCORE', KEYS[1], ARGV[1])
  local increment = tonumber(ARGV[2])
  if current then
    current = tonumber(current) + increment
    redis.call('ZADD', KEYS[1], current, ARGV[1])
  else
    redis.call('ZADD', KEYS[1], increment, ARGV[1])
  end
  return 1
" 1 leaderboard:global player:42 100
Sorted Set as a Sorted Bucket of Marbles
  • The weight is the score; the marble's label is the player ID.
  • When you change a marble's weight, it automatically repositions.
  • You can ask the bucket for the heaviest 100 marbles at any time.
  • The bucket uses a skip list internally, so reordering after a weight change takes O(log n).
Production Insight
Redis Sorted Sets are memory efficient, but a single instance can only handle ~100k operations per second.
If you have 1M+ players updating scores every minute, you'll saturate the Redis thread.
Shard by player ID hash across multiple Redis nodes to scale.
Avoid ZRANGE on very large sorted sets ( >10M members ) as it blocks Redis.
Key Takeaway
Redis Sorted Set is the default leaderboard engine.
Use ZINCRBY for atomic increments; ZREVRANGE for top N.
Shard when you exceed a single node's throughput.
The mental model: a self-sorting bucket of weighted marbles.

Sharding Strategies for Global Leaderboards

When a single Redis sorted set grows beyond a node's capacity or write throughput, you must shard. There are two common approaches:

  1. Fixed shard count with consistent hashing – Partition the player ID space into N shards (e.g., 16). Each shard holds a subset of players. When reading the global top N, you query all shards for their top N, merge results, and return the final top N. This merge is fast if N is small (e.g., 100). You trade off perfect accuracy at the boundaries (players with same score across shards) but it's acceptable.
  2. Range-based sharding by score – Shard 0 holds scores 0-999, shard 1 holds 1000-1999, etc. This is simpler for reads: you only need to query the top few shards to get the global top N. But scores change over time, so players may move across shards. Need a rebalancing mechanism or use a proxy that routes by score range.

A third, less common approach: split by competition (e.g., each tournament ID is its own sorted set). This works when leaderboards are per-event, not global.

leaderboard_shard_merge.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import redis

# Assume 16 shard Redis clients: shard_clients[0..15]
SHARD_COUNT = 16
TOP_N = 100

def get_global_top_n():
    pipeline_results = []
    for client in shard_clients:
        # Get top N from each shard
        pipe = client.pipeline()
        pipe.zrevrange('leaderboard', 0, TOP_N-1, withscores=True)
        results = pipe.execute()
        pipeline_results.append(results[0])
    
    # Merge: sort all entries by score descending, then take TOP_N
    all_entries = []
    for shard_result in pipeline_results:
        for member, score in shard_result:
            all_entries.append((member, float(score)))
    all_entries.sort(key=lambda x: x[1], reverse=True)
    return all_entries[:TOP_N]
Merge overhead at high N
If you need the top 10,000 players, merging from 16 shards means pulling 160,000 entries and sorting them — that's expensive. Use approximation (e.g., bucketed scores) or pre-compute a global top N offline.
Production Insight
Consistent hashing works well but requires resharding logic when adding/removing nodes.
Always monitor the merge latency; it's a common bottleneck when N grows.
Consider a hybrid: keep a separate Redis key for the global top 100, updated asynchronously via a stream processor.
Key Takeaway
Shard by player ID for writes, merge for reads.
Range sharding by score simplifies reads but complicates score changes.
Monitor merge performance — it's the hidden bottleneck.
Choose your sharding strategy
IfGlobal leaderboard, many players, consistent hashing already in use
UseFixed shard by player ID hash — simpler, good read accuracy with merge
IfScores rarely change bands, read-heavy, need fast global top N
UseRange shard by score band — near-perfect reads, but rebalancing is painful
IfLeaderboards are per-competition or per-event
UsePer-competition shard — each competition gets its own sorted set, no merge needed

Handling Score Update Conflicts and Atomicity

When multiple game servers can update the same player's score concurrently, you risk lost updates. The classic read-modify-write pattern (read current score, compute new score, write back) is not atomic. Two servers may read the same value, add their increments, and last write wins — discarding one server's increment.

Solutions
  • Atomic increment: Redis ZINCRBY atomically adds to the existing score. No read needed.
  • Lua scripting: For more complex updates (e.g., add score, then conditionally update metadata), use a Redis Lua script that runs atomically.
  • Optimistic locking: Use WATCH to monitor a key and MULTI/EXEC to conditionally update; but this adds overhead and may fail under contention.

For systems that need external atomicity (e.g., across multiple databases), implement a distributed lock (via Redlock or etcd) around the player's score update.

atomic_score_update.luaLUA
1
2
3
4
5
6
7
8
9
10
11
12
-- KEYS[1] = leaderboard key
-- ARGV[1] = player ID
-- ARGV[2] = score increment

local current = redis.call('ZSCORE', KEYS[1], ARGV[1])
if current then
    local new_score = tonumber(current) + tonumber(ARGV[2])
    redis.call('ZADD', KEYS[1], new_score, ARGV[1])
else
    redis.call('ZADD', KEYS[1], tonumber(ARGV[2]), ARGV[1])
end
return 1
Always use Lua for compound operations
Even ZINCRBY can be insufficient if you need to also update a 'last updated' timestamp. Wrap everything in a Lua script to avoid race conditions.
Production Insight
In a game with 10M daily active players, concurrent score updates are inevitable.
We saw a case where a leaderboard was off by 5% due to lost updates — after switching to Lua scripts, accuracy recovered within 2 hours.
But watch out: long-running Lua scripts block Redis. Keep your scripts under 10 operations.
Key Takeaway
Always use atomic operations for score updates.
ZINCRBY for simple increments; Lua scripts for composite updates.
Avoid read-modify-write — it's a race condition waiting to happen.

Real-Time Updates and Caching

Users expect to see their new rank immediately after finishing a round. But updating the global leaderboard in real-time every second for millions of players is expensive. You need a tiered approach:

  1. Write path: Score update goes to Redis (in-memory, sub-millisecond). No DB write on every update.
  2. Read path: Top N leaderboard cached in a separate Redis string, refreshed every 5–10 seconds or invalidated on threshold changes (e.g., a new score enters the top 100).
  3. Persistence: Periodically (e.g., every minute) snapshot the sorted set to a relational DB for recovery. Use Redis AOF + RDB for safety.
  4. Scalable reads: If you have 100k+ reads per second on the top 100, add a local cache on the API server (e.g., in-memory cache with TTL of 2 seconds). Stale reads are acceptable for most games.

For real-time updates like 'you just dropped to rank 5,000', push notifications via WebSocket or polling every 10 seconds.

cache_top_leaderboard.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import redis
import time

r = redis.Redis()
TOP_N = 100
CACHE_KEY = 'leaderboard:top100'

def get_top_leaderboard():
    cached = r.get(CACHE_KEY)
    if cached:
        return cached
    # fallback: compute from sorted set
    data = r.zrevrange('leaderboard:global', 0, TOP_N-1, withscores=True)
    # serialize and cache with TTL 5 seconds
    serialized = str(data)
    r.setex(CACHE_KEY, 5, serialized)
    return serialized

def invalidate_on_new_top_score(new_score, current_top_100_min):
    if new_score > current_top_100_min:
        r.delete(CACHE_KEY)
Two-tier caching: hot and warm
  • The top 100 is read 100x more than any individual rank — cache it.
  • Individual player rank queries can hit Redis directly with low latency.
  • Update the cached top 100 only when a new top score enters the threshold.
  • Stale cache (a few seconds) is acceptable; stale scores (minutes) cause complaints.
Production Insight
Caching the top N is trivial but invalidating it is not.
If a flood of high scores hits, you'll invalidate rapidly and thrash the cache.
Solution: use a debounce — only invalidate once per second, even if many updates occur.
If multiple API servers, use a centralized cache (Redis) to avoid each server holding a different stale version.
Key Takeaway
Cache the top N leaderboard, not individual scores.
Use short TTL (a few seconds) and lazy invalidation.
Debounce cache invalidations under high write load.

Consistency vs. Availability Trade-offs

In a distributed leaderboard, you must choose how consistent the rankings are. CAP theorem says you can't have both strong consistency and high availability under partitions.

  • Eventual consistency: Score updates propagate asynchronously to all shards/replicas. Two users in different data centers may see different ranks for the same player for a few seconds. This is acceptable for most game leaderboards — the 'exact rank at that instant' isn't critical.
  • Strong consistency: Use Redis cluster with transactions or a single master Redis. Writes must go to the master, reads from the master (no read replicas). This increases latency and reduces availability if the master fails.
  • Causal consistency: Ensure that a player sees their own update immediately, but others may see it later. Implement by reading the player's own score from a local consistent store.

For competition leaderboards with money prizes, you need stronger guarantees. Use a single leaderboard per competition (no sharding), enforce atomic writes, and provide audit logs.

tradeoff_matrix.txtTEXT
1
2
3
4
5
| Consistency Level   | Read freshness | Write availability | Use case                     |
|---------------------|----------------|--------------------|------------------------------|
| Eventual            | Stale (secs)   | High               | Casual game leaderboards     |
| Causal              | Self-fresh     | High               | Social features, user ranks  |
| Strong (monolithic) | Immediate      | Lower              | Prize tournaments, financials|
Don't over-engineer consistency
99% of leaderboards do not need strong consistency. Players expect eventual consistency — they know their rank may lag by a few seconds. Only invest in strong consistency if you're dealing with real money.
Production Insight
We once saw a tournament leaderboard using eventual consistency where two players tied for first place each saw themselves as winner.
The shards had not reconciled the tie-breaking rule.
Rule: for money events, use a single shard and strong consistency, even if it means sacrificing some throughput.
Key Takeaway
Know your consistency requirements upfront.
Casual leaderboards: eventual consistency is fine.
Financial leaderboards: strong consistency on a single shard.
Always document the staleness behavior for support teams.
Choose consistency level
IfCasual game, no money involved, high availability needed
UseEventual consistency — acceptable staleness of a few seconds
IfPlayers need to see their own rank instantly, others can be stale
UseCausal consistency — use local read-your-writes
IfPrize money, audits, or regulatory requirements
UseStrong consistency on a single shard per competition
● Production incidentPOST-MORTEMseverity: high

The Vanishing High Score

Symptom
Players reporting their scores reset to zero or shifted to incorrect ranks after high-traffic events (e.g., weekend tournaments).
Assumption
The engineering team assumed Redis ZADD was atomic and that a single write would safely update the score.
Root cause
Race condition: two concurrent ZADD calls from different game servers for the same player caused one write to overwrite the other's score. Each server read a stale score, added points locally, and wrote back — the last write won, discarding the other's points.
Fix
Use Redis Lua scripts (EVAL) to atomically read and update a player's score in one operation. Alternatively, use Redis 6+ built-in deferring of writes with MULTI/EXEC or WATCH. Also implement a per-player lock (distributed lock via Redlock) for score updates.
Key lesson
  • Never assume read-then-write is atomic — use Lua scripting or transactions.
  • Always protect critical counters against lost updates in distributed systems.
  • Test with concurrent writers to catch race conditions before they reach production.
Production debug guideCommon symptoms and precise actions to resolve them3 entries
Symptom · 01
Player's score is not incrementing after a game round
Fix
Check Redis logs for WRONGTYPE errors — the key may be a different data type. Run TYPE leaderboard:global to verify. Also check if Lua script timed out (SCRIPT KILL can recover).
Symptom · 02
Top 100 leaderboard shows stale data (e.g., missing latest scores)
Fix
Check cache TTL: if you cache the top 100 list, ensure it's invalidated on write. Use Redis Pub/Sub to push invalidation messages. Also verify that shard rebalancing hasn't moved a hot key to a new node without proper routing.
Symptom · 03
Two players with the same score have inconsistent ranks
Fix
Decide on tie-breaking strategy: default Redis sorted set uses lexicographic ordering on member string for equal scores. Ensure your application uses consistent tie-breaking (e.g., earlier timestamp wins). Adjust scores with fractional timestamp if needed.
★ Leaderboard Quick Debug Cheat SheetImmediate actions when leaderboard behaves unexpectedly
Scores not reflecting in top list
Immediate action
Check if the key exists and is a sorted set: `EXISTS leaderboard:global` then `TYPE leaderboard:global`.
Commands
redis-cli ZRANGE leaderboard:global 0 10 WITHSCORES
redis-cli ZCARD leaderboard:global
Fix now
If key missing, restore from DB snapshot. If wrong type, rename and recreate from DB checkpoint.
Concurrent update loss+
Immediate action
Inspect Redis slow log for long-running scripts. Check if application uses read-modify-write without atomicity.
Commands
redis-cli SLOWLOG GET 10
redis-cli SCRIPT EXISTS <script_sha>
Fix now
Rewrite score update as a Lua script that reads and writes atomically. Example: local current = redis.call('ZSCORE', KEYS[1], ARGV[1]) if current then current = tonumber(current) + tonumber(ARGV[2]) redis.call('ZADD', KEYS[1], current, ARGV[1]) end
High latency on leaderboard reads (top 100)+
Immediate action
Check network round trips: if reading from multiple shards and merging, see if that merge is the bottleneck.
Commands
redis-cli --latency -h <shard_host>
Measure client merge time in application logs with a timing trace.
Fix now
Cache the top 100 in a separate Redis string with TTL. Update that cache only when the top 100 changes (e.g., after a high score update).
Leaderboard Sharding Strategies Compared
StrategyWrite ComplexityRead LatencyGlobal Top N AccuracyResharding Pain
Fixed shard by player ID hashO(1) (write to one shard)O(N*logN) mergeHigh (merge sorts)Medium (consistent hashing helps)
Range shard by score bandO(1) (write to one shard)O(k*logM) (few shards)Very high (no merge needed)High (players move bands)
Single Redis instanceO(1)O(log n) (no merge)PerfectNot applicable (scale up instead)
Per-competition leaderboardO(1) per competitionO(log n) per competitionPerfect per competitionLow (each competition independent)

Key takeaways

1
Redis Sorted Sets are the canonical leaderboard data structure
O(log n) for updates and sorted reads.
2
Shard by player ID for write scalability; merge for global reads; use consistent hashing for node changes.
3
Always use atomic operations (ZINCRBY or Lua scripts) for score updates to prevent lost increments.
4
Cache the top N leaderboard with short TTL and debounced invalidation.
5
Choose consistency level based on business need
eventual for casual, strong for financial tournaments.
6
Tie-breaking rules must be explicit and documented
default Redis uses lexicographic order on member strings.

Common mistakes to avoid

5 patterns
×

Using a relational database with ORDER BY LIMIT

Symptom
500ms+ response times under 10k concurrent updates; database CPU pegs at 100% during tournaments.
Fix
Use Redis Sorted Sets for real-time ranking. Only use DB for persistence snapshots.
×

No atomicity on score updates

Symptom
Players randomly lose or gain points; scores don't match actual game results.
Fix
Replace read-modify-write with ZINCRBY or Lua scripts. Always test with concurrent load.
×

Storing whole leaderboard in a single Redis key

Symptom
Redis latency spikes, memory usage grows unbounded, evacuation takes hours.
Fix
Shard by player ID or competition. Set a max size per shard and evict low-scoring inactive players.
×

Forgetting tie-breaking rules

Symptom
Two players with same score swap ranks randomly; user confusion.
Fix
Decide on tie-breaking: earlier timestamp, alphabetical, or include fractional score component. Document the rule.
×

Caching top N without invalidation

Symptom
Leaderboard shows stale top 100 for minutes; players complain their new high score doesn't appear.
Fix
Set short TTL and invalidate cache when a new score enters the top threshold. Use debounce to avoid thrash.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
How would you design a leaderboard for a mobile game with 10 million dai...
Q02SENIOR
How do you handle the scenario where two game servers concurrently incre...
Q03JUNIOR
What is the difference between ZRANK and ZREVRANK? When would you use ea...
Q01 of 03SENIOR

How would you design a leaderboard for a mobile game with 10 million daily active users? Walk me through your data structure choices and trade-offs.

ANSWER
I would start with Redis Sorted Sets for the real-time ranking because ZINCRBY provides atomic increment and ZREVRANGE gives O(log n) top N queries. For 10M users, a single Redis instance won't suffice — I'd shard by player ID hash across 16–32 Redis nodes. Each shard holds a subset of players. To get the global top 100, I query all shards for their top 100, merge and sort in the application layer. Individual rank queries go directly to one shard. For persistence, I'd take periodic snapshots to a relational database (e.g., PostgreSQL) for recovery. Top N would be cached in a separate Redis string with TTL of 5 seconds. If we need strong consistency for tournament prizes, I'd use a single shard per competition with Lua scripts for atomic updates.
FAQ · 5 QUESTIONS

Frequently Asked Questions

01
What is a leaderboard system in simple terms?
02
Can I use a SQL database for a leaderboard?
03
How do I handle ties in the leaderboard?
04
What is the biggest production risk with leaderboards?
05
How do I persist a leaderboard from Redis to a database?
🔥

That's System Design Interview. Mark it forged?

5 min read · try the examples if you haven't

Previous
Design a Job Scheduler
7 / 7 · System Design Interview
Next
Top SQL Interview Questions