Database Sharding
- Sharding enables horizontal scale — add more servers as data grows.
- The shard key must be included in all queries — otherwise all shards must be scanned.
- Hash sharding distributes evenly; range sharding allows range queries but risks hot shards.
Sharding splits data across multiple independent database instances (shards) to scale beyond what a single server can handle. Each shard holds a subset of the data. A shard key determines which shard holds each row. The application (or a proxy) routes queries to the correct shard based on the key.
How Sharding Works
The shard key is the column used to route data to a specific shard. Every query must include the shard key to avoid broadcasting to all shards.
# Package: io.thecodeforge.python.system_design # Simple shard router based on user_id class ShardRouter: def __init__(self, shards): self.shards = shards # list of database connections self.n = len(shards) def get_shard(self, user_id: int): """Hash sharding: user_id % number_of_shards.""" return self.shards[user_id % self.n] def query_user(self, user_id: int, sql: str): shard = self.get_shard(user_id) return shard.execute(sql, user_id) # Example: 4 shards # user_id 1001 → shard 1001 % 4 = 1 # user_id 1002 → shard 1002 % 4 = 2 # user_id 1000 → shard 1000 % 4 = 0 # Cross-shard query — must broadcast to all shards (expensive) # SELECT * FROM orders WHERE order_date = '2025-01-01' # This hits ALL 4 shards and merges results in application layer
Hash vs Range Sharding
# Hash sharding: even distribution, no hot shards # Weakness: range queries require broadcast def hash_shard(user_id, n_shards): return user_id % n_shards # Range sharding: range queries efficient, but hot shard risk # Shard 0: user_ids 1 to 1,000,000 # Shard 1: user_ids 1,000,001 to 2,000,000 # etc. RANGE_BOUNDARIES = [0, 1_000_000, 2_000_000, 3_000_000] def range_shard(user_id, boundaries): for i, boundary in enumerate(boundaries[1:], 1): if user_id < boundary: return i - 1 return len(boundaries) - 1 # Hot shard problem with range sharding: # If all new users have sequential IDs, new users always go to the LAST shard # That shard gets all the write load — called a hot shard # Consistent hashing: used by Cassandra, DynamoDB # Solves resharding without reassigning all data # When adding a shard, only the adjacent shard's data needs to move
🎯 Key Takeaways
- Sharding enables horizontal scale — add more servers as data grows.
- The shard key must be included in all queries — otherwise all shards must be scanned.
- Hash sharding distributes evenly; range sharding allows range queries but risks hot shards.
- Cross-shard joins and transactions are hard — design your shard key to minimise them.
- Most applications should NOT shard — optimise indexes, add read replicas, and scale vertically first.
Interview Questions on This Topic
- QWhat is a shard key and how do you choose a good one?
- QWhat is the hot shard problem and how do you prevent it?
- QHow does consistent hashing help with resharding?
Frequently Asked Questions
How do you handle transactions that span multiple shards?
Two-phase commit (2PC) is the formal approach — a coordinator asks all shards to prepare, then commits if all agree. But 2PC is slow and reduces availability (blocked if any shard fails). Most large-scale systems avoid cross-shard transactions by designing the data model so transactions stay within one shard. Eventual consistency or saga patterns handle cases where cross-shard coordination is unavoidable.
What is resharding and why is it painful?
Resharding is changing the number of shards (e.g., 4 → 8). With simple hash sharding (user_id % n), changing n means nearly all data maps to a different shard — you must migrate ~87.5% of data. Consistent hashing minimises this: adding a shard only moves adjacent data. Most cloud databases (DynamoDB, Cassandra) handle resharding automatically.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.