Cuckoo Hashing — Correlated Hash Functions and Rehash Loops
Correlated hash functions caused 12-second rehash loops at 55% load.
20+ years shipping performance-critical code where algorithms decide the bill. Everything here is grounded in real deployments.
- Cuckoo hashing uses two hash tables, each key has exactly two candidate slots
- Lookups always check at most two locations — hard worst-case O(1), not amortized, not expected
- Insertions use an eviction chain: kick out the occupant, move them to their alternate slot
- If the eviction chain loops back to the starting key, a cycle is detected and the table must be resized and rehashed with new hash seeds
- Max load factor ~50% per table in theory, 45% in production — exceed it and insertion failure probability rises exponentially
- Biggest mistake: assuming insertion is always O(1) — it degrades to O(n) in the worst case and falls apart sharply past load factor thresholds
- Hash function independence is not optional — correlated functions collapse both tables into one and cause premature cycles
Imagine a hotel with two buildings, and every guest has one reserved room in each building. When a new guest arrives and both their rooms are taken, they bump one occupant out, who then moves to their room in the other building. That evicted guest might bump someone else out, creating a chain reaction until everyone has a room or the hotel manager calls time and opens a new wing.
Cuckoo hashing works exactly like this. Every key has two possible home slots, one in each table. Lookup is trivially fast — check room A, check room B, done. Insertion is where the work happens: evict, relocate, repeat. The payoff is that no matter how full the hotel gets (within reason), finding any guest never requires more than two stops. That hard guarantee is why latency-critical systems reach for cuckoo hashing when other approaches fall short.
Hash tables are the backbone of modern software — they power database indexes, caches, compilers, and networking stacks. But the classic open-addressing and chaining approaches carry a hidden cost: their lookup time degrades under load. A bucket with five chained nodes requires five pointer dereferences. A probing table at 80% occupancy can force dozens of slot comparisons per lookup. For most applications that cost is tolerable. For latency-sensitive systems — packet classification in network routers, real-time game servers, CPU-level cache simulators — tolerable is not a design target.
Cuckoo hashing, introduced by Rasmus Pagh and Flemming Friche Rodler in 2001, attacks this problem with a radical guarantee: lookups always examine at most two memory locations. Worst-case O(1) — not amortized, not expected, but hard. It achieves this by giving every key two candidate slots across two separate hash tables and using an eviction-and-relocation protocol during insertion to ensure no slot ever holds more than one key.
That guarantee comes with strings attached. Insertion is unpredictable and can fail entirely when eviction chains cycle back on themselves. The load factor ceiling is roughly half what chained hashing tolerates. Hash function independence is not a nice-to-have — it is the mechanism the whole structure depends on. And if you deploy this without a max-iteration guard on the eviction chain, you have introduced an infinite loop into your production system.
By the end of this article you will understand exactly how the two-table eviction cycle works, why infinite loops occur and how to detect them before they happen, how to implement a working cuckoo hash map in Java with cycle detection and graceful rehashing, what load factor limits apply and why exceeding them causes exponential rehash overhead, and when cuckoo hashing genuinely beats its alternatives in production. The interview questions at the end go into the nuanced territory that separates candidates who have read about this structure from those who have actually operated it under load.
Cuckoo Hashing — When Two Tables Are Better Than One
Cuckoo hashing is a collision resolution scheme that guarantees O(1) worst-case lookup by using two independent hash functions and two separate tables. When a new key collides at its primary position in table T1, it evicts the existing occupant, which then rehashes to its alternate position in T2. That occupant may evict another key back to T1, creating a chain of displacements until either every key settles or a cycle is detected. This is the core mechanic: constant-time lookups at the cost of unpredictable insertion time.
In practice, cuckoo hashing achieves an average insertion cost of O(1) with high probability, but worst-case insertions can degrade to O(n) when a rehash loop occurs — typically after a threshold of evictions (e.g., 500 iterations). The load factor must stay below 50% per table (total ~90% combined) to keep cycles rare. Memory overhead is roughly 2x a standard hash table because of the two tables, but the payoff is deterministic lookup latency, critical for real-time systems.
Use cuckoo hashing when worst-case lookup latency must be bounded — for example, in packet processing, CPU caches, or database index structures where a single slow hash probe could stall a pipeline. It is not a general-purpose replacement for HashMap; the insertion cost variance and memory doubling make it unsuitable for write-heavy workloads. The real value is in read-dominated paths where every nanosecond matters and you cannot tolerate the occasional O(n) probe from chaining or open addressing.
Core Mechanism: Two Tables, Two Hash Functions
The structure is built on one invariant: no slot in either table ever holds more than one key. Cuckoo hashing maintains two hash tables, T1 and T2, each of size n. Every key k has exactly two candidate positions — h1(k) in T1 and h2(k) in T2. That invariant is what makes lookup trivially fast.
To look up a key, compute h1(k) and h2(k) and check both slots. Two memory accesses. No probing, no chaining, no traversal, no variable-length search. If the key is in the table, it is in one of those two locations. If it is in neither, it is not in the table. That is the entire lookup algorithm.
Insertion is where the complexity lives. To insert key k, first try T1[h1(k)]. If the slot is empty, place k there and finish. If the slot is occupied by some existing key k', evict k' and place k in its slot. k' now needs a home — its only remaining option is T2[h2(k')]. If that slot is also occupied, evict that occupant and continue. The chain zigzags between T1 and T2: each evicted key has exactly one alternate table to move to, and within that table exactly one candidate slot. The chain terminates when an empty slot is found or when it has gone on long enough to indicate a cycle.
The asymmetry between lookup and insertion is intentional. You pay a variable insertion cost — sometimes one step, sometimes dozens — in exchange for a guaranteed, constant, predictable lookup cost. For workloads where lookups vastly outnumber inserts, that is a trade you want to make. For write-heavy workloads it is the wrong tool entirely.
- Two tables equal two buildings; each key has one room in each — that assignment is fixed by the hash functions
- Lookup checks room A then room B — two memory accesses and you are done, regardless of how full the hotel is
- Insertion bumps the occupant of your room, who moves to their alternate room, who might bump someone else
- The eviction chain is the domino sequence of bumped guests finding alternate rooms
- A cycle means guest A bumps B who bumps C who bumps A — infinite without a chain length limit
- Resize opens a new wing and reassigns all rooms under a new numbering scheme (new hash seeds)
put(). Most hash table APIs throw on failure or silently succeed. A false here means the eviction chain hit the ceiling and nothing was inserted — not a slow insert, not a degraded insert, a dropped insert. Your calling code must handle this explicitly and trigger a resize before retrying. If it does not, you have silent data loss.put() method returning false is not an error condition to log and ignore — it is a signal that the table needs to grow before you retry. Handle it explicitly.Cycle Detection and the Infinite Loop Problem
The fundamental failure mode of cuckoo hashing is the eviction cycle. During insertion, the chain zigzags between T1 and T2: evict from T1, try T2, evict from T2, try T1, and so on. If this chain revisits a slot that was already modified earlier in the same insertion, the displaced key has nowhere to go that it has not already displaced someone from. You have a cycle, and without a guard the loop runs forever.
This is not a theoretical edge case. The probability of encountering a cycle is a function of load factor, and it rises steeply. At 40% per-table occupancy the risk is negligible — a properly seeded implementation can run millions of insertions without seeing one. At 50% the expected failure rate per insertion is around 1%. At 55% it climbs toward 10%. At 60% you should expect failure on nearly every insertion attempt. The relationship is exponential, not linear, which is why the 45% production ceiling feels conservative but genuinely matters.
The adversarial key distribution problem compounds this. A 2004 analysis by Devroye and Morin established the theoretical load limit, but that analysis assumes random hash function behaviour. When real key sets interact with fixed hash functions, certain key patterns produce consistently long eviction chains at any load factor. This is what the production incident above captured: a traffic burst introduced a pattern of source IPs that happened to hash to overlapping slot pairs under both functions. The functions were correlated, which made the overlap worse, but even with independent functions an adversarial enough key set can trigger cycles at 35% load.
The practical response is layered. First, bound the chain: if the eviction chain exceeds MAX_EVICTION_CHAIN iterations, abort the insertion and return false — the calling code triggers a resize. Second, use new hash seeds on every rehash — the same functions on a larger table will reproduce the same problematic slot assignments for the same key set. Third, monitor insertion failure rate as a first-class metric alongside load factor. Load factor tells you where you are on the probability curve; insertion failure rate tells you whether the current key distribution is adversarial relative to your hash functions.
Two approaches to cycle detection exist in practice. The iteration count approach is simpler and lower overhead: if the chain runs longer than MAX_EVICTION_CHAIN steps, declare failure. The visited-set approach is more precise: track every (table, slot) pair visited during the chain and stop immediately when a pair repeats. The visited-set detects true cycles early and can terminate in fewer than MAX_EVICTION_CHAIN steps, but it allocates memory per insertion attempt and adds overhead to every step. For most production systems the iteration count is sufficient — the overhead of the visited-set only pays off when adversarial key distributions are a hard requirement.
Hash Function Selection and Independence
Everything in cuckoo hashing depends on one property of the two hash functions: they must distribute keys independently. If h1 and h2 are correlated — and the most common way this happens is someone writes h2(k) = h1(k) + constant because it looks simple and distinct — then the two candidate slots for many keys will cluster in the same region of the table. The structure now behaves like a single table with twice the collision rate. Cycles form at load factors you would consider safe for open addressing.
Independence in hash functions is a stronger requirement than uniformity. A function can distribute keys uniformly across all slots and still be correlated with another function that does the same. What matters is that knowing h1(k) tells you nothing useful about h2(k). In practice you achieve this either by using genuinely different algorithms (SipHash versus MurmurHash3) or by using the same algorithm with seeds chosen to produce statistically independent outputs — and then verifying empirically before deployment.
MurmurHash3's 128-bit output variant is a natural fit here. It produces two independent 64-bit values in a single pass. Use the low 64 bits for h1 and the high 64 bits for h2. The two halves are designed to be independent by construction, and you get both values from one computation — no extra cost per lookup or insertion.
SipHash with different 128-bit key pairs is the stronger choice when the key space may be adversarially controlled — for example, if your cuckoo hash table is part of a network-facing service where an attacker could craft keys to maximize collisions. SipHash's resistance to hash flooding comes from the key material; changing the key pair changes the entire output distribution.
Tabulation hashing is worth knowing about for academic completeness: it achieves strong theoretical independence guarantees and is extremely fast (array lookups only), but it requires per-byte random tables that consume memory proportional to the key size and are awkward to implement correctly for variable-length keys.
The practical check: after choosing your two functions, run estimateIndependence() against a representative sample of your key set. The overlap rate — the fraction of keys for which h1(k) == h2(k) after modulo — should be approximately 1/capacity. If it is significantly higher, your functions are correlated. This test takes seconds and has caught production issues that would otherwise only appear under load.
- Independent hashes distribute keys uniformly and separately across both tables — keys end up in different regions of each table
- Correlated hashes cluster both candidate slots for many keys into the same region — eviction chains immediately run into each other
- The failure mode is silent at low load and catastrophic at moderate load — cycles appear at 30% where independent hashes would be stable to 45%
- Verify with estimateIndependence() — if overlap exceeds 2/capacity, your functions are correlated and must be changed before deployment
- Never derive h2 as h1 + constant, h1 * constant, or any linear transform of h1 — these are all maximally correlated
Load Factor Limits and Rehash Strategy
Cuckoo hashing operates under tighter load factor constraints than any other common hash table variant. The theoretical maximum for two tables of size n is 50% per table — meaning you can fill at most n keys into a pair of tables each holding n slots before cycle probability becomes prohibitive. In practice that ceiling is lower, and the gap between theory and production reality matters.
Below 40% per table, insertion is nearly always fast. Eviction chains are short — typically two to five steps. Cycle probability is negligible for most key distributions and hash function pairs. This is the comfortable operating range where cuckoo hashing delivers its performance guarantee without any insertion drama.
Between 40% and 50%, behaviour depends heavily on hash function quality and key distribution. With good independent hash functions and non-adversarial keys, you can run in this range and see occasional longer chains but rare failures. With correlated functions or adversarial keys, failure rates in this band climb fast. This is the region where monitoring insertion failure rate becomes important — load factor alone does not tell you enough.
Above 50%, insertion failure should be expected rather than tolerated. The probability curve is exponential in this range. Trying to squeeze extra utilisation out of a cuckoo table above 50% is not a capacity optimisation — it is borrowing reliability against a debt that collects with interest during your next traffic spike.
The production-grade rehash strategy has four components. First, double the table capacity — 2x growth is standard and keeps amortised insertion cost O(1) across the table's lifetime. Second, generate new hash function seeds — the same functions on a larger table reproduce the same adversarial slot mappings. Third, re-insert every key from the old tables into the new ones — this is O(n) but happens infrequently, and the new seeds mean the insertion success rate should be high. Fourth, retry rehash with a further capacity increase if the first attempt fails — adversarial key sets occasionally defeat even a freshly seeded pair; cap retries at five and throw on persistent failure rather than looping silently.
For systems where rehash latency is unacceptable — a routing table that cannot pause for 50ms while 2 million entries are re-inserted — consider the stash approach. A small auxiliary array (16 to 64 slots) holds overflow keys that failed eviction. These keys require a third memory access during lookup, but the stash reduces rehash frequency dramatically. A stash of size O(log n) reduces insertion failure probability from O(1/n) to O(1/n²) — the difference between rehashing every few thousand inserts and rehashing every few billion.
Cuckoo Hashing vs Alternatives in Production
Choosing a hash table implementation is a trade-off analysis, and the right answer depends on ratios you should measure rather than assume. Here is how cuckoo hashing compares against the alternatives you are actually likely to consider.
Chained hashing is the baseline: each slot holds a linked list, collisions extend the chain, and load factors of 80% or higher are manageable. Insertion is O(1) amortized. Lookup is O(1) expected but degrades under load — a heavily loaded bucket means traversing several nodes with pointer dereferences at each step. For CPU cache efficiency this is poor, and for latency guarantees it is non-starter territory. Java's HashMap is chained and switches to a tree structure per bucket at chain length 8, which bounds the worst case but adds per-insertion overhead and GC pressure from node allocation.
Linear probing is cache-friendly because it accesses consecutive memory locations, which plays well with prefetchers. But it clusters under load — the probe sequence lengthens quadratically as occupancy rises, and worst-case lookup is O(n). Deletion requires tombstone markers that accumulate and degrade subsequent lookups without periodic compaction.
Robin Hood hashing reduces probe sequence variance by requiring each key to be displaced if the new arrival has searched longer than the current occupant — a richer-gets-displaced policy. This produces a more uniform probe distribution and lower variance in lookup cost, but worst-case is still O(n). Its advantage over cuckoo hashing is at high load factors (70%+ is viable) and in write-heavy workloads where insertion predictability matters more than lookup guarantees.
Hopscotch hashing constrains keys to a small neighbourhood around their home slot, enabling concurrent lock-free implementations with bounded lookup distances. It handles high load well and is cache-friendly. It is more complex to implement correctly than either open addressing or cuckoo hashing, and the lookup bound is neighbourhood size rather than a hard two-location check.
Cuckoo hashing wins when: lookups are the dominant operation by a large margin (95%+), lookup latency must be bounded not just on average but at every percentile, or concurrent lock-free reads are required without complex synchronisation. The two-fixed-location property enables lock-free reads with a single memory fence — no pointer chasing, no variable-length probe sequence, no possibility of a reader observing a partially-written chain.
The benchmark numbers from our routing table comparison are worth stating concretely: cuckoo achieved 38ns p99 lookup and 1,200ns p99 insertion. Robin Hood achieved 52ns p99 lookup and 45ns p99 insertion. For a workload that was 99.7% reads, cuckoo's lookup advantage was decisive. For a workload that was 60% reads, Robin Hood's insertion consistency would have won.
Cuckoo Hashing in Action: A Concrete Walkthrough
Theory is cheap. Let's watch this thing breathe. You've got two tables, each with 10 slots, and two hash functions: h1(x) = x % 10, h2(x) = (x / 10) % 10. Insert 5. h1(5) = 5, slot 5 is empty. Done. Insert 15. h1(15) = 5 — collision. Cuckoo kicks 5 out to its alternate position. h2(5) = 0, slot 0 is empty. 5 moves there, 15 sits in slot 5. Insert 25. h1(25) = 5 — collision with 15. Kick 15 out. h2(15) = 1, slot 1 empty. 15 moves, 25 takes slot 5. Now look what happened: every insertion displaced exactly one element. No linked lists. No probing chains. Worst case lookup is still one hash, one table access, done. That's the guarantee. You can trace this on a whiteboard in under 30 seconds. The predictability matters when you're debugging a production latency spike at 3 AM. Knowing your lookup will never cascade beyond two memory accesses changes how you sleep.
Implementation Blueprint: The Bare Metal Loop
Here's where most tutorials wave their hands. I'm showing you the actual loop structure that makes cuckoo hashing work under load. The core is a while loop that runs for a bounded number of attempts. You maintain a current key, current table index, and a counter. Each iteration: try placing the key in its primary table. If that slot is free, done. If not, swap the occupant into the current key and flip to the other table using its alternate hash. Increment the counter. If counter exceeds a threshold — typically O(log n) or a fixed constant like 20 — you bail out and trigger a rehash. Why? Because you've either hit a cycle or the table is too full. The threshold is your safety net against infinite loops. Production systems tune this based on load factor. At 50% load, you almost never hit the limit. At 90%, you'll trigger rehash constantly. That's correct behavior. The rehash doubles table size, picks new hash functions from a universal family, and inserts everything fresh. It's expensive but rare. The rest of the time, you're paying for two hash computations and maybe two memory swaps. That's the trade: occasional O(n) rehash, everyday O(1) operations.
Why Your First Hash Function Is Already Leaking
Hash function independence is not academic trivia. If your two hash functions produce correlated outputs, you guarantee cycles at absurdly low load factors. Every collision becomes a cluster, every insert becomes O(n) evictions.
Production systems require independent hash seeds per function. Use SipHash with different keys, or split a single 64-bit hash into high and low halves. Never reuse the same algorithm with the same seed.
Real-world failure: A payment backend used SHA-256 with index and index+1 as 'different' functions. Load factor 0.3 triggered infinite rehash loops. Independence means statistical zero correlation — not just different constants.
Load Factor 0.5 Is Not a Suggestion — It's a Law
Cuckoo hashing breaks hard above 50% load factor. Unlike open addressing or chaining where performance degrades gracefully, cuckoo hashing hits a cliff. At 0.51, insertion probability of requiring a rehash jumps from 5% to 95%.
This isn't a tuning parameter. It's a mathematical bound from the bipartite graph matching model behind the data structure. Your hash table has two buckets per key. Fill one more item than buckets available, and the eviction chain becomes a random walk with no guaranteed termination.
Production rule: Trigger rehash at 0.45 load factor. Account for the 5% headroom during peak writes. Monitoring load factor is not optional — it is the primary health metric for any cuckoo implementation.
Cycle Detection: Your Eviction Loop Needs an Escape Hatch
Infinite loops in cuckoo hashing aren't bugs — they're features of the data distribution. When both of a key's target slots are occupied by keys that also can't move, the system deadlocks. Without detection, your insert hangs forever.
The fix: set a maximum eviction chain length. A common threshold is 2 * table size. If a single insert triggers more evictions than that, abort and rehash. This bounds worst-case insertion to O(n), but average case stays O(1).
Do not rely on hash functions to 'never' produce cycles. They will. Every production cuckoo implementation I've seen that skipped cycle detection hit it within the first million inserts. The counter is free. The debugging session without it costs hours.
3.1. Formal Description: Why Two Tables Beat One
Cuckoo hashing guarantees O(1) worst-case lookup by enforcing exactly one candidate location per key. Each key has exactly two possible cells: one in table T1 via hash function h1, and one in table T2 via h2. The formal invariant: every stored key occupies exactly one of its two assigned positions. Insertion follows a deterministic eviction chain — if the target slot is occupied, the current occupant is displaced to its alternate table, continuing until a vacancy is found or a cycle is detected. This is not a heuristic; it's a total ordering constraint. The cycle detection threshold is typically set to a constant multiple of table size. Once reached, the tables are rehashed with new hash functions. The formal model proves that as long as the load factor stays below 0.5, the expected insertion cost remains constant. This mathematical guarantee eliminates the amortized-cost surprises inherent in chaining or linear probing.
4.1. Formal Setup: Tables, Hashes, and the Eviction Contract
Introduction
Cuckoo Hashing is a collision resolution scheme for hash tables that guarantees O(1) worst-case lookup and deletion time, unlike chaining or open addressing which degrade to linear in high-load scenarios. It achieves this by using two independent hash functions and two tables, or a single table with two candidate positions. When a new key collides, it evicts the occupant, which then rehashes to its alternate location — possibly displacing another. This process cascades until a free slot is found or a cycle is detected. Named after the cuckoo bird's nesting behavior, the algorithm trades amortized insertion complexity for deterministic retrieval performance. Why does this matter? In production systems where latency spikes are unacceptable — such as packet routing tables, CPU caches, or real-time databases — constant-time lookups are non-negotiable. However, the simplicity ends with the concept. The implementation demands careful hash function independence, strict load factor limits (typically ≤ 0.5), and robust cycle detection. This article dissects those mechanical truths without sugarcoating the engineering realities.
4.2. Workflow
The workflow of cuckoo hashing is a disciplined procedure designed to maintain constant-time access. Insertion begins by computing the first hash function h1(key) to get a candidate slot in table T1. If that slot is empty, the key is placed immediately — done. If occupied, the existing entry is evicted: the new key takes its place, and the evicted key is rehashed using h2. That displaced key then targets a slot in T2. Again, if free, it stops; if occupied, eviction cascades back to T1, alternating tables each round. This continues until either an empty slot is found or a cycle is detected — when the same key is evicted twice in the same insertion attempt. Cycle detection is critical: a naive loop could run forever. The standard threshold is a maximum number of displacements (often log2(capacity) or fixed like 100). If exceeded, the table is rehashed with new hash functions and grown. Lookups are straightforward: check T1[h1(key)], then T2[h2(key)], or return null. Deletion is simply clearing the slot — no tombstones needed because probing never occurs. This workflow guarantees O(1) worst-case search with amortized O(1) insertion, provided hash functions are independent and load factor stays ≤ 0.5.
Cuckoo table stuck in infinite rehash loop during traffic spike
- Never trust theoretical average-case insertion cost — bound the eviction chain length unconditionally
- Monitor per-table load factor continuously; alert at 45%, not 50% — by the time you hit 50% the probability curve has already turned against you
- Correlated hash functions are the silent killer — verify independence empirically before deployment, not after a production incident
- A rehash must use genuinely new hash seeds, not the same functions applied to a larger table — same seeds reproduce the same adversarial mapping
put() method with no progress on the chain counter. Kill the thread if the service can tolerate it, or kill the process and redeploy with the max-iteration guard in place. After recovery, audit the insertion path: MAX_EVICTION_CHAIN must be set before any production deployment.jstack <pid> | grep -A 20 CuckooHashMapjmap -histo <pid> | head -30put(); resize to 2x capacity with new hash seeds if exceededKey takeaways
Common mistakes to avoid
5 patternsUsing correlated hash functions — most commonly h2 = h1 + constant or h2 = h1 XOR constant
No max eviction chain length guard on the put() method
put() with no progress visible in application logs.put() method. When the loop counter reaches this limit, return false immediately. The caller must handle false explicitly — trigger CuckooResizer.resize() and retry the insertion. Never silently drop the key on false return; the API contract is explicit failure, not silent loss.Operating above 50% per-table load factor without monitoring insertion failure rate
Rehashing with the same hash function seeds on a larger table
Choosing cuckoo hashing for write-heavy workloads based on the O(1) average-case insertion claim
Practice These on LeetCode
Interview Questions on This Topic
What is the worst-case lookup time for cuckoo hashing, and how does it differ from open addressing?
Frequently Asked Questions
20+ years shipping performance-critical code where algorithms decide the bill. Everything here is grounded in real deployments.
That's Hashing. Mark it forged?
18 min read · try the examples if you haven't