Cuckoo Hashing — Correlated Hash Functions and Rehash Loops
Correlated hash functions caused 12-second rehash loops at 55% load.
- Cuckoo hashing uses two hash tables, each key has exactly two candidate slots
- Lookups always check at most two locations — hard worst-case O(1), not amortized, not expected
- Insertions use an eviction chain: kick out the occupant, move them to their alternate slot
- If the eviction chain loops back to the starting key, a cycle is detected and the table must be resized and rehashed with new hash seeds
- Max load factor ~50% per table in theory, 45% in production — exceed it and insertion failure probability rises exponentially
- Biggest mistake: assuming insertion is always O(1) — it degrades to O(n) in the worst case and falls apart sharply past load factor thresholds
- Hash function independence is not optional — correlated functions collapse both tables into one and cause premature cycles
Imagine a hotel with two buildings, and every guest has one reserved room in each building. When a new guest arrives and both their rooms are taken, they bump one occupant out, who then moves to their room in the other building. That evicted guest might bump someone else out, creating a chain reaction until everyone has a room or the hotel manager calls time and opens a new wing.
Cuckoo hashing works exactly like this. Every key has two possible home slots, one in each table. Lookup is trivially fast — check room A, check room B, done. Insertion is where the work happens: evict, relocate, repeat. The payoff is that no matter how full the hotel gets (within reason), finding any guest never requires more than two stops. That hard guarantee is why latency-critical systems reach for cuckoo hashing when other approaches fall short.
Hash tables are the backbone of modern software — they power database indexes, caches, compilers, and networking stacks. But the classic open-addressing and chaining approaches carry a hidden cost: their lookup time degrades under load. A bucket with five chained nodes requires five pointer dereferences. A probing table at 80% occupancy can force dozens of slot comparisons per lookup. For most applications that cost is tolerable. For latency-sensitive systems — packet classification in network routers, real-time game servers, CPU-level cache simulators — tolerable is not a design target.
Cuckoo hashing, introduced by Rasmus Pagh and Flemming Friche Rodler in 2001, attacks this problem with a radical guarantee: lookups always examine at most two memory locations. Worst-case O(1) — not amortized, not expected, but hard. It achieves this by giving every key two candidate slots across two separate hash tables and using an eviction-and-relocation protocol during insertion to ensure no slot ever holds more than one key.
That guarantee comes with strings attached. Insertion is unpredictable and can fail entirely when eviction chains cycle back on themselves. The load factor ceiling is roughly half what chained hashing tolerates. Hash function independence is not a nice-to-have — it is the mechanism the whole structure depends on. And if you deploy this without a max-iteration guard on the eviction chain, you have introduced an infinite loop into your production system.
By the end of this article you will understand exactly how the two-table eviction cycle works, why infinite loops occur and how to detect them before they happen, how to implement a working cuckoo hash map in Java with cycle detection and graceful rehashing, what load factor limits apply and why exceeding them causes exponential rehash overhead, and when cuckoo hashing genuinely beats its alternatives in production. The interview questions at the end go into the nuanced territory that separates candidates who have read about this structure from those who have actually operated it under load.
Core Mechanism: Two Tables, Two Hash Functions
The structure is built on one invariant: no slot in either table ever holds more than one key. Cuckoo hashing maintains two hash tables, T1 and T2, each of size n. Every key k has exactly two candidate positions — h1(k) in T1 and h2(k) in T2. That invariant is what makes lookup trivially fast.
To look up a key, compute h1(k) and h2(k) and check both slots. Two memory accesses. No probing, no chaining, no traversal, no variable-length search. If the key is in the table, it is in one of those two locations. If it is in neither, it is not in the table. That is the entire lookup algorithm.
Insertion is where the complexity lives. To insert key k, first try T1[h1(k)]. If the slot is empty, place k there and finish. If the slot is occupied by some existing key k', evict k' and place k in its slot. k' now needs a home — its only remaining option is T2[h2(k')]. If that slot is also occupied, evict that occupant and continue. The chain zigzags between T1 and T2: each evicted key has exactly one alternate table to move to, and within that table exactly one candidate slot. The chain terminates when an empty slot is found or when it has gone on long enough to indicate a cycle.
The asymmetry between lookup and insertion is intentional. You pay a variable insertion cost — sometimes one step, sometimes dozens — in exchange for a guaranteed, constant, predictable lookup cost. For workloads where lookups vastly outnumber inserts, that is a trade you want to make. For write-heavy workloads it is the wrong tool entirely.
- Two tables equal two buildings; each key has one room in each — that assignment is fixed by the hash functions
- Lookup checks room A then room B — two memory accesses and you are done, regardless of how full the hotel is
- Insertion bumps the occupant of your room, who moves to their alternate room, who might bump someone else
- The eviction chain is the domino sequence of bumped guests finding alternate rooms
- A cycle means guest A bumps B who bumps C who bumps A — infinite without a chain length limit
- Resize opens a new wing and reassigns all rooms under a new numbering scheme (new hash seeds)
put(). Most hash table APIs throw on failure or silently succeed. A false here means the eviction chain hit the ceiling and nothing was inserted — not a slow insert, not a degraded insert, a dropped insert. Your calling code must handle this explicitly and trigger a resize before retrying. If it does not, you have silent data loss.put() method returning false is not an error condition to log and ignore — it is a signal that the table needs to grow before you retry. Handle it explicitly.Cycle Detection and the Infinite Loop Problem
The fundamental failure mode of cuckoo hashing is the eviction cycle. During insertion, the chain zigzags between T1 and T2: evict from T1, try T2, evict from T2, try T1, and so on. If this chain revisits a slot that was already modified earlier in the same insertion, the displaced key has nowhere to go that it has not already displaced someone from. You have a cycle, and without a guard the loop runs forever.
This is not a theoretical edge case. The probability of encountering a cycle is a function of load factor, and it rises steeply. At 40% per-table occupancy the risk is negligible — a properly seeded implementation can run millions of insertions without seeing one. At 50% the expected failure rate per insertion is around 1%. At 55% it climbs toward 10%. At 60% you should expect failure on nearly every insertion attempt. The relationship is exponential, not linear, which is why the 45% production ceiling feels conservative but genuinely matters.
The adversarial key distribution problem compounds this. A 2004 analysis by Devroye and Morin established the theoretical load limit, but that analysis assumes random hash function behaviour. When real key sets interact with fixed hash functions, certain key patterns produce consistently long eviction chains at any load factor. This is what the production incident above captured: a traffic burst introduced a pattern of source IPs that happened to hash to overlapping slot pairs under both functions. The functions were correlated, which made the overlap worse, but even with independent functions an adversarial enough key set can trigger cycles at 35% load.
The practical response is layered. First, bound the chain: if the eviction chain exceeds MAX_EVICTION_CHAIN iterations, abort the insertion and return false — the calling code triggers a resize. Second, use new hash seeds on every rehash — the same functions on a larger table will reproduce the same problematic slot assignments for the same key set. Third, monitor insertion failure rate as a first-class metric alongside load factor. Load factor tells you where you are on the probability curve; insertion failure rate tells you whether the current key distribution is adversarial relative to your hash functions.
Two approaches to cycle detection exist in practice. The iteration count approach is simpler and lower overhead: if the chain runs longer than MAX_EVICTION_CHAIN steps, declare failure. The visited-set approach is more precise: track every (table, slot) pair visited during the chain and stop immediately when a pair repeats. The visited-set detects true cycles early and can terminate in fewer than MAX_EVICTION_CHAIN steps, but it allocates memory per insertion attempt and adds overhead to every step. For most production systems the iteration count is sufficient — the overhead of the visited-set only pays off when adversarial key distributions are a hard requirement.
Hash Function Selection and Independence
Everything in cuckoo hashing depends on one property of the two hash functions: they must distribute keys independently. If h1 and h2 are correlated — and the most common way this happens is someone writes h2(k) = h1(k) + constant because it looks simple and distinct — then the two candidate slots for many keys will cluster in the same region of the table. The structure now behaves like a single table with twice the collision rate. Cycles form at load factors you would consider safe for open addressing.
Independence in hash functions is a stronger requirement than uniformity. A function can distribute keys uniformly across all slots and still be correlated with another function that does the same. What matters is that knowing h1(k) tells you nothing useful about h2(k). In practice you achieve this either by using genuinely different algorithms (SipHash versus MurmurHash3) or by using the same algorithm with seeds chosen to produce statistically independent outputs — and then verifying empirically before deployment.
MurmurHash3's 128-bit output variant is a natural fit here. It produces two independent 64-bit values in a single pass. Use the low 64 bits for h1 and the high 64 bits for h2. The two halves are designed to be independent by construction, and you get both values from one computation — no extra cost per lookup or insertion.
SipHash with different 128-bit key pairs is the stronger choice when the key space may be adversarially controlled — for example, if your cuckoo hash table is part of a network-facing service where an attacker could craft keys to maximize collisions. SipHash's resistance to hash flooding comes from the key material; changing the key pair changes the entire output distribution.
Tabulation hashing is worth knowing about for academic completeness: it achieves strong theoretical independence guarantees and is extremely fast (array lookups only), but it requires per-byte random tables that consume memory proportional to the key size and are awkward to implement correctly for variable-length keys.
The practical check: after choosing your two functions, run estimateIndependence() against a representative sample of your key set. The overlap rate — the fraction of keys for which h1(k) == h2(k) after modulo — should be approximately 1/capacity. If it is significantly higher, your functions are correlated. This test takes seconds and has caught production issues that would otherwise only appear under load.
- Independent hashes distribute keys uniformly and separately across both tables — keys end up in different regions of each table
- Correlated hashes cluster both candidate slots for many keys into the same region — eviction chains immediately run into each other
- The failure mode is silent at low load and catastrophic at moderate load — cycles appear at 30% where independent hashes would be stable to 45%
- Verify with estimateIndependence() — if overlap exceeds 2/capacity, your functions are correlated and must be changed before deployment
- Never derive h2 as h1 + constant, h1 * constant, or any linear transform of h1 — these are all maximally correlated
Load Factor Limits and Rehash Strategy
Cuckoo hashing operates under tighter load factor constraints than any other common hash table variant. The theoretical maximum for two tables of size n is 50% per table — meaning you can fill at most n keys into a pair of tables each holding n slots before cycle probability becomes prohibitive. In practice that ceiling is lower, and the gap between theory and production reality matters.
Below 40% per table, insertion is nearly always fast. Eviction chains are short — typically two to five steps. Cycle probability is negligible for most key distributions and hash function pairs. This is the comfortable operating range where cuckoo hashing delivers its performance guarantee without any insertion drama.
Between 40% and 50%, behaviour depends heavily on hash function quality and key distribution. With good independent hash functions and non-adversarial keys, you can run in this range and see occasional longer chains but rare failures. With correlated functions or adversarial keys, failure rates in this band climb fast. This is the region where monitoring insertion failure rate becomes important — load factor alone does not tell you enough.
Above 50%, insertion failure should be expected rather than tolerated. The probability curve is exponential in this range. Trying to squeeze extra utilisation out of a cuckoo table above 50% is not a capacity optimisation — it is borrowing reliability against a debt that collects with interest during your next traffic spike.
The production-grade rehash strategy has four components. First, double the table capacity — 2x growth is standard and keeps amortised insertion cost O(1) across the table's lifetime. Second, generate new hash function seeds — the same functions on a larger table reproduce the same adversarial slot mappings. Third, re-insert every key from the old tables into the new ones — this is O(n) but happens infrequently, and the new seeds mean the insertion success rate should be high. Fourth, retry rehash with a further capacity increase if the first attempt fails — adversarial key sets occasionally defeat even a freshly seeded pair; cap retries at five and throw on persistent failure rather than looping silently.
For systems where rehash latency is unacceptable — a routing table that cannot pause for 50ms while 2 million entries are re-inserted — consider the stash approach. A small auxiliary array (16 to 64 slots) holds overflow keys that failed eviction. These keys require a third memory access during lookup, but the stash reduces rehash frequency dramatically. A stash of size O(log n) reduces insertion failure probability from O(1/n) to O(1/n²) — the difference between rehashing every few thousand inserts and rehashing every few billion.
Cuckoo Hashing vs Alternatives in Production
Choosing a hash table implementation is a trade-off analysis, and the right answer depends on ratios you should measure rather than assume. Here is how cuckoo hashing compares against the alternatives you are actually likely to consider.
Chained hashing is the baseline: each slot holds a linked list, collisions extend the chain, and load factors of 80% or higher are manageable. Insertion is O(1) amortized. Lookup is O(1) expected but degrades under load — a heavily loaded bucket means traversing several nodes with pointer dereferences at each step. For CPU cache efficiency this is poor, and for latency guarantees it is non-starter territory. Java's HashMap is chained and switches to a tree structure per bucket at chain length 8, which bounds the worst case but adds per-insertion overhead and GC pressure from node allocation.
Linear probing is cache-friendly because it accesses consecutive memory locations, which plays well with prefetchers. But it clusters under load — the probe sequence lengthens quadratically as occupancy rises, and worst-case lookup is O(n). Deletion requires tombstone markers that accumulate and degrade subsequent lookups without periodic compaction.
Robin Hood hashing reduces probe sequence variance by requiring each key to be displaced if the new arrival has searched longer than the current occupant — a richer-gets-displaced policy. This produces a more uniform probe distribution and lower variance in lookup cost, but worst-case is still O(n). Its advantage over cuckoo hashing is at high load factors (70%+ is viable) and in write-heavy workloads where insertion predictability matters more than lookup guarantees.
Hopscotch hashing constrains keys to a small neighbourhood around their home slot, enabling concurrent lock-free implementations with bounded lookup distances. It handles high load well and is cache-friendly. It is more complex to implement correctly than either open addressing or cuckoo hashing, and the lookup bound is neighbourhood size rather than a hard two-location check.
Cuckoo hashing wins when: lookups are the dominant operation by a large margin (95%+), lookup latency must be bounded not just on average but at every percentile, or concurrent lock-free reads are required without complex synchronisation. The two-fixed-location property enables lock-free reads with a single memory fence — no pointer chasing, no variable-length probe sequence, no possibility of a reader observing a partially-written chain.
The benchmark numbers from our routing table comparison are worth stating concretely: cuckoo achieved 38ns p99 lookup and 1,200ns p99 insertion. Robin Hood achieved 52ns p99 lookup and 45ns p99 insertion. For a workload that was 99.7% reads, cuckoo's lookup advantage was decisive. For a workload that was 60% reads, Robin Hood's insertion consistency would have won.
Cuckoo table stuck in infinite rehash loop during traffic spike
- Never trust theoretical average-case insertion cost — bound the eviction chain length unconditionally
- Monitor per-table load factor continuously; alert at 45%, not 50% — by the time you hit 50% the probability curve has already turned against you
- Correlated hash functions are the silent killer — verify independence empirically before deployment, not after a production incident
- A rehash must use genuinely new hash seeds, not the same functions applied to a larger table — same seeds reproduce the same adversarial mapping
put() method with no progress on the chain counter. Kill the thread if the service can tolerate it, or kill the process and redeploy with the max-iteration guard in place. After recovery, audit the insertion path: MAX_EVICTION_CHAIN must be set before any production deployment.put(); resize to 2x capacity with new hash seeds if exceededKey takeaways
Common mistakes to avoid
5 patternsUsing correlated hash functions — most commonly h2 = h1 + constant or h2 = h1 XOR constant
No max eviction chain length guard on the put() method
put() with no progress visible in application logs.put() method. When the loop counter reaches this limit, return false immediately. The caller must handle false explicitly — trigger CuckooResizer.resize() and retry the insertion. Never silently drop the key on false return; the API contract is explicit failure, not silent loss.Operating above 50% per-table load factor without monitoring insertion failure rate
Rehashing with the same hash function seeds on a larger table
Choosing cuckoo hashing for write-heavy workloads based on the O(1) average-case insertion claim
Interview Questions on This Topic
What is the worst-case lookup time for cuckoo hashing, and how does it differ from open addressing?
Frequently Asked Questions
That's Hashing. Mark it forged?
11 min read · try the examples if you haven't