Intermediate 9 min · March 05, 2026

Hash Map Hash-Flooding Attack — 2ms to 30s Latency

A hash-flooding attack caused API latency to spike from 2ms to 30s with 100% CPU.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
Quick Answer
  • Hash tables map keys to array indices using a hash function for average O(1) access
  • Collisions are managed via separate chaining (linked lists/trees) or open addressing
  • Load factor (default 0.75) triggers resize — trades memory for speed
  • Java's HashMap uses separate chaining and auto-treeifies buckets after 8 collisions
  • Production insight: a bad hashCode() can silently degrade performance to O(n) — test with large data
  • Biggest mistake: using mutable objects as keys breaks lookups after modification

Every time you log into a website, your password isn't compared by scanning a list of millions of users one by one — that would take seconds. Instead, it's looked up in a hash table and found in a fraction of a millisecond. Hash tables power database indexes, caches, compilers, and the dictionaries in nearly every programming language you've ever used. They're one of the most practically impactful data structures in existence, yet most developers use them without truly understanding what's happening underneath.

How Hashing Actually Works — The Magic Formula Demystified

At its core, hashing is the process of converting any input — a string, an object, a number — into a fixed integer called a hash code. That hash code is then reduced (usually via the modulo operator) to fit within the array that backs your hash table. The slot in the array where your data lands is called a bucket.

Here's the critical insight: a good hash function is deterministic and uniform. Deterministic means the same input always produces the same output — 'alice@email.com' must always hash to bucket 42, never bucket 17 on Tuesdays. Uniform means the outputs are spread across all available buckets as evenly as possible, so you don't end up with 900 items crammed into bucket 3 while buckets 4 through 100 sit empty.

In Java, every object inherits a hashCode() method from Object. Strings compute their hash by multiplying each character's Unicode value through a polynomial, which gives decent distribution. The HashMap class takes that hash code and uses bit-manipulation tricks to map it to a bucket index. You never see this machinery, but it runs on every single put() and get() call you make.

Visual Diagram: How a Hash Function Maps Keys to Buckets

The diagram below illustrates the process a hash table follows when you insert a key-value pair. The key is first passed through a hash function, which produces an integer (the hash code). That integer is then reduced via modulo to fit within the bucket array. The result is the index where the entry is stored. If two keys produce the same index, a collision occurs — how the table handles that is covered in the next section.

Collision Handling — What Happens When Two Keys Land in the Same Bucket

No matter how brilliant your hash function is, collisions are mathematically inevitable. With enough keys and a finite number of buckets, two different keys will eventually map to the same bucket. 'John' and 'Mary' might both hash to bucket 7. Your hash table must have a strategy for this.

The two dominant approaches are Separate Chaining and Open Addressing. Separate chaining means each bucket holds a linked list (or in Java 8+, a balanced tree once the list grows past 8 nodes). When a collision occurs, the new entry is simply appended to the list at that bucket. Lookup then requires iterating that short list to find the right key. In practice, with a good load factor, these chains stay length 1 or 2.

Open addressing takes a different approach: when a collision occurs, the algorithm probes for the next available bucket using a defined sequence (linear probing, quadratic probing, or double hashing). Everything stays in the same array — no external lists. This is more cache-friendly but degrades faster when the table fills up.

Java's HashMap uses separate chaining. The key takeaway: collisions don't break a hash table — they just slow it down if they pile up too much. The load factor (entries divided by bucket count) is your early warning system.

Visual Diagram: Chaining vs Open Addressing Collision Resolution

The two main strategies for handling collisions differ in how they store multiple keys that land in the same bucket. Separate chaining stores a linked list (or tree) at each bucket, so all colliding entries are kept in the same bucket. Open addressing stores all entries within the array itself — when a collision occurs, it probes forward to find the next empty slot. The diagram below contrasts these two approaches for three keys that all originally map to bucket index 2.

HashMap vs HashTable in Java — The Practical Difference That Matters

Java has two classes that often confuse people: HashMap and HashTable (the older one). They feel interchangeable but they're not, and picking the wrong one can cause subtle bugs in multi-threaded code or hurt performance in single-threaded code.

HashTable is synchronized — every method acquires a lock before executing. This makes it thread-safe but painfully slow in single-threaded applications because you're paying synchronization overhead for nothing. It was the original Java 1.0 implementation, and the API has quirks: it doesn't allow null keys or null values, throwing a NullPointerException instead.

HashMap is unsynchronized, allows one null key and multiple null values, and is significantly faster in single-threaded contexts. It's the right default choice for the vast majority of code. When you do need thread safety, don't reach for HashTable — use ConcurrentHashMap instead. ConcurrentHashMap locks only individual bucket segments rather than the entire table, giving you thread safety without the bottleneck.

LinkedHashMap is worth knowing too: it wraps HashMap with a doubly-linked list that maintains insertion order, making it perfect for LRU caches. TreeMap orders by key but sacrifices O(1) for O(log n) — use it when sorted iteration matters more than raw speed.

Why HashMap is O(1) — And When It Quietly Becomes O(n)

The O(1) average-case performance of a hash map is one of the most cited facts in computer science — and one of the most misunderstood. It's not O(1) because the computer is magic. It's O(1) because the hash function computes the exact memory address of the bucket directly, skipping all comparison. No matter how many items are in the table, we always do the same three steps: hash the key, find the bucket, return the value.

But 'average case' is doing a lot of heavy lifting in that sentence. The worst case is O(n), and it happens when all keys collide into the same bucket, turning your hash table into a linked list. This can occur accidentally with a poor hash function or maliciously in a hash-flooding denial-of-service attack (where an attacker deliberately sends inputs that collide).

Java 8 introduced an important defence: when a bucket's chain exceeds 8 entries, it converts from a linked list to a red-black tree. This caps the worst case at O(log n) per bucket instead of O(n), making hash-flooding far less damaging. This is why your HashMap's performance stays predictable under adversarial conditions — the JDK engineers thought about this so you don't have to.

The practical lesson: keep your load factor below 0.75, ensure your key objects have well-distributed hashCode() implementations, and always override equals() alongside hashCode() — they're a contract in Java, not optional companions.

C++ std::unordered_map Implementation — Same Theory, Different Syntax

While Java's HashMap is the most common example for hash maps in the JVM world, C++ developers use std::unordered_map. The core concepts are identical: a hash function maps keys to buckets, collisions are resolved via separate chaining, and a load factor triggers rehashing. But the C++ standard library gives you more control over the hash function, the bucket count, and even the hash policy.

In C++, std::unordered_map is a hash-table-based associative container. It uses a default hash function from the <functional> header, but you can specialize std::hash for your custom types or provide a custom hash functor. The container also exposes direct bucket-level APIs: bucket_count(), bucket_size(), and bucket(key) let you inspect how items are distributed.

The same pitfalls apply: a poor hash function turns O(1) into O(n). C++ doesn't automatically treeify long chains — if you have many collisions, every lookup scans the entire bucket's linked list. However, you can control the max_load_factor and potentially trigger an early rehash to mitigate the issue.

Resizing and Load Factor — When HashMap Rewrites Itself

Hash tables cannot grow indefinitely within a fixed array. When the number of entries exceeds the load factor threshold (default 0.75 in Java), the HashMap resizes: it doubles the bucket array and rehashes every existing entry into the new table. This operation is O(n) and can cause a latency spike if it happens during a critical request.

The load factor is a trade-off. A lower load factor (e.g., 0.5) means fewer collisions and faster lookups but wastes memory. A higher load factor (e.g., 1.0) saves memory but increases collision probability and slows lookups. Java settled on 0.75 as the empiricial sweet spot — a balance that keeps chains short while not wasting too much space.

You can control resizing by setting the initial capacity in the constructor: new HashMap<>(expectedSize). If you know you'll store 10,000 entries, set the initial capacity to 13,334 (10,000 / 0.75) to avoid resizing entirely. This is a common optimisation in high-throughput services where predictability matters.

Advantages and Disadvantages: Hash Table vs BST vs Array

When choosing a data structure for a key-value lookup or collection, it's important to understand the trade-offs between hash tables, balanced binary search trees, and simple arrays. The table below breaks down the key differences across several dimensions.

FeatureHash TableBalanced BST (e.g., TreeMap)Sorted Array (binary search)
Average search timeO(1)O(log n)O(log n) via binary search
Worst-case searchO(n) (without treeification)O(log n) guaranteedO(log n) guaranteed
Insert time (average)O(1)O(log n)O(n) (shift elements)
Delete time (average)O(1)O(log n)O(n)
Memory overhead per elementModerate (bucket array + node)High (left/right/parent pointers)Low (only the items)
Ordered iterationNoYes (in-order)Yes (sorted)
Range queries (e.g., all keys between 5 and 10)Not supportedEfficientEfficient after binary search
Handles duplicate keysUsually not (value updated)Usually not (value updated)Supports duplicates
Hash function required?YesNo (comparator)No (comparator)

In practice: use a hash table when you need fast single-key lookups and don't need ordering. Use a BST when you need ordered traversal or range queries. Use a sorted array when your data is static and you want minimal memory overhead.

Practice Problems to Master Hash Tables

The best way to internalize hash table usage is to solve problems that require O(1) lookups. Below are a curated set of problems, ranging from easy to hard, that test different aspects of hash map knowledge — from simple frequency counting to two-pointer plus hash strategies.

  1. Two Sum (Easy)
  2. Given an array of integers, return indices of the two numbers that add up to a target.
  3. [LeetCode](https://leetcode.com/problems/two-sum/) — Classic hash map usage: store each element's value and its index, check for complement.
  4. Valid Anagram (Easy)
  5. Given two strings, determine if they are anagrams.
  6. [LeetCode](https://leetcode.com/problems/valid-anagram/) — Use a hash map to count frequencies or a fixed-size array for lowercase letters.
  7. Longest Consecutive Sequence (Medium)
  8. Given an unsorted array of integers, find the length of the longest consecutive elements sequence.
  9. [LeetCode](https://leetcode.com/problems/longest-consecutive-sequence/) — Use a hash set to allow O(1) lookups while building sequences.
  10. Group Anagrams (Medium)
  11. Given an array of strings, group anagrams together.
  12. [LeetCode](https://leetcode.com/problems/group-anagrams/) — Use a hash map with sorted character string as key.
  13. Subarray Sum Equals K (Medium)
  14. Given an array of integers, find the total number of continuous subarrays whose sum equals k.
  15. [LeetCode](https://leetcode.com/problems/subarray-sum-equals-k/) — Use a hash map to store cumulative sum frequencies.
  16. Top K Frequent Elements (Medium)
  17. Given an array of integers, return the k most frequent elements.
  18. [LeetCode](https://leetcode.com/problems/top-k-frequent-elements/) — Use a hash map for frequency counting then a bucket sort or heap.
  19. Encode and Decode TinyURL (Medium)
  20. Design a URL shortening service.
  21. [LeetCode](https://leetcode.com/problems/encode-and-decode-tinyurl/) — Use a hash map to store long-to-short and short-to-long mappings.
  22. LRU Cache (Medium/Hard)
  23. Design a data structure that follows the LRU (Least Recently Used) cache policy.
  24. [LeetCode](https://leetcode.com/problems/lru-cache/) — Combines a hash map with a doubly-linked list for O(1) operations.
FeatureHashMapHashTableConcurrentHashMapLinkedHashMap
Thread Safe?NoYes (full lock)Yes (segment lock)No
Null Keys Allowed1 null keyNot allowedNot allowed1 null key
Null Values AllowedYesNot allowedNot allowedYes
Iteration OrderNo guaranteeNo guaranteeNo guaranteeInsertion order
Average LookupO(1)O(1)O(1)O(1)
Worst-case Lookup (Java 8+)O(log n)O(log n)O(log n)O(log n)
Best ForGeneral purpose, single-threadedLegacy code onlyMulti-threaded accessLRU caches, ordered iteration
Introduced InJava 1.2Java 1.0Java 5Java 1.4

Key Takeaways

  • A hash function converts any key into a fixed integer — the magic is that it's O(1) to compute regardless of the table's size, which is why lookups are constant time.
  • Collisions are inevitable, not failures — Java's HashMap handles them with separate chaining (linked lists that upgrade to red-black trees at 8+ entries), capping worst-case at O(log n) instead of O(n).
  • The hashCode/equals contract is sacred: equal objects must share a hash code, and breaking this causes get() to silently return null for values you definitely stored — one of the trickiest bugs to diagnose.
  • ConcurrentHashMap is the modern answer to thread-safe map access — it locks individual buckets rather than the whole table, giving you safety without the performance cliff that HashTable creates.
  • Pre-sizing a HashMap based on expected size eliminates resizing overhead and can cut insertion time in half for large datasets.

Common Mistakes to Avoid

  • Using mutable objects as HashMap keys
    Symptom: After modifying a key that is already used, get() returns null because the object's hash code changed and the bucket no longer matches.
    Fix: Use immutable key classes (String, Integer, UUID) or ensure all fields used in hashCode() are final and don't change after insertion.
  • Overriding equals() without overriding hashCode()
    Symptom: Two objects that are logically equal (via equals()) are stored as separate entries in a HashSet or as separate keys in a HashMap. contains() returns false for a value that was added.
    Fix: Always override both methods. Use Objects.hash(field1, field2) to generate a reliable hashCode() with minimal boilerplate.
  • Iterating and modifying a HashMap simultaneously
    Symptom: Calling map.remove() inside a for-each loop on entrySet() throws ConcurrentModificationException.
    Fix: Use an explicit Iterator and call iterator.remove(), or collect the keys to remove into a separate list and call map.keySet().removeAll(keysToRemove) after the loop.

Interview Questions on This Topic

  • QWalk me through what happens internally in a Java HashMap when you call put(key, value) — from the moment you pass in the arguments to the moment the value is stored.SeniorReveal
    1. The key's hashCode() is called. 2. HashMap applies bitwise spread (hash ^ (hash >>> 16)) to mix high bits. 3. The result is masked to the current bucket array length -1 to get an index. 4. If that bucket is empty, create a new node and store the entry. 5. If the bucket already contains nodes (collision), iterate the chain. For each node, check if the key matches via equals(). If found, overwrite the value and return old value. If not found, append new node at end (or insert into tree if chain > 8). 6. After insertion, increment size. If size exceeds threshold = capacity * load factor, resize the table (double capacity and rehash all entries).
  • QWhat is the difference between HashMap and ConcurrentHashMap, and why is HashTable considered a poor choice for thread safety even though it's synchronized?Mid-levelReveal
    HashMap is not thread-safe — concurrent modifications can cause infinite loops or data loss. HashTable synchronizes all methods, which creates a single lock contention point — only one thread can access any method at a time, killing scalability. ConcurrentHashMap uses a segmented lock approach (in Java 8, fine-grained CAS + synchronized on individual bin heads) allowing multiple threads to read/write different buckets concurrently. It also uses non-blocking reads. HashTable is rarely the right choice today; use ConcurrentHashMap for shared mutable state.
  • QIf two keys have the same hashCode() but are not equal according to equals(), how does the HashMap handle storing and retrieving both of them? What data structure is used internally at that bucket?SeniorReveal
    They collide to the same bucket. HashMap stores them in a linked list (or a tree if the list exceeds 8 entries). When retrieving, HashMap computes the bucket index via hash, then iterates the list (or searches the tree) using equals() to find the exact match. Both entries remain separate; they are differentiated by equals().

Frequently Asked Questions

What is the difference between a hash table and a hash map?

The terms are often used interchangeably — both describe a data structure that maps keys to values using a hash function for O(1) average access. In Java specifically, HashTable is the legacy synchronized class from Java 1.0, while HashMap is the modern unsynchronized alternative. In general computer science discussions, 'hash table' refers to the underlying data structure, and 'hash map' refers to a key-value mapping built on top of it.

Why does HashMap allow null keys but HashTable doesn't?

HashMap explicitly handles null as a special case — it always places null keys into bucket 0, bypassing the hash function entirely. HashTable was designed to call hashCode() on every key without any null check, so passing null immediately throws a NullPointerException. This is a design inconsistency from Java's early days that was corrected when HashMap was introduced in Java 1.2.

Why is the default load factor for HashMap 0.75 and not 1.0?

At a load factor of 1.0, most buckets would be occupied and collisions would become frequent, degrading average lookup time noticeably. At 0.75, the statistical probability of a collision per insertion is low enough that chains stay very short (usually 0 or 1 entries), keeping lookups genuinely close to O(1). The 0.75 value is an empirically-tested sweet spot between memory efficiency (not too many empty buckets) and collision rate (not too many full ones).

What happens when a HashMap resizes?

When the number of entries exceeds (capacity * load factor), the array doubles in size. All existing entries are rehashed — each key's hashCode is recomputed and remapped to the new array. This is an O(n) operation that can cause a latency spike. Pre-sizing the map with the correct initial capacity avoids unnecessary resizes.

Can I use a custom class as a HashMap key?

Yes, but you must override equals() and hashCode() correctly. The key should be immutable (all fields final) to prevent changes after insertion that would change the hash code and break lookups. If you must use a mutable key, ensure it is never modified once put into the map.

🔥

That's Hashing. Mark it forged?

9 min read · try the examples if you haven't

Previous
DP on Trees
1 / 11 · Hashing
Next
Hash Collisions and Resolution