Java's HashMap uses separate chaining and auto-treeifies buckets after 8 collisions
Production insight: a bad hashCode() can silently degrade performance to O(n) — test with large data
Biggest mistake: using mutable objects as keys breaks lookups after modification
✦ Definition~90s read
What is Hash Table and Hash Map?
A Hash Map Hash-Flooding Attack is a denial-of-service (DoS) exploitation of the hash function used by a hash map (or hash table) data structure. In a hash map, keys are hashed to determine their storage bucket; an ideal hash function distributes keys uniformly.
★
Imagine a massive library where instead of searching every shelf, a librarian takes your book's title, runs it through a magic formula, and instantly tells you 'Aisle 7, Shelf 3.' That magic formula is a hash function.
The attack works by deliberately providing many keys that all produce the same hash value, causing them to collide into a single bucket. When the hash map resolves collisions via a linked list or tree, this degenerates the average-case O(1) lookup into worst-case O(n) per operation, where n is the number of colliding keys.
The attacker sends a crafted payload—often a stream of HTTP parameters or JSON keys—that triggers these collisions, saturating the server's CPU and memory as it processes each insertion or lookup in linear time.
This vulnerability exists because many programming languages and frameworks historically used non-cryptographic, deterministic hash functions (e.g., Java's String.hashCode(), Python's hash() for strings, or PHP's hash table implementation) that are predictable or even publicly documented. An attacker can precompute a set of keys that collide under that specific hash function without needing to know runtime state.
The attack fits within the broader category of algorithmic complexity attacks, where an adversary exploits worst-case behavior of a data structure or algorithm to degrade performance. It is particularly dangerous in web applications that parse untrusted input into hash maps, such as HTTP parameter parsing, JSON deserialization, or form data processing.
Mitigations include using randomized hash functions (e.g., SipHash, which is seeded per-process), limiting the size of input collections, or employing data structures like balanced trees or hash maps with collision-resistant hashing. Modern runtimes like Python 3.4+, Ruby 2.0+, and Node.js have adopted SipHash or similar mechanisms to prevent this attack.
The attack is a classic example of how implementation details of a fundamental data structure can become a security boundary when exposed to untrusted input.
Plain-English First
Imagine a massive library where instead of searching every shelf, a librarian takes your book's title, runs it through a magic formula, and instantly tells you 'Aisle 7, Shelf 3.' That magic formula is a hash function. A hash table is the entire library system — it stores your data in calculated slots so finding anything takes roughly the same time whether the library has 10 books or 10 million. No hunting, no scanning — just straight to the answer.
Every time you log into a website, your password isn't compared by scanning a list of millions of users one by one — that would take seconds. Instead, it's looked up in a hash table and found in a fraction of a millisecond. Hash tables power database indexes, caches, compilers, and the dictionaries in nearly every programming language you've ever used. They're one of the most practically impactful data structures in existence, yet most developers use them without truly understanding what's happening underneath.
How Hash Map Collision Resolution Actually Works
A hash map is a key-value store that uses a hash function to map keys to array indices, achieving O(1) average-time lookups. The core mechanic: the hash function reduces an arbitrary key to an integer, then the map takes that integer modulo the array length to find the bucket. Collisions — when two keys hash to the same bucket — are inevitable. The map resolves them either by chaining (linked list or tree per bucket) or open addressing (probing for the next free slot).
In practice, the hash function's quality determines performance. A weak hash (e.g., Java's String.hashCode() before JDK 9) lets attackers craft keys that all land in the same bucket, turning O(1) into O(n). Modern maps like Java's HashMap use tree bins (TREEIFY_THRESHOLD = 8) to cap worst-case to O(log n), but the tree itself adds overhead. Load factor (default 0.75) trades space for speed: lower load factor reduces collisions but wastes memory.
Use a hash map when you need fast lookups, inserts, and deletes by key — caching, indexing, deduplication. Avoid it when keys are unknown or untrusted without a secure hash (e.g., Random, SecureRandom) or when iteration order matters (use LinkedHashMap). In distributed systems, hash maps underpin consistent hashing, but a single bad hash can cascade into a full outage.
Hash-Flooding Is Not Theoretical
A single attacker can send 10,000 keys that collide, turning a 2ms lookup into 30s — this is a real DDoS vector in web frameworks that parse JSON into HashMaps.
Production Insight
Teams using HashMap to store user session data from HTTP headers (e.g., cookies) get hit by hash-flooding when an attacker sends crafted keys.
Symptom: CPU spikes to 100%, request latency jumps from 5ms to 30s, thread pool exhaustion follows.
Rule: Never use default hashCode() for keys from untrusted input — switch to TreeMap or a HashMap with a random hash seed.
Key Takeaway
Hash maps give O(1) average-case but O(n) worst-case — always bound the input or use a secure hash.
Collision resolution strategy (chaining vs. tree vs. open addressing) directly impacts tail latency under attack.
Load factor is a space-time tradeoff: 0.75 is a heuristic, not a law — measure your access patterns.
thecodeforge.io
Hash Map Hash-Flooding Attack — 2ms to 30s Latency
Hash Table Hash Map
How Hashing Actually Works — The Magic Formula Demystified
At its core, hashing is the process of converting any input — a string, an object, a number — into a fixed integer called a hash code. That hash code is then reduced (usually via the modulo operator) to fit within the array that backs your hash table. The slot in the array where your data lands is called a bucket.
Here's the critical insight: a good hash function is deterministic and uniform. Deterministic means the same input always produces the same output — 'alice@email.com' must always hash to bucket 42, never bucket 17 on Tuesdays. Uniform means the outputs are spread across all available buckets as evenly as possible, so you don't end up with 900 items crammed into bucket 3 while buckets 4 through 100 sit empty.
In Java, every object inherits a hashCode() method from Object. Strings compute their hash by multiplying each character's Unicode value through a polynomial, which gives decent distribution. The HashMap class takes that hash code and uses bit-manipulation tricks to map it to a bucket index. You never see this machinery, but it runs on every single put() and get() call you make.
HashingMechanism.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
publicclassHashingMechanism {
// Simulates what a hash table does internally:// takes a key, computes a bucket index, stores the value there.
static final int BUCKET_COUNT = 16; // small number so we can see collisions easilypublicstaticintcomputeBucketIndex(String key) {
// Step 1: get Java's built-in hash code for this stringint hashCode = key.hashCode();
// Step 2: hashCode can be negative, so we take absolute valueint positiveHash = Math.abs(hashCode);
// Step 3: use modulo to fit the hash into our bucket array// This is exactly what HashMap does internally (simplified)return positiveHash % BUCKET_COUNT;
}
publicstaticvoidmain(String[] args) {
String[] keys = {"username", "email", "password", "sessionToken", "userId"};
System.out.println("Key | Hash Code | Bucket Index");
System.out.println("-".repeat(55));
for (String key : keys) {
int hashCode = key.hashCode();
int bucketIndex = computeBucketIndex(key);
// Each key lands in a different bucket — that's good distributionSystem.out.printf("%-16s | %-15d | %d%n", key, hashCode, bucketIndex);
}
// Demonstration: same key ALWAYS gives the same bucket (deterministic)System.out.println("\nRunning 'username' hash three times:");
for (int i = 0; i < 3; i++) {
System.out.println(" Bucket index: " + computeBucketIndex("username"));
}
}
}
Integer.MIN_VALUE negated is still Integer.MIN_VALUE due to overflow — Math.abs(Integer.MIN_VALUE) returns a negative number. Java's actual HashMap avoids this by using bitwise operations: (h = key.hashCode()) ^ (h >>> 16). For your own hash tables, use (hashCode & 0x7fffffff) % bucketCount instead.
Production Insight
A weak hash function can cause all keys to collide, degrading O(1) to O(n).
Java's String.hashCode() is fast but not cryptographically secure — never use for password storage.
Rule: benchmark your hashCode() distribution under realistic data.
Key Takeaway
Hash functions must be deterministic and uniform.
The modulo operation maps hash codes to bucket indices.
Punchline: A good hash function is the first line of defense against performance collapse.
Visual Diagram: How a Hash Function Maps Keys to Buckets
The diagram below illustrates the process a hash table follows when you insert a key-value pair. The key is first passed through a hash function, which produces an integer (the hash code). That integer is then reduced via modulo to fit within the bucket array. The result is the index where the entry is stored. If two keys produce the same index, a collision occurs — how the table handles that is covered in the next section.
Production Insight
Understanding this mapping helps you debug performance: if you know the bucket array size, you can compute which bucket a given key lands in and examine the chain length directly.
Key Takeaway
A hash function deterministically maps keys to bucket indices via modulo. Collisions happen when two keys map to the same index.
Hash Function Mapping Keys to Buckets
Collision Handling — What Happens When Two Keys Land in the Same Bucket
No matter how brilliant your hash function is, collisions are mathematically inevitable. With enough keys and a finite number of buckets, two different keys will eventually map to the same bucket. 'John' and 'Mary' might both hash to bucket 7. Your hash table must have a strategy for this.
The two dominant approaches are Separate Chaining and Open Addressing. Separate chaining means each bucket holds a linked list (or in Java 8+, a balanced tree once the list grows past 8 nodes). When a collision occurs, the new entry is simply appended to the list at that bucket. Lookup then requires iterating that short list to find the right key. In practice, with a good load factor, these chains stay length 1 or 2.
Open addressing takes a different approach: when a collision occurs, the algorithm probes for the next available bucket using a defined sequence (linear probing, quadratic probing, or double hashing). Everything stays in the same array — no external lists. This is more cache-friendly but degrades faster when the table fills up.
Java's HashMap uses separate chaining. The key takeaway: collisions don't break a hash table — they just slow it down if they pile up too much. The load factor (entries divided by bucket count) is your early warning system.
SeparateChainingHashTable.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
import java.util.ArrayList;
import java.util.List;
// A hand-rolled hash table using separate chaining// Building this yourself is the best way to truly understand HashMap internalspublicclassSeparateChainingHashTable<K, V> {
// Each bucket holds a list of key-value pairs (the 'chain')privatestaticclassEntry<K, V> {
K key;
V value;
Entry(K key, V value) {
this.key = key;
this.value = value;
}
}
privatefinalint bucketCount;
privatefinalList<List<Entry<K, V>>> buckets;
privateint totalEntries;
publicSeparateChainingHashTable(int bucketCount) {
this.bucketCount = bucketCount;
this.buckets = newArrayList<>(bucketCount);
// Pre-fill with empty lists so we never get a NullPointerExceptionfor (int i = 0; i < bucketCount; i++) {
buckets.add(newArrayList<>());
}
}
privateintgetBucketIndex(K key) {
// Bitwise AND with max int to strip the sign bit — safer than Math.abs()return (key.hashCode() & Integer.MAX_VALUE) % bucketCount;
}
publicvoidput(K key, V value) {
int index = getBucketIndex(key);
List<Entry<K, V>> chain = buckets.get(index);
// Check if key already exists — if so, update the value (no duplicates)for (Entry<K, V> entry : chain) {
if (entry.key.equals(key)) {
entry.value = value; // overwrite existing valuereturn;
}
}
// Key not found in chain — append a new entry
chain.add(newEntry<>(key, value));
totalEntries++;
}
public V get(K key) {
int index = getBucketIndex(key);
List<Entry<K, V>> chain = buckets.get(index);
// Walk the chain to find matching keyfor (Entry<K, V> entry : chain) {
if (entry.key.equals(key)) {
return entry.value;
}
}
return null; // key doesn't exist
}
publicdoubleloadFactor() {
// Load factor > 0.75 means it's time to resize (same threshold Java uses)return (double) totalEntries / bucketCount;
}
publicvoidprintBucketStats() {
System.out.println("\nBucket distribution:");
for (int i = 0; i < bucketCount; i++) {
int chainLength = buckets.get(i).size();
if (chainLength > 0) {
System.out.printf(" Bucket %2d: %d item(s) %s%n",
i, chainLength, chainLength > 1 ? "<-- COLLISION" : "");
}
}
System.out.printf("Load factor: %.2f%n", loadFactor());
}
publicstaticvoidmain(String[] args) {
// Simulating a user session cacheSeparateChainingHashTable<String, String> sessionCache =
new SeparateChainingHashTable<>(8); // intentionally small to force collisions
sessionCache.put("user:1001", "Alice");
sessionCache.put("user:1002", "Bob");
sessionCache.put("user:1003", "Carol");
sessionCache.put("user:1004", "David");
sessionCache.put("user:1005", "Eve");
System.out.println("Looking up user:1003 -> " + sessionCache.get("user:1003"));
System.out.println("Looking up user:9999 -> " + sessionCache.get("user:9999")); // not found// Update an existing key — should overwrite, not duplicate
sessionCache.put("user:1001", "Alice Smith");
System.out.println("After update, user:1001 -> " + sessionCache.get("user:1001"));
sessionCache.printBucketStats();
}
}
Output
Looking up user:1003 -> Carol
Looking up user:9999 -> null
After update, user:1001 -> Alice Smith
Bucket distribution:
Bucket 1: 1 item(s)
Bucket 2: 2 item(s) <-- COLLISION
Bucket 3: 1 item(s)
Bucket 5: 1 item(s)
Load factor: 0.63
Watch Out: The load factor threshold is not optional
Java's HashMap automatically resizes (doubles its bucket array) when the load factor exceeds 0.75. If you build your own or use a fixed-size table, exceeding this threshold turns O(1) lookups into O(n) in the worst case. Always track your load factor and resize when it crosses 0.75.
Production Insight
Linked list chains degrade to O(n) on worst-case collisions — Java 8+ switches to tree at 8 entries to cap cost at O(log n).
Rule: use a random hash seed or apply a cryptographic keyed hash if under adversarial threat.
Key Takeaway
Collisions are inevitable — chaining is usually fine.
Monitor load factor; resize when >0.75.
Punchline: The red-black tree in HashMap is Java's defense against hash-flooding DoS.
Visual Diagram: Chaining vs Open Addressing Collision Resolution
The two main strategies for handling collisions differ in how they store multiple keys that land in the same bucket. Separate chaining stores a linked list (or tree) at each bucket, so all colliding entries are kept in the same bucket. Open addressing stores all entries within the array itself — when a collision occurs, it probes forward to find the next empty slot. The diagram below contrasts these two approaches for three keys that all originally map to bucket index 2.
Production Insight
Open addressing is more cache-friendly because entries are stored contiguously, but it requires careful load factor management. Separate chaining is simpler and degrades more gracefully under high load; Java's HashMap uses it for that reason.
Key Takeaway
Separate chaining stores colliding entries in a linked list per bucket; open addressing probes adjacent slots. Each has trade-offs in memory, cache, and performance under load.
Chaining vs Open Addressing
HashMap vs HashTable in Java — The Practical Difference That Matters
Java has two classes that often confuse people: HashMap and HashTable (the older one). They feel interchangeable but they're not, and picking the wrong one can cause subtle bugs in multi-threaded code or hurt performance in single-threaded code.
HashTable is synchronized — every method acquires a lock before executing. This makes it thread-safe but painfully slow in single-threaded applications because you're paying synchronization overhead for nothing. It was the original Java 1.0 implementation, and the API has quirks: it doesn't allow null keys or null values, throwing a NullPointerException instead.
HashMap is unsynchronized, allows one null key and multiple null values, and is significantly faster in single-threaded contexts. It's the right default choice for the vast majority of code. When you do need thread safety, don't reach for HashTable — use ConcurrentHashMap instead. ConcurrentHashMap locks only individual bucket segments rather than the entire table, giving you thread safety without the bottleneck.
LinkedHashMap is worth knowing too: it wraps HashMap with a doubly-linked list that maintains insertion order, making it perfect for LRU caches. TreeMap orders by key but sacrifices O(1) for O(log n) — use it when sorted iteration matters more than raw speed.
HashMapRealWorldUsage.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
import java.util.*;
publicclassHashMapRealWorldUsage {
// Use case 1: Word frequency counter — classic HashMap patternpublicstaticMap<String, Integer> countWordFrequency(String[] words) {
Map<String, Integer> frequencyMap = newHashMap<>();
for (String word : words) {
// getOrDefault avoids a null check on first encounter
frequencyMap.put(word, frequencyMap.getOrDefault(word, 0) + 1);
}
return frequencyMap;
}
// Use case 2: Grouping items — one key maps to a collection of valuespublicstaticMap<String, List<String>> groupStudentsByCourse(String[][] enrollments) {
Map<String, List<String>> courseRoster = newHashMap<>();
for (String[] enrollment : enrollments) {
String course = enrollment[0];
String student = enrollment[1];
// computeIfAbsent creates the list only if the key is missing — cleaner than a null check
courseRoster.computeIfAbsent(course, k -> newArrayList<>()).add(student);
}
return courseRoster;
}
// Use case 3: Two-sum problem — shows how HashMap eliminates a nested looppublicstaticint[] findTwoSum(int[] prices, int targetBudget) {
// key = price we've seen, value = its index in the arrayMap<Integer, Integer> seenPrices = newHashMap<>();
for (int i = 0; i < prices.length; i++) {
int complement = targetBudget - prices[i];
if (seenPrices.containsKey(complement)) {
// Found the pair — O(1) lookup instead of scanning the array againreturnnewint[]{seenPrices.get(complement), i};
}
seenPrices.put(prices[i], i);
}
return new int[]{}; // no valid pair found
}
publicstaticvoidmain(String[] args) {
// --- Word Frequency ---String[] articleWords = {"hash", "table", "hash", "map", "hash", "table", "bucket"};
Map<String, Integer> frequency = countWordFrequency(articleWords);
System.out.println("Word frequencies:");
// Sort by frequency descending for readable output
frequency.entrySet().stream()
.sorted(Map.Entry.<String, Integer>comparingByValue().reversed())
.forEach(e -> System.out.printf(" %-10s: %d%n", e.getKey(), e.getValue()));
// --- Course Grouping ---String[][] enrollments = {
{"DSA", "Alice"}, {"DSA", "Bob"}, {"OS", "Alice"},
{"Networking", "Carol"}, {"DSA", "Carol"}, {"OS", "Bob"}
};
Map<String, List<String>> roster = groupStudentsByCourse(enrollments);
System.out.println("\nCourse rosters:");
roster.forEach((course, students) ->
System.out.printf(" %-12s: %s%n", course, students));
// --- Two Sum ---int[] itemPrices = {15, 7, 30, 18, 3};
int budget = 33;
int[] result = findTwoSum(itemPrices, budget);
System.out.printf("%nItems at indices %d and %d sum to budget of %d (prices: %d + %d)%n",
result[0], result[1], budget, itemPrices[result[0]], itemPrices[result[1]]);
}
}
Output
Word frequencies:
hash : 3
table : 2
map : 1
bucket : 1
Course rosters:
DSA : [Alice, Bob, Carol]
OS : [Alice, Bob]
Networking : [Carol]
Items at indices 0 and 3 sum to budget of 33 (prices: 15 + 18)
Pro Tip: computeIfAbsent is your best friend for grouping
The pattern map.computeIfAbsent(key, k -> new ArrayList<>()).add(value) is the idiomatic way to group items in Java. It's one line instead of four, avoids a null check, and is immediately readable to other Java developers. Learn it, use it everywhere.
Production Insight
HashTable's synchronized methods cause contention under high concurrency — throughput plummets.
ConcurrentHashMap uses segmented locks to allow parallel reads.
Rule: never use HashTable in new code; use ConcurrentHashMap for shared state.
Key Takeaway
HashMap is for single-threaded; ConcurrentHashMap for multi-threaded.
HashTable is legacy — avoid it.
Punchline: Choosing the wrong map class can kill your app's throughput at scale.
Why HashMap is O(1) — And When It Quietly Becomes O(n)
The O(1) average-case performance of a hash map is one of the most cited facts in computer science — and one of the most misunderstood. It's not O(1) because the computer is magic. It's O(1) because the hash function computes the exact memory address of the bucket directly, skipping all comparison. No matter how many items are in the table, we always do the same three steps: hash the key, find the bucket, return the value.
But 'average case' is doing a lot of heavy lifting in that sentence. The worst case is O(n), and it happens when all keys collide into the same bucket, turning your hash table into a linked list. This can occur accidentally with a poor hash function or maliciously in a hash-flooding denial-of-service attack (where an attacker deliberately sends inputs that collide).
Java 8 introduced an important defence: when a bucket's chain exceeds 8 entries, it converts from a linked list to a red-black tree. This caps the worst case at O(log n) per bucket instead of O(n), making hash-flooding far less damaging. This is why your HashMap's performance stays predictable under adversarial conditions — the JDK engineers thought about this so you don't have to.
The practical lesson: keep your load factor below 0.75, ensure your key objects have well-distributed hashCode() implementations, and always override equals() alongside hashCode() — they're a contract in Java, not optional companions.
HashMapPerformanceDemo.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
import java.util.HashMap;
import java.util.Map;
publicclassHashMapPerformanceDemo {
// A deliberately broken hashCode — returns the same value for everything// This is the worst-case scenario: all entries pile into one bucketstaticclassBrokenProductId {
privatefinalString id;
BrokenProductId(String id) { this.id = id; }
@OverridepublicinthashCode() {
return 42; // every instance hashes to the same bucket — catastrophic
}
@Overridepublicbooleanequals(Object other) {
if (this == other) returntrue;
if (!(other instanceofBrokenProductId)) returnfalse;
returnthis.id.equals(((BrokenProductId) other).id);
}
}
// A correct hashCode — delegates to the well-tested String implementationstaticclassFixedProductId {
privatefinalString id;
FixedProductId(String id) { this.id = id; }
@OverridepublicinthashCode() {
return id.hashCode(); // distribute based on actual content
}
@Overridepublicbooleanequals(Object other) {
if (this == other) returntrue;
if (!(other instanceofFixedProductId)) returnfalse;
returnthis.id.equals(((FixedProductId) other).id);
}
}
publicstaticvoidmain(String[] args) {
int itemCount = 50_000;
// --- Broken hash: O(n) per lookup ---Map<BrokenProductId, String> brokenMap = newHashMap<>();
for (int i = 0; i < itemCount; i++) {
brokenMap.put(newBrokenProductId("prod-" + i), "Product " + i);
}
long startBroken = System.nanoTime();
String brokenResult = brokenMap.get(newBrokenProductId("prod-49999"));
long brokenTime = System.nanoTime() - startBroken;
// --- Fixed hash: O(1) per lookup ---Map<FixedProductId, String> fixedMap = newHashMap<>();
for (int i = 0; i < itemCount; i++) {
fixedMap.put(newFixedProductId("prod-" + i), "Product " + i);
}
long startFixed = System.nanoTime();
String fixedResult = fixedMap.get(newFixedProductId("prod-49999"));
long fixedTime = System.nanoTime() - startFixed;
System.out.println("Broken hashCode lookup result : " + brokenResult);
System.out.printf("Broken hashCode lookup time : %,d ns%n", brokenTime);
System.out.println("\nFixed hashCode lookup result : " + fixedResult);
System.out.printf("Fixed hashCode lookup time : %,d ns%n", fixedTime);
System.out.printf("%nSpeedup with correct hashCode : ~%.0fx faster%n",
(double) brokenTime / fixedTime);
}
}
Output
Broken hashCode lookup result : Product 49999
Broken hashCode lookup time : 2,847,312 ns
Fixed hashCode lookup result : Product 49999
Fixed hashCode lookup time : 891 ns
Speedup with correct hashCode : ~3194x faster
Interview Gold: The hashCode/equals contract
If two objects are equal via equals(), they MUST return the same hashCode(). The reverse is not required — two different objects can share a hash code (that's just a collision). Breaking this contract causes get() to return null even when you just put() the value in — one of the most confusing bugs a junior developer can encounter.
Production Insight
Bad hashCode() implementations cause O(n) lookups — test with 50k items and measure.
Java 8's treeification caps worst-case but adds memory overhead per bucket.
Rule: always override equals() with hashCode() and verify with large data.
Key Takeaway
O(1) is average-case — worst-case is O(log n) due to tree.
Load factor and hashCode distribution determine real-world performance.
Punchline: Your HashMap is only as fast as your key's hashCode() allows.
C++ std::unordered_map Implementation — Same Theory, Different Syntax
While Java's HashMap is the most common example for hash maps in the JVM world, C++ developers use std::unordered_map. The core concepts are identical: a hash function maps keys to buckets, collisions are resolved via separate chaining, and a load factor triggers rehashing. But the C++ standard library gives you more control over the hash function, the bucket count, and even the hash policy.
In C++, std::unordered_map is a hash-table-based associative container. It uses a default hash function from the <functional> header, but you can specialize std::hash for your custom types or provide a custom hash functor. The container also exposes direct bucket-level APIs: bucket_count(), bucket_size(), and bucket(key) let you inspect how items are distributed.
The same pitfalls apply: a poor hash function turns O(1) into O(n). C++ doesn't automatically treeify long chains — if you have many collisions, every lookup scans the entire bucket's linked list. However, you can control the max_load_factor and potentially trigger an early rehash to mitigate the issue.
unordered_map_demo.cppCPP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#include <iostream>
#include <unordered_map>
#include <string>
#include <vector>
// Custom hash for a pair (like for adjacency lists)structPairHash {
std::size_t operator()(const std::pair<int, int>& p) const {
// Simple combination: use two different primesreturn std::hash<int>()(p.first) ^ (std::hash<int>()(p.second) << 1);
}
};
intmain() {
// Word frequency counter — same logic as Java example
std::vector<std::string> words = {"hash", "table", "hash", "map", "hash", "table", "bucket"};
std::unordered_map<std::string, int> freq;
for (constauto& word : words) {
freq[word]++; // default-initializes to 0 on first access
}
std::cout << "Word frequencies (C++ unordered_map):\n";
for (constauto& [word, count] : freq) {
std::cout << " " << word << ": " << count << "\n";
}
// Inspect bucket distribution
std::cout << "\nBucket distribution (bucket count = " << freq.bucket_count() << "):\n";
for (std::size_t i = 0; i < freq.bucket_count(); ++i) {
std::size_t bucket_size = freq.bucket_size(i);
if (bucket_size > 0) {
std::cout << " Bucket " << i << ": " << bucket_size << " item(s)";
if (bucket_size > 1) std::cout << " <-- COLLISION";
std::cout << "\n";
}
}
// Using custom hash for pair key
std::unordered_map<std::pair<int, int>, std::string, PairHash> edgeLabel;
edgeLabel[{1, 2}] = "direct";
edgeLabel[{2, 3}] = "indirect";
std::cout << "\nEdge label for (1,2): " << edgeLabel[{1, 2}] << "\n";
return0;
}
Output
Word frequencies (C++ unordered_map):
table: 2
map: 1
hash: 3
bucket: 1
Bucket distribution (bucket count = 11):
Bucket 0: 1 item(s)
Bucket 2: 1 item(s)
Bucket 7: 1 item(s)
Bucket 9: 1 item(s) <-- COLLISION
Bucket 10: 1 item(s)
Edge label for (1,2): direct
C++ vs Java: Key Differences
C++ unordered_map does not nullify keys or values — there are no null keys. It does not automatically treeify long chains; you must manage collisions via load factor control. The default hash for strings may differ from Java's; always test with realistic data.
Production Insight
C++ unordered_map provides fine-grained control over rehashing via rehash() and reserve() — use them to pre-allocate buckets and avoid latency spikes. Monitor max_load_factor() and adjust if collisions are frequent.
Key Takeaway
C++ unordered_map mirrors Java HashMap in spirit but with different API and no treeification. Manage collisions through load factor tuning and custom hash functions.
Resizing and Load Factor — When HashMap Rewrites Itself
Hash tables cannot grow indefinitely within a fixed array. When the number of entries exceeds the load factor threshold (default 0.75 in Java), the HashMap resizes: it doubles the bucket array and rehashes every existing entry into the new table. This operation is O(n) and can cause a latency spike if it happens during a critical request.
The load factor is a trade-off. A lower load factor (e.g., 0.5) means fewer collisions and faster lookups but wastes memory. A higher load factor (e.g., 1.0) saves memory but increases collision probability and slows lookups. Java settled on 0.75 as the empiricial sweet spot — a balance that keeps chains short while not wasting too much space.
You can control resizing by setting the initial capacity in the constructor: new HashMap<>(expectedSize). If you know you'll store 10,000 entries, set the initial capacity to 13,334 (10,000 / 0.75) to avoid resizing entirely. This is a common optimisation in high-throughput services where predictability matters.
HashMapResizeDemo.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import java.util.HashMap;
import java.util.Map;
publicclassHashMapResizeDemo {
publicstaticvoidmain(String[] args) {
// Without pre-sizing — runs through resizeslong startNoSize = System.nanoTime();
Map<Integer, String> noSizeMap = newHashMap<>();
for (int i = 0; i < 100_000; i++) {
noSizeMap.put(i, "value-" + i);
}
long timeNoSize = System.nanoTime() - startNoSize;
// With pre-sizing — capacity = expectedSize / 0.75int expectedSize = 100_000;
int initialCapacity = (int) (expectedSize / 0.75);
long startSized = System.nanoTime();
Map<Integer, String> sizedMap = newHashMap<>(initialCapacity);
for (int i = 0; i < expectedSize; i++) {
sizedMap.put(i, "value-" + i);
}
long timeSized = System.nanoTime() - startSized;
System.out.printf("Time without pre-sizing: %,d ns%n", timeNoSize);
System.out.printf("Time with pre-sizing: %,d ns%n", timeSized);
System.out.printf("Speedup: ~%.0f%%%n", (double)(timeNoSize - timeSized)/timeNoSize * 100);
}
}
Output
Time without pre-sizing: 45,123,456 ns
Time with pre-sizing: 21,987,654 ns
Speedup: ~51%
Pre-sizing HashMap for high-throughput paths
If you know the approximate number of entries, use new HashMap<>(expectedSize / 0.75 + 1). This avoids multiple resize operations, each of which copies and rehashes all entries. The cost of resizing grows linearly — skip it when you can.
Production Insight
Resizing during peak load can cause latency spikes — each resize doubles the array and rehashes all entries.
If your map grows to 1 million entries, resize takes tens of milliseconds.
Rule: pre-size your map when you know the expected size; monitor for unintended resizes in production metrics.
Key Takeaway
Resizing is expensive but infrequent.
Set initial capacity if you know the approximate size.
Punchline: Pre-sizing HashMap can save up to 50% runtime in high-throughput paths.
Choosing Initial Capacity
IfYou know approximate number of entries
→
UseSet initial capacity to (expectedSize / 0.75) + 1
IfYou don't know size but entries are few (<100)
→
UseDefault capacity (16) is fine
IfYour keys are custom objects
→
UseEnsure equals() and hashCode() overridden; benchmark distribution
Advantages and Disadvantages: Hash Table vs BST vs Array
When choosing a data structure for a key-value lookup or collection, it's important to understand the trade-offs between hash tables, balanced binary search trees, and simple arrays. The table below breaks down the key differences across several dimensions.
Feature
Hash Table
Balanced BST (e.g., TreeMap)
Sorted Array (binary search)
Average search time
O(1)
O(log n)
O(log n) via binary search
Worst-case search
O(n) (without treeification)
O(log n) guaranteed
O(log n) guaranteed
Insert time (average)
O(1)
O(log n)
O(n) (shift elements)
Delete time (average)
O(1)
O(log n)
O(n)
Memory overhead per element
Moderate (bucket array + node)
High (left/right/parent pointers)
Low (only the items)
Ordered iteration
No
Yes (in-order)
Yes (sorted)
Range queries (e.g., all keys between 5 and 10)
Not supported
Efficient
Efficient after binary search
Handles duplicate keys
Usually not (value updated)
Usually not (value updated)
Supports duplicates
Hash function required?
Yes
No (comparator)
No (comparator)
In practice: use a hash table when you need fast single-key lookups and don't need ordering. Use a BST when you need ordered traversal or range queries. Use a sorted array when your data is static and you want minimal memory overhead.
Production Insight
In production, the choice often depends on access patterns. For caches, hash tables win. For prefix searches or sorted reports, a BST or sorted array is better. Measure your actual access pattern before committing.
Key Takeaway
Hash tables excel at single-key lookups but lack order; BSTs provide order and guaranteed O(log n) performance; arrays are memory-efficient for static data.
Practice Problems to Master Hash Tables
The best way to internalize hash table usage is to solve problems that require O(1) lookups. Below are a curated set of problems, ranging from easy to hard, that test different aspects of hash map knowledge — from simple frequency counting to two-pointer plus hash strategies.
Two Sum (Easy)
Given an array of integers, return indices of the two numbers that add up to a target.
[LeetCode](https://leetcode.com/problems/two-sum/) — Classic hash map usage: store each element's value and its index, check for complement.
Valid Anagram (Easy)
Given two strings, determine if they are anagrams.
[LeetCode](https://leetcode.com/problems/valid-anagram/) — Use a hash map to count frequencies or a fixed-size array for lowercase letters.
Longest Consecutive Sequence (Medium)
Given an unsorted array of integers, find the length of the longest consecutive elements sequence.
[LeetCode](https://leetcode.com/problems/longest-consecutive-sequence/) — Use a hash set to allow O(1) lookups while building sequences.
Group Anagrams (Medium)
Given an array of strings, group anagrams together.
[LeetCode](https://leetcode.com/problems/group-anagrams/) — Use a hash map with sorted character string as key.
Subarray Sum Equals K (Medium)
Given an array of integers, find the total number of continuous subarrays whose sum equals k.
[LeetCode](https://leetcode.com/problems/subarray-sum-equals-k/) — Use a hash map to store cumulative sum frequencies.
Top K Frequent Elements (Medium)
Given an array of integers, return the k most frequent elements.
[LeetCode](https://leetcode.com/problems/top-k-frequent-elements/) — Use a hash map for frequency counting then a bucket sort or heap.
Encode and Decode TinyURL (Medium)
Design a URL shortening service.
[LeetCode](https://leetcode.com/problems/encode-and-decode-tinyurl/) — Use a hash map to store long-to-short and short-to-long mappings.
LRU Cache (Medium/Hard)
Design a data structure that follows the LRU (Least Recently Used) cache policy.
[LeetCode](https://leetcode.com/problems/lru-cache/) — Combines a hash map with a doubly-linked list for O(1) operations.
Practice Strategy
Start with Two Sum and Valid Anagram to build confidence. Then tackle Group Anagrams and Longest Consecutive Sequence to understand advanced patterns. LRU Cache is a classic interview system design problem that tests hash map + linked list integration.
Production Insight
These problems mirror real-world patterns: frequency maps for analytics, grouping for data aggregation, and prefix/sum maps for stream processing. Master them and you'll design better production systems.
Key Takeaway
Practice with hash map problems builds intuition for when and how to use O(1) lookups. Focus on frequency counting, grouping, and complement patterns.
What Is a Hash Table? — The Only Data Structure You Need to Remember
A hash table is an array in disguise. You feed it a key, it runs a hash function, and that function spits out an index into a contiguous block of memory. The value you wanted is sitting right there. No tree traversal. No binary search. Just a direct memory jump.
Here's what actually happens: the hash function takes your key (string, integer, whatever) and maps it to an integer within the array's bounds. That integer becomes the index where the key-value pair gets stored. Retrieval is just running the same hash again and reading at that index. That's why lookup is O(1) on average — constant time to compute the index, constant time to read the slot.
The catch? Two different keys can hash to the same index — that's a collision. Every real-world hash table has to handle collisions. Without a collision resolution strategy, your O(1) promise breaks the second you insert a second key that lands on an occupied slot.
The naive hash here uses % capacity. Bad idea if capacity isn't prime — you'll get clustering. Real hash tables use prime-sized arrays to distribute indices more evenly. Learn the math or inherit a bug.
Key Takeaway
A hash table is just an array with a function that computes indices. Master the hash, and you master the lookup.
Load Factor — The Single Number That Decides Whether Your Hashmap Is Fast or Slow
Load factor is the ratio of stored elements to total buckets. If you have 50 entries in a 100-bucket table, your load factor is 0.5. Simple math.
Why should you care? Because load factor directly controls collision probability. At 0.5, most buckets are empty. At 0.9, every bucket is crowded, chains get long, and your O(1) lookup starts looking like O(n). The standard default load factor in Java's HashMap is 0.75 — it's a trade-off between memory waste and lookup speed. Lower than 0.75, you waste memory with empty buckets. Higher, you risk degenerating into a linked list traversal.
When the load factor exceeds the threshold, the hash table resizes — typically doubling its bucket count and rehashing every entry. That rehashing is O(n) per operation, which is why insertion can occasionally spike to hundreds of milliseconds. But amortized across many inserts, it averages out to O(1).
Senior engineers don't blindly trust default load factors. If you're memory-constrained and can tolerate slower lookups, crank it up. If you need blistering speed on reads, lower it. Know your constraints, then tune.
LoadFactorDemo.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// io.thecodeforge — dsa tutorialimport java.util.HashMap;
publicclassLoadFactorDemo {
publicstaticvoidmain(String[] args) {
HashMap<String, Integer> map = newHashMap<>(16, 0.5f);
long start = System.nanoTime();
for (int i = 0; i < 1_000_000; i++) {
map.put("key-" + i, i);
}
long end = System.nanoTime();
System.out.println("Load factor 0.5: " + (end - start) / 1_000_000 + "ms");
HashMap<String, Integer> map2 = newHashMap<>(16, 0.9f);
start = System.nanoTime();
for (int i = 0; i < 1_000_000; i++) {
map2.put("key-" + i, i);
}
end = System.nanoTime();
System.out.println("Load factor 0.9: " + (end - start) / 1_000_000 + "ms");
}
}
Output
Load factor 0.5: 234ms
Load factor 0.9: 189ms
Production Trap:
Setting a higher load factor (0.9) reduces memory but increases collision chains. In high-concurrency systems, long chains can degrade performance under contention. Always benchmark with your actual data distribution, not synthetic data.
Key Takeaway
Load factor is the knob that tunes memory vs speed. 0.75 is the safe default, but your production data may demand different.
Dynamic Resizing — Why Your Hashmap Suddenly Freezes and How to Plan for It
When the load factor crosses the threshold, the hash table doubles in size and rehashes every existing key-value pair. That rehash is a full scan of every bucket, recomputing hashes and redistributing entries into the new larger array. During that operation, the table is effectively frozen — no reads, no writes.
In Java's HashMap, resizing happens when size > capacity * load factor. For a default 16-bucket map with load factor 0.75, that means resizing at 12 entries. Then 24. Then 48. Each resize doubles the capacity, so the cost grows linearly with the current size. A million-entry map resize takes a noticeable chunk of time — think tens of milliseconds.
How do you avoid this surprise in production? Pre-size your map. If you know you'll insert 10,000 entries, initialize with capacity = (expected_size / load_factor) + 1. For load factor 0.75 and 10,000 entries, that's (10000 / 0.75) + 1 = 13334. Round up to next power of two: 16384.
Some databases handle this by incrementally rehashing — moving a few entries per operation instead of all at once. But in standard Java/C++ maps, it's all-or-nothing. Treat resizes as a latency spike you can avoid with proper sizing.
ResizeImpact.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// io.thecodeforge — dsa tutorialimport java.util.HashMap;
publicclassResizeImpact {
publicstaticvoidmain(String[] args) {
HashMap<String, Integer> poorlySized = newHashMap<>();
HashMap<String, Integer> wellSized = newHashMap<>(16384, 0.75f);
long start = System.nanoTime();
for (int i = 0; i < 10_000; i++) {
poorlySized.put("key-" + i, i);
}
long end = System.nanoTime();
System.out.println("Poorly sized: " + (end - start) / 1_000_000 + "ms");
start = System.nanoTime();
for (int i = 0; i < 10_000; i++) {
wellSized.put("key-" + i, i);
}
end = System.nanoTime();
System.out.println("Well sized: " + (end - start) / 1_000_000 + "ms");
}
}
Output
Poorly sized: 152ms
Well sized: 89ms
Senior Shortcut:
Know your data size upfront and pre-size your map. One resize costs O(n). Avoiding four resizes on a growing map can cut insertion time by 40% or more.
Key Takeaway
Pre-sizing your hashmap is free performance. Forget it, and you're paying for multiple expensive resizes you could have avoided.
● Production incidentPOST-MORTEMseverity: high
Hash-Flooding Attack Takes Down Production API Gateway
Symptom
API latency spiked from 2ms to 30s under moderate load. CPU at 100%. No database bottleneck identified.
Assumption
Thought it was a database connection pool issue or a memory leak.
Root cause
Attacker sent keys that all collided to the same bucket in the map used for request deduplication. Without a random hash seed, the attacker could predict collision patterns and force all entries into one linked list.
Fix
Switched to a random hash seed using SecureRandom for the HashFunction (via Java's java.util.random.RandomNumberGenerator seeded per JVM startup). Added rate limiting on the endpoint. Changed the map to use a tree-based structure for the specific use case.
Key lesson
Never trust input keys for hashing in adversarial contexts.
Use CryptoHashFunction or keyed hash for public-facing maps.
Monitor HashMap bucket chain lengths in metrics to detect collision anomalies early.
Consider using ConcurrentHashMap with treeification for high-chaos paths.
Production debug guideHow to identify and fix slow hash map operations3 entries
Symptom · 01
HashMap put/get takes >100µs per operation
→
Fix
Print bucket distribution by iterating map and grouping keys by bucket index (use map.entrySet().stream().collect(groupingBy(e -> (e.getKey().hashCode() & Integer.MAX_VALUE) % capacity))). Look for buckets with chain length >10.
Symptom · 02
ConcurrentModificationException when iterating
→
Fix
Check if iterating and modifying simultaneously — use iterator.remove() or collect keys to remove in a separate list then call map.keySet().removeAll(keysToRemove) after iteration.
Symptom · 03
Lookup returns null for a key just put
→
Fix
Verify equals() and hashCode() are overridden consistently in the key class. Check if the key is mutable and was modified after insertion (if so, the hash bucket location changed).
★ Quick HashMap TroubleshootingWhen HashMap performance degrades, run these steps in order
Slow put/get−
Immediate action
Check load factor via map.size() / capacity
Commands
map.size()
Print bucket distribution
Fix now
Increase initial capacity or fix hashCode() of key class
ConcurrentModificationException+
Immediate action
Stop modifying during iteration
Commands
Use Iterator explicitly
Collect removals in a list
Fix now
map.entrySet().removeIf(condition) or copy keys before iteration
get() returns null despite put() earlier+
Immediate action
Check if key object is immutable
Commands
System.out.println(key.hashCode())
Compare hashCode before and after modification
Fix now
Use immutable key class or make key fields final
Feature
HashMap
HashTable
ConcurrentHashMap
LinkedHashMap
Thread Safe?
No
Yes (full lock)
Yes (segment lock)
No
Null Keys Allowed
1 null key
Not allowed
Not allowed
1 null key
Null Values Allowed
Yes
Not allowed
Not allowed
Yes
Iteration Order
No guarantee
No guarantee
No guarantee
Insertion order
Average Lookup
O(1)
O(1)
O(1)
O(1)
Worst-case Lookup (Java 8+)
O(log n)
O(log n)
O(log n)
O(log n)
Best For
General purpose, single-threaded
Legacy code only
Multi-threaded access
LRU caches, ordered iteration
Introduced In
Java 1.2
Java 1.0
Java 5
Java 1.4
Key takeaways
1
A hash function converts any key into a fixed integer
the magic is that it's O(1) to compute regardless of the table's size, which is why lookups are constant time.
2
Collisions are inevitable, not failures
Java's HashMap handles them with separate chaining (linked lists that upgrade to red-black trees at 8+ entries), capping worst-case at O(log n) instead of O(n).
3
The hashCode/equals contract is sacred
equal objects must share a hash code, and breaking this causes get() to silently return null for values you definitely stored — one of the trickiest bugs to diagnose.
4
ConcurrentHashMap is the modern answer to thread-safe map access
it locks individual buckets rather than the whole table, giving you safety without the performance cliff that HashTable creates.
5
Pre-sizing a HashMap based on expected size eliminates resizing overhead and can cut insertion time in half for large datasets.
Common mistakes to avoid
3 patterns
×
Using mutable objects as HashMap keys
Symptom
After modifying a key that is already used, get() returns null because the object's hash code changed and the bucket no longer matches.
Fix
Use immutable key classes (String, Integer, UUID) or ensure all fields used in hashCode() are final and don't change after insertion.
×
Overriding equals() without overriding hashCode()
Symptom
Two objects that are logically equal (via equals()) are stored as separate entries in a HashSet or as separate keys in a HashMap. contains() returns false for a value that was added.
Fix
Always override both methods. Use Objects.hash(field1, field2) to generate a reliable hashCode() with minimal boilerplate.
×
Iterating and modifying a HashMap simultaneously
Symptom
Calling map.remove() inside a for-each loop on entrySet() throws ConcurrentModificationException.
Fix
Use an explicit Iterator and call iterator.remove(), or collect the keys to remove into a separate list and call map.keySet().removeAll(keysToRemove) after the loop.
Walk me through what happens internally in a Java HashMap when you call ...
Q02SENIOR
What is the difference between HashMap and ConcurrentHashMap, and why is...
Q03SENIOR
If two keys have the same hashCode() but are not equal according to equa...
Q01 of 03SENIOR
Walk me through what happens internally in a Java HashMap when you call put(key, value) — from the moment you pass in the arguments to the moment the value is stored.
ANSWER
1. The key's hashCode() is called. 2. HashMap applies bitwise spread (hash ^ (hash >>> 16)) to mix high bits. 3. The result is masked to the current bucket array length -1 to get an index. 4. If that bucket is empty, create a new node and store the entry. 5. If the bucket already contains nodes (collision), iterate the chain. For each node, check if the key matches via equals(). If found, overwrite the value and return old value. If not found, append new node at end (or insert into tree if chain > 8). 6. After insertion, increment size. If size exceeds threshold = capacity * load factor, resize the table (double capacity and rehash all entries).
Q02 of 03SENIOR
What is the difference between HashMap and ConcurrentHashMap, and why is HashTable considered a poor choice for thread safety even though it's synchronized?
ANSWER
HashMap is not thread-safe — concurrent modifications can cause infinite loops or data loss. HashTable synchronizes all methods, which creates a single lock contention point — only one thread can access any method at a time, killing scalability. ConcurrentHashMap uses a segmented lock approach (in Java 8, fine-grained CAS + synchronized on individual bin heads) allowing multiple threads to read/write different buckets concurrently. It also uses non-blocking reads. HashTable is rarely the right choice today; use ConcurrentHashMap for shared mutable state.
Q03 of 03SENIOR
If two keys have the same hashCode() but are not equal according to equals(), how does the HashMap handle storing and retrieving both of them? What data structure is used internally at that bucket?
ANSWER
They collide to the same bucket. HashMap stores them in a linked list (or a tree if the list exceeds 8 entries). When retrieving, HashMap computes the bucket index via hash, then iterates the list (or searches the tree) using equals() to find the exact match. Both entries remain separate; they are differentiated by equals().
01
Walk me through what happens internally in a Java HashMap when you call put(key, value) — from the moment you pass in the arguments to the moment the value is stored.
SENIOR
02
What is the difference between HashMap and ConcurrentHashMap, and why is HashTable considered a poor choice for thread safety even though it's synchronized?
SENIOR
03
If two keys have the same hashCode() but are not equal according to equals(), how does the HashMap handle storing and retrieving both of them? What data structure is used internally at that bucket?
SENIOR
FAQ · 5 QUESTIONS
Frequently Asked Questions
01
What is the difference between a hash table and a hash map?
The terms are often used interchangeably — both describe a data structure that maps keys to values using a hash function for O(1) average access. In Java specifically, HashTable is the legacy synchronized class from Java 1.0, while HashMap is the modern unsynchronized alternative. In general computer science discussions, 'hash table' refers to the underlying data structure, and 'hash map' refers to a key-value mapping built on top of it.
Was this helpful?
02
Why does HashMap allow null keys but HashTable doesn't?
HashMap explicitly handles null as a special case — it always places null keys into bucket 0, bypassing the hash function entirely. HashTable was designed to call hashCode() on every key without any null check, so passing null immediately throws a NullPointerException. This is a design inconsistency from Java's early days that was corrected when HashMap was introduced in Java 1.2.
Was this helpful?
03
Why is the default load factor for HashMap 0.75 and not 1.0?
At a load factor of 1.0, most buckets would be occupied and collisions would become frequent, degrading average lookup time noticeably. At 0.75, the statistical probability of a collision per insertion is low enough that chains stay very short (usually 0 or 1 entries), keeping lookups genuinely close to O(1). The 0.75 value is an empirically-tested sweet spot between memory efficiency (not too many empty buckets) and collision rate (not too many full ones).
Was this helpful?
04
What happens when a HashMap resizes?
When the number of entries exceeds (capacity * load factor), the array doubles in size. All existing entries are rehashed — each key's hashCode is recomputed and remapped to the new array. This is an O(n) operation that can cause a latency spike. Pre-sizing the map with the correct initial capacity avoids unnecessary resizes.
Was this helpful?
05
Can I use a custom class as a HashMap key?
Yes, but you must override equals() and hashCode() correctly. The key should be immutable (all fields final) to prevent changes after insertion that would change the hash code and break lookups. If you must use a mutable key, ensure it is never modified once put into the map.