Two Sum Hashing — HashMap Memory Blowout on 5M Records
HashMap storing 5M entries caused OutOfMemoryError in Java heap space under 256MB.
- Two Sum: find two numbers that sum to target
- Brute force: O(n²) nested loops, works but slows at scale
- Hashing: single pass, O(n) time, O(n) space
- Complement = target - current; check if complement already seen
- Insert after check to prevent self-match
- Production watch: OOM when no solution exists for huge arrays; memory grows with input
Imagine you're at a grocery store with a shopping list, and you need to find two items whose prices add up to exactly $10. The slow way is to pick up every item and compare it to every other item — exhausting. The smart way is to glance at an item priced $3, then immediately check a mental sticky note to see if you've already seen a $7 item. That sticky note is a hash map — it lets you look up any past price in an instant rather than scanning the whole store again. That's the entire trick behind the Two Sum hash solution.
The Two Sum problem is probably the most famous coding interview question in the world. It shows up at Google, Amazon, Meta, and practically every tech company that cares about algorithmic thinking. But its fame isn't just because interviewers are lazy — it's because the problem perfectly exposes whether a candidate knows how to trade memory for speed, which is one of the most important skills in software engineering.
The problem itself sounds almost too simple: given a list of numbers and a target value, find the two numbers in the list that add up to the target. The naive approach — check every pair — works, but it's painfully slow for large inputs. The hashing approach cuts that time dramatically by remembering what you've already seen as you walk through the list just once.
By the end of this article you'll understand exactly why hashing makes this problem fast, be able to write the solution from memory, know the exact edge cases that trip up beginners, and have confident answers ready for the three follow-up questions interviewers love to ask.
The Brute Force Approach — And Why It Falls Apart
Before we reach for the elegant solution, let's fully understand the problem and the obvious first attempt. You're given an array of integers — say [2, 7, 11, 15] — and a target number, say 9. Your job is to find which two numbers sum to 9. In this case it's 2 and 7, sitting at index 0 and index 1.
The most natural thing a beginner does is use two nested loops: pick the first number, then loop through every remaining number to see if they sum to the target. This works correctly — but think about what happens when your array has 10,000 elements. For each of the 10,000 numbers, you scan up to 9,999 others. That's roughly 100 million comparisons. On a large dataset this gets brutally slow.
In computer science we describe this cost as $O(n^2)$ — pronounced 'O of n squared'. It means the work grows with the square of the input size. Double the input, quadruple the work. This is the core pain point that hashing solves, and understanding this pain makes the solution feel like a genuine breakthrough rather than a magic trick.
How Hashing Turns an O(n²) Problem Into O(n)
Here's the key insight: when you're standing at any number in the array, you already know exactly what its partner needs to be. If the current number is 7 and the target is 9, the partner must be 2. You don't need to search for it — you just need to check whether you've already seen it.
A HashMap is the perfect tool for this. Think of it as an instant-lookup notebook. As you walk through the array, you write each number and its index into the notebook. When you land on a new number, before writing it down, you flip open the notebook and ask: 'Is the complement (target minus current number) already in here?' If yes — you're done. If no — write the current number down and move on.
This means you only walk through the array once. One pass. The HashMap lookup happens in $O(1)$ — constant time — regardless of how many entries are in the map. So the total work is $O(n)$: you do a fixed amount of work per element, and there are n elements. That's the trade-off: you use a little extra memory (the HashMap) to eliminate the entire inner loop.
This idea — using a data structure to trade memory for speed — is one of the most reusable patterns in all of computer science.
What's Actually Happening Inside the HashMap — A Visual Walkthrough
Seeing code is one thing; watching it execute step-by-step is how it really sticks. Let's trace through numbers = [2, 7, 11, 15] with target = 9 manually, as if we're the program.
Step 1: currentNumber = 2. Complement = 9 - 2 = 7. Is 7 in seenNumbers? No — map is empty. Store {2 → 0}. Move on.
Step 2: currentNumber = 7. Complement = 9 - 7 = 2. Is 2 in seenNumbers? YES — it's at index 0! Return [0, 1]. Done.
We never even looked at 11 or 15. The algorithm stopped the moment it found the answer. Now contrast that with brute force, which would have checked every single pair in the worst case.
The HashMap is the reason we can ask 'have I seen X?' and get a yes/no answer almost immediately, without scanning anything. Internally, Java's HashMap converts the key (a number like 2) into a bucket position using a hash function, making retrieval constant-time. You don't need to understand the internal math right now — just trust that containsKey() and get() are both $O(1)$ operations.
Edge Cases: What Breaks a Naive Solution
The happy path works with [2,7,11,15] and target 9. But what happens when the array is empty? Or has only one element? Or the target is unreachable? A robust solution must handle these gracefully.
Here are the edge cases you should test:
- Empty array: [] — should return empty array immediately, no exceptions.
- Single element: [5] — no pair, return empty.
- No solution: [1,2,3] target 7 — returns empty, but the HashMap stores all three entries. Fine for small inputs, but for large datasets without a pair, this is the OOM scenario we saw earlier.
- Duplicates: [3,3] target 6 — must return [0,1] not [0,0]. Our insert-after-check pattern handles this.
- Negative numbers: [-3,3] target 0 — complement calculation works.
- Zero: [0,5] target 5 — complement 5, works.
- Large target: target = Integer.MAX_VALUE, current = 10, complement = Integer.MAX_VALUE - 10. No overflow if using long.
Always write test cases for these before you start coding in an interview.
Variations and Interview Follow-Ups
Interviewers rarely stop at the basic solution. Here are the most common follow-up questions and how to handle them:
- Return all pairs that sum to target, not just the first. Instead of returning early, collect each matching pair into a list and keep iterating. Complexity stays O(n) for time, but O(n) for storing all pairs (could be O(n²) pairs in worst case).
- Return boolean if a pair exists, not the indices. This simplifies the return type — just return true the moment you find a complement. You can even use a HashSet instead of HashMap since indices aren't needed.
- Sorted array variant: Use the two-pointer technique. Sort the array first (O(n log n)), then place one pointer at the start and another at the end. If the sum is too small, move the left pointer right; if too large, move the right pointer left. This achieves O(1) space but O(n log n) time due to sorting.
- Multiple queries on the same array: Preprocess by storing all pairs in a HashMap for O(1) lookups later. Building the map costs O(n²), but each query is O(1). Trade-off heavily depends on query-to-array-size ratio.
- Three Sum or K Sum: Extend the complement pattern recursively. For Three Sum, fix one number and solve Two Sum for the rest. Complexity becomes O(n²), but the hash pattern still applies.
- For any current element, the complement is target - current.
- The hash map provides O(1) lookup for that complement.
- This model extends to any pair-finding problem: the key is calculating what you're searching for.
- Two-pointer is a different mental model: narrowing range by sum comparison.
- Interviewers want to see you pivot between models based on constraints.
HashMap Memory Blowout: When Two Sum Lost a Production Job
- Always consider worst-case space: if no solution exists, HashMap stores every element.
- Know your data distribution: if many inputs have no solution, a memory-efficient alternative is safer.
- Set resource limits and have fallback algorithms for large-scale processing.
Key takeaways
get() both run in O(1) average time, which is what makes the single-pass O(n) solution possibleCommon mistakes to avoid
3 patternsInserting into the map before checking the complement
put(). The order is critical.Forgetting that the same number can appear twice in the array
Returning the values instead of the indices
Interview Questions on This Topic
Can you walk me through the time and space complexity of your HashMap solution and explain the trade-off you're making compared to the brute force approach?
Frequently Asked Questions
That's Hashing. Mark it forged?
5 min read · try the examples if you haven't