Longest Consecutive Sequence — O(n²) Bug in Login Streak
A missing guard caused 50M inner steps for a 10k streak — the O(n²) bug that almost killed login streaks.
- The algorithm uses a HashSet to find the longest run of consecutive numbers in O(n) time.
- Only start counting from sequence starts — numbers without a left neighbour in the set.
- Each number is visited at most twice: once as a start candidate, once during a forward walk.
- Total work stays O(n) because the inner while loop never revisits elements already counted.
- In production, forgetting the start guard turns O(n) into O(n²) for large inputs.
Imagine you have a shoebox full of random numbered tickets — 4, 2, 100, 3, 1, 101. You want to find the longest unbroken run of numbers, like 1-2-3-4. Instead of sorting all the tickets first, you tip them into a lookup board (a hash set) so you can instantly check 'is this number here?' Then for each ticket, you only start counting a run if it's the very first in a chain — meaning the number one below it is NOT on the board. That's the entire trick: the hash set turns 'does this number exist?' from a slow search into an instant yes-or-no, letting you count every chain in one pass.
Consecutive sequence problems appear everywhere in the real world — think about detecting unbroken login streaks in a fitness app, finding contiguous ID ranges in a database audit, or identifying runs of sensor readings in IoT pipelines. Whenever you need to find structure inside a pile of unsorted numbers, this problem and its pattern keep showing up. It's also one of the most frequently asked medium-level questions at FAANG-tier interviews because it forces you to think about data access patterns, not just loops.
The naive approach — sort the array and scan for breaks — costs O(n log n) and tempts almost everyone at first glance. But sorting is expensive, and it destroys the original order. The real insight is that you don't need sorted order at all. You just need to answer one question in O(1): 'Does the number X exist in the input?' A hash set is purpose-built for exactly that. With it, you can scan the array once and trace every consecutive chain from its true starting point, bringing the total cost down to O(n).
By the end of this article you'll understand exactly why the hash set approach works and not just mechanically how to code it, you'll be able to trace through any example by hand, spot the two most common implementation bugs before they bite you, and confidently explain your reasoning in an interview when the follow-up questions start flying.
Why Sorting Is the Wrong First Instinct
When most developers see 'find the longest run of consecutive numbers', their brain jumps straight to sorting. Sort the array, walk through it, count runs, track the max. Clean, simple, done. And it works — but it's costing you more than you need to pay.
Sorting is O(n log n). For an array of 10 million elements, that's a meaningful difference from O(n). More importantly, the sorting instinct reveals a deeper misunderstanding: you're paying to impose order on the entire dataset when you only care about local neighbourhood relationships — 'is num+1 in the array?' and 'is num-1 in the array?'.
This is the mental shift that separates intermediate from advanced thinking. Instead of restructuring the data, change how you query it. A HashSet gives you O(1) membership checks. Once every number is in the set, you can ask 'does 5 exist?' or 'does 6 exist?' for free, without touching any sorted structure.
The sorted approach also doesn't handle duplicates gracefully without extra logic — you have to skip them manually. The hash set absorbs duplicates automatically because a set stores only unique values. Two problems solved with one data structure choice.
The Core Insight — Only Count From Sequence Starts
Here's the question that unlocks everything: what would happen if you started counting a chain from every single number in the set? You'd recount the same sequence multiple times. Starting from 1 you'd count 1→2→3→4 (length 4). Starting from 2 you'd count 2→3→4 (length 3). Starting from 3, length 2. You'd do redundant work proportional to the square of the sequence length in the worst case.
The genius of the algorithm is the guard: only begin a chain-count when the current number has no left-neighbour in the set. If num - 1 is absent, then num must be the true beginning of a consecutive run. You can safely count forward from there, knowing you'll never double-count that sequence.
This single condition — !numberSet.contains(num - 1) — is what keeps the whole algorithm at O(n). Every number is visited at most twice: once when you check it as a potential sequence start, and once when it's stepped over during a forward count from an earlier start. Those two passes across n elements together are still O(n).
Think of it like painting. You only dip your brush at the leftmost edge of each white stripe. You never start mid-stripe. The result: every stripe gets painted exactly once.
numberSet, not the original numbers array. This means duplicates are never visited a second time during the outer loop, tightening the constant factor of your O(n) runtime. It's a small detail that signals precision in interviews.!numberSet.contains(num - 1) is the entire reason for O(n).Proving O(n) — Walking Through the Work Count
When interviewers push back with 'but isn't there a nested while loop inside your for loop — doesn't that make it O(n²)?', this is the moment that separates candidates who understand their own code from those who memorised it.
The answer lies in amortised analysis. Ask: how many total times can any single number be visited by the while loop? Exactly once. The while loop only runs when we're at a confirmed sequence start. Every number that gets consumed by a while-loop forward-walk is never itself a sequence start (because its predecessor exists in the set), so it will hit the continue in the outer for-loop and never trigger its own while-walk.
Total outer loop iterations: at most n (one per unique number). Total inner while-loop steps across ALL iterations: also at most n, because each of the n numbers is stepped over by a while loop at most once. That's 2n operations total — O(n).
This is the same amortised reasoning used to prove that a dynamic array's push operation is O(1) amortised. The expensive step is rare and its cost is spread evenly across all operations. Knowing this explanation cold will visibly impress an interviewer.
Real-World Application — Detecting Login Streak Gaps in a User System
Here's a scenario you'll actually encounter. Your app tracks which calendar days (as day-numbers since epoch) a user logged in. You want to find the user's longest unbroken login streak. The login days come from a database query — unsorted, possibly containing duplicates if the schema logs multiple events per day.
This is the Longest Consecutive Sequence problem in production clothing. The 'array of integers' is your list of day numbers. The 'consecutive sequence' is unbroken daily logins. The hash set approach handles duplicates (multiple logins on one day) without special casing, works in O(n) time even for users with years of history, and doesn't require you to sort a potentially large dataset.
The same pattern applies to: finding contiguous free blocks in a memory allocator, detecting unbroken sensor uptime windows in an IoT dashboard, finding runs of consecutive student attendance days, and identifying sequential order IDs to spot gaps in an audit trail.
Whenever your domain has 'find the longest run of something that can be modelled as consecutive integers', this algorithm is your friend.
bestStreakStart alongside longestStreak — it's a one-line addition and immediately makes the result actionable in a UI (e.g., 'Your best streak ran from Jan 5 to Jan 9').Edge Cases, Variants, and Follow-Up Questions
The basic algorithm is elegant, but interviewers love to push on the edges. Let's cover the most common follow-ups.
Integer underflow at MIN_VALUE: num - 1 when num = Integer.MIN_VALUE wraps to Integer.MAX_VALUE. The guard will think MAX_VALUE exists and skip a legitimate start. Fix: cast to long before subtraction.
Returning the actual sequence: Track bestStart and bestLength. After the loop, reconstruct: IntStream.range(bestStart, bestStart + bestLength).toArray(). That's a common extension.
Distributed variant: What if the data is sharded across thousands of servers? You can't load everything into a single hash set. The pattern shifts to MapReduce: map each number to its bucket (num / segmentSize), find sequences per bucket, then merge across bucket boundaries. This appears in real-time streaming systems.
Memory-constrained environment: If you can't afford O(n) memory, you're back to sorting (O(1) extra space) or use a Bloom filter as a probabilistic membership test if exactness isn't required. That's a rare but advanced trade-off.
num - 1 overflows to Integer.MAX_VALUE. Always cast to long when doing arithmetic on array values in this algorithm.The O(n²) That Almost Killed a Login Streak Feature
if (numberSet.contains(num - 1)) continue;, the inner while loop runs for every number, not just sequence starts. For a sequence of length L, it's executed L times from each element, turning the algorithm into O(n²). Here, a single streak of 10,000 consecutive logins caused 10,000 * 10,000 / 2 ≈ 50 million inner steps.- Always validate O(n) claims with data from both worst-case (one long chain) and worst-case-looking (many short chains) inputs.
- Amortised analysis isn't just theory — it's the difference between a 200ms response and a failed deployment.
- When performance degrades unexpectedly in prod, suspect hidden quadratic behaviour from missing loop guards.
for (int num : numberSet).!numberSet.contains((long) num - 1). Java's int underflows MIN_VALUE - 1 to MAX_VALUE, breaking the guard.for (int num : numbers) with for (int num : numberSet).Key takeaways
!numberSet.contains(num - 1)) is the entire reason the nested loop doesn't blow up to O(n²)Common mistakes to avoid
4 patternsMissing the sequence-start guard
if (numberSet.contains(num - 1)) continue; before the inner while loop. This single line keeps the algorithm linear.Iterating the original array instead of the set in the outer loop
for (int num : numbers) with for (int num : numberSet). Iterating the set guarantees each unique value is processed exactly once.Forgetting null or empty input handling
if (numbers == null || numbers.length == 0) return 0;.Not accounting for integer overflow with MIN_VALUE
!numberSet.contains(num - 1) incorrectly checks for Integer.MAX_VALUE (because of wraparound), potentially skipping a legitimate sequence start. The result is a missed streak.contains() calls. Or use a Set<Long>.Interview Questions on This Topic
Can you walk me through why your solution is O(n) even though there's a while loop nested inside a for loop?
Frequently Asked Questions
That's Hashing. Mark it forged?
5 min read · try the examples if you haven't