Home DSA Longest Consecutive Sequence Using Hashing — O(n) Explained

Longest Consecutive Sequence Using Hashing — O(n) Explained

In Plain English 🔥
Imagine you have a shoebox full of random numbered tickets — 4, 2, 100, 3, 1, 101. You want to find the longest unbroken run of numbers, like 1-2-3-4. Instead of sorting all the tickets first, you tip them into a lookup board (a hash set) so you can instantly check 'is this number here?' Then for each ticket, you only start counting a run if it's the very first in a chain — meaning the number one below it is NOT on the board. That's the entire trick: the hash set turns 'does this number exist?' from a slow search into an instant yes-or-no, letting you count every chain in one pass.
⚡ Quick Answer
Imagine you have a shoebox full of random numbered tickets — 4, 2, 100, 3, 1, 101. You want to find the longest unbroken run of numbers, like 1-2-3-4. Instead of sorting all the tickets first, you tip them into a lookup board (a hash set) so you can instantly check 'is this number here?' Then for each ticket, you only start counting a run if it's the very first in a chain — meaning the number one below it is NOT on the board. That's the entire trick: the hash set turns 'does this number exist?' from a slow search into an instant yes-or-no, letting you count every chain in one pass.

Consecutive sequence problems appear everywhere in the real world — think about detecting unbroken login streaks in a fitness app, finding contiguous ID ranges in a database audit, or identifying runs of sensor readings in IoT pipelines. Whenever you need to find structure inside a pile of unsorted numbers, this problem and its pattern keep showing up. It's also one of the most frequently asked medium-level questions at FAANG-tier interviews because it forces you to think about data access patterns, not just loops.

The naive approach — sort the array and scan for breaks — costs O(n log n) and tempts almost everyone at first glance. But sorting is expensive, and it destroys the original order. The real insight is that you don't need sorted order at all. You just need to answer one question in O(1): 'Does the number X exist in the input?' A hash set is purpose-built for exactly that. With it, you can scan the array once and trace every consecutive chain from its true starting point, bringing the total cost down to O(n).

By the end of this article you'll understand exactly why the hash set approach works and not just mechanically how to code it, you'll be able to trace through any example by hand, spot the two most common implementation bugs before they bite you, and confidently explain your reasoning in an interview when the follow-up questions start flying.

Why Sorting Is the Wrong First Instinct

When most developers see 'find the longest run of consecutive numbers', their brain jumps straight to sorting. Sort the array, walk through it, count runs, track the max. Clean, simple, done. And it works — but it's costing you more than you need to pay.

Sorting is O(n log n). For an array of 10 million elements, that's a meaningful difference from O(n). More importantly, the sorting instinct reveals a deeper misunderstanding: you're paying to impose order on the entire dataset when you only care about local neighbourhood relationships — 'is num+1 in the array?' and 'is num-1 in the array?'.

This is the mental shift that separates intermediate from advanced thinking. Instead of restructuring the data, change how you query it. A HashSet gives you O(1) membership checks. Once every number is in the set, you can ask 'does 5 exist?' or 'does 6 exist?' for free, without touching any sorted structure.

The sorted approach also doesn't handle duplicates gracefully without extra logic — you have to skip them manually. The hash set absorbs duplicates automatically because a set stores only unique values. Two problems solved with one data structure choice.

NaiveSortVsHashSet.java · JAVA
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
import java.util.Arrays;
import java.util.HashSet;
import java.util.Set;

public class NaiveSortVsHashSet {

    // ---- APPROACH 1: Sort-based O(n log n) ----
    public static int longestConsecutiveSorting(int[] numbers) {
        if (numbers.length == 0) return 0;

        Arrays.sort(numbers); // O(n log n) — reorders entire array

        int longestRun = 1;
        int currentRun = 1;

        for (int i = 1; i < numbers.length; i++) {
            if (numbers[i] == numbers[i - 1]) {
                continue; // skip duplicates manually
            }
            if (numbers[i] == numbers[i - 1] + 1) {
                currentRun++; // extend the current run
                longestRun = Math.max(longestRun, currentRun);
            } else {
                currentRun = 1; // break in sequence, reset
            }
        }
        return longestRun;
    }

    // ---- APPROACH 2: HashSet-based O(n) ----
    public static int longestConsecutiveHashing(int[] numbers) {
        // Load every number into a set — O(n) time, duplicates disappear
        Set<Integer> numberSet = new HashSet<>();
        for (int num : numbers) {
            numberSet.add(num);
        }

        int longestRun = 0;

        for (int num : numberSet) {
            // Only start counting from the BEGINNING of a sequence.
            // If (num - 1) exists in the set, this number is NOT a sequence start.
            boolean isSequenceStart = !numberSet.contains(num - 1);

            if (isSequenceStart) {
                int currentNumber = num;
                int currentRun = 1;

                // Walk forward as long as consecutive numbers exist
                while (numberSet.contains(currentNumber + 1)) {
                    currentNumber++;
                    currentRun++;
                }

                longestRun = Math.max(longestRun, currentRun);
            }
        }
        return longestRun;
    }

    public static void main(String[] args) {
        int[] input = {100, 4, 200, 1, 3, 2, 2, 3}; // note duplicate 2s and 3s

        System.out.println("Input: " + Arrays.toString(input));
        System.out.println("Sorting approach:  " + longestConsecutiveSorting(input));
        System.out.println("HashSet approach:  " + longestConsecutiveHashing(input));
    }
}
▶ Output
Input: [100, 4, 200, 1, 3, 2, 2, 3]
Sorting approach: 4
HashSet approach: 4
🔥
Why Both Give 4:The longest run is 1→2→3→4. Duplicates (2, 3 appear twice) don't extend the sequence — they're just noise. The HashSet discards them automatically; the sort approach needs an explicit skip. Same answer, but the HashSet path is cleaner and faster.

The Core Insight — Only Count From Sequence Starts

Here's the question that unlocks everything: what would happen if you started counting a chain from every single number in the set? You'd recount the same sequence multiple times. Starting from 1 you'd count 1→2→3→4 (length 4). Starting from 2 you'd count 2→3→4 (length 3). Starting from 3, length 2. You'd do redundant work proportional to the square of the sequence length in the worst case.

The genius of the algorithm is the guard: only begin a chain-count when the current number has no left-neighbour in the set. If num - 1 is absent, then num must be the true beginning of a consecutive run. You can safely count forward from there, knowing you'll never double-count that sequence.

This single condition — !numberSet.contains(num - 1) — is what keeps the whole algorithm at O(n). Every number is visited at most twice: once when you check it as a potential sequence start, and once when it's stepped over during a forward count from an earlier start. Those two passes across n elements together are still O(n).

Think of it like painting. You only dip your brush at the leftmost edge of each white stripe. You never start mid-stripe. The result: every stripe gets painted exactly once.

LongestConsecutiveSequence.java · JAVA
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
import java.util.HashSet;
import java.util.Set;

public class LongestConsecutiveSequence {

    public static int findLongestConsecutiveSequence(int[] numbers) {
        if (numbers == null || numbers.length == 0) {
            return 0; // guard against empty or null input
        }

        // Step 1: Populate the set — O(n)
        // Using a Set means duplicates are ignored for free.
        Set<Integer> numberSet = new HashSet<>();
        for (int num : numbers) {
            numberSet.add(num);
        }

        int longestSequenceLength = 0;

        // Step 2: Iterate over unique numbers only (iterating the set, not the array)
        for (int num : numberSet) {

            // THE KEY GUARD: skip this number if it's not a sequence start.
            // If (num - 1) exists, some smaller number already owns this chain.
            if (numberSet.contains(num - 1)) {
                continue; // not a start — skip entirely
            }

            // We've confirmed: num IS the start of a new consecutive sequence.
            int sequenceStart = num;
            int currentLength = 1;

            // Walk forward one step at a time until the chain breaks
            while (numberSet.contains(sequenceStart + currentLength)) {
                currentLength++; // found the next consecutive number
            }

            // Update the global maximum
            longestSequenceLength = Math.max(longestSequenceLength, currentLength);
        }

        return longestSequenceLength;
    }

    public static void main(String[] args) {
        // Test Case 1: multiple sequences, clear winner
        int[] mixed = {10, 5, 6, 100, 7, 101, 8, 102, 9, 103};
        System.out.println("Test 1 — Mixed sequences:");
        System.out.println("  Input length: " + mixed.length + " numbers");
        System.out.println("  Longest consecutive: " + findLongestConsecutiveSequence(mixed));
        // Sequences: [5,6,7,8,9,10] length 6 and [100,101,102,103] length 4

        // Test Case 2: all elements form one sequence
        int[] continuous = {3, 2, 1, 0, -1, -2};
        System.out.println("\nTest 2 — Continuous descending input:");
        System.out.println("  Longest consecutive: " + findLongestConsecutiveSequence(continuous));

        // Test Case 3: single element
        int[] single = {42};
        System.out.println("\nTest 3 — Single element:");
        System.out.println("  Longest consecutive: " + findLongestConsecutiveSequence(single));

        // Test Case 4: all duplicates
        int[] duplicates = {5, 5, 5, 5};
        System.out.println("\nTest 4 — All duplicates:");
        System.out.println("  Longest consecutive: " + findLongestConsecutiveSequence(duplicates));
    }
}
▶ Output
Test 1 — Mixed sequences:
Input length: 10 numbers
Longest consecutive: 6

Test 2 — Continuous descending input:
Longest consecutive: 6

Test 3 — Single element:
Longest consecutive: 1

Test 4 — All duplicates:
Longest consecutive: 1
⚠️
Pro Tip: Iterate the Set, Not the ArrayNotice the for-loop iterates over `numberSet`, not the original `numbers` array. This means duplicates are never visited a second time during the outer loop, tightening the constant factor of your O(n) runtime. It's a small detail that signals precision in interviews.

Proving O(n) — Walking Through the Work Count

When interviewers push back with 'but isn't there a nested while loop inside your for loop — doesn't that make it O(n²)?', this is the moment that separates candidates who understand their own code from those who memorised it.

The answer lies in amortised analysis. Ask: how many total times can any single number be visited by the while loop? Exactly once. The while loop only runs when we're at a confirmed sequence start. Every number that gets consumed by a while-loop forward-walk is never itself a sequence start (because its predecessor exists in the set), so it will hit the continue in the outer for-loop and never trigger its own while-walk.

Total outer loop iterations: at most n (one per unique number). Total inner while-loop steps across ALL iterations: also at most n, because each of the n numbers is stepped over by a while loop at most once. That's 2n operations total — O(n).

This is the same amortised reasoning used to prove that a dynamic array's push operation is O(1) amortised. The expensive step is rare and its cost is spread evenly across all operations. Knowing this explanation cold will visibly impress an interviewer.

ComplexityTracer.java · JAVA
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
import java.util.HashSet;
import java.util.Set;

/**
 * This version adds counters to PROVE the O(n) complexity claim.
 * Run it and watch that totalSteps stays proportional to input size,
 * not to input size squared.
 */
public class ComplexityTracer {

    public static void tracedLongestConsecutive(int[] numbers) {
        Set<Integer> numberSet = new HashSet<>();
        for (int num : numbers) {
            numberSet.add(num);
        }

        int longestSequenceLength = 0;
        int outerLoopCount = 0; // how many times the for-loop body executes
        int innerWhileSteps = 0; // total steps across ALL while-loop runs

        for (int num : numberSet) {
            outerLoopCount++;

            if (numberSet.contains(num - 1)) {
                continue; // not a sequence start, skip
            }

            // Confirmed start: count forward
            int sequenceStart = num;
            int currentLength = 1;

            while (numberSet.contains(sequenceStart + currentLength)) {
                currentLength++;
                innerWhileSteps++; // track each forward step
            }

            longestSequenceLength = Math.max(longestSequenceLength, currentLength);
        }

        int uniqueCount = numberSet.size();
        System.out.println("  Unique numbers (n):       " + uniqueCount);
        System.out.println("  Outer loop iterations:    " + outerLoopCount);
        System.out.println("  Inner while total steps:  " + innerWhileSteps);
        System.out.println("  Combined total steps:     " + (outerLoopCount + innerWhileSteps));
        System.out.println("  Longest sequence:         " + longestSequenceLength);
        System.out.println("  Steps / n ratio:          "
            + String.format("%.2f", (double)(outerLoopCount + innerWhileSteps) / uniqueCount));
    }

    public static void main(String[] args) {
        System.out.println("--- Small input (n=8) ---");
        tracedLongestConsecutive(new int[]{1, 2, 3, 4, 10, 11, 50, 51});

        System.out.println("\n--- Medium input: one long chain (n=10) ---");
        tracedLongestConsecutive(new int[]{1, 2, 3, 4, 5, 6, 7, 8, 9, 10});

        System.out.println("\n--- Worst-case-looking input: many chains (n=10) ---");
        tracedLongestConsecutive(new int[]{1, 3, 5, 7, 9, 11, 13, 15, 17, 19});
    }
}
▶ Output
--- Small input (n=8) ---
Unique numbers (n): 8
Outer loop iterations: 8
Inner while total steps: 6
Combined total steps: 14
Longest sequence: 4
Steps / n ratio: 1.75

--- Medium input: one long chain (n=10) ---
Unique numbers (n): 10
Outer loop iterations: 10
Inner while total steps: 9
Combined total steps: 19
Longest sequence: 10
Steps / n ratio: 1.90

--- Worst-case-looking input: many chains (n=10) ---
Unique numbers (n): 10
Outer loop iterations: 10
Inner while total steps: 0
Combined total steps: 10
Longest sequence: 1
Steps / n ratio: 1.00
🔥
What the Ratio Tells You:The Steps/n ratio is always between 1.0 and 2.0, regardless of input shape. One long chain gives ratio close to 2 (each element appears in both the outer loop and one inner step). All isolated numbers gives ratio exactly 1 (inner while never runs). Both are O(n) — the ratio never grows with n.

Real-World Application — Detecting Login Streak Gaps in a User System

Here's a scenario you'll actually encounter. Your app tracks which calendar days (as day-numbers since epoch) a user logged in. You want to find the user's longest unbroken login streak. The login days come from a database query — unsorted, possibly containing duplicates if the schema logs multiple events per day.

This is the Longest Consecutive Sequence problem in production clothing. The 'array of integers' is your list of day numbers. The 'consecutive sequence' is unbroken daily logins. The hash set approach handles duplicates (multiple logins on one day) without special casing, works in O(n) time even for users with years of history, and doesn't require you to sort a potentially large dataset.

The same pattern applies to: finding contiguous free blocks in a memory allocator, detecting unbroken sensor uptime windows in an IoT dashboard, finding runs of consecutive student attendance days, and identifying sequential order IDs to spot gaps in an audit trail.

Whenever your domain has 'find the longest run of something that can be modelled as consecutive integers', this algorithm is your friend.

LoginStreakAnalyzer.java · JAVA
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374
import java.util.HashSet;
import java.util.List;
import java.util.Set;

/**
 * Real-world simulation: find a user's longest consecutive login streak.
 * Login days are represented as integers (days since epoch).
 * Data is unsorted and may contain duplicates (multiple logins in one day).
 */
public class LoginStreakAnalyzer {

    /**
     * Returns the length of the longest unbroken daily login streak.
     *
     * @param loginDays list of day-numbers when the user logged in (unsorted, may repeat)
     * @return number of consecutive days in the longest streak
     */
    public static int longestLoginStreak(List<Integer> loginDays) {
        if (loginDays == null || loginDays.isEmpty()) {
            return 0;
        }

        // Load into a set — duplicates (multiple logins per day) vanish automatically
        Set<Integer> uniqueLoginDays = new HashSet<>(loginDays);

        int longestStreak = 0;

        for (int day : uniqueLoginDays) {
            // Only start counting from the first day of a streak block.
            // If the previous day also has a login, this is mid-streak — skip.
            boolean isStreakStart = !uniqueLoginDays.contains(day - 1);

            if (isStreakStart) {
                int streakDay = day;
                int streakLength = 1;

                // Advance through consecutive login days
                while (uniqueLoginDays.contains(streakDay + 1)) {
                    streakDay++;
                    streakLength++;
                }

                longestStreak = Math.max(longestStreak, streakLength);

                System.out.println("  Streak found: days " + day
                    + " to " + streakDay
                    + " (" + streakLength + " days)");
            }
        }

        return longestStreak;
    }

    public static void main(String[] args) {
        // Simulated login days for a user over several weeks
        // Days 101-105: 5-day streak
        // Days 110-112: 3-day streak
        // Day 120: isolated login
        // Day 103 appears twice — user logged in twice that day
        List<Integer> userLoginHistory = List.of(
            101, 102, 103, 103, 104, 105,
            110, 111, 112,
            120
        );

        System.out.println("=== Login Streak Analysis ===");
        System.out.println("Raw login events: " + userLoginHistory.size());
        System.out.println("Streaks detected:");

        int best = longestLoginStreak(userLoginHistory);

        System.out.println("\nLongest login streak: " + best + " day(s)");
    }
}
▶ Output
=== Login Streak Analysis ===
Raw login events: 10
Streaks detected:
Streak found: days 101 to 105 (5 days)
Streak found: days 110 to 112 (3 days)
Streak found: days 120 to 120 (1 days)

Longest login streak: 5 day(s)
⚠️
Design Tip for Production Code:In a real system you'd also want to return the start day of the longest streak, not just its length. Tweak the algorithm to track `bestStreakStart` alongside `longestStreak` — it's a one-line addition and immediately makes the result actionable in a UI (e.g., 'Your best streak ran from Jan 5 to Jan 9').
AspectSort-Based Approach O(n log n)HashSet Approach O(n)
Time ComplexityO(n log n) — dominated by sortO(n) — two linear passes total
Space ComplexityO(1) extra if in-place sortO(n) for the hash set
Handles DuplicatesManual skip logic neededSet absorbs duplicates automatically
Preserves Input OrderNo — array is sorted in-placeYes — original array untouched
Handles NegativesYes, sorts correctlyYes, hash set is key-agnostic
Code ComplexitySimple to write and readSlightly more logic, but still concise
Best Use CaseWhen input is nearly sorted alreadyGeneral case, large unsorted inputs
Interview PreferenceAcceptable baseline answerExpected optimal answer at FAANG level

🎯 Key Takeaways

  • The hash set's O(1) lookup is what makes O(n) possible — the algorithm is essentially trading O(n) space for the elimination of sorting.
  • The sequence-start guard (!numberSet.contains(num - 1)) is the entire reason the nested loop doesn't blow up to O(n²) — never implement this algorithm without it.
  • Iterating over the set (not the original array) in the outer loop silently eliminates duplicate processing — a small detail that matters for correctness on inputs with repeated values.
  • This pattern — load into a set, then query cheaply — generalises to dozens of problems: anagram detection, two-sum, first missing positive. Recognising it as a pattern is more valuable than memorising this one problem.

⚠ Common Mistakes to Avoid

  • Mistake 1: Starting the count from every number, not just sequence starts — Symptom: the inner while-loop runs from every number, giving correct results but O(n²) runtime on large inputs — Fix: add the guard if (numberSet.contains(num - 1)) continue; before the while-loop. This single line is what keeps the algorithm linear.
  • Mistake 2: Iterating over the original array instead of the set in the outer loop — Symptom: duplicates cause the outer loop to re-visit sequence starts, potentially counting the same sequence multiple times, inflating the answer for inputs like [1,1,2,2,3] — Fix: replace for (int num : numbers) with for (int num : numberSet). Iterating the set guarantees each unique value is processed exactly once.
  • Mistake 3: Forgetting to handle null or empty input — Symptom: NullPointerException or ArrayIndexOutOfBoundsException on edge-case inputs, which will cost you the question in an interview even if the main logic is perfect — Fix: add a null-and-length guard at the top: if (numbers == null || numbers.length == 0) return 0;. Always handle edges before touching the data.

Interview Questions on This Topic

  • QCan you walk me through why your solution is O(n) even though there's a while loop nested inside a for loop? (Interviewers want to hear 'amortised analysis' — each element is touched by the while loop at most once across all iterations, so total inner steps ≤ n.)
  • QWhat changes if the input can contain negative numbers or Integer.MIN_VALUE? (The hash set handles negatives transparently. The only real edge case is Integer.MIN_VALUE — checking `num - 1` causes integer underflow in Java. A safe fix is to cast to long: `!numberSet.contains((long) num - 1)`.)
  • QHow would you modify this to return the actual sequence, not just its length? (Track the start number and length of the best sequence separately. After the loops, reconstruct with a simple for-loop from `bestStart` to `bestStart + longestLength - 1`. This tests whether you understand the algorithm's state, not just its output.)

Frequently Asked Questions

What is the time complexity of the longest consecutive sequence using a hash set?

It's O(n) time and O(n) space. The set is built in one linear pass, and although there's a nested while loop in the algorithm, each number is visited by the while loop at most once across all outer-loop iterations — that's amortised O(n) for the whole traversal, not O(n²).

Why does the longest consecutive sequence problem use a HashSet and not a HashMap?

You only need to answer 'does this number exist?' — a yes/no question. A HashSet is perfectly designed for membership queries and uses less memory than a HashMap. A HashMap would make sense if you also needed to store metadata per number (like index or frequency), but the core algorithm needs none of that.

Does the algorithm work correctly if the input array has duplicate numbers?

Yes, and it handles them more cleanly than the sorting approach. When you load the array into a HashSet, duplicates are silently dropped because sets only store unique values. The algorithm then operates on unique numbers only, so duplicates never inflate the sequence length or cause double-counting.

🔥
TheCodeForge Editorial Team Verified Author

Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.

← PreviousGroup Anagrams ProblemNext →Greedy Algorithm Introduction
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged