DSA Advanced

Strongly Connected Components Explained — Kosaraju, Tarjan & When to Use Each

📅 March 2026 ⏱ 8 min read 🎯 Advanced

In Plain English 🔥

Imagine a group of friends where everyone can pass a message to everyone else — maybe not directly, but through mutual friends. That closed circle is a Strongly Connected Component. Now imagine two separate friend groups at a party who never mingle: those are two different SCCs. Tarjan's and Kosaraju's algorithms are just clever ways of finding all those isolated circles in one pass through the crowd.

⚡ Quick Answer

Social networks, airline route maps, compiler dependency graphs, deadlock detection in operating systems — every one of these systems hides a graph underneath, and inside that graph lurk tightly knit clusters where every node can reach every other node. Those clusters are Strongly Connected Components (SCCs), and identifying them is one of the most practically useful graph operations you'll ever implement. Miss an SCC and your compiler might mis-order function compilation. Ignore SCCs in a routing system and you'll miss that a subset of cities is completely isolated from the rest of the network.

The core problem SCCs solve is reachability symmetry in directed graphs. In an undirected graph, if A can reach B then B can reach A — trivially. But in a directed graph, edges have direction, so reachability isn't symmetric. You might be able to fly from New York to London but the direct return flight might not exist. SCCs carve a directed graph into maximal groups where every internal pair of nodes can mutually reach each other, giving you a clean decomposition of the graph's cyclic structure.

By the end of this article you'll understand exactly how Kosaraju's two-pass DFS and Tarjan's single-pass low-link algorithm work at the bit-level, you'll know which one to reach for in production and why, you'll have complete runnable Java implementations of both, and you'll be ready to handle the tricky follow-up questions interviewers throw at candidates who only memorised the surface-level answer.

What an SCC Actually Is — And the Condensation Graph Trick

A Strongly Connected Component is a maximal set of vertices in a directed graph such that there is a directed path from every vertex in the set to every other vertex in the set. The word 'maximal' is doing serious work here — you can't add one more vertex and keep the property. A single isolated vertex with no edges counts as an SCC of size 1, because the only path it needs is the trivial zero-length path to itself.

Once you identify all SCCs, something beautiful happens: you can collapse each SCC into a single super-node, and the edges between SCCs form a Directed Acyclic Graph (DAG). This is called the condensation graph, and it's the secret weapon SCCs give you. A DAG can be topologically sorted, which means problems that were intractable on the original cyclic graph become straightforward on the condensation. Compiler pass ordering, task scheduling with mutual dependencies, and finding the minimum number of edges to add to make a graph strongly connected — all of these reduce to operations on the condensation DAG.

Think of the condensation as compressing all the complexity of cycles into single nodes. The resulting DAG tells you the 'macro structure' of your graph: which clusters depend on which other clusters, with all the internal circular dependencies safely encapsulated.

SCCCondensationDemo.java · JAVA

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142

import java.util.*;

/**
 * Demonstrates what an SCC is and builds a condensation graph.
 * Graph used:
 *   0 -> 1, 1 -> 2, 2 -> 0   (cycle: SCC {0,1,2})
 *   1 -> 3, 3 -> 4            (no back edges: SCCs {3}, {4})
 */
public class SCCCondensationDemo {

    private final int totalVertices;
    private final List<List<Integer>> adjacencyList;

    public SCCCondensationDemo(int totalVertices) {
        this.totalVertices = totalVertices;
        this.adjacencyList = new ArrayList<>();
        for (int i = 0; i < totalVertices; i++) {
            adjacencyList.add(new ArrayList<>());
        }
    }

    public void addDirectedEdge(int from, int to) {
        adjacencyList.get(from).add(to);
    }

    /**
     * Runs a DFS and records the finish order of vertices.
     * This finish order is the secret sauce of Kosaraju's algorithm.
     */
    private void dfsRecordFinish(
            int vertex,
            boolean[] visited,
            Deque<Integer> finishStack,
            List<List<Integer>> graph) {

        visited[vertex] = true;

        for (int neighbour : graph.get(vertex)) {
            if (!visited[neighbour]) {
                dfsRecordFinish(neighbour, visited, finishStack, graph);
            }
        }
        // A vertex is pushed AFTER all its descendants finish.
        // The last vertex to finish has the highest 'reach' in the original graph.
        finishStack.push(vertex);
    }

    /**
     * Collects all vertices reachable from 'startVertex' on the transposed graph.
     * Every vertex collected here belongs to the same SCC as 'startVertex'.
     */
    private void dfsCollectComponent(
            int startVertex,
            boolean[] visited,
            List<Integer> currentComponent,
            List<List<Integer>> transposedGraph) {

        visited[startVertex] = true;
        currentComponent.add(startVertex);

        for (int neighbour : transposedGraph.get(startVertex)) {
            if (!visited[neighbour]) {
                dfsCollectComponent(neighbour, visited, currentComponent, transposedGraph);
            }
        }
    }

    /**
     * Builds the transposed graph (every edge reversed).
     * Reversing edges is what lets us 'check' reachability in both directions.
     */
    private List<List<Integer>> buildTransposedGraph() {
        List<List<Integer>> transposed = new ArrayList<>();
        for (int i = 0; i < totalVertices; i++) {
            transposed.add(new ArrayList<>());
        }
        for (int from = 0; from < totalVertices; from++) {
            for (int to : adjacencyList.get(from)) {
                transposed.get(to).add(from); // flip the direction
            }
        }
        return transposed;
    }

    /**
     * Kosaraju's algorithm — returns each SCC as a list of vertex IDs.
     * Time complexity: O(V + E). Space: O(V + E) for the transposed graph.
     */
    public List<List<Integer>> findSCCs() {
        boolean[] visited = new boolean[totalVertices];
        Deque<Integer> finishStack = new ArrayDeque<>();

        // Pass 1: DFS on original graph, push vertices by finish time
        for (int vertex = 0; vertex < totalVertices; vertex++) {
            if (!visited[vertex]) {
                dfsRecordFinish(vertex, visited, finishStack, adjacencyList);
            }
        }

        List<List<Integer>> transposedGraph = buildTransposedGraph();
        Arrays.fill(visited, false); // reset for second DFS pass

        List<List<Integer>> allSCCs = new ArrayList<>();

        // Pass 2: Process vertices in reverse finish order on the transposed graph.
        // If you can reach a vertex from the 'leader' in the transposed graph,
        // it means the original graph has a path back — confirming mutual reachability.
        while (!finishStack.isEmpty()) {
            int leader = finishStack.pop();
            if (!visited[leader]) {
                List<Integer> component = new ArrayList<>();
                dfsCollectComponent(leader, visited, component, transposedGraph);
                allSCCs.add(component);
            }
        }

        return allSCCs;
    }

    public static void main(String[] args) {
        SCCCondensationDemo graph = new SCCCondensationDemo(5);

        // Cycle among 0, 1, 2 — they form one SCC
        graph.addDirectedEdge(0, 1);
        graph.addDirectedEdge(1, 2);
        graph.addDirectedEdge(2, 0);

        // 1 reaches 3 and 4, but 3 and 4 can't get back into the cycle
        graph.addDirectedEdge(1, 3);
        graph.addDirectedEdge(3, 4);

        List<List<Integer>> sccs = graph.findSCCs();

        System.out.println("Strongly Connected Components found: " + sccs.size());
        for (int i = 0; i < sccs.size(); i++) {
            System.out.println("  SCC " + (i + 1) + ": " + sccs.get(i));
        }
        System.out.println();
        System.out.println("Condensation insight:");
        System.out.println("  {0,1,2} -> {3} -> {4} forms a DAG — no cycles remain.");
    }
}

▶ Output

Strongly Connected Components found: 3
SCC 1: [0, 2, 1]
SCC 2: [3]
SCC 3: [4]

Condensation insight:
{0,1,2} -> {3} -> {4} forms a DAG — no cycles remain.

🔥

Why the Transposed Graph Works:If vertex A can reach B in the original graph AND B can reach A in the transposed graph (which means A could reach B in the original), then A and B are in the same SCC. Reversing edges lets you check the 'return path' without running a second full BFS from every vertex. This is the elegant core insight of Kosaraju's algorithm.

Tarjan's Algorithm — One DFS, Low-Link Values, and Why It's Faster in Practice

Kosaraju's two-pass approach is elegant and easy to explain, but it requires building an entirely transposed graph in memory and running DFS twice. Tarjan's algorithm finds all SCCs in a single DFS pass using two clever bookkeeping arrays: discovery time (when was this vertex first visited?) and low-link value (what is the earliest discovery time reachable from this vertex's subtree, including back edges?).

The low-link value is the key insight. When you finish processing a vertex, if its low-link value equals its own discovery time, it means no back edge from its subtree escapes to an ancestor. That vertex is the 'root' of an SCC. Everything on the DFS stack between the root and the current top of the stack belongs to that SCC.

Tarjan's algorithm uses an explicit stack to track 'potentially still in the current SCC' vertices, and a boolean array to mark which vertices are currently on that stack. The on-stack check is critical — without it, you'd incorrectly update low-link values through cross-edges to already-completed SCCs, collapsing separate components into one.

In practice, Tarjan's wins on memory because it doesn't need the transposed graph. For sparse graphs (E close to V) they're equivalent, but for dense graphs Kosaraju's extra memory allocation becomes noticeable.

TarjanSCC.java · JAVA

import java.util.*;

/**
 * Tarjan's SCC algorithm — single DFS pass using low-link values.
 * Time: O(V + E). Space: O(V) auxiliary (no transposed graph needed).
 *
 * Same test graph as before:
 *   0 -> 1 -> 2 -> 0  (cycle)
 *   1 -> 3 -> 4
 */
public class TarjanSCC {

    private final int totalVertices;
    private final List<List<Integer>> adjacencyList;

    // Discovery time of each vertex (order in which DFS visits it)
    private final int[] discoveryTime;

    // Low-link value: smallest discovery time reachable from this vertex's subtree
    private final int[] lowLink;

    // Tracks whether a vertex is currently on the DFS stack
    private final boolean[] onStack;

    // The explicit stack holding the 'current path' of the DFS
    private final Deque<Integer> dfsStack;

    // Monotonically increasing timer — gives each vertex a unique discovery stamp
    private int timer;

    private final List<List<Integer>> allSCCs;

    public TarjanSCC(int totalVertices) {
        this.totalVertices = totalVertices;
        this.adjacencyList = new ArrayList<>();
        this.discoveryTime = new int[totalVertices];
        this.lowLink = new int[totalVertices];
        this.onStack = new boolean[totalVertices];
        this.dfsStack = new ArrayDeque<>();
        this.allSCCs = new ArrayList<>();

        Arrays.fill(discoveryTime, -1); // -1 means 'not yet visited'
        Arrays.fill(lowLink, -1);

        for (int i = 0; i < totalVertices; i++) {
            adjacencyList.add(new ArrayList<>());
        }
    }

    public void addDirectedEdge(int from, int to) {
        adjacencyList.get(from).add(to);
    }

    /**
     * Core DFS function.
     * After visiting all neighbours of 'current', we check:
     *   if lowLink[current] == discoveryTime[current]
     * That means 'current' is the root of an SCC — no back edge escapes upward.
     */
    private void tarjanDFS(int current) {
        // Stamp this vertex with its discovery time and initialise low-link to the same value
        discoveryTime[current] = lowLink[current] = timer++;

        // Push onto our tracking stack
        dfsStack.push(current);
        onStack[current] = true;

        for (int neighbour : adjacencyList.get(current)) {

            if (discoveryTime[neighbour] == -1) {
                // Tree edge — neighbour hasn't been visited yet
                tarjanDFS(neighbour);
                // After returning, propagate the lowest reachable time upward
                lowLink[current] = Math.min(lowLink[current], lowLink[neighbour]);

            } else if (onStack[neighbour]) {
                // Back edge — neighbour is an ancestor still on our stack.
                // Update low-link directly from neighbour's DISCOVERY time, not its low-link.
                // Using discoveryTime here (not lowLink) prevents wrongly crossing SCC boundaries.
                lowLink[current] = Math.min(lowLink[current], discoveryTime[neighbour]);
            }
            // If neighbour is visited but NOT on the stack, it belongs to a completed SCC.
            // We deliberately ignore it — updating low-link from it would merge separate SCCs.
        }

        // If low-link equals discovery time, 'current' is the SCC root.
        // Pop everything off the stack down to (and including) 'current' — that's the SCC.
        if (lowLink[current] == discoveryTime[current]) {
            List<Integer> newComponent = new ArrayList<>();
            int poppedVertex;
            do {
                poppedVertex = dfsStack.pop();
                onStack[poppedVertex] = false;
                newComponent.add(poppedVertex);
            } while (poppedVertex != current); // stop when we reach the root
            allSCCs.add(newComponent);
        }
    }

    public List<List<Integer>> findAllSCCs() {
        // Handle disconnected graphs by trying every unvisited starting point
        for (int vertex = 0; vertex < totalVertices; vertex++) {
            if (discoveryTime[vertex] == -1) {
                tarjanDFS(vertex);
            }
        }
        return allSCCs;
    }

    public static void main(String[] args) {
        TarjanSCC graph = new TarjanSCC(5);

        graph.addDirectedEdge(0, 1);
        graph.addDirectedEdge(1, 2);
        graph.addDirectedEdge(2, 0);
        graph.addDirectedEdge(1, 3);
        graph.addDirectedEdge(3, 4);

        List<List<Integer>> sccs = graph.findAllSCCs();

        System.out.println("Tarjan's Algorithm — SCCs found: " + sccs.size());
        for (int i = 0; i < sccs.size(); i++) {
            System.out.println("  SCC " + (i + 1) + ": " + sccs.get(i));
        }

        System.out.println();
        System.out.println("Discovery times: " + Arrays.toString(graph.discoveryTime));
        System.out.println("Final low-links: " + Arrays.toString(graph.lowLink));
        System.out.println();
        System.out.println("Vertices 0,1,2 share low-link=0 (discoveryTime[0]).");
        System.out.println("That confirms they are all in the same SCC.");
    }
}

▶ Output

Tarjan's Algorithm — SCCs found: 3
SCC 1: [4]
SCC 2: [3]
SCC 3: [2, 1, 0]

Discovery times: [0, 1, 2, 3, 4]
Final low-links: [0, 0, 0, 3, 4]

Vertices 0,1,2 share low-link=0 (discoveryTime[0]).
That confirms they are all in the same SCC.

⚠️

Watch Out: The onStack Check Is Non-NegotiableA common bug is updating lowLink[current] = min(lowLink[current], lowLink[neighbour]) even when the neighbour belongs to an already-completed SCC (not on the stack). This incorrectly merges two separate SCCs. Always check onStack[neighbour] before using a neighbour's values for low-link propagation. Using discoveryTime[neighbour] instead of lowLink[neighbour] for back edges is equally critical for correctness at SCC boundaries.

Production Gotchas — Stack Overflow, Iterative Tarjan, and Real-World SCC Bugs

Both recursive implementations above will cause a StackOverflowError on graphs with tens of thousands of vertices in a default JVM. The default Java thread stack is 512KB–1MB, which allows roughly 5,000–10,000 recursive calls before crashing. Production graphs — web crawls, social networks, dependency trees — routinely have millions of vertices.

The fix is to rewrite the recursive DFS iteratively using an explicit stack that stores not just the vertex but also the iterator position (which neighbour are we currently processing). This is the tricky part: a recursive DFS implicitly stores the 'current neighbour index' in the call frame. When you go iterative, you must track that explicitly, typically by storing an iterator or index alongside the vertex on the stack.

A second production concern is input validation. Real-world graph data often has duplicate edges, self-loops, or disconnected components. Self-loops (a vertex pointing to itself) are valid SCCs of size 1 in Tarjan's — the condition lowLink[v] == discoveryTime[v] still holds. Duplicate edges don't break correctness but add wasted iterations; deduplicate adjacency lists on ingestion if performance matters.

A third gotcha is integer overflow in the timer for enormous graphs. A plain int timer maxes out at ~2.1 billion — fine for most cases, but worth noting if you're processing a multi-billion-edge web graph.

TarjanSCCIterative.java · JAVA

import java.util.*;

/**
 * Iterative version of Tarjan's SCC algorithm.
 * Eliminates recursion to avoid StackOverflowError on large graphs.
 *
 * The key challenge: a recursive DFS implicitly tracks 'which neighbour am I
 * currently processing' via the call stack frame. We replicate that with a
 * DfsFrame object that stores the vertex AND a neighbour iterator.
 */
public class TarjanSCCIterative {

    private final int totalVertices;
    private final List<List<Integer>> adjacencyList;

    public TarjanSCCIterative(int totalVertices) {
        this.totalVertices = totalVertices;
        adjacencyList = new ArrayList<>();
        for (int i = 0; i < totalVertices; i++) {
            adjacencyList.add(new ArrayList<>());
        }
    }

    public void addDirectedEdge(int from, int to) {
        adjacencyList.get(from).add(to);
    }

    /**
     * Represents one 'frame' on our simulated call stack.
     * Stores the vertex being processed and an iterator over its neighbours,
     * so we know where to resume when we return from a 'recursive call'.
     */
    private static class DfsFrame {
        final int vertex;
        final Iterator<Integer> neighbourIterator;

        DfsFrame(int vertex, Iterator<Integer> neighbourIterator) {
            this.vertex = vertex;
            this.neighbourIterator = neighbourIterator;
        }
    }

    public List<List<Integer>> findAllSCCs() {
        int[] discoveryTime = new int[totalVertices];
        int[] lowLink = new int[totalVertices];
        boolean[] onStack = new boolean[totalVertices];
        Arrays.fill(discoveryTime, -1);

        Deque<Integer> sccStack = new ArrayDeque<>();   // Tarjan's tracking stack
        Deque<DfsFrame> callStack = new ArrayDeque<>(); // simulated recursion stack
        List<List<Integer>> allSCCs = new ArrayList<>();
        int[] timer = {0}; // array wrapper so we can mutate inside the loop

        for (int startVertex = 0; startVertex < totalVertices; startVertex++) {
            if (discoveryTime[startVertex] != -1) continue; // already visited

            // Simulate the initial 'call' to tarjanDFS(startVertex)
            discoveryTime[startVertex] = lowLink[startVertex] = timer[0]++;
            sccStack.push(startVertex);
            onStack[startVertex] = true;
            callStack.push(new DfsFrame(startVertex,
                    adjacencyList.get(startVertex).iterator()));

            while (!callStack.isEmpty()) {
                DfsFrame frame = callStack.peek(); // peek, not pop — we're 'inside' this frame
                int current = frame.vertex;

                if (frame.neighbourIterator.hasNext()) {
                    // Still have neighbours to process — equivalent to the for-loop body
                    int neighbour = frame.neighbourIterator.next();

                    if (discoveryTime[neighbour] == -1) {
                        // Tree edge: simulate recursive call by pushing a new frame
                        discoveryTime[neighbour] = lowLink[neighbour] = timer[0]++;
                        sccStack.push(neighbour);
                        onStack[neighbour] = true;
                        callStack.push(new DfsFrame(neighbour,
                                adjacencyList.get(neighbour).iterator()));

                    } else if (onStack[neighbour]) {
                        // Back edge: update low-link with neighbour's discovery time
                        lowLink[current] = Math.min(lowLink[current], discoveryTime[neighbour]);
                    }
                    // Cross edge to completed SCC: do nothing

                } else {
                    // No more neighbours — simulate the 'return' from the recursive call
                    callStack.pop();

                    if (!callStack.isEmpty()) {
                        // Propagate low-link upward to the parent frame
                        int parent = callStack.peek().vertex;
                        lowLink[parent] = Math.min(lowLink[parent], lowLink[current]);
                    }

                    // Check if 'current' is the root of an SCC
                    if (lowLink[current] == discoveryTime[current]) {
                        List<Integer> component = new ArrayList<>();
                        int poppedVertex;
                        do {
                            poppedVertex = sccStack.pop();
                            onStack[poppedVertex] = false;
                            component.add(poppedVertex);
                        } while (poppedVertex != current);
                        allSCCs.add(component);
                    }
                }
            }
        }

        return allSCCs;
    }

    public static void main(String[] args) {
        // Test with a larger graph to prove no stack overflow
        int nodeCount = 100_000;
        TarjanSCCIterative graph = new TarjanSCCIterative(nodeCount);

        // One giant cycle across all 100k nodes — should be exactly 1 SCC
        for (int i = 0; i < nodeCount - 1; i++) {
            graph.addDirectedEdge(i, i + 1);
        }
        graph.addDirectedEdge(nodeCount - 1, 0); // close the cycle

        List<List<Integer>> sccs = graph.findAllSCCs();

        System.out.println("Graph with " + nodeCount + " nodes in one giant cycle.");
        System.out.println("SCCs found: " + sccs.size()); // should be 1
        System.out.println("SCC size: " + sccs.get(0).size()); // should be 100000
        System.out.println("No StackOverflowError — iterative approach works at scale.");

        // Also test the original 5-node graph for correctness
        TarjanSCCIterative smallGraph = new TarjanSCCIterative(5);
        smallGraph.addDirectedEdge(0, 1);
        smallGraph.addDirectedEdge(1, 2);
        smallGraph.addDirectedEdge(2, 0);
        smallGraph.addDirectedEdge(1, 3);
        smallGraph.addDirectedEdge(3, 4);

        List<List<Integer>> smallSCCs = smallGraph.findAllSCCs();
        System.out.println("\nSmall graph SCCs: " + smallSCCs.size());
        smallSCCs.forEach(scc -> System.out.println("  " + scc));
    }
}

▶ Output

Graph with 100000 nodes in one giant cycle.
SCCs found: 1
SCC size: 100000
No StackOverflowError — iterative approach works at scale.

Small graph SCCs: 3
[4]
[3]
[2, 1, 0]

⚠️

Pro Tip: Increase Stack Size Only as a Last ResortSome devs respond to StackOverflowError by launching the JVM with -Xss64m to increase the thread stack size. That's a band-aid, not a fix — it just delays the crash on bigger inputs and wastes memory for every thread in the pool. The iterative rewrite above scales to graphs with millions of vertices with zero JVM flag tweaks.

Feature / Aspect	Kosaraju's Algorithm	Tarjan's Algorithm
DFS passes required	Two separate passes	One pass only
Extra memory needed	Full transposed graph (O(V+E) extra)	O(V) for discoveryTime, lowLink, onStack arrays
Conceptual simplicity	Easier to explain and remember	Requires understanding low-link values
Recursive stack depth	Same risk — two DFS traversals	Same risk — one DFS traversal
Output order	SCCs in reverse topological order of condensation	SCCs in reverse topological order of condensation
Handles self-loops	Yes — self-loop vertex is its own SCC	Yes — lowLink[v] == discoveryTime[v] still holds
Handles disconnected graphs	Yes — outer loop covers all components	Yes — outer loop covers all components
Preferred in production	When code clarity is the priority	When memory efficiency matters most
Interview recognition	Very commonly asked — easy to whiteboard	Slightly harder to whiteboard but more impressive

🎯 Key Takeaways

⚠ Common Mistakes to Avoid

✕Mistake 1: Updating lowLink through cross-edges to completed SCCs — Symptom: multiple real SCCs get merged into one giant component — Fix: only update lowLink[current] from a neighbour if onStack[neighbour] is true; ignore neighbours that are visited but no longer on the stack.
✕Mistake 2: Using lowLink[neighbour] instead of discoveryTime[neighbour] for back-edge updates — Symptom: subtle test cases fail where an ancestor's low-link has already been compressed through another path, incorrectly pulling down the current vertex's low-link below the true SCC boundary — Fix: for back edges (neighbour already on stack), always use lowLink[current] = min(lowLink[current], discoveryTime[neighbour]), not lowLink[neighbour].
✕Mistake 3: Forgetting to handle disconnected graphs in the outer loop — Symptom: vertices unreachable from vertex 0 are never visited and don't appear in any SCC — Fix: wrap your DFS in a for-loop that iterates over all vertices and calls DFS on any vertex where discoveryTime[vertex] == -1, guaranteeing every component is discovered regardless of the starting point.

🔥

TheCodeForge Editorial Team Verified Author

Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.

About Our Team Editorial Standards

Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged