Skip to content
Home DSA Strongly Connected Components — Kosaraju's and Tarjan's Algorithms

Strongly Connected Components — Kosaraju's and Tarjan's Algorithms

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Graphs → Topic 13 of 17
Learn Strongly Connected Components (SCCs) in directed graphs using Kosaraju's two-pass DFS and Tarjan's single-pass algorithm with low-link values.
🔥 Advanced — solid DSA foundation required
In this tutorial, you'll learn
Learn Strongly Connected Components (SCCs) in directed graphs using Kosaraju's two-pass DFS and Tarjan's single-pass algorithm with low-link values.
  • SCCs partition a directed graph into maximal groups where every node can reach every other.
  • Kosaraju's uses two DFS passes: the first computes finish order, the second processes the transposed graph.
  • Both Kosaraju's and Tarjan's algorithms run in O(V+E) time.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer

Imagine a group of friends where everyone can pass a message to everyone else — maybe not directly, but through mutual friends. That closed circle is a Strongly Connected Component. Now imagine two separate friend groups at a party who never mingle: those are two different SCCs. Tarjan's and Kosaraju's algorithms are just clever ways of finding all those isolated circles in one pass through the crowd.

Social networks, airline route maps, compiler dependency graphs, deadlock detection in operating systems — every one of these systems hides a graph underneath, and inside that graph lurk tightly knit clusters where every node can reach every other node. Those clusters are Strongly Connected Components (SCCs), and identifying them is one of the most practically useful graph operations you'll ever implement. Miss an SCC and your compiler might mis-order function compilation. Ignore SCCs in a routing system and you'll miss that a subset of cities is completely isolated from the rest of the network.

The core problem SCCs solve is reachability symmetry in directed graphs. In an undirected graph, if A can reach B then B can reach A — trivially. But in a directed graph, edges have direction, so reachability isn't symmetric. You might be able to fly from New York to London but the direct return flight might not exist. SCCs carve a directed graph into maximal groups where every internal pair of nodes can mutually reach each other, giving you a clean decomposition of the graph's cyclic structure.

By the end of this article you'll understand exactly how Kosaraju's two-pass DFS and Tarjan's single-pass low-link algorithm work at the bit-level, you'll know which one to reach for in production and why, you'll have complete runnable Java implementations of both, and you'll be ready to handle the tricky follow-up questions interviewers throw at candidates who only memorised the surface-level answer.

What are Strongly Connected Components? — Plain English

A Strongly Connected Component (SCC) is a maximal group of nodes in a directed graph where every node can reach every other node in the group. Think of it like social circles where everyone in the circle can directly or indirectly message everyone else. For example, in a graph where A→B→C→A (a triangle), all three form one SCC. If D→A but A cannot reach D, then D is its own SCC. SCCs partition the graph into groups. Contracting each SCC to a single super-node produces a DAG (Directed Acyclic Graph) — the condensation graph. Applications: compiler analysis (detecting circular dependencies), social network analysis (finding tightly-knit communities), and internet page ranking.

How Kosaraju's Algorithm Works — Step by Step

Pass 1 — Find finish order: 1. Run DFS on the original graph. When a node fully finishes (all its descendants explored), push it onto a stack. 2. The last node pushed is the node that finishes last in DFS order.

Pass 2 — Process in reverse finish order on reversed graph: 3. Reverse all edges in the graph (transpose). 4. Pop nodes from the stack one by one. 5. If the node is unvisited in the reversed graph, run DFS from it on the reversed graph. 6. All nodes visited in this DFS form one SCC. 7. Repeat until the stack is empty.

Why it works: the finish-order stack ensures we process 'sink SCCs' first in the transposed graph. Each DFS on the transposed graph from a sink SCC reaches exactly the nodes in that SCC and no others.

Worked Example — Kosaraju's on a 5-Node Graph

Directed graph: 0→1, 1→2, 2→0, 1→3, 3→4. SCCs should be: {0,1,2}, {3}, {4}.

Pass 1 DFS from 0 (unvisited nodes)
  • DFS(0): go to 1, go to 2, go to 0 (visited), finish 2 (push 2), go to 3, go to 4, finish 4 (push 4), finish 3 (push 3), finish 1 (push 1), finish 0 (push 0).
  • Stack (bottom to top): [2, 4, 3, 1, 0].

Pass 2 on reversed graph (edges: 1→0, 2→1, 0→2, 3→1, 4→3): - Pop 0: DFS on reversed. 0→2→1→0(visited). Visits {0,1,2} → SCC #1. - Pop 1: already visited. - Pop 3: DFS on reversed. 3→1(visited). Visits {3} → SCC #2. - Pop 4: DFS on reversed. 4→3(visited). Visits {4} → SCC #3. - Result: [{0,1,2}, {3}, {4}].

Implementation

Kosaraju's runs two DFS passes: the first on the original graph to collect finish order, the second on the transposed graph in reverse finish order. Each SCC is the set of nodes visited in one DFS of pass 2. Tarjan's algorithm finds SCCs in a single DFS pass using a stack and 'low-link' values — more complex but only one traversal. Both algorithms run in O(V+E) time.

kosaraju_scc.py · PYTHON
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
from collections import defaultdict

def kosaraju_scc(num_v, adj):
    """Returns list of SCCs (each SCC is a list of node indices)."""
    # Pass 1: DFS to get finish order
    visited = [False] * num_v
    finish_stack = []

    def dfs1(v):
        visited[v] = True
        for nb in adj[v]:
            if not visited[nb]:
                dfs1(nb)
        finish_stack.append(v)

    for v in range(num_v):
        if not visited[v]:
            dfs1(v)

    # Build transposed graph
    radj = defaultdict(list)
    for v in range(num_v):
        for nb in adj[v]:
            radj[nb].append(v)

    # Pass 2: DFS on transposed in reverse finish order
    visited2 = [False] * num_v
    sccs = []

    def dfs2(v, component):
        visited2[v] = True
        component.append(v)
        for nb in radj[v]:
            if not visited2[nb]:
                dfs2(nb, component)

    while finish_stack:
        v = finish_stack.pop()
        if not visited2[v]:
            comp = []
            dfs2(v, comp)
            sccs.append(comp)

    return sccs

# Graph: 0->1->2->0 (SCC1), 1->3->4 (separate SCCs)
adj = defaultdict(list)
for u, v in [(0,1),(1,2),(2,0),(1,3),(3,4)]:
    adj[u].append(v)

result = kosaraju_scc(5, adj)
for scc in result:
    print(sorted(scc))
▶ Output
[0, 1, 2]
[3]
[4]
Feature / AspectKosaraju's AlgorithmTarjan's Algorithm
DFS passes requiredTwo separate passesOne pass only
Extra memory neededFull transposed graph (O(V+E) extra)O(V) for discoveryTime, lowLink, onStack arrays
Conceptual simplicityEasier to explain and rememberRequires understanding low-link values
Recursive stack depthSame risk — two DFS traversalsSame risk — one DFS traversal
Output orderSCCs in reverse topological order of condensationSCCs in reverse topological order of condensation
Handles self-loopsYes — self-loop vertex is its own SCCYes — lowLink[v] == discoveryTime[v] still holds
Handles disconnected graphsYes — outer loop covers all componentsYes — outer loop covers all components
Preferred in productionWhen code clarity is the priorityWhen memory efficiency matters most
Interview recognitionVery commonly asked — easy to whiteboardSlightly harder to whiteboard but more impressive

🎯 Key Takeaways

  • SCCs partition a directed graph into maximal groups where every node can reach every other.
  • Kosaraju's uses two DFS passes: the first computes finish order, the second processes the transposed graph.
  • Both Kosaraju's and Tarjan's algorithms run in O(V+E) time.

⚠ Common Mistakes to Avoid

    Updating lowLink through cross-edges to completed SCCs
    Symptom

    multiple real SCCs get merged into one giant component —

    Fix

    only update lowLink[current] from a neighbour if onStack[neighbour] is true; ignore neighbours that are visited but no longer on the stack.

    Using lowLink[neighbour] instead of discoveryTime[neighbour] for back-edge updates
    Symptom

    subtle test cases fail where an ancestor's low-link has already been compressed through another path, incorrectly pulling down the current vertex's low-link below the true SCC boundary —

    Fix

    for back edges (neighbour already on stack), always use lowLink[current] = min(lowLink[current], discoveryTime[neighbour]), not lowLink[neighbour].

    Forgetting to handle disconnected graphs in the outer loop
    Symptom

    vertices unreachable from vertex 0 are never visited and don't appear in any SCC —

    Fix

    wrap your DFS in a for-loop that iterates over all vertices and calls DFS on any vertex where discoveryTime[vertex] == -1, guaranteeing every component is discovered regardless of the starting point.

Interview Questions on This Topic

  • QWhat is a strongly connected component?
  • QDescribe Kosaraju's algorithm.
  • QWhat is the condensation graph?

Frequently Asked Questions

Why does Kosaraju's algorithm need two DFS passes?

The first pass computes the finish order, which reveals the SCC structure. The node that finishes last must be in a 'source SCC' (one with no incoming edges from other SCCs). Processing the transposed graph in reverse finish order starts from sink SCCs, and each DFS reaches exactly one SCC because edges into that SCC from other SCCs are now outgoing in the transposed graph.

What is the difference between Kosaraju's and Tarjan's algorithms?

Kosaraju's uses two passes and is easier to understand and implement. Tarjan's uses a single DFS pass with a stack and 'low-link' values (the lowest discovery time reachable from a subtree). Tarjan's is more efficient in practice (one pass) but harder to implement correctly. Both are O(V+E).

What is the condensation graph?

After finding all SCCs, contract each SCC into a single super-node. Edges between super-nodes represent edges between different SCCs in the original graph. The condensation graph is always a DAG — no cycles, because a cycle would merge those SCCs into one. The condensation reveals the high-level structure of the directed graph.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousNumber of Islands ProblemNext →A* Search Algorithm
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged