Mid-level 7 min · March 05, 2026

Union-Find - Recursion Causes StackOverflow in Production

Naive Union-Find recursion causes StackOverflowError on deep chains.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • Union-Find tracks disjoint connected components with near-constant operations
  • Two operations: find(x) returns root, union(x,y) merges components
  • Path compression flattens trees during find
  • Union by rank keeps tree heights O(log n)
  • Combined: amortized O(alpha(n)) — effectively O(1) for all practical inputs
  • Real use: graph connectivity, Kruskal's MST, percolation, compiler symbol resolution
Plain-English First

Imagine a school with 1,000 students. On day one, everyone is a stranger — each student is their own 'friend group'. As the year goes on, students become friends and their groups merge. Union-Find is a data structure that answers two questions blazing fast: 'Are these two students in the same friend group?' and 'Merge these two groups into one.' That's it. Every graph connectivity problem is secretly this same question at scale.

Distributed systems crash, social networks grow to billions of edges, and compilers resolve symbol dependencies — all of these lean on one elegant primitive: the ability to track which elements belong to the same connected component and merge components together efficiently. Union-Find (also called Disjoint Set Union, or DSU) is the structure that makes this possible in near-constant time, and it shows up everywhere from Kruskal's minimum spanning tree algorithm to LeetCode hard problems to production network topology managers.

The naive solution to connectivity is a hash map or a boolean matrix — mark which nodes are connected and do a BFS every time you're asked a query. That works for toy problems. The moment your graph has millions of nodes and you're doing hundreds of thousands of union and find operations, that approach collapses. Union-Find solves the problem by maintaining a forest of trees where each tree represents one connected component, and two killer optimisations — path compression and union by rank — bring the amortised cost of each operation to effectively O(α(n)), where α is the inverse Ackermann function. For any input you'll encounter in the real world, α(n) ≤ 5. It's as close to O(1) as a non-trivial algorithm gets.

By the end of this article you'll be able to implement a production-grade Union-Find from scratch, explain exactly why path compression and union by rank work together (not just that they do), detect cycles in undirected graphs, build the skeleton of Kruskal's MST, and dodge the three most common bugs that trip up experienced engineers in interviews and on the job.

What is Union-Find (Disjoint Set Union)? — Plain English

Union-Find (also called Disjoint Set Union, DSU) is a data structure that tracks a set of elements partitioned into disjoint groups. It supports two core operations: find (which group does element x belong to?) and union (merge the groups of x and y).

Real-world uses: Kruskal's MST algorithm, detecting cycles in a graph, network connectivity problems (are computers A and B in the same network?), and clustering algorithms.

Naive implementation uses an array where parent[i] = i's direct parent. Find walks up the tree to the root. Two optimizations make it nearly O(1): path compression (flatten the tree during find) and union by rank (attach the smaller tree under the larger).

Production Insight
Without path compression, repeated finds on chain unions create O(n) trees.
A 50M-edge graph with naive DSU can cause minutes of GC pauses and stack overflows.
Rule: always pair both optimisations in any code path handling >10K elements.
Key Takeaway
Union-Find is the hammer for connectivity questions.
Two optimisations — path compression and union by rank — turn linear operations into amortised constant.
Never ship DSU without both; you'll regret it when the graph grows.

How Union-Find (Disjoint Set Union) Works — Step by Step

Initialization: 1. parent[i] = i for all i (each element is its own root). 2. rank[i] = 0 for all i.

Find with path compression (amortized O(alpha(n)) ≈ O(1)): 1. If parent[x] == x: return x (x is the root). 2. parent[x] = find(parent[x]) (path compression: directly connect x to root). 3. Return parent[x].

Union by rank (O(1)): 1. Find root of x: rx = find(x). 2. Find root of y: ry = find(y). 3. If rx == ry: already in same set, return False. 4. Attach smaller rank tree under larger: if rank[rx] < rank[ry]: swap rx,ry. 5. parent[ry] = rx. If rank[rx]==rank[ry]: rank[rx]++. 6. Return True.

With both optimizations: effectively O(alpha(n)) per operation where alpha is the inverse Ackermann function — practically constant.

Production Insight
Recursive find works but risks stack overflow on deep chains (depth > 1000).
Iterative find with while loop avoids stack entirely and is faster.
Union by rank doesn't need to be perfect after path compression — size is more practical for extra queries.
Key Takeaway
Path compression flattens: each node visited becomes a direct child of the root.
Union by rank bounds height: log n even without path compression.
Together they make DSU the most practically constant data structure.

Before and After: Path Compression Effect — Visual Diagram

Path compression is the most impactful optimisation in Union-Find. Without it, a series of unions can create a deep chain — finding the root of the last element means traversing every ancestor. With path compression, each find operation flattens the tree so that subsequent finds on any of those nodes become O(1). The diagram below shows the parent array before and after a single find(3) on a chain of four nodes. Note how nodes 1, 2, and 3 all point directly to root 0 after compression.

Production Insight
Even after partial path compression, the iterative find in production code gradually flattens the tree. The worst-case chain can be flattened in a single find. Always measure parent array depth before and after processing large batches.
Key Takeaway
Path compression turns worst-case O(n) finds into amortised O(α(n)). The diagram proves that one find(3) on a chain of length 3 reduces depth to 1 for all nodes on the path.

Worked Example — Tracing the Algorithm

Elements: 0, 1, 2, 3, 4. Initially each is its own group. parent = [0, 1, 2, 3, 4]

union(0, 1): find(0)=0, find(1)=1. Different groups. parent[1]=0. rank[0]=1. parent = [0, 0, 2, 3, 4]

union(2, 3): find(2)=2, find(3)=3. parent[3]=2. rank[2]=1. parent = [0, 0, 2, 2, 4]

union(0, 2): find(0)=0, find(2)=2. rank[0]=rank[2]=1. Attach 2 under 0. rank[0]=2. parent = [0, 0, 0, 2, 4]

find(3): parent[3]=2. parent[2]=0. Path compress: parent[3]=0. Return 0. parent = [0, 0, 0, 0, 4] (3 now points directly to root 0)

find(1): parent[1]=0. Return 0. find(3)=0. Same group! Group containing {0,1,2,3}: root=0. Group {4}: root=4.

Production Insight
Path compression in find(3) made parent[3] point root 0 directly — next find is O(1).
Without union by rank, union(0,2) could have attached 0 under 2, creating height 2 instead of 1.
Production lesson: trace union order; worst-case chain unions happen when elements are added in increasing index order.
Key Takeaway
Trace one find with path compression to see the flattening effect.
Union by rank prevents tall trees even on adversarial input.
After enough operations, almost all nodes point directly to root.

Implementation

find uses recursive path compression: each node on the path from x to its root is made to point directly to the root. union attaches the lower-rank root under the higher-rank root; equal ranks increment the winner's rank. components tracks the number of disjoint sets and decrements on each successful union. connected simply compares the roots of two nodes.

union_find.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank   = [0] * n
        self.components = n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry: return False  # already connected
        if self.rank[rx] < self.rank[ry]: rx, ry = ry, rx
        self.parent[ry] = rx
        if self.rank[rx] == self.rank[ry]: self.rank[rx] += 1
        self.components -= 1
        return True

    def connected(self, x, y):
        return self.find(x) == self.find(y)

uf = UnionFind(5)
uf.union(0, 1); uf.union(2, 3); uf.union(0, 2)
print('0 and 3 connected:', uf.connected(0, 3))  # True
print('0 and 4 connected:', uf.connected(0, 4))  # False
print('Components:', uf.components)
Output
0 and 3 connected: True
0 and 4 connected: False
Components: 2
Production Insight
Recursive find is clean but unsafe for large graphs — use iterative version in production.
Track components count: it's a free indicator of partition count; useful for monitoring.
If using union by size instead of rank, componentSize() query becomes free and accurate after path compression.
Key Takeaway
The code is short but powerful.
Always expose connected() and component count for debugging.
Consider iterative find if recursion depth is a concern.

C++ Implementation with Iterative Find

In production C++ code, iterative find is strongly preferred over recursion to avoid stack overflow. The implementation below uses a while loop for path compression and an integer array for rank. Note the use of std::vector for dynamic sizing.

union_find.cppCPP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <vector>

class UnionFind {
public:
    UnionFind(int n) : parent(n), rank(n, 0), components(n) {
        for (int i = 0; i < n; ++i) parent[i] = i;
    }

    int find(int x) {
        // iterative find with path compression
        int root = x;
        while (parent[root] != root)
            root = parent[root];
        // compress path
        while (x != root) {
            int next = parent[x];
            parent[x] = root;
            x = next;
        }
        return root;
    }

    bool union_sets(int x, int y) {
        int rx = find(x);
        int ry = find(y);
        if (rx == ry) return false;
        if (rank[rx] < rank[ry]) std::swap(rx, ry);
        parent[ry] = rx;
        if (rank[rx] == rank[ry]) rank[rx]++;
        components--;
        return true;
    }

    bool connected(int x, int y) { return find(x) == find(y); }
    int get_components() const { return components; }

private:
    std::vector<int> parent;
    std::vector<int> rank;
    int components;
};
Output
// Example usage
UnionFind uf(5);
uf.union_sets(0,1);
uf.union_sets(2,3);
uf.union_sets(0,2);
std::cout << uf.connected(0,3); // 1 (true)
std::cout << uf.connected(0,4); // 0 (false)
std::cout << uf.get_components(); // 2
Production Insight
Iterative find is mandatory in C++ for large graphs due to limited stack per thread (default ~8MB on Linux, but deep recursion still risks overflow). The two‑pass iterative compression is slightly slower than the recursive one‑pass but completely safe.
Key Takeaway
Always use iterative find in production C++. The two‑path technique (find root, then compress) is straightforward and avoids any recursion risk.

Weighted DSU — Tracking Edge Weights in a Disjoint Set

Standard Union-Find tracks connectivity but not extra information. A weighted DSU variant stores additional data per component, such as sum of edge weights, number of nodes, or aggregated statistics. This is useful when you need to answer queries like “What is the total weight of edges in the component containing node v?” or “Maximum value in the component?”.

The core idea: store an array weight or value per root, and when merging components, combine the values. For example, to track total weight of edges added to each component, increment the root's weight by the edge weight on each union. The query component_weight(x) simply returns weight[find(x)].

weighted_dsu.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
class WeightedDSU:
    def __init__(self, n):
        self.parent = list(range(n))
        self.size = [1] * n          # component size (union by size)
        self.total = [0] * n         # total weight per component (root only)
        self.components = n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]

    def union(self, x, y, weight=0):
        """Merge components of x and y, adding 'weight' to the merged component's total."""
        rx, ry = self.find(x), self.find(y)
        if rx == ry:
            # already connected, but still add weight (if tracking cumulative edge weight)
            self.total[rx] += weight
            return False
        if self.size[rx] < self.size[ry]:
            rx, ry = ry, rx
        self.parent[ry] = rx
        self.size[rx] += self.size[ry]
        # combine totals: the old total of ry is already in self.total[ry], add to rx
        self.total[rx] += self.total[ry] + weight
        self.components -= 1
        return True

    def component_total(self, x):
        return self.total[self.find(x)]

# Example: building a weighted graph, track total weight per component
dsu = WeightedDSU(5)
dsu.union(0,1,10)   # edge weight 10
print(dsu.component_total(0))  # 10
dsu.union(2,3,20)
dsu.union(0,2,5)    # merge components with an edge weight 5
print(dsu.component_total(0))  # 10 + 20 + 5 = 35
Output
10
35
Production Insight
Weighted DSU is useful for clustering algorithms (e.g., aggregate similarity scores) and for monitoring total resource consumption per connected group. Ensure the combine operation is commutative and the weight update logic is correct for directed vs undirected edges.
Key Takeaway
With a simple extra array, Union-Find can track sums, max/min, or any aggregatable attribute per component. This extends its utility beyond pure connectivity.

Cycle Detection in an Undirected Graph Using Union-Find

Union-Find provides a clean linear-time algorithm for cycle detection in undirected graphs. Process each edge (u, v): find roots of u and v. If they are the same, a cycle exists (u and v are already connected). Otherwise, union them. If no edge produces same-root after all edges, the graph is acyclic.

Complexity: O(E * α(V)) — near-linear. This is used inside Kruskal's MST to skip edges that would create cycles.

Mental Model: Cycle Detection as Blue vs Red Teams
  • Each node starts on its own team.
  • Union merges two teams into one.
  • When processing an edge, if both nodes are already on the same team, the edge is redundant — cycle found.
  • This is exactly how Kruskal's MST builds a tree: it adds edges only if they connect two different teams.
Production Insight
This algorithm works only for undirected graphs. For directed graphs, use DFS with back-edge detection.
In production, watch for self-loops (edge u->u): they always produce a cycle; your code should handle them.
Large graphs with >10M edges may need iterative find to avoid stack overflow.
Key Takeaway
Cycle detection with Union-Find is elegant and fast.
Use for undirected graphs only.
The same mechanism powers Kruskal's MST — keep it in mind for graph problems.
When to use Union-Find vs DFS for Cycle Detection
IfGraph is undirected, edges are streaming or dynamic
UseUnion-Find — supports incremental edge additions in near-constant time
IfGraph is directed
UseDFS with back-edge detection (Union-Find doesn't work for directed cycles)
IfNeed to detect cycle in a static graph once
UseUnion-Find or DFS — both O(V+E); Union-Find often simpler to implement

Kruskal's Minimum Spanning Tree Algorithm

Kruskal's algorithm finds a minimum spanning tree for a weighted undirected graph: 1. Sort all edges by weight. 2. Initialise Union-Find with V components. 3. For each edge (u, v, w) in sorted order: - If find(u) != find(v): add edge to MST, union(u, v). - If MST size reaches V-1, stop.

Complexity: O(E log E) for sorting + O(E α(V)) for Union-Find. Union-Find makes the cycle detection O(1) per edge.

kruskal_mst.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def kruskal_mst(vertices, edges):
    """
    edges: list of (weight, u, v)
    returns: list of edges in MST, total weight
    """
    edges.sort()  # sort by weight
    uf = UnionFind(vertices)
    mst = []
    total_weight = 0
    for weight, u, v in edges:
        if uf.union(u, v):
            mst.append((u, v, weight))
            total_weight += weight
            if len(mst) == vertices - 1:
                break
    return mst, total_weight
Output
edges = [(1,0,1), (2,1,2), (3,0,2), (4,2,3)]
mst, total = kruskal_mst(4, edges)
print(mst) # [(0,1,1), (1,2,2), (2,3,4)]
print(total) # 7
Production Insight
Sorting edges is the bottleneck — for dense graphs, Prim's algorithm with heap may be faster.
Union-Find ensures the algorithm runs in near-linear time after sorting.
In production, use indexed edges (list of tuples) rather than objects to reduce memory overhead for large graphs.
Key Takeaway
Kruskal + Union-Find = production-grade MST.
Union-Find eliminates the need for visited arrays in cycle detection.
For extremely dense graphs, consider Prim's instead.

Understanding O(α(n)) — The Inverse Ackermann Complexity

When both path compression and union by rank are used, the amortised time per operation is O(α(n)), where α is the inverse Ackermann function. The Ackermann function A(m, n) grows extremely fast — far faster than exponential or tower functions. Its inverse α(n) is the number of times you must apply the logarithm function (base 2) to reduce n to a constant. For all practical values of n, α(n) ≤ 5. In fact, α(n) exceeds 5 only when n > 2^65536, a number astronomically larger than the number of atoms in the observable universe. Thus, for any real‑world dataset, Union‑Find operations run in effectively constant time.

Production Insight
When engineers say ‘DSU is O(1)’, they implicitly mean ‘amortised O(α(n)) with α(n) ≤ 5’. This bound holds only because both optimizations work together. Skipping either one raises the worst‑case to O(log n) or O(n) per operation — a dramatic difference at scale.
Key Takeaway
O(α(n)) is practically constant for any real dataset. The bound relies on the combination of path compression and union by rank.

Practice Problems for Union-Find

Sharpen your Union-Find skills with these carefully chosen coding problems. They cover fundamental connectivity, cycle detection, and advanced applications.

ProblemPlatformDifficultyCore Concept
[Redundant Connection (LeetCode 684)](https://leetcode.com/problems/redundant-connection/)LeetCodeMediumCycle detection in an undirected graph (edge that creates a cycle)
[Number of Islands (LeetCode 200)](https://leetcode.com/problems/number-of-islands/)LeetCodeMediumCounting connected components in a grid (2D DSU)
[Accounts Merge (LeetCode 721)](https://leetcode.com/problems/accounts-merge/)LeetCodeMediumMerging accounts with common emails
[Minimize Malware Spread (LeetCode 924)](https://leetcode.com/problems/minimize-malware-spread/)LeetCodeHardFind the node whose removal minimises infected component size
[The Earliest Moment When Everyone Become Friends (LeetCode 1101)](https://leetcode.com/problems/the-earliest-moment-when-everyone-become-friends/)LeetCodeMediumTime‑based connectivity using DSU
[Graph Connectivity With Threshold (LeetCode 1627)](https://leetcode.com/problems/graph-connectivity-with-threshold/)LeetCodeHardDSU with divisors — connect nodes that share a common divisor above a threshold
[Checking Existence of Edge Length Limited Paths (LeetCode 1697)](https://leetcode.com/problems/checking-existence-of-edge-length-limited-paths/)LeetCodeHardOffline queries with DSU (process edges in sorted order)

Start with LeetCode 684 and 200 to solidify the basics, then move to the hard problems for advanced DSU patterns.

Production Insight
These problems map directly to real scenarios: connectivity in distributed systems (Accounts Merge), network failure impact (Minimize Malware Spread), and dynamic threshold queries (Graph Connectivity With Threshold). Practicing them prepares you for both interviews and production challenges.
Key Takeaway
The 7 problems above cover every DSU pattern from simple connectivity to offline queries. Work through them in order of difficulty for deep mastery.

Advantages and Disadvantages of Union-Find

Union-Find is a powerful and elegant data structure, but it's not a silver bullet. Below we summarize its strengths and limitations.

AdvantageDisadvantage
Near‑constant amortised time per operation (O(α(n)))Only works for undirected graphs — cannot detect cycles in directed graphs
Incremental — supports dynamic edge additionsNo support for edge deletions (requires offline techniques like rollback DSU)
Extremely memory efficient (two arrays of ints)Recursive implementations risk stack overflow on large inputs
Simple to implement and debug (roughly 20 lines)Not suitable for queries like “what is the path between two nodes?” — only connectivity
Ideal for Kruskal's MST, network connectivity, and percolationCannot track most‑recent union time without extra data structures
Easily extended (weighted DSU, union by size, etc.)Slower than adjacency‑list DFS for single static connectivity queries

In summary, Union-Find excels at dynamic connectivity problems in undirected graphs where the primary operations are union and find. For static graphs, a simple BFS/DFS might be faster. For directed graphs or queries involving paths, other data structures are needed.

Production Insight
When evaluating DSU for a production system, check if your graph is undirected, if edges are only added (never removed), and if the main query is connectivity. If yes, DSU is unbeatable. Otherwise, consider alternative approaches like adjacency lists or DFS.
Key Takeaway
Union-Find is a specialized tool: fast, memory‑light, but limited to undirected incremental connectivity. Know when to use it and when to choose something else.
● Production incidentPOST-MORTEMseverity: high

The 40-Second Graph Build That Never Finished

Symptom
Graph connectivity job hangs for 40+ minutes then crashes with StackOverflowError.
Assumption
The Union-Find library from a third-party had both optimisations enabled by default.
Root cause
Naive find used recursion without path compression. On a deep chain of unions, find traversed O(n) nodes, and recursion depth exceeded JVM default stack (typically 1024).
Fix
Switched to iterative find with path compression: while(parent[x] != x) { int p = parent[x]; parent[x] = parent[parent[x]]; x = p; }. Also added union by rank to keep trees flat.
Key lesson
  • Always implement Union-Find with both optimisations in any production code path handling >10K elements.
  • Prefer iterative find when recursion depth could exceed 1,000 — JVM stack limits bite hard.
  • Profile your union-find under realistic data shapes: chain unions (worst case) should be in your test suite.
Production debug guideSymptom → Action for the three most common DSU failures3 entries
Symptom · 01
find(x) returns a node that is not the root (wrong component id)
Fix
Check if path compression is properly implemented. In recursive version, ensure parent[x] = find(parent[x]) is called. In iterative, verify the while loop updates ancestor links. Also check parent array initialisation: all elements must be self-parent at start.
Symptom · 02
union(x,y) returns True but the component count doesn't decrease
Fix
Verify that union returns False only when roots are equal. If ranks are used, ensure the rank increment only happens when both ranks equal. Common bug: rank[rx]++ inside else block instead of post-attach.
Symptom · 03
StackOverflowError during find on large datasets
Fix
Switch from recursive to iterative find. Alternatively increase JVM stack size (-Xss), but that masks the problem. The real fix is path compression and avoiding deep trees via union by rank.
★ Union-Find Quick Debug Cheat SheetUse these commands to diagnose the most common DSU production failures. Assumes a Python implementation. Adapt for other languages.
find returns wrong root after many unions
Immediate action
Print parent array and trace path from suspect node to root manually.
Commands
def debug_find(uf, x): path = []; while uf.parent[x] != x: path.append(x); x = uf.parent[x]; print('Path:', '->'.join(map(str, path))); return x
uf = UnionFind(1000); uf.union(0,1); uf.union(2,3); print([uf.find(i) for i in range(5)])
Fix now
Ensure parent array initialised with list(range(n)). No element should have parent pointing outside [0,n-1].
component count not decrementing+
Immediate action
Log every union call: print(f'union({x},{y}) -> roots {uf.find(x)}, {uf.find(y)}')
Commands
uf = UnionFind(10); log_union = lambda x,y: print(f'Union {x}-{y}: roots {uf.find(x)} and {uf.find(y)} -> {uf.union(x,y)}')
uf = UnionFind(10); log_union(0,1); log_union(1,2); print('Components:', uf.components)
Fix now
Check that union only decrements components when roots differ: if rx != ry: self.components -= 1
Union by Rank vs Union by Size in Path Compression DSU
AspectUnion by RankUnion by Size
What's trackedUpper bound on tree heightExact number of nodes in component
Merge ruleAttach lower-rank root under higher-rank rootAttach smaller component under larger component
Rank/size updateIncrement only when ranks are equalAlways add smaller size to larger
After path compressionRank becomes a proxy (no longer exact height)Size remains accurate — always exact
Extra utilityNone beyond merge decisionscomponentSize() query is free and always correct
Asymptotic boundO(α(n)) amortisedO(α(n)) amortised — identical
Which to preferTheoretical purists, textbooksPractical production code — size is queryable

Key takeaways

1
Union-Find tracks disjoint sets with near O(1) find and union operations.
2
Path compression flattens the tree during find
future finds on same nodes are O(1).
3
Union by rank attaches the shorter tree under the taller
prevents O(n) tree height.
4
Combined, the amortized complexity per operation is O(alpha(n))
the inverse Ackermann function.
5
Used in Kruskal's MST, cycle detection in undirected graphs, and network connectivity.
6
Iterative find avoids stack overflow
use for production systems with large datasets.
7
Union by size gives free componentSize() queries
prefer it for practical applications.

Common mistakes to avoid

4 patterns
×

Using recursion for find without stack limit awareness

Symptom
StackOverflowError on large inputs (e.g., 2 million nodes in a chain). JVM default stack ~1024 frames, and recursive find can exceed that even with path compression if the tree was deep before optimisation.
Fix
Switch to iterative find with while loop. Example: while parent[x] != x: parent[x] = parent[parent[x]]; x = parent[x]; return x. This uses O(1) stack space.
×

Forgetting to initialise parent[i] = i for all i

Symptom
find returns stale or uninitialised values, leading to incorrect connectivity. Some elements may never be found in a component, or union thinks they're already connected when they're not.
Fix
Always initialise parent array with list(range(n)). Never rely on zero-initialisation except for 0-indexed elements. For 1-indexed, set parent[0] unused or adjust indexing.
×

Implementing union without returning boolean (or decrementing component count)

Symptom
Component count becomes out of sync — often shows fewer components than actual partitions. This breaks monitoring and any logic that depends on accurate count.
Fix
Return True/False from union: True if merged (roots different), False if already connected. When True, decrement self.components. Test with a small example to verify count.
×

Using union by rank incorrectly (not checking equal ranks for increment)

Symptom
Tree heights grow faster than needed; worst-case height becomes O(log n) is lost. But more common: rank values become incorrect, causing suboptimal merges.
Fix
Only increment rank[rx] when rank[rx] == rank[ry] after attaching ry under rx. If you always increment, ranks become artificially high and lose their height-bound guarantee.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
What are path compression and union by rank, and why do they matter?
Q02SENIOR
How does Union-Find detect cycles in an undirected graph?
Q03SENIOR
What is the amortized time complexity of Union-Find with both optimizati...
Q04SENIOR
Why does union by rank guarantee O(log n) tree height even without path ...
Q05SENIOR
When would you choose union by size over union by rank?
Q01 of 05SENIOR

What are path compression and union by rank, and why do they matter?

ANSWER
Path compression makes each node on the find path point directly to the root during the find operation. Union by rank always attaches the shorter tree under the taller tree, bounding tree height to O(log n). Together they bring the amortised time per operation to O(α(n)), the inverse Ackermann function — practically constant. Without them, Union-Find can degrade to O(n) per operation on chain-like unions.
FAQ · 6 QUESTIONS

Frequently Asked Questions

01
What is path compression and why does it matter?
02
What is union by rank and why not just always attach root A under root B?
03
How is Union-Find used to detect cycles in an undirected graph?
04
What is the inverse Ackermann function alpha(n)?
05
Can Union-Find be used for directed graph cycle detection?
06
Should I use arrays or objects for parent and rank?
🔥

That's Graphs. Mark it forged?

7 min read · try the examples if you haven't

Previous
Topological Sort
8 / 17 · Graphs
Next
Minimum Spanning Tree — Kruskal and Prim