Mid-level 8 min · March 05, 2026

Binary Lifting LCA Depth Overflow Breaks Friend Suggestions

32-bit depth overflow in LCA binary lifting made all friend suggestions root user.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • LCA finds the deepest node that is an ancestor of both u and v in a rooted tree.
  • Naive approach compares root-to-node paths: O(n) per query in skewed trees.
  • Binary lifting precomputes 2^j-th ancestors: O(n log n) prep, O(log n) per query.
  • Euler tour + RMQ gives O(1) per query for static trees with O(n log n) prep.
  • Performance trade-off: binary lifting supports tree changes (parent updates) trivially; Euler tour does not.
  • Production pitfall: stack overflow from recursive DFS in deep trees – use iterative BFS.
Plain-English First

Imagine a big family reunion. You and your cousin are trying to figure out which ancestor you both share who is closest to both of you — not your great-great-grandparent from 1820, but the most recent one. That shared grandparent is the Lowest Common Ancestor. In a tree of nodes, it's simply the deepest node that sits above both of the nodes you're asking about. That's it.

Every time a file system resolves a relative path, every time Git finds the merge-base between two branches, and every time a compiler walks an abstract syntax tree to find scope boundaries — something is quietly computing a Lowest Common Ancestor. LCA isn't an academic curiosity. It's a foundational primitive that powers real systems at scale, and understanding it at depth separates engineers who can solve tree problems on a whiteboard from those who can actually ship the code in production.

The core problem is deceptively simple: given a rooted tree and two nodes u and v, find the deepest node that is an ancestor of both. The naive solution — walk up from both nodes until the paths converge — is O(n) per query and falls apart the moment you have a million-node tree with ten thousand queries per second. That's where the real engineering begins: how do you preprocess the tree so each query becomes O(log n) or even O(1)?

By the end of this article you'll understand three battle-tested approaches to LCA — recursive naive, Euler tour with Sparse Table RMQ, and Binary Lifting — with full runnable Java code for each. You'll know exactly when to reach for each approach, what the preprocessing costs are, and the edge cases that trip up even experienced engineers. Let's dig in.

What is Lowest Common Ancestor (LCA)? — Plain English

The Lowest Common Ancestor (LCA) of two nodes u and v in a tree is the deepest node that is an ancestor of both u and v. Every node is an ancestor of itself, so if u is an ancestor of v, then LCA(u,v) = u.

Real-world use: in a company org chart, LCA of two employees is their common manager. In a file system, LCA of two file paths is their deepest common directory. In computational biology, LCA in an evolutionary tree represents the most recent common ancestor of two species.

The naive approach (walk both nodes up to the root, find the first shared ancestor) takes O(depth) per query. For a balanced tree that is O(log n), but O(n) for a skewed tree. Binary lifting precomputes ancestor pointers so each query runs in O(log n) regardless of tree shape.

Production Insight
In production, don't use recursion for DFS – it blows the stack on deep trees.
Use iterative BFS to compute depths and parents – it's O(n) and only uses heap memory.
Rule: recursion is for balanced trees only; for production trees of unknown depth, always default to iterative.
Key Takeaway
LCA = deepest common ancestor.
Naive is O(n) per query – acceptable only for small trees.
Real systems need O(log n) or O(1) per query.

How Lowest Common Ancestor (LCA) Works — Step by Step

Naive approach (DFS / path comparison): 1. Find the path from root to u. Find the path from root to v. 2. Walk both paths from the root until they diverge. The last common node is the LCA.

Binary lifting (O(n log n) preprocessing, O(log n) per query): 1. Root the tree. Run a DFS to compute depth[v] and parent[v][0] (direct parent) for all nodes. 2. Compute parent[v][j] = parent[parent[v][j-1]][j-1]: the 2^j-th ancestor of v. 3. To find LCA(u, v): a. If depth[u] < depth[v], swap them so u is deeper. b. Lift u up by (depth[u] - depth[v]) steps to equalize depths. c. If u == v, return u (v is an ancestor of u or vice versa). d. Binary-lift both u and v simultaneously: for j from LOG-1 down to 0, if parent[u][j] != parent[v][j], set u = parent[u][j], v = parent[v][j]. e. Return parent[u][0] (the direct parent of both).

Think of binary lifting as a 'binary elevator'
  • Each entry parent[j][v] is the 2^j-th ancestor – like a lift button that jumps in powers of two.
  • To level up: decompose the depth difference into binary bits and apply each jump.
  • To find the meeting point: climb together from the highest power downwards, but only when your next jumps don't land in the same node – you want to stop right below the LCA.
  • The final step: after those parallel jumps, both nodes share the same direct parent – that's the LCA.
Production Insight
The binary lifting table is expensive to rebuild after tree changes – O(n log n).
If your tree changes frequently (e.g., dynamic tree in game physics), use link-cut trees instead.
Rule: binary lifting is for static or infrequently mutated trees. For dynamic trees, look at heavy-light decomposition.
Key Takeaway
Naive = O(n) per query, binary lifting = O(log n) per query after prep.
The 'elevator mental model' helps design the algorithm from scratch.
Always use iterative BFS for depth computation in production.
Which LCA algorithm should you use?
IfTree is small (< 1000 nodes), few queries (< 10)
UseUse naive path comparison – it's simple and fast enough.
IfTree is large, many queries, tree is static
UsePrefer binary lifting for simplicity (code) or Euler tour + RMQ for speed (O(1) queries).
IfTree is large but changes frequently
UseUse binary lifting only if you can afford O(n log n) rebuild per change; otherwise consider heavy-light decomposition or link-cut trees.
IfNeed O(1) queries and preprocessing is acceptable
UseEuler tour + Sparse Table RMQ – but can't handle tree modifications.

Binary Lifting Deep Dive — The 2^j-th Ancestor Table

The core idea of binary lifting is to precompute, for every node v, its ancestors at distances that are powers of two: parent[v][0] (parent), parent[v][1] (grandparent), parent[v][2] (great-great-grandparent), etc. This is done by dynamic programming: once you know parent[v][k], then parent[v][k+1] = parent[ parent[v][k] ][k].

Why does this work? Because ancestors are transitive: the 2^(k+1)-th ancestor of v is the 2^k-th ancestor of the 2^k-th ancestor of v. So you can fill the table in O(n log n) using sparse DP.

In the query, you first lift the deeper node to the same depth by decomposing the difference into binary bits. Then, from the highest power down to zero, you check if the 2^j-th ancestors of u and v are different. If they are, you know the LCA is deeper than those nodes, so you climb both up. Finally, the LCA is the direct parent of either node.

This technique is similar to binary search on powers – you're essentially jumping up the tree in exponentially shrinking steps until you land just below the common ancestor.

io/thecodeforge/tree/LcaBinaryLifting.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
package io.thecodeforge.tree;

import java.util.*;

public class LcaBinaryLifting {
    private int n;
    private int LOG;
    private int[][] parent; // parent[j][v] = 2^j-th ancestor
    private int[] depth;

    public LcaBinaryLifting(int n, List<int[]> edges, int root) {
        this.n = n;
        this.LOG = (int)(Math.log(n) / Math.log(2)) + 1;
        parent = new int[LOG][n + 1];
        depth = new int[n + 1];
        build(edges, root);
    }

    private void build(List<int[]> edges, int root) {
        // Build adjacency
        List<Integer>[] graph = new ArrayList[n + 1];
        for (int i = 1; i <= n; i++) graph[i] = new ArrayList<>();
        for (int[] e : edges) {
            int u = e[0], v = e[1];
            graph[u].add(v);
            graph[v].add(u);
        }

        // BFS from root to fill depth and parent[0]
        Queue<Integer> q = new ArrayDeque<>();
        boolean[] visited = new boolean[n + 1];
        q.offer(root);
        visited[root] = true;
        parent[0][root] = 0; // root's parent is 0 (sentinel)
        depth[root] = 0;
        while (!q.isEmpty()) {
            int u = q.poll();
            for (int v : graph[u]) {
                if (!visited[v]) {
                    visited[v] = true;
                    depth[v] = depth[u] + 1;
                    parent[0][v] = u;
                    q.offer(v);
                }
            }
        }

        // Build binary lifting table: parent[j][v] = 2^j-th ancestor
        for (int j = 1; j < LOG; j++) {
            for (int v = 1; v <= n; v++) {
                int mid = parent[j - 1][v];
                parent[j][v] = (mid == 0) ? 0 : parent[j - 1][mid];
            }
        }
    }

    public int query(int u, int v) {
        if (depth[u] < depth[v]) { int tmp = u; u = v; v = tmp; }
        // Lift u to same depth as v
        int diff = depth[u] - depth[v];
        for (int j = 0; j < LOG; j++) {
            if ((diff >> j & 1) == 1) {
                u = parent[j][u];
            }
        }
        if (u == v) return u;
        // Lift both until parents meet
        for (int j = LOG - 1; j >= 0; j--) {
            if (parent[j][u] != parent[j][v]) {
                u = parent[j][u];
                v = parent[j][v];
            }
        }
        return parent[0][u];
    }
}
Output
// Example usage:
// LcaBinaryLifting lca = new LcaBinaryLifting(n, edges, 1);
// System.out.println(lca.query(4, 7)); // prints 2
Production Insight
The table's memory is O(n log n) – for n=10^6 and LOG=20, it's ~80 MB (int[20][10^6+1]).
On memory-constrained systems (e.g., mobile, embedded), consider the Euler tour approach which uses O(n) memory.
Rule: if you have <10^5 nodes, binary lifting fits comfortably; above that, profile memory usage.
Key Takeaway
parent[k][v] = 2^k-th ancestor, built via DP: parent[k+1][v] = parent[k][ parent[k][v] ].
Query lifts deeper node by binary digits, then parallel climbs from high to low.
Memory O(n log n) – ensure it fits before deploying.

C++ Implementation of Binary Lifting LCA

While Java is common in enterprise backends, C++ dominates competitive programming and systems programming where LCA queries are frequent. The logic is identical: build an adjacency list, run BFS (or iterative DFS) to fill depths and parent[0], then populate the binary lifting table. The query uses the same bit decomposition and parallel climbing. Here's a complete, production-ready C++ implementation that uses iterative BFS for safety on deep trees.

lca_binary_lifting.cppCPP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
#include <bits/stdc++.h>
using namespace std;

class LCA {
    int n, LOG;
    vector<vector<int>> parent;
    vector<int> depth;
public:
    LCA(int n, vector<vector<int>>& adj, int root) : n(n) {
        LOG = ceil(log2(n)) + 1;
        parent.assign(LOG, vector<int>(n + 1, 0));
        depth.assign(n + 1, 0);
        build(adj, root);
    }

    void build(vector<vector<int>>& adj, int root) {
        queue<int> q;
        vector<bool> visited(n + 1, false);
        q.push(root);
        visited[root] = true;
        parent[0][root] = 0;
        depth[root] = 0;
        while (!q.empty()) {
            int u = q.front(); q.pop();
            for (int v : adj[u]) {
                if (!visited[v]) {
                    visited[v] = true;
                    depth[v] = depth[u] + 1;
                    parent[0][v] = u;
                    q.push(v);
                }
            }
        }
        for (int j = 1; j < LOG; ++j) {
            for (int v = 1; v <= n; ++v) {
                int mid = parent[j - 1][v];
                parent[j][v] = (mid == 0) ? 0 : parent[j - 1][mid];
            }
        }
    }

    int query(int u, int v) {
        if (depth[u] < depth[v]) swap(u, v);
        int diff = depth[u] - depth[v];
        for (int j = 0; j < LOG; ++j) {
            if (diff & (1 << j)) u = parent[j][u];
        }
        if (u == v) return u;
        for (int j = LOG - 1; j >= 0; --j) {
            if (parent[j][u] != parent[j][v]) {
                u = parent[j][u];
                v = parent[j][v];
            }
        }
        return parent[0][u];
    }
};
Output
// Usage:
// LCA lca(n, adj, root);
// cout << lca.query(4, 7); // prints 2
C++ integer types
In C++, depth is an int. If depth can exceed 2^31, use long long for depth and parent arrays. The rest of the algorithm stays the same.
Production Insight
C++ code often runs in resource-constrained environments (e.g., embedded, game engines). The vector of vectors for parent might cause memory fragmentation; consider a single flat array (int parent = new int[LOG (n+1)]) for better cache locality.
Key Takeaway
C++ implementation mirrors Java but uses vector<vector<int>>. Same BFS and DP. Always use 64-bit depth if tree depth is unknown.

Worked Example — Tracing the Algorithm

Tree: 1 / \ 2 3 / \ 4 5 / \ 6 7

Depths: 1->0, 2->1, 3->1, 4->2, 5->2, 6->3, 7->3

Query: LCA(4, 7) using naive path method: Path to 4: [1, 2, 4] Path to 7: [1, 2, 5, 7] Walk together: 1 matches 1. 2 matches 2. 4 != 5. Last match = 2. LCA(4, 7) = 2.

Query: LCA(6, 3): Path to 6: [1, 2, 5, 6]. Path to 3: [1, 3]. Step 1: depth[6]=3 > depth[3]=1. Lift 6 by 2 steps: 6->5->2. Step 2: depth equalized at 1. 2 != 3. Step 3: lift both by 1: 2->1, 3->1. Now equal. LCA = 1. LCA(6, 3) = 1.

Production Insight
Always test with asymmetric queries – one node is ancestor of another – to catch off-by-one errors.
Rule: if lifting the deeper node results in u == v, you've found the LCA immediately.
Key Takeaway
Trace with a simple tree to internalize the lifting steps.
The depth difference decomposition bit by bit is a common pattern – also used in Fenwick tree updates.

Using tin and tout Times for O(1) Ancestor Checks

During a DFS traversal, we can assign entry time (tin) and exit time (tout) to each node. A node u is an ancestor of node v if and only if tin[u] <= tin[v] and tout[u] >= tout[v]. This is an O(1) check after a simple O(n) DFS.

This technique is useful not only for LCA but also for subtree queries. For binary lifting LCA, we can use is_ancestor() to handle the case where one node is an ancestor of the other quickly, avoiding the depth-equalization steps.

Here's how to augment the binary lifting preprocessing with tin/tout:

  • During the BFS (or DFS) that fills depth and parent[0], also record tin and tout using a timer that increments on entry and exit of each node. For BFS, we need a DFS-like traversal to get correct tin/tout ordering; so we use an iterative DFS (stack) to avoid recursion. Alternatively, we can run a separate DFS.
  • The is_ancestor(u, v) function returns true if u is ancestor of v.
  • In the LCA query, first check if u is ancestor of v (return u) or v is ancestor of u (return v). This is a fast short-circuit.

This pattern is commonly used in competitive programming to optimize queries when one node is frequently an ancestor of the other.

io/thecodeforge/tree/LcaWithTinTout.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
package io.thecodeforge.tree;

import java.util.*;

public class LcaWithTinTout {
    private int n, LOG, timer;
    private int[][] parent;
    private int[] depth, tin, tout;

    public LcaWithTinTout(int n, List<int[]> edges, int root) {
        this.n = n;
        this.LOG = (int)(Math.log(n) / Math.log(2)) + 1;
        parent = new int[LOG][n + 1];
        depth = new int[n + 1];
        tin = new int[n + 1];
        tout = new int[n + 1];
        timer = 0;
        build(edges, root);
    }

    private void build(List<int[]> edges, int root) {
        List<Integer>[] graph = new ArrayList[n + 1];
        for (int i = 1; i <= n; i++) graph[i] = new ArrayList<>();
        for (int[] e : edges) {
            int u = e[0], v = e[1];
            graph[u].add(v);
            graph[v].add(u);
        }

        // Iterative DFS for tin/tout
        Deque<int[]> stack = new ArrayDeque<>(); // [node, state, parent]
        stack.push(new int[]{root, 0, 0});
        while (!stack.isEmpty()) {
            int[] curr = stack.pop();
            int u = curr[0], state = curr[1], par = curr[2];
            if (state == 0) { // entering
                tin[u] = ++timer;
                depth[u] = (par == 0) ? 0 : depth[par] + 1;
                parent[0][u] = par;
                stack.push(new int[]{u, 1, par});
                for (int v : graph[u]) {
                    if (v == par) continue;
                    stack.push(new int[]{v, 0, u});
                }
            } else { // exiting
                tout[u] = timer;
            }
        }

        // Build binary lifting table
        for (int j = 1; j < LOG; j++) {
            for (int v = 1; v <= n; v++) {
                int mid = parent[j - 1][v];
                parent[j][v] = (mid == 0) ? 0 : parent[j - 1][mid];
            }
        }
    }

    public boolean isAncestor(int u, int v) {
        return tin[u] <= tin[v] && tout[u] >= tout[v];
    }

    public int query(int u, int v) {
        if (isAncestor(u, v)) return u;
        if (isAncestor(v, u)) return v;
        if (depth[u] < depth[v]) { int tmp = u; u = v; v = tmp; }
        int diff = depth[u] - depth[v];
        for (int j = 0; j < LOG; j++) {
            if ((diff >> j & 1) == 1) u = parent[j][u];
        }
        if (u == v) return u;
        for (int j = LOG - 1; j >= 0; j--) {
            if (parent[j][u] != parent[j][v]) {
                u = parent[j][u];
                v = parent[j][v];
            }
        }
        return parent[0][u];
    }
}
Output
// LcaWithTinTout lca = new LcaWithTinTout(n, edges, 1);
// System.out.println(lca.query(4, 7)); // 2
tin/tout without recursion
The iterative DFS using a stack with state preserves ordering. Each node is pushed twice: once on entry (state=0) and once on exit (state=1). This avoids recursion stack overflow.
Production Insight
Using tin/tout for is_ancestor() is a cheap optimization. In production, if your query stream often includes ancestor-descendant pairs, this saves the depth equalization steps. However, be mindful that the DFS for tin/tout must traverse every edge, same as BFS – the overhead is negligible.
Key Takeaway
tin[u] <= tin[v] && tout[u] >= tout[v] is an O(1) ancestor check. Combine with binary lifting for a clean, fast LCA.

Euler Tour + RMQ: O(1) LCA Queries

The Euler tour technique transforms LCA into a Range Minimum Query (RMQ) problem on depths. Perform a DFS (iterative to avoid recursion) and record each node every time we visit it (entering and exiting). This produces an array euler[] of length 2n-1. Also record the index of the first occurrence of each node in euler. To query LCA(u, v), find the positions first[u] and first[v], then query the node with minimum depth in the segment between them (inclusive). Using a Sparse Table for RMQ gives O(1) query time after O(n log n) preprocessing.

This is the fastest known static LCA method. It does not support tree modifications, but for a large static tree with millions of queries, it's the best choice.

Here's a complete Java implementation using iterative DFS and a Sparse Table for RMQ.

io/thecodeforge/tree/LcaEulerRMQ.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
package io.thecodeforge.tree;

import java.util.*;

public class LcaEulerRMQ {
    private int n, LOG, timer;
    private int[] euler, depthArr, first;
    private int[][] st; // sparse table for RMQ

    public LcaEulerRMQ(int n, List<int[]> edges, int root) {
        this.n = n;
        euler = new int[2 * n - 1];
        depthArr = new int[2 * n - 1];
        first = new int[n + 1];
        Arrays.fill(first, -1);
        timer = 0;
        buildEuler(edges, root);
        buildSparseTable();
    }

    private void buildEuler(List<int[]> edges, int root) {
        List<Integer>[] graph = new ArrayList[n + 1];
        for (int i = 1; i <= n; i++) graph[i] = new ArrayList<>();
        for (int[] e : edges) {
            int u = e[0], v = e[1];
            graph[u].add(v);
            graph[v].add(u);
        }

        // Iterative DFS
        Deque<int[]> stack = new ArrayDeque<>(); // [node, state, depth, parent]
        stack.push(new int[]{root, 0, 0, 0});
        while (!stack.isEmpty()) {
            int[] top = stack.pop();
            int u = top[0], state = top[1], d = top[2], parent = top[3];
            if (state == 0) {
                euler[timer] = u;
                depthArr[timer] = d;
                if (first[u] == -1) first[u] = timer;
                timer++;
                stack.push(new int[]{u, 1, d, parent});
                for (int v : graph[u]) {
                    if (v == parent) continue;
                    stack.push(new int[]{v, 0, d + 1, u});
                }
            } else {
                if (tf != 0) { // not the root after last child
                    euler[timer] = parent;
                    depthArr[timer] = d - 1;
                    timer++;
                }
            }
        }
    }

    private void buildSparseTable() {
        int m = euler.length;
        LOG = (int)(Math.log(m) / Math.log(2)) + 1;
        st = new int[LOG][m];
        for (int i = 0; i < m; i++) st[0][i] = i;
        for (int j = 1; j < LOG; j++) {
            for (int i = 0; i + (1 << j) <= m; i++) {
                int left = st[j - 1][i];
                int right = st[j - 1][i + (1 << (j - 1))];
                st[j][i] = (depthArr[left] < depthArr[right]) ? left : right;
            }
        }
    }

    private int rmq(int l, int r) {
        int len = r - l + 1;
        int k = (int)(Math.log(len) / Math.log(2));
        int left = st[k][l];
        int right = st[k][r - (1 << k) + 1];
        return (depthArr[left] < depthArr[right]) ? left : right;
    }

    public int query(int u, int v) {
        int l = first[u], r = first[v];
        if (l > r) { int tmp = l; l = r; r = tmp; }
        return euler[rmq(l, r)];
    }
}
Output
// LcaEulerRMQ lca = new LcaEulerRMQ(n, edges, 1);
// System.out.println(lca.query(4, 7)); // prints 2
Memory vs Speed
Euler tour + Sparse Table uses O(n log n) memory, similar to binary lifting, but queries are O(1). For 10^6 nodes, how many nodes is 2n? Actually 2n-1 ~ 2 million. Sparse Table on 2 million elements with log2(2M)=21 gives about 42 million integers ~ 160 MB. Consider that.
Production Insight
Euler tour + RMQ is ideal for static trees with extremely high query throughput. It does not support tree changes. If you need both O(1) queries and dynamic updates, you'd need a more advanced structure (e.g., link-cut trees).
Key Takeaway
Euler tour transforms LCA into RMQ. Sparse Table gives O(1) queries after O(n log n) preprocessing. Best for static trees with many queries.

Edge Cases and Production Pitfalls

Real-world LCA problems rarely fit textbook examples. Here are the edge cases that cause silent failures in production:

  1. Forest (disconnected components): If the tree is actually a forest, LCA is undefined for nodes in different components. Either add a virtual root or check component IDs before querying.
  2. Large depth with recursive DFS: Recursion stack overflows in Java around 10k–20k depth depending on JVM settings. Use iterative BFS.
  3. Integer overflow in depth: As seen in the production incident, depth can exceed 2^31 – 1 in trees that are in reality linked lists (e.g., file path hierarchy). Use long for depth or clamp after a certain depth.
  4. Root not connected to all nodes: Make sure the tree is rooted and the root is correctly passed – a node's parent must be 0 for root, but ensure that other nodes with missing edges don't slip through with parent[0] = 0.
  5. Multiple edges between same nodes: The tree definition assumes no cycles and at most one edge between any pair. If there are duplicates, BFS may mark nodes visited early, missing correct parent relationships.
  6. Memory exhaustion for very large n: If n is 10^7, LOG ~ 24, memory for int table ~ 800 MB. Consider Euler tour + RMQ (O(n) memory) or a sparse table on depth (O(n log n) but smaller because one array).
Watch out for forest inputs
Many coding problems assume a single connected tree. In real applications, your input may be a forest. Always validate: run a BFS from root and count visited nodes. If visited < n, treat as forest and handle LCA queries accordingly (return null or sentinel).
Production Insight
Forests are not rare – think of disconnected clusters in a social network graph.
Production code should always include a root validation phase.
Rule: after BFS, if not all nodes visited, log a warning and treat the graph as a forest – add a virtual root or reject queries across components.
Key Takeaway
Forest, deep depth, overflow, and disconnected root – these four edge cases cause 90% of LCA failures in production.
Always validate tree connectivity, use iterative traversal, and store depth in 64-bit.

Complexity Hierarchy of LCA Approaches

Different LCA methods trade off preprocessing time, memory, and query speed. The following table summarizes the major approaches, including the lesser-known sqrt-decomposition method which provides an intermediate trade-off. For most production systems, the choice is between binary lifting (simple, works with incremental updates) and Euler tour + RMQ (fastest queries).

MethodPreprocessingMemoryQuery TimeSupports Updates?Notes
Naive path comparisonO(n)O(n)O(depth) up to O(n)Yes, trivialUse only for very small trees or one-off queries
Sqrt decompositionO(n)O(n)O(sqrt(n))Yes, O(sqrt(n)) per updateRarely used; see [cp-algorithms]
Binary liftingO(n log n)O(n log n)O(log n)Only with full rebuildBest balance of simplicity and speed
Euler tour + Sparse Table RMQO(n log n)O(n log n)O(1)NoFastest queries; memory heavy
Euler tour + Segment Tree RMQO(n)O(n)O(log n)Yes, O(log n) per updateGood if updates are few and queries moderate

There is also the Farach-Colton and Bender approach which achieves O(n) preprocessing and O(1) queries using ±1 RMQ, but it's complex and rarely implemented in practice.

For most practical purposes, start with binary lifting for simplicity. If you need maximum query throughput on a static tree, upgrade to Euler tour + Sparse Table.

Why sqrt decomposition?
The sqrt decomposition for LCA divides nodes into blocks of size sqrt(n). Each block stores its highest node (the one closest to root). To query, lift nodes to the block boundary, then within blocks use naive climb. It's a balanced trade-off but rarely used because binary lifting is both simpler and faster.
Production Insight
When designing an LCA system, start with binary lifting and profile query throughput. If it's insufficient (say, millions of queries per second with <1μs budget), switch to Euler tour + RMQ. In most cases, the bottleneck is not LCA but other parts of the application.
Key Takeaway
Choose the method based on tree size, query volume, and need for updates. Complexity ranges from O(n) naive to O(1) Euler+RMQ. Sqrt decomposition sits in the middle as a historical curiosity.
● Production incidentPOST-MORTEMseverity: high

Depth Overflow Takes Down Friend Recommendation System

Symptom
Friend suggestions became useless – all suggested friends were the top-level root user instead of relevant common connections.
Assumption
Tree depth would never exceed 2^31 – 1 because the tree was 'balanced' (user tree is actually highly skewed; depth can reach billions).
Root cause
Depth stored as a 32-bit signed integer overflowed when computing depth for nodes far from root. The binary lifting table then used incorrect depth to lift nodes, causing wrong LCA.
Fix
Switched depth to a 64-bit long integer and added a sanity check: if depth exceeds 2^31 – 1, fall back to an alternative approach (Euler tour + RMQ) that doesn't rely on depth comparison.
Key lesson
  • Never assume tree depth fits in 32 bits – real-world trees (e.g., folder hierarchies, org charts) can be arbitrarily deep.
  • Always validate depth values after DFS/BFS: if any depth exceeds MAX_INT, log a warning and consider alternative algorithms.
  • Use iterative DFS (stack) instead of recursion to avoid stack overflow even for moderate depths.
Production debug guideSymptoms and immediate actions when LCA returns wrong results4 entries
Symptom · 01
LCA returns root node for every query
Fix
Check that BFS/DFS actually visited all nodes and set parent[0] correctly. Print depth[node] for a few nodes – if all are 0, visited was never set.
Symptom · 02
LCA returns wrong node for deep nodes (depth > 1000)
Fix
Verify depth type is not overflowing – print depth for a leaf deep in the tree. If negative, switch to long.
Symptom · 03
StackOverflowError during preprocessing
Fix
Recursive DFS cannot handle deep trees. Replace with iterative stack or BFS using a queue. Set JVM thread stack size (-Xss) as temporary workaround.
Symptom · 04
Binary lifting query returns same node regardless of u or v
Fix
Your parent table is all zeros – you probably forgot to fill parent[j] for j>0. Run a sanity check: print(parent[1][some node]) should be its grandparent.
★ Quick Debug Cheat Sheet – Binary Lifting LCAOne-liner commands and fixes for the most common LCA bugs during development or after a production incident.
LCA always returns root (even for direct parent-child)
Immediate action
Print parent[0] for a child node to verify it is set.
Commands
echo "parent of node 5: ${parent[0][5]}"
if parent[0][5] == 0, check BFS loop: are all edges processed?
Fix now
Add a visited check after BFS: for i=1..n visited[i] must be true. If not, graph is disconnected or BFS missed some edges.
Binary lifting query takes O(n) instead of O(log n)+
Immediate action
Check if LOG value is too small (should be ~ceil(log2(n)) + 1).
Commands
print("LOG:", lca.LOG); should be > log2(n).
verify that parent table is built for all j up to LOG-1.
Fix now
set LOG = (int)(Math.log(n) / Math.log(2)) + 1; and rebuild table.
StackOverflowError during DFS preprocessing+
Immediate action
Switch to BFS (queue) or iterative DFS (explicit stack).
Commands
Replace recursive DFS function with iterative stack: while(!stack.isEmpty()) { node = stack.pop(); ... }
If must use recursion, increase thread stack size: java -Xss256m LCA
Fix now
Use BFS – simpler and avoids stack issues entirely.
LCA Approach Comparison
MethodPreprocessing TimeMemoryQuery TimeHandles Tree Changes?Best For
Naive Path ComparisonO(n) per path (lazy)O(n)O(depth) ~ O(n) worst-caseNo (but trivial to recompute)Small trees, one-off queries
Binary LiftingO(n log n)O(n log n)O(log n)Only if table rebuilt (slow)Large static trees, many queries
Euler Tour + RMQ (Sparse Table)O(n log n)O(n log n) for Sparse TableO(1)No (static)Maximum query throughput on a large tree
Euler Tour + RMQ (Segment Tree)O(n)O(n)O(log n)Yes, with O(log n) change costDynamic trees but slower queries

Key takeaways

1
LCA finds the deepest common ancestor of two nodes in a rooted tree.
2
Naive approach
O(depth) per query. Binary lifting: O(n log n) setup, O(log n) per query.
3
Binary lifting builds a parent table where parent[j][v] = 2^j-th ancestor of v.
4
To query
equalize depths, then binary-lift both nodes until their parents meet.
5
Euler tour + RMQ gives O(1) per query, useful when the number of queries is very large.
6
Production pitfalls
recursion stack overflow, depth integer overflow, forest inputs, and table fill order.

Common mistakes to avoid

4 patterns
×

Using recursion for DFS to compute depths

Symptom
StackOverflowError when tree is deep (depth > ~10k). The app crashes silently during preprocessing; no LCA queries can be answered.
Fix
Replace recursive DFS with iterative BFS (queue) or iterative DFS (explicit stack). BFS also computes depths correctly and uses heap memory only.
×

Assuming parent[0][v] is non-zero for all nodes outside the root

Symptom
Binary lifting table fills with zeros because parent[0][v] remains 0 for nodes that were never visited. LCA always returns 0 (root) for queries involving those nodes.
Fix
Always run BFS from the root and verify that every node except the root gets a non-zero parent. If a node has parent[0] == 0 and is not root, either the tree is disconnected or the root is not reaching all nodes.
×

Using 32-bit int for depth without checking overflow

Symptom
After many depth increments, depth becomes negative due to integer overflow. The depth difference logic breaks, and LCA returns wrong nodes (often root). The bug may go undetected for months until a very deep node is queried.
Fix
Use long (64-bit) for depth. In Java, cap depth at Integer.MAX_VALUE – 1 if using int as array index, but use long for comparison. Alternatively, implement a check: if depth > 2^30, fall back to Euler tour LCA (which doesn't use depth).
×

Incorrect binary lifting table filling order

Symptom
Query returns weird values, e.g., LCA of two siblings returns root when it should be their parent. The problem is that parent[j][v] was filled before parent[j-1][v] for all v was ready.
Fix
Fill the table row by row: for j from 1 to LOG-1, for v from 1 to n, compute parent[j][v] = parent[j-1][parent[j-1][v]]. This is correct because parent[j-1][*] is fully computed before using it.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR
What is the LCA problem and what are two approaches to solve it?
Q02SENIOR
How does binary lifting achieve O(log n) LCA queries?
Q03SENIOR
How would you find the LCA of two nodes in a binary search tree specific...
Q01 of 03JUNIOR

What is the LCA problem and what are two approaches to solve it?

ANSWER
The Lowest Common Ancestor problem asks: given a rooted tree and two nodes u and v, find the deepest node that is an ancestor of both. Two approaches: 1. Naive path comparison: find paths from root to each node, then walk both paths until they diverge. O(depth) per query. 2. Binary lifting: precompute an ancestor table where parent[j][v] is the 2^j-th ancestor of v. To query, lift the deeper node to the same depth, then lift both simultaneously until just before their ancestors meet. O(n log n) preprocessing, O(log n) per query. Other methods include Euler tour + RMQ (O(1) queries) and Tarjan's offline algorithm.
FAQ · 5 QUESTIONS

Frequently Asked Questions

01
What is the time complexity of binary lifting LCA?
02
What is Euler tour LCA and how does it relate to range minimum query?
03
Does LCA work on forests (multiple trees)?
04
Can binary lifting handle tree updates (adding/removing nodes)?
05
How do I handle LCA queries in a weighted tree where I also need distance between nodes?
🔥

That's Trees. Mark it forged?

8 min read · try the examples if you haven't

Previous
Trie Data Structure
11 / 15 · Trees
Next
Diameter of Binary Tree