Skip to content
Home DSA Lowest Common Ancestor in Trees: Naive to Binary Lifting Explained

Lowest Common Ancestor in Trees: Naive to Binary Lifting Explained

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Trees → Topic 11 of 15
Lowest Common Ancestor explained deeply — naive recursion, Euler tour + RMQ, and O(log n) binary lifting with full Java code, edge cases, and interview prep.
🔥 Advanced — solid DSA foundation required
In this tutorial, you'll learn
Lowest Common Ancestor explained deeply — naive recursion, Euler tour + RMQ, and O(log n) binary lifting with full Java code, edge cases, and interview prep.
  • LCA finds the deepest common ancestor of two nodes in a rooted tree.
  • Naive approach: O(depth) per query. Binary lifting: O(n log n) setup, O(log n) per query.
  • Binary lifting builds a parent table where parent[j][v] = 2^j-th ancestor of v.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer

Imagine a big family reunion. You and your cousin are trying to figure out which ancestor you both share who is closest to both of you — not your great-great-grandparent from 1820, but the most recent one. That shared grandparent is the Lowest Common Ancestor. In a tree of nodes, it's simply the deepest node that sits above both of the nodes you're asking about. That's it.

Every time a file system resolves a relative path, every time Git finds the merge-base between two branches, and every time a compiler walks an abstract syntax tree to find scope boundaries — something is quietly computing a Lowest Common Ancestor. LCA isn't an academic curiosity. It's a foundational primitive that powers real systems at scale, and understanding it at depth separates engineers who can solve tree problems on a whiteboard from those who can actually ship the code in production.

The core problem is deceptively simple: given a rooted tree and two nodes u and v, find the deepest node that is an ancestor of both. The naive solution — walk up from both nodes until the paths converge — is O(n) per query and falls apart the moment you have a million-node tree with ten thousand queries per second. That's where the real engineering begins: how do you preprocess the tree so each query becomes O(log n) or even O(1)?

By the end of this article you'll understand three battle-tested approaches to LCA — recursive naive, Euler tour with Sparse Table RMQ, and Binary Lifting — with full runnable Java code for each. You'll know exactly when to reach for each approach, what the preprocessing costs are, and the edge cases that trip up even experienced engineers. Let's dig in.

What is Lowest Common Ancestor (LCA)? — Plain English

The Lowest Common Ancestor (LCA) of two nodes u and v in a tree is the deepest node that is an ancestor of both u and v. Every node is an ancestor of itself, so if u is an ancestor of v, then LCA(u,v) = u.

Real-world use: in a company org chart, LCA of two employees is their common manager. In a file system, LCA of two file paths is their deepest common directory. In computational biology, LCA in an evolutionary tree represents the most recent common ancestor of two species.

The naive approach (walk both nodes up to the root, find the first shared ancestor) takes O(depth) per query. For a balanced tree that is O(log n), but O(n) for a skewed tree. Binary lifting precomputes ancestor pointers so each query runs in O(log n) regardless of tree shape.

How Lowest Common Ancestor (LCA) Works — Step by Step

Naive approach (DFS / path comparison): 1. Find the path from root to u. Find the path from root to v. 2. Walk both paths from the root until they diverge. The last common node is the LCA.

Binary lifting (O(n log n) preprocessing, O(log n) per query): 1. Root the tree. Run a DFS to compute depth[v] and parent[v][0] (direct parent) for all nodes. 2. Compute parent[v][j] = parent[parent[v][j-1]][j-1]: the 2^j-th ancestor of v. 3. To find LCA(u, v): a. If depth[u] < depth[v], swap them so u is deeper. b. Lift u up by (depth[u] - depth[v]) steps to equalize depths. c. If u == v, return u (v is an ancestor of u or vice versa). d. Binary-lift both u and v simultaneously: for j from LOG-1 down to 0, if parent[u][j] != parent[v][j], set u = parent[u][j], v = parent[v][j]. e. Return parent[u][0] (the direct parent of both).

Worked Example — Tracing the Algorithm

Tree: 1 / \ 2 3 / \ 4 5 / \ 6 7

Depths: 1->0, 2->1, 3->1, 4->2, 5->2, 6->3, 7->3

Query: LCA(4, 7) using naive path method: Path to 4: [1, 2, 4] Path to 7: [1, 2, 5, 7] Walk together: 1 matches 1. 2 matches 2. 4 != 5. Last match = 2. LCA(4, 7) = 2.

Query: LCA(6, 3): Path to 6: [1, 2, 5, 6]. Path to 3: [1, 3]. Step 1: depth[6]=3 > depth[3]=1. Lift 6 by 2 steps: 6->5->2. Step 2: depth equalized at 1. 2 != 3. Step 3: lift both by 1: 2->1, 3->1. Now equal. LCA = 1. LCA(6, 3) = 1.

Implementation

The LCA class builds a binary-lifting table where parent[j][v] is the 2^j-th ancestor of v. BFS computes direct parents and depths first. Then for j in 1..LOG-1, parent[j][v] = parent[j-1][parent[j-1][v]]. query equalizes depths by lifting the deeper node, then binary-lifts both nodes simultaneously until their parents converge. This runs in O(log n) per query after O(n log n) preprocessing.

lca.py · PYTHON
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
from collections import defaultdict

class LCA:
    def __init__(self, n, edges, root=1):
        self.LOG = 17  # 2^17 > 100000
        self.n   = n
        graph    = defaultdict(list)
        for u, v in edges:
            graph[u].append(v)
            graph[v].append(u)

        self.depth  = [0] * (n + 1)
        self.parent = [[0] * (n + 1) for _ in range(self.LOG)]

        # BFS to set depth and direct parent
        from collections import deque
        visited = [False] * (n + 1)
        queue   = deque([root])
        visited[root] = True
        while queue:
            node = queue.popleft()
            for nb in graph[node]:
                if not visited[nb]:
                    visited[nb] = True
                    self.depth[nb]     = self.depth[node] + 1
                    self.parent[0][nb] = node
                    queue.append(nb)

        # Binary lifting
        for j in range(1, self.LOG):
            for v in range(1, n + 1):
                self.parent[j][v] = self.parent[j-1][self.parent[j-1][v]]

    def query(self, u, v):
        if self.depth[u] < self.depth[v]: u, v = v, u
        diff = self.depth[u] - self.depth[v]
        for j in range(self.LOG):
            if (diff >> j) & 1:
                u = self.parent[j][u]
        if u == v: return u
        for j in range(self.LOG - 1, -1, -1):
            if self.parent[j][u] != self.parent[j][v]:
                u = self.parent[j][u]
                v = self.parent[j][v]
        return self.parent[0][u]

# Tree: 1-2, 1-3, 2-4, 2-5, 5-6, 5-7
lca = LCA(7, [(1,2),(1,3),(2,4),(2,5),(5,6),(5,7)])
print('LCA(4,7):', lca.query(4, 7))  # 2
print('LCA(6,3):', lca.query(6, 3))  # 1
print('LCA(6,7):', lca.query(6, 7))  # 5
▶ Output
LCA(4,7): 2
LCA(6,3): 1
LCA(6,7): 5
ConceptUse CaseExample
Lowest Common AncestorCore usageSee code above

🎯 Key Takeaways

  • LCA finds the deepest common ancestor of two nodes in a rooted tree.
  • Naive approach: O(depth) per query. Binary lifting: O(n log n) setup, O(log n) per query.
  • Binary lifting builds a parent table where parent[j][v] = 2^j-th ancestor of v.
  • To query: equalize depths, then binary-lift both nodes until their parents meet.
  • Euler tour + RMQ gives O(1) per query, useful when the number of queries is very large.

⚠ Common Mistakes to Avoid

    Memorising syntax before understanding the concept
    Skipping practice and only reading theory

Interview Questions on This Topic

  • QWhat is the LCA problem and what are two approaches to solve it?
  • QHow does binary lifting achieve O(log n) LCA queries?
  • QHow would you find the LCA of two nodes in a binary search tree specifically?

Frequently Asked Questions

What is the time complexity of binary lifting LCA?

Preprocessing: O(n log n) time and space to build the ancestor table. Each query takes O(log n). This makes it practical for n up to 10^5 with thousands of queries. For very large trees with few queries, the naive DFS path approach might be acceptable.

What is Euler tour LCA and how does it relate to range minimum query?

The Euler tour LCA technique converts the LCA problem into a range minimum query (RMQ) problem. Record the sequence of nodes visited during a DFS (including revisits when backtracking). LCA(u,v) is the shallowest node in the Euler tour between the first occurrences of u and v. Solving this RMQ with a sparse table gives O(1) per query after O(n log n) preprocessing.

Does LCA work on forests (multiple trees)?

Not directly — LCA is defined for a single rooted tree. For a forest, either add a virtual root node connecting all trees, or check if u and v have the same root before applying LCA. If they are in different trees, they have no common ancestor.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousTrie Data StructureNext →Diameter of Binary Tree
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged