Kruskal MST — Duplicate Edges Cause V-2 Edge Count
Duplicate edges in Kruskal's MST cause V-2 edge count, hiding disconnected nodes.
- MST connects all nodes in a weighted graph with minimum total weight, using exactly V-1 edges
- Kruskal: sort edges, add cheapest non-cycle via Union-Find; O(E log E) time
- Prim: grow tree from start using a min-heap; O(E log V) with a binary heap, O(E + V log V) with Fibonacci heap
- Both rely on the cut property: the lightest edge crossing any cut belongs to some MST
- In practice: Kruskal wins for sparse graphs, Prim for dense; Union-Find with path compression is the hidden performance difference
Imagine you're a city planner who needs to connect 10 towns with roads, but every road costs money. You don't need every possible road — you just need the cheapest set of roads so that every town is reachable from every other town. That cheapest connected skeleton is a Minimum Spanning Tree. Kruskal's approach is: sort all possible roads by cost and greedily pick the cheapest one that doesn't create a loop. Prim's approach is: start at one town and always extend to the nearest unvisited town. Same destination, totally different journeys.
Network infrastructure engineers don't debate Kruskal vs Prim at a whiteboard for fun — they debate it because the wrong choice can tank performance on a graph with a million edges. MST algorithms power cable laying, circuit board routing, cluster analysis in machine learning, image segmentation, and approximation algorithms for NP-hard problems like TSP. If you've ever wondered how Spotify groups similar songs or how Cisco routes OSPF traffic, MSTs are lurking in the background.
The core problem MST solves is deceptively simple: given a connected, weighted, undirected graph, find the subset of edges that keeps the graph connected, contains no cycle, and has the minimum possible total edge weight. That subset is always a tree (N nodes, N-1 edges). The real challenge is doing this efficiently at scale — a naive approach checking all spanning trees is factorial in complexity, which is completely unusable in production.
By the end of this article you'll understand not just how both algorithms work, but why they're correct (the cut property and cycle property of MSTs), when each one dominates the other in practice, how Union-Find with path compression makes Kruskal fly, how a priority queue shapes Prim's performance, and exactly what breaks in each algorithm when your graph has duplicate edge weights, disconnected components, or negative weights.
What is Minimum Spanning Tree? — Plain English
A Minimum Spanning Tree (MST) is a subset of edges in a connected, undirected, weighted graph that: (1) connects all nodes, (2) contains no cycles, and (3) has the minimum possible total edge weight.
Real-world analogy: a telephone company wants to lay cables connecting 10 cities. They want every city reachable from every other city, and they want to minimize the total cable length. The MST gives the exact set of cables to lay. Two classic algorithms solve MST: Kruskal's (sort edges, add cheapest that doesn't form a cycle) and Prim's (grow a tree greedily from one node, always adding the cheapest edge connecting the tree to a new node).
How Minimum Spanning Tree Works — Step by Step
Kruskal's Algorithm: 1. Sort all edges by weight in ascending order. 2. Initialize a Union-Find data structure where each node is its own component. 3. For each edge (u, v, w) in sorted order: a. If u and v are in different components (find(u) != find(v)), add this edge to the MST. b. Union the two components (union(u, v)). c. Stop when the MST has V-1 edges.
Prim's Algorithm (min-heap version): 1. Start from any node. Mark it visited. Push all its edges into a min-heap keyed by weight. 2. While the heap is non-empty and the MST has fewer than V-1 edges: a. Pop the minimum-weight edge (w, u, v). b. If v is already in the MST, skip (this edge would form a cycle). c. Add v to the MST. Add edge (u,v,w) to results. Push all edges from v to unvisited nodes.
Both algorithms produce the same MST (or an equally-weighted one). Time: O(E log E) for Kruskal's, O(E log V) for Prim's with a heap.
Worked Example — Tracing the Algorithm
Graph with 5 nodes (A-E) and edges: A-B:2, A-C:3, B-C:1, B-D:4, C-E:5, D-E:1
Kruskal's trace (edges sorted by weight): Sorted edges: B-C:1, D-E:1, A-B:2, A-C:3, B-D:4, C-E:5
Step 1: B-C:1 — B and C are in different components. Add to MST. Union(B,C). MST={B-C} Step 2: D-E:1 — D and E are in different components. Add to MST. Union(D,E). MST={B-C, D-E} Step 3: A-B:2 — A and B are in different components. Add to MST. Union(A,B,C). MST={B-C, D-E, A-B} Step 4: A-C:3 — A and C are in the SAME component. SKIP (would form cycle A-B-C). Step 5: B-D:4 — B is in {A,B,C}, D is in {D,E}. Add to MST. Union all. MST={B-C, D-E, A-B, B-D} MST has 4 edges = V-1 = 5-1. Done.
Total MST weight: 1+1+2+4 = 8 The edge C-E:5 was not needed — the path C-B-D-E achieves connectivity for less total cost.
Visual Step-by-Step of Kruskal Edge Selection
The following diagram visualizes the edge selection process for the 5-node graph from the worked example. Each row shows the current state of edges (processed and added) during a step. Highlighted indices indicate the edge under consideration, and the note explains the decision.
Real-World Applications of Minimum Spanning Trees
MST algorithms are not just academic; they power real infrastructure and data analysis pipelines:
- Network Design: Laying fiber optic cables, electrical grids, or water pipelines — MST gives the cheapest layout that connects all endpoints.
- Cluster Analysis: In machine learning, MST-based clustering (e.g., single-linkage hierarchical clustering) builds a spanning tree of data points and cuts the heaviest edges to form clusters. Spotify's song similarity groups use MST-based techniques.
- Image Segmentation: In computer vision, MST segments an image by treating pixels as nodes and edge weights as color/intensity differences. The MST groups similar pixels while separating dissimilar ones — widely used in medical imaging.
- Approximation Algorithms: MST is the backbone of the 2-approximation for the Traveling Salesman Problem (TSP) and helps solve Steiner tree problems.
- Circuit Design: Routing traces on a PCB—MST minimizes total track length while ensuring connectivity.
- Social Network Analysis: Finding communities in graphs by building the MST and removing low-correlation edges.
Implementation — Kruskal's Algorithm in Python
kruskal sorts all edges by weight then processes them with Union-Find. find uses iterative path halving for efficiency. union attaches by rank. An edge is added to the MST only when it connects two different components. The loop stops early once the MST has n-1 edges, avoiding unnecessary edge processing.
Implementation — Kruskal's Algorithm in C++
Below is a production-grade C++ implementation of Kruskal's algorithm using a custom struct for edges, std::sort, and an inline Disjoint Set Union (DSU) class with path compression and union by size. The code reads from a vector of edges, sorts by weight, and processes them with the DSU. The MST edges are collected in a result vector. The main function demonstrates the same 5-node graph.
Implementation — Prim's Algorithm in Java
Prim's algorithm uses a min-priority queue of edges (or vertices). The adjacency list is built as a map from node to list of (neighbor, weight). We start from node 0, mark it visited, and push all its edges. While the heap has edges and MST not complete, we pop the cheapest edge. If the target node is already visited, skip. Otherwise, add to MST and push its unvisited neighbors.
Java implementation uses a custom Edge class and PriorityQueue with comparator.
When to Use Kruskal vs Prim — Decision Framework
Choosing between Kruskal and Prim is a trade-off based on graph density and edge distribution:
- Sparse graphs (E ≈ V): Kruskal wins because sorting O(E log E) is dominated by E, and Union-Find operations are near-constant. Prim's heap overhead (O(E log V)) has a log factor on V but E is small.
- Dense graphs (E ≈ V^2): Prim with a Fibonacci heap outruns Kruskal because heap operations are O(E + V log V) vs O(E log E) for Kruskal.
- Very large graphs (millions of edges): Kruskal's sorting memory can be a bottleneck. Prim's adjacency list also memory-heavy, but you can stream edges from disk for Kruskal using external sorting.
- Graph with near-duplicate edge weights: Kruskal handles ties naturally (any order works). Prim's heap might need tie-breaking stabilisation.
- Sparse: avg degree < 10-20. Kruskal's sorting overhead is low.
- Dense: avg degree close to V. Prim's heap size stays small relative to edge count.
- If graph is weighted and you need to process edge weights dynamically (add/remove), Kruskal's sorting cost makes it impractical — use Prim with a dynamic priority queue.
- For distributed / out-of-core graphs: Kruskal with external sort and parallel Union-Find is battle-tested.
Advantages and Disadvantages of MST Algorithms
Both Kruskal and Prim have strengths and weaknesses. The table below helps you decide which suits your constraints.
Borůvka's Algorithm — The Third Classic MST
Borůvka's algorithm is the oldest MST algorithm (1926) and the only one that naturally parallelizes across graph partitions. It works in phases: 1. Each node selects the cheapest edge incident to it. 2. All selected edges are added to the MST (they are safe by cut property). 3. The graph is contracted by merging components connected by added edges. 4. Repeat until only one component remains (V-1 edges).
Each phase runs in O(E) time, and the number of phases is O(log V) because the number of components at least halves. Total time O(E log V) without needing a fancy priority queue. Borůvka's is less common in textbooks but shines in distributed systems (e.g., MapReduce) because each node's cheapest edge can be computed independently. Modern implementations for massive graphs (like Google's Pregel) use Borůvka-style approaches. If you need to compute MST on a graph too large to fit in memory, Borůvka's is your best bet.
Production Pitfalls and Edge Cases
MST algorithms seem simple but have subtle edge cases that break in production:
- Disconnected graph: Both algorithms will produce a spanning forest (one tree per component). Kruskal will simply not add enough edges. Prim will only explore the component containing the start node. Always check final edge count == V-1, or run a connectivity check beforehand.
- Negative edge weights: Both algorithms handle negative weights correctly — the cut property still holds. However, if you use a variant of Prim that assumes non-negative (like Dijkstra-based), you might get wrong results. Standard Prim works fine.
- Duplicate edge weights: Multiple MSTs possible. Kruskal's sorting order among equal-weight edges is implementation-defined — different sort stability produces different trees. For deterministic output, tie-break by edge index or node IDs.
- Integer overflow: Edge weights summed can exceed 32-bit int. Use 64-bit (long in Java, int64 in C++) for total weight. JS and Python handle big ints natively.
- Memory explosion in Kruskal: Sorting E edges may require O(E) auxiliary memory. For graphs with tens of millions of edges, use external sort or switch to Prim.
Practice Problems to Master MST
The best way to solidify MST concepts is to solve real problems. Below are curated practice problems from popular judges, ranging from basic implementation to advanced variations. Start with the easy ones (simple Kruskal) and move to harder ones (MST with constraints, second-best MST, dynamic MST).
The 500ms Query That Missed Edges — A Kruskal Bug in Production
- An MST implementation must handle duplicate edges explicitly — they can mask disconnected graphs.
- Never rely solely on edge count for termination; always validate final tree connectivity.
- Log total edges processed vs total edges added — the difference reveals duplicates.
Key takeaways
Common mistakes to avoid
4 patternsUsing sort() without specifying stability for equal weights
Forgetting to mark start node visited in Prim
Using Prim with a simple queue instead of priority queue
Not deduplicating edges when there are multiple same-weight edges between two vertices
Interview Questions on This Topic
Why does the cut property guarantee MST correctness?
Frequently Asked Questions
That's Graphs. Mark it forged?
7 min read · try the examples if you haven't