Skip to content
Home DSA Floyd-Warshall Algorithm Explained — All-Pairs Shortest Paths, Negative Cycles & Production Pitfalls

Floyd-Warshall Algorithm Explained — All-Pairs Shortest Paths, Negative Cycles & Production Pitfalls

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Graphs → Topic 6 of 17
Floyd-Warshall algorithm deep dive: how dynamic programming finds all-pairs shortest paths, detects negative cycles, handles dense graphs, and where it fails in production.
🔥 Advanced — solid DSA foundation required
In this tutorial, you'll learn
Floyd-Warshall algorithm deep dive: how dynamic programming finds all-pairs shortest paths, detects negative cycles, handles dense graphs, and where it fails in production.
  • Floyd-Warshall finds shortest paths between ALL pairs of nodes in O(V^3) time.
  • It uses dynamic programming: dist[i][j] is updated by considering every node k as a potential intermediate.
  • It handles negative edge weights but not negative cycles. Check dist[i][i] < 0 to detect cycles.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer

Imagine you're a travel agent with a map of every city and every flight route between them. A customer asks: 'What's the cheapest way to get from ANY city to ANY other city?' Instead of asking a pilot to fly every possible route one at a time, Floyd-Warshall says: 'Let me check — what if we're allowed to stop in City 1? Does that make any pair cheaper? Now what if we can also stop in City 2?' It asks that question for every city, one by one, until it's considered every possible layover — and at the end, you have the cheapest route between every pair of cities on the entire map.

Navigation apps, network routers, ride-sharing platforms, and game AI all share one brutal requirement: they don't just need the shortest path from A to B — they need the shortest path between every pair of nodes in the graph. Dijkstra's algorithm is brilliant for single-source shortest paths, but running it once per node in a dense graph gets expensive fast. Floyd-Warshall was built for exactly this problem, and it powers real infrastructure you use every day.

The core problem Floyd-Warshall solves is called the All-Pairs Shortest Path (APSP) problem. Given a weighted graph with V vertices, find the shortest distance between every possible (source, destination) pair — including graphs with negative edge weights, as long as there are no negative-weight cycles. It achieves this through an elegant three-nested-loop dynamic programming recurrence that's deceptively simple to implement yet surprisingly deep to understand correctly.

By the end of this article you'll understand exactly why the recurrence works (not just that it does), how to implement it with proper negative-cycle detection, when Floyd-Warshall beats alternatives like repeated Dijkstra or Bellman-Ford, what the real memory and time costs look like at scale, and the production-level gotchas that will catch you out in an interview or on the job.

What is Floyd-Warshall? — Plain English

Floyd-Warshall solves the All-Pairs Shortest Path (APSP) problem: given a weighted graph, find the shortest distance between every pair of nodes, not just from one source. Dijkstra solves single-source shortest paths; Floyd-Warshall solves ALL pairs in one pass. It handles negative edge weights (but not negative cycles).

The real-world analogy: imagine a travel agent with a complete distance table between every city. Instead of running a GPS route for each origin city one at a time, Floyd-Warshall fills the entire table simultaneously by asking: 'Would going through city k make the trip from city i to city j cheaper?' It asks this question for every possible intermediate city k, and the table is complete when every city has been considered as a potential stopover.

How Floyd-Warshall Works — Step by Step

Floyd-Warshall is a dynamic programming algorithm. Define dist[i][j] as the shortest distance from node i to node j.

Initialization: 1. Set dist[i][i] = 0 for all nodes (distance from a node to itself is zero). 2. For each edge (i, j) with weight w, set dist[i][j] = w. 3. For all other pairs (no direct edge), set dist[i][j] = infinity.

Main loop — consider each node k as a potential intermediate stop: 4. For k from 0 to V-1: For i from 0 to V-1: For j from 0 to V-1: If dist[i][k] + dist[k][j] < dist[i][j]: Update dist[i][j] = dist[i][k] + dist[k][j]

After the triple loop, dist[i][j] holds the shortest path between every pair.

Negative cycle detection: after the main loop, if dist[i][i] < 0 for any node i, a negative cycle exists. Walking that cycle repeatedly makes costs arbitrarily small, so shortest paths are undefined.

Time complexity: O(V^3). Space: O(V^2) for the distance matrix.

Worked Example — Tracing the Algorithm

Consider a graph with 4 nodes (0-3) and these directed edges: 0->1 weight 3, 0->3 weight 7, 1->0 weight 8, 1->2 weight 2, 2->0 weight 5, 2->3 weight 1, 3->0 weight 2

Initial dist matrix (INF = infinity): 0 1 2 3 0 [ 0, 3, INF, 7 ] 1 [ 8, 0, 2, INF] 2 [ 5, INF, 0, 1 ] 3 [ 2, INF, INF, 0 ]

After k=0 (using node 0 as intermediate): dist[1][3]: 8+7=15 < INF -> update to 15 dist[2][1]: 5+3=8 < INF -> update to 8 dist[2][3]: 5+7=12 > 1 -> no update dist[3][1]: 2+3=5 < INF -> update to 5 dist[3][2]: 2+INF -> no update

After k=1 (node 1 as intermediate): dist[0][2]: 3+2=5 < INF -> update to 5 dist[0][3]: 3+15=18> 7 -> no update dist[3][2]: 5+2=7 < INF -> update to 7

After k=2 (node 2 as intermediate): dist[0][3]: 5+1=6 < 7 -> update to 6 dist[1][3]: 2+1=3 < 15 -> update to 3 dist[3][3]: 7+1=8 > 0 -> no update

After k=3 (node 3 as intermediate): dist[0][0]: 6+2=8 > 0 -> no update dist[1][0]: 3+2=5 < 8 -> update to 5 dist[2][0]: 1+2=3 < 5 -> update to 3

Final dist matrix: 0 1 2 3 0 [ 0, 3, 5, 6 ] 1 [ 5, 0, 2, 3 ] 2 [ 3, 6, 0, 1 ] 3 [ 2, 5, 7, 0 ]

For example, shortest path 1->0 is 5 (via 1->2->3->0: weight 2+1+2=5).

Implementation

The dist matrix is initialized with edge weights and infinity for non-edges. The triple loop iterates: for each intermediate node k, for each source i, for each destination j, update dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j]). After the loop, dist[i][j] holds the shortest path considering all possible intermediates. Negative cycles are detected by checking dist[i][i] < 0 for any i.

floyd_warshall.py · PYTHON
12345678910111213141516171819202122232425262728293031
INF = float('inf')

def floyd_warshall(n, edges):
    # Initialize distance matrix
    dist = [[INF] * n for _ in range(n)]
    for i in range(n):
        dist[i][i] = 0
    for u, v, w in edges:
        dist[u][v] = w

    # Relax through every intermediate node k
    for k in range(n):
        for i in range(n):
            for j in range(n):
                if dist[i][k] + dist[k][j] < dist[i][j]:
                    dist[i][j] = dist[i][k] + dist[k][j]

    # Detect negative cycles
    for i in range(n):
        if dist[i][i] < 0:
            return None  # negative cycle found

    return dist

edges = [
    (0,1,3),(0,3,7),(1,0,8),(1,2,2),
    (2,0,5),(2,3,1),(3,0,2)
]
result = floyd_warshall(4, edges)
for row in result:
    print([x if x != float('inf') else 'INF' for x in row])
▶ Output
[0, 3, 5, 6]
[5, 0, 2, 3]
[3, 6, 0, 1]
[2, 5, 7, 0]
ConceptUse CaseExample
Floyd-Warshall AlgorithmCore usageSee code above

🎯 Key Takeaways

  • Floyd-Warshall finds shortest paths between ALL pairs of nodes in O(V^3) time.
  • It uses dynamic programming: dist[i][j] is updated by considering every node k as a potential intermediate.
  • It handles negative edge weights but not negative cycles. Check dist[i][i] < 0 to detect cycles.
  • Space complexity is O(V^2) for the distance matrix.
  • For sparse graphs, repeated Dijkstra is faster; Floyd-Warshall shines on dense graphs or when negative weights are present.

⚠ Common Mistakes to Avoid

    Memorising syntax before understanding the concept
    Skipping practice and only reading theory

Interview Questions on This Topic

  • QWhat is the time and space complexity of Floyd-Warshall?
  • QHow does Floyd-Warshall detect negative cycles?
  • QWhen would you choose Floyd-Warshall over Dijkstra?

Frequently Asked Questions

When should I use Floyd-Warshall instead of running Dijkstra for every node?

Use Floyd-Warshall when the graph is dense (E close to V^2) or when you need all-pairs results and the graph is small (V <= 500 is practical). For sparse graphs with many nodes, running Dijkstra once per source is faster: O(V * E log V) vs O(V^3). Floyd-Warshall also handles negative edge weights, which Dijkstra cannot.

Does Floyd-Warshall work on undirected graphs?

Yes. Represent each undirected edge (u,v,w) as two directed edges: u->v and v->u both with weight w. The algorithm then runs identically.

How do I detect a negative cycle with Floyd-Warshall?

After the main triple loop, check the diagonal of the distance matrix. If dist[i][i] < 0 for any node i, that node lies on a negative-weight cycle. Any path going through that node has undefined (negative-infinity) shortest distance.

How do I reconstruct the actual path, not just the distance?

Maintain a 'next' matrix alongside the distance matrix. Initialize next[i][j] = j for every direct edge. When updating dist[i][j] through k, also set next[i][j] = next[i][k]. To reconstruct the path from i to j, follow i -> next[i][j] -> next[next[i][j]][j] -> ... until you reach j.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousBellman-Ford AlgorithmNext →Topological Sort
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged