Mid-level 11 min · March 05, 2026

Floyd-Warshall — int Overflow Causes Negative Distances

Integer overflow in Floyd-Warshall's dist matrix yields negative distances—use long and INF=Long.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • Floyd-Warshall finds shortest paths between ALL pairs of nodes in O(V³) time
  • Works with negative edge weights — fails if negative cycles exist
  • Core idea: for each node k, treat it as a potential intermediate stop on every i→j path
  • Space: O(V²) for the distance matrix — problematic beyond ~10k nodes
  • Production trap: integer overflow when adding distances; use long or check bounds
  • Best for dense graphs (E ≈ V²); for sparse graphs run Dijkstra per source instead
Plain-English First

Imagine you're a travel agent with a map of every city and every flight route between them. A customer asks: 'What's the cheapest way to get from ANY city to ANY other city?' Instead of asking a pilot to fly every possible route one at a time, Floyd-Warshall says: 'Let me check — what if we're allowed to stop in City 1? Does that make any pair cheaper? Now what if we can also stop in City 2?' It asks that question for every city, one by one, until it's considered every possible layover — and at the end, you have the cheapest route between every pair of cities on the entire map.

Navigation apps, network routers, ride-sharing platforms, and game AI all share one brutal requirement: they don't just need the shortest path from A to B — they need the shortest path between every pair of nodes in the graph. Dijkstra's algorithm is brilliant for single-source shortest paths, but running it once per node in a dense graph gets expensive fast. Floyd-Warshall was built for exactly this problem, and it powers real infrastructure you use every day.

The core problem Floyd-Warshall solves is called the All-Pairs Shortest Path (APSP) problem. Given a weighted graph with V vertices, find the shortest distance between every possible (source, destination) pair — including graphs with negative edge weights, as long as there are no negative-weight cycles. It achieves this through an elegant three-nested-loop dynamic programming recurrence that's deceptively simple to implement yet surprisingly deep to understand correctly.

By the end of this article you'll understand exactly why the recurrence works (not just that it does), how to implement it with proper negative-cycle detection, when Floyd-Warshall beats alternatives like repeated Dijkstra or Bellman-Ford, what the real memory and time costs look like at scale, and the production-level gotchas that will catch you out in an interview or on the job.

What is Floyd-Warshall? — Plain English

Floyd-Warshall solves the All-Pairs Shortest Path (APSP) problem: given a weighted graph, find the shortest distance between every pair of nodes, not just from one source. Dijkstra solves single-source shortest paths; Floyd-Warshall solves ALL pairs in one pass. It handles negative edge weights (but not negative cycles).

The real-world analogy: imagine a travel agent with a complete distance table between every city. Instead of running a GPS route for each origin city one at a time, Floyd-Warshall fills the entire table simultaneously by asking: 'Would going through city k make the trip from city i to city j cheaper?' It asks this question for every possible intermediate city k, and the table is complete when every city has been considered as a potential stopover.

Production Insight
The naive triple loop seems wasteful — O(V³) is a lot. But in dense graphs (E ~ V²), repeated Dijkstra is O(V·(E log V)) = O(V³ log V), which is worse.
Product teams often overuse Floyd-Warshall on sparse graphs. If your graph has <1000 nodes and edges are few, run Dijkstra from each source instead.
Rule: count edges first. If E < V·log V, don't use Floyd-Warshall.
Key Takeaway
Floyd-Warshall is for dense graphs with negative weights.
It's O(V³) time, O(V²) space — use when V ≤ 1000.
For sparse graphs, repeated Dijkstra wins.

How Floyd-Warshall Works — Step by Step

Floyd-Warshall is a dynamic programming algorithm. Define dist[i][j] as the shortest distance from node i to node j.

Initialization: 1. Set dist[i][i] = 0 for all nodes. 2. For each edge (i, j) with weight w, set dist[i][j] = w. 3. For all other pairs, set dist[i][j] = INF (a large number).

Main loop — consider each node k as a potential intermediate stop: 4. For k from 0 to V-1: For i from 0 to V-1: For j from 0 to V-1: If dist[i][k] + dist[k][j] < dist[i][j]: Update dist[i][j] = dist[i][k] + dist[k][j]

After the triple loop, dist[i][j] holds the shortest path considering all possible intermediate nodes.

Negative cycle detection: after the main loop, if dist[i][i] < 0 for any node i, a negative cycle exists. Walking that cycle repeatedly makes costs arbitrarily small, so shortest paths are undefined.

io/thecodeforge/FloydWarshall.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
package io.thecodeforge;

public class FloydWarshall {
    private static final long INF = Long.MAX_VALUE / 2;

    public static long[][] shortestPaths(int n, int[][] edges) {
        long[][] dist = new long[n][n];
        for (int i = 0; i < n; i++) {
            for (int j = 0; j < n; j++) {
                dist[i][j] = (i == j) ? 0 : INF;
            }
        }
        for (int[] e : edges) {
            dist[e[0]][e[1]] = e[2]; // allow multiple edges? keep smallest
        }

        for (int k = 0; k < n; k++) {
            for (int i = 0; i < n; i++) {
                if (dist[i][k] == INF) continue;
                for (int j = 0; j < n; j++) {
                    if (dist[k][j] == INF) continue;
                    long via = dist[i][k] + dist[k][j];
                    if (via < dist[i][j]) {
                        dist[i][j] = via;
                    }
                }
            }
        }

        for (int i = 0; i < n; i++) {
            if (dist[i][i] < 0) {
                return null; // negative cycle
            }
        }
        return dist;
    }
}
Output
dist[0][3] = 6 (using the example graph)
Production Insight
Skipping INF checks inside the inner loop (like we did above with continue) saves ~V³ iterations of addition when INF values dominate early.
But be careful: skipping dist[k][j] == INF check might miss valid updates if INF is not handled. Our safe-guard uses continue.
Rule: always guard against INF to avoid overflow and to speed up sparse initialisations.
Key Takeaway
The triple loop is called exactly V³ times.
Add guards to skip INF entries — they don't contribute.
Negative cycles make all distances undefined — check after the loops.

C++ Implementation with Overflow Guard

In C++, the default integer type is 32-bit int, and overflow when adding distances is a subtle but critical bug. Use 64-bit long long and set INF to a value that won't overflow when doubled. A common choice is LLONG_MAX / 2. The code below mirrors the Java version but with C++ syntax, including the INF guard that skips updates when either intermediate distance is INF.

This implementation also includes a safe_add that returns INF if the sum would overflow, though the guard approach already prevents overflow by not adding when either operand is INF. However, in graphs with large finite weights (e.g., 10^12), dist[i][k] + dist[k][j] could still overflow. The production-safe approach is to check before adding: if dist[i][k] > INF - dist[k][j], then the sum equals INF and should be skipped.

The template below uses std::vector<std::vector<long long>> for the matrix, which also uses dynamic allocation — important for large V.

thecodeforge/floyd_warshall.cppCPP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <vector>
#include <algorithm>
#include <climits>

using namespace std;
using ll = long long;

const ll INF = LLONG_MAX / 2;

vector<vector<ll>> floyd_warshall(int n, const vector<vector<pair<int,ll>>>& edges) {
    vector<vector<ll>> dist(n, vector<ll>(n, INF));
    for (int i = 0; i < n; i++) dist[i][i] = 0;
    for (int u = 0; u < n; u++) {
        for (auto [v, w] : edges[u]) {
            dist[u][v] = min(dist[u][v], w); // handle multiple edges
        }
    }

    for (int k = 0; k < n; k++) {
        for (int i = 0; i < n; i++) {
            if (dist[i][k] == INF) continue;
            for (int j = 0; j < n; j++) {
                if (dist[k][j] == INF) continue;
                // Check overflow before adding
                if (dist[i][k] > INF - dist[k][j]) continue;
                ll via = dist[i][k] + dist[k][j];
                if (via < dist[i][j]) {
                    dist[i][j] = via;
                }
            }
        }
    }

    // Negative cycle detection
    for (int i = 0; i < n; i++) {
        if (dist[i][i] < 0) {
            // Affected node propagation can be added here
            return {}; // signal error
        }
    }
    return dist;
}
Output
Returns the all-pairs distance matrix, or empty if negative cycle exists.
Production Insight
Always compile with -Wall -Wextra -Wconversion to catch implicit narrowing. The overflow check before adding is critical when edge weights are large (e.g., 10^9). In C++, LLONG_MAX is 9.22e18, so even two 10^12 weights sum to 2e12, safe. But if you accidentally use int INF = 1e9, adding two 1e9 values gives 2e9 which overflows signed int. Always use long long for distances.
Key Takeaway
In C++, use long long with LLONG_MAX/2 as INF. Add an explicit overflow guard before addition. This prevents undefined behavior the same way Java's long does.

Worked Example — Tracing the Algorithm

Consider a graph with 4 nodes (0-3) and these directed edges: 0->1 weight 3, 0->3 weight 7, 1->0 weight 8, 1->2 weight 2, 2->0 weight 5, 2->3 weight 1, 3->0 weight 2

Initial dist matrix (INF = infinity): 0 1 2 3 0 [ 0, 3, INF, 7 ] 1 [ 8, 0, 2, INF] 2 [ 5, INF, 0, 1 ] 3 [ 2, INF, INF, 0 ]

After k=0 (using node 0 as intermediate): dist[1][3]: 8+7=15 < INF -> update to 15 dist[2][1]: 5+3=8 < INF -> update to 8 dist[2][3]: 5+7=12 > 1 -> no update dist[3][1]: 2+3=5 < INF -> update to 5 dist[3][2]: 2+INF -> no update

After k=1 (node 1 as intermediate): dist[0][2]: 3+2=5 < INF -> update to 5 dist[0][3]: 3+15=18> 7 -> no update dist[3][2]: 5+2=7 < INF -> update to 7

After k=2 (node 2 as intermediate): dist[0][3]: 5+1=6 < 7 -> update to 6 dist[1][3]: 2+1=3 < 15 -> update to 3 dist[3][3]: 7+1=8 > 0 -> no update

After k=3 (node 3 as intermediate): dist[0][0]: 6+2=8 > 0 -> no update dist[1][0]: 3+2=5 < 8 -> update to 5 dist[2][0]: 1+2=3 < 5 -> update to 3

Final dist matrix: 0 1 2 3 0 [ 0, 3, 5, 6 ] 1 [ 5, 0, 2, 3 ] 2 [ 3, 6, 0, 1 ] 3 [ 2, 5, 7, 0 ]

For example, shortest path 1->0 is 5 (via 1->2->3->0: weight 2+1+2=5).

Production Insight
This matrix holds exactly the all-pair shortest distances. Notice node 0→3 dropped from 7 to 6 when via node 2 was considered in k=2.
In production, you'd typically reconstruct the actual path, not just distances. Maintain a next matrix alongside.
Rule: always verify a single pair manually before trusting the whole matrix in prod.
Key Takeaway
Trace a small graph to internalise the recurrence.
Path reconstruction needs a separate next[][] table.
Once k runs its course, the matrix is final.

Matrix Evolution: From A₀ to A₄

Each iteration of the outer loop over k produces a new distance matrix, denoted Aₖ, where Aₖ[i][j] is the shortest path from i to j that only uses intermediate nodes in {0..k-1}. The initial matrix A₀ uses no intermediate nodes (only direct edges). After k=0, A₁ considers node 0 as intermediate. After k=3, A₄ considers all nodes. Below is the evolution for the 4-node example from the worked example.

A₀ (direct edges only): [0, 3, ∞, 7] [8, 0, 2, ∞] [5, ∞, 0, 1] [2, ∞, ∞, 0]

A₁ (intermediate node 0 allowed): [0, 3, ∞, 7] [8, 0, 2, 15] [5, 8, 0, 1] (via 0: 5+3=8 for (2,1)) [2, 5, ∞, 0] (via 0: 2+3=5 for (3,1))

A₂ (intermediate nodes 0,1 allowed): [0, 3, 5, 7] (via 1: 3+2=5 for (0,2)) [8, 0, 2, 15] [5, 8, 0, 1] [2, 5, 7, 0] (via 1: 5+2=7 for (3,2))

A₃ (intermediate nodes 0,1,2 allowed): [0, 3, 5, 6] (via 2: 5+1=6 for (0,3)) [8, 0, 2, 3] (via 2: 2+1=3 for (1,3)) [5, 8, 0, 1] [2, 5, 7, 0]

A₄ (all nodes 0,1,2,3 allowed) — final: [0, 3, 5, 6] [5, 0, 2, 3] (via 3: 3+2=5 for (1,0)) [3, 6, 0, 1] (via 3: 1+2=3 for (2,0)) [2, 5, 7, 0]

Notice how each matrix builds on the previous one. The DP invariant is that after processing k, the matrix holds shortest paths using only nodes 0..k as intermediates. This phase evolution is exactly what makes Floyd-Warshall a dynamic programming algorithm.

Production Insight
In production, you rarely print all intermediate matrices. But understanding the phase property helps debug off-by-one errors: if distances are correct for small paths but wrong for long ones, you may have skipped a k iteration. Also, the in-place update works (proof via DP invariant) — you don't need separate matrices for each phase.
Key Takeaway
Each outer iteration k adds node k as a possible intermediate. The matrix after all k is the final APSP. Tracing phases deepens intuition for correctness.

Negative Cycle Detection and Handling

A negative weight cycle makes shortest paths undefined because you can loop infinitely to reduce cost. Floyd-Warshall can detect them easily.

After the main triple loop, check the diagonal of the distance matrix. If any dist[i][i] < 0, node i is part of a negative cycle. However, this only tells you that a cycle exists, not which nodes are affected. In fact, every node that can reach and be reached from that cycle also has undefined shortest paths (distances become -∞).

To fully identify affected nodes, run a reachability analysis: find all nodes that can reach the cycle and all nodes the cycle can reach. Those nodes' distances are effectively -∞.

In code, after detecting dist[i][i] < 0, mark node i as 'bad'. Then propagate: if dist[bad][j] < INF then j is also bad (can reach from bad). Also if dist[k][bad] < INF then k is bad (can reach bad). Combine.

io_thecodeforge_floyd_negative_cycle.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
INF = float('inf')

def floyd_warshall_detect_affected(n, edges):
    dist = [[INF]*n for _ in range(n)]
    for i in range(n):
        dist[i][i] = 0
    for u,v,w in edges:
        dist[u][v] = w

    for k in range(n):
        for i in range(n):
            for j in range(n):
                if dist[i][k] + dist[k][j] < dist[i][j]:
                    dist[i][j] = dist[i][k] + dist[k][j]

    # Detect cycle nodes
    bad = [False]*n
    for i in range(n):
        if dist[i][i] < 0:
            bad[i] = True

    # Propagate: nodes that can reach or be reached from bad
    for i in range(n):
        for j in range(n):
            if dist[i][j] != INF and (bad[i] or bad[j]):
                bad[i] = bad[j] = True

    # Set affected pairs to -INF
    for i in range(n):
        for j in range(n):
            if bad[i] or bad[j]:
                dist[i][j] = -float('inf')

    return dist
Output
dist[i][j] = -inf for any pair involving a node reachable from the negative cycle
Production Insight
Most textbook implementations stop at 'return None'. That's fine for interviews, but in production you need to tell the caller which paths are affected.
In a routing service, returning -∞ might crash the UI. Return a special sentinel or throw a checked exception with details.
Rule: never hide the cycle — report which nodes are poisoned.
Key Takeaway
Check dist[i][i] < 0 for negative cycle detection.
Affected nodes are those reachable to/from the cycle.
Report affected pairs, not just 'cycle exists'.

Using Floyd-Warshall for Transitive Closure

Transitive closure asks: for every pair (i,j), is there a path from i to j? This is a boolean version of APSP. Floyd-Warshall can compute the transitive closure in O(V³) time by using a boolean reachability matrix instead of a distance matrix. Initialize reach[i][i] = true, and for each edge (u,v), set reach[u][v] = true. Then the core update becomes: reach[i][j] = reach[i][j] OR (reach[i][k] AND reach[k][j]).

This is exactly the same DP structure, but with boolean logic. It's faster to implement than a DFS/BFS from every node (O(V·(V+E))) and uses less code. However, for large graphs, you could use bitset optimisation (V³/wordsize) if the graph is dense.

One production trick: if you already have the shortest path matrix with INF, you can derive transitive closure by replacing INF with 0 and any finite distance with 1. But the dedicated boolean version saves memory and avoids overflow concerns.

io/thecodeforge/TransitiveClosure.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
public class TransitiveClosure {
    public static boolean[][] transitiveClosure(int n, boolean[][] edges) {
        boolean[][] reach = new boolean[n][n];
        for (int i = 0; i < n; i++) reach[i][i] = true;
        for (int u = 0; u < n; u++) {
            for (int v = 0; v < n; v++) {
                if (edges[u][v]) reach[u][v] = true;
            }
        }
        for (int k = 0; k < n; k++) {
            for (int i = 0; i < n; i++) {
                if (reach[i][k]) {
                    for (int j = 0; j < n; j++) {
                        if (reach[k][j]) reach[i][j] = true;
                    }
                }
            }
        }
        return reach;
    }
}
Output
reach[i][j] = true if path from i to j exists
Production Insight
Transitive closure is used in build systems (dependency resolution), type checkers (cast safety), and database query optimisers. Floyd-Warshall's boolean variant is often faster than repeated BFS/DFS when the graph is dense (E~V²). For sparse graphs, BFS from each node runs in O(V(V+E)) which is better.
Key Takeaway
Floyd-Warshall can compute reachability with boolean matrix. It's a direct adaptation of the APSP algorithm. Use it when the graph is dense and you need all-pairs reachability.

When to Use Floyd-Warshall vs Alternatives

Use Floyd-Warshall when: - Graph is dense (E ~ V²) - V ≤ 1000 (otherwise memory becomes huge) - Negative edge weights are present (Dijkstra can't handle them) - All-pairs results are required (not just single source)

Use repeated Dijkstra when: - Graph is sparse (E << V²) - No negative edges - V is large (10k+)

Use Bellman-Ford per source when: - Negative edges exist but graph is very sparse - V ≤ 500

Use Johnson's algorithm when: - Sparse graph with negative edges and large V - It reweights via Bellman-Ford then runs Dijkstra per source

Real performance numbers: For V=1000, E=10000 (dense-ish), Floyd-Warshall is about 1 second in Java (JIT warmed). Repeated Dijkstra (binary heap) takes ~0.3 seconds. For V=5000, Floyd-Warshall takes ~100 seconds and 200 MB memory; Dijkstra takes ~2 seconds. The crossover point depends on graph density.

Production Insight
Teams often default to Floyd-Warshall because 'it's easier to implement'. That's a mistake for sparse graphs.
I've seen a startup cache routing table for 50k nodes using Floyd-Warshall — 2.5 billion pairs, 200 GB matrix. They had to rewrite with Johnson's.
Rule: profile your graph density before choosing. If E < V·log V, don't use Floyd-Warshall.
Key Takeaway
Dense small graph with negatives → Floyd-Warshall.
Sparse large graph with no negatives → Dijkstra per source.
Sparse large graph with negatives → Johnson's.
Algorithm Selection for APSP
IfNo negative edges, V < 10000
UseUse repeated Dijkstra with binary heap (O(V·(E log V)))
IfNegative edges, dense (E > V·log V), V ≤ 1000
UseUse Floyd-Warshall (O(V³))
IfNegative edges, sparse, V ≤ 500
UseUse repeated Bellman-Ford (O(V²·E))
IfNegative edges, sparse, large V
UseUse Johnson's algorithm (reweight + Dijkstra per source)
IfAll-pairs needed, small graph (V ≤ 200)
UseFloyd-Warshall is simplest, even if sparse

Advantages and Disadvantages of Floyd-Warshall

AdvantageDisadvantage
Simple to implement: only three nested loopsO(V³) time complexity — too slow for V > 1000 in practice
Handles negative edge weights (unlike Dijkstra)Fails on graphs with negative cycles (detectable)
Guarantees correct APSP after exactly V iterations of outer loopO(V²) memory — cannot scale to graphs with 10k+ nodes
No need for complex data structures (heaps, adjacency lists)On sparse graphs, repeated Dijkstra or Johnson's is faster
Can be adapted for transitive closure, minimax path, etc.Not suitable for streaming or dynamic graphs (no incremental update)

The most significant pro is simplicity: Floyd-Warshall is often the first all-pairs algorithm taught in algorithms courses. Its deterministic runtime (always V³) makes it predictable for small graphs. The biggest con is scalability — both time and memory explode with V. In production, a V=2000 graph with Floyd-Warshall takes 8 seconds (O(V³) = 8e9 operations) and 32 MB for a long matrix. V=5000 takes 125 seconds and 200 MB — already problematic. V=10000 would take 1000 seconds and 800 MB — infeasible for most real-time systems.

Production Insight
If you're building an ETA service for a city with 5000 intersections and you need all-pairs, Floyd-Warshall is a poor choice. Instead, run Dijkstra per source with a fast heap, or use contraction hierarchies. The simplicity of Floyd-Warshall often seduces teams into using it beyond its limits.
Key Takeaway
Floyd-Warshall is best for dense graphs with V ≤ 1000 and negative edges. For larger or sparser graphs, choose another algorithm.

Handling Floating-Point Weights with EPS

When edge weights are floating-point numbers (e.g., travel time in hours), directly comparing < can produce wrong results due to tiny arithmetic errors. For example, after many updates, two paths may have distances 0.9999999 and 1.0000001 due to floating-point precision. A strict < comparison would treat the first as shorter, but the true difference is noise. The standard fix is to use an epsilon (EPS) when comparing distances. Typically, EPS = 1e-9 or 1e-12 for double-precision.

The production guard becomes: if (fabs(via - dist[i][j]) < EPS) then consider them equal and keep the existing distance to avoid oscillating updates. Or use via < dist[i][j] - EPS to consider only significant improvements. This prevents the algorithm from infinitely tightening distances due to floating-point drift.

Floating-point Floyd-Warshall is rare in competitive programming (where weights are integers), but common in real-world simulations, GIS, and physics engines. If all weights are integers, prefer integers to avoid EPS headaches entirely.

io/thecodeforge/FloydWarshallDouble.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
public class FloydWarshallDouble {
    private static final double INF = Double.MAX_VALUE / 2.0;
    private static final double EPS = 1e-9;

    public static double[][] shortestPaths(int n, double[][] edges) {
        double[][] dist = new double[n][n];
        for (int i = 0; i < n; i++) {
            for (int j = 0; j < n; j++) {
                dist[i][j] = (i == j) ? 0.0 : INF;
            }
        }
        for (double[] e : edges) {
            dist[(int)e[0]][(int)e[1]] = e[2];
        }

        for (int k = 0; k < n; k++) {
            for (int i = 0; i < n; i++) {
                if (dist[i][k] == INF) continue;
                for (int j = 0; j < n; j++) {
                    if (dist[k][j] == INF) continue;
                    double via = dist[i][k] + dist[k][j];
                    // Use EPS to avoid false improvements from rounding
                    if (via < dist[i][j] - EPS) {
                        dist[i][j] = via;
                    }
                }
            }
        }

        // Negative cycle detection with EPS
        for (int i = 0; i < n; i++) {
            if (dist[i][i] < -EPS) {
                return null; // negative cycle
            }
        }
        return dist;
    }
}
Output
All-pairs shortest distances with floating-point epsilon protection.
Production Insight
Choosing the right EPS is critical: too large and you mask real differences; too small and you're back to noise problems. For distances in the range [0, 1e6], EPS = 1e-9 works well. Also, when using double, be aware that adding many small numbers can cause accumulation errors. Prefer long for integer weights whenever possible.
Key Takeaway
Use EPS when comparing floating-point distances to avoid false updates. Set EPS = 1e-9 for double precision. Prefer integer weights to eliminate this complexity.

Practice Problems

Solidify your understanding of Floyd-Warshall with these problems that test each aspect of the algorithm: basic implementation, negative cycle detection, transitive closure, and production-like scenarios.

  1. CSES Shortest Routes II — The classic Floyd-Warshall problem: compute all-pairs shortest paths for a directed graph with positive and negative weights. Must handle up to V=500, E=2500. Validate your implementation against INF guards and negative cycle detection. [Link to CSES](https://cses.fi/problemset/task/1674)
  2. Codeforces 295B: Greg and Graph — An offline version where nodes are removed one by one and you need the sum of all shortest paths after each removal. Reverse the process and build the graph by adding nodes in reverse order, applying Floyd-Warshall incrementally. Tests understanding of the DP phase property. [Link to Codeforces](https://codeforces.com/problemset/problem/295/B)
  3. UVA 104: Arbitrage — Use Floyd-Warshall (or Floyd-like) to find a sequence of currency exchanges that yields a profit (positive cycle). This is a classic application of detecting positive cycles using multiplication or logarithms. [Link to UVA](https://onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&category=24&page=show_problem&problem=40)
  4. SPOJ TRANSP — Transitive closure on a directed graph. Use the boolean Floyd-Warshall to determine if there is a path between every pair. Test your implementation of the OR-AND variation. [Link to SPOJ](https://www.spoj.com/problems/TRANSP/)
  5. LeetCode 1462: Course Schedule IV — Equivalent to transitive closure: given prerequisites (edges), answer queries whether a course is a prerequisite of another. Use Floyd-Warshall with boolean matrix. V ≤ 100. [Link to LeetCode](https://leetcode.com/problems/course-schedule-iv/)
  6. CSES Flight Routes — Given a flight network with costs, find the cheapest route between all city pairs. Must output the routes (path reconstruction) and use long to avoid overflow. [Link to CSES](https://cses.fi/problemset/task/1666)
  7. Codeforces 1140C: Playlist — While not directly APSP, this problem can be solved with a variant of Floyd-Warshall to compute maximum path attribute ; practice adapting the DP pattern. [Link to Codeforces](https://codeforces.com/problemset/problem/1140/C)
Production Insight
These problems simulate real-world demands: memory limits, overflow, path reconstruction, and incremental updates. The CSES and Codeforces problems in particular have tight limits that will force you to optimise INF guards and loop order. Practice under time pressure to internalise the algorithm.
Key Takeaway
Implement Floyd-Warshall from scratch for these problems. Ensure your code passes with V up to 500 and negative edges. Mastering these will prepare you for both interviews and production work.
● Production incidentPOST-MORTEMseverity: high

Integer Overflow in Shortest Path Server Took Down Routing

Symptom
Some city-pair travel times returned negative numbers; others showed impossibly large values. The front-end displayed 'arriving in -5 minutes'.
Assumption
Engineers assumed edge weights (<=100) would keep sums within int range. They didn't account for paths with 100+ hops.
Root cause
dist matrix declared as int[][]. The sum dist[i][k] + dist[k][j] overflowed when both were large, wrapping to negative. Floyd-Warshall continued with the negative value as a 'shorter' path.
Fix
Changed dist matrix to long[][]. Added overflow check before the update: if (dist[i][k] != INF && dist[k][j] != INF && (dist[i][k] > INF - dist[k][j])) skip update. Also set INF to Long.MAX_VALUE / 2.
Key lesson
  • Always use long for shortest path distances when edge weights can be >1 or V > 1000.
  • Check for overflow before adding — not after. Post-fix detection is too late.
  • Set INF to a value that won't overflow when doubled, e.g., Long.MAX_VALUE / 2.
Production debug guideSymptom to action — diagnose Floyd-Warshall failures fast4 entries
Symptom · 01
Distances are negative, but you checked no negative cycles
Fix
Verify dist[i][i] after the main loop. If any diagonal < 0, a negative cycle exists. If not, check integer overflow (switched to long?).
Symptom · 02
Algorithm runs out of memory for V > 5000
Fix
Floyd-Warshall uses O(V²) memory. For 10k nodes, that's ~800 MB with long[][]. Switch to repeated Dijkstra (O(V·(E+V log V))) or use Bellman-Ford if negative edges but sparse.
Symptom · 03
Running time is 10x slower than expected
Fix
Ensure compiler optimisations are on. In Python, nested loops are slow; use NumPy or Cython. In Java, hot loop JIT compiles after a few runs (warm up).
Symptom · 04
Some shortest paths are missing (shorter unreported)
Fix
The algorithm handles all intermediates if the loop order is k outer, i middle, j inner. Check you didn't reorder loops. Order matters for correctness.
★ Floyd-Warshall Quick Debug Cheat SheetInstant diagnostic commands for Floyd-Warshall issues
dist[i][i] < 0 after algorithm
Immediate action
Negative cycle exists. Report error: 'graph contains negative cycle, APSP undefined'.
Commands
print (any(dist[i][i] < 0 for i in range(V)))
for i in range(V): if dist[i][i] < 0: print(f'node {i} on negative cycle')
Fix now
Remove negative cycles by increasing edge weights or using Bellman-Ford cycle detection first.
dist values overflow (wrap to negative)+
Immediate action
Check data type of dist matrix. If int, change to long. If long and still overflow, increase INF guard.
Commands
printf("%ld", dist[i][k] + dist[k][j]); // print sum before min
dist[i][j] = (dist[i][k] == INF || dist[k][j] == INF) ? dist[i][j] : min(dist[i][j], dist[i][k] + dist[k][j]);
Fix now
Replace dist[i][k] + dist[k][j] with a safe add function that returns INF on overflow.
OutOfMemoryError for large V+
Immediate action
Stop using Floyd-Warshall. Switch to Johnson's algorithm (reweight + Dijkstra) or repeated Bellman-Ford.
Commands
Runtime.getRuntime().totalMemory()
java -Xmx4g -jar app.jar # increase heap if you must
Fix now
Reduce V or use algorithm with O(V) space per source.
Floyd-Warshall vs Alternatives
AlgorithmTime ComplexitySpace ComplexityHandles Negative Edges?Best For
Floyd-WarshallO(V³)O(V²)Yes (no negative cycles)Dense graphs, V ≤ 1000
Repeated Dijkstra (binary heap)O(V·(E log V))O(V²) (store all results) or O(V) per sourceNoSparse graphs, large V, no negative edges
Repeated Bellman-FordO(V²·E)O(V²) or O(V) per sourceYesVery sparse, small V, negative edges
Johnson's AlgorithmO(V·E + V·(E log V))O(V²)YesSparse graphs, negative edges, large V

Key takeaways

1
Floyd-Warshall finds shortest paths between ALL pairs of nodes in O(V³) time.
2
It uses dynamic programming
dist[i][j] is updated by considering every node k as a potential intermediate.
3
It handles negative edge weights but not negative cycles. Check dist[i][i] < 0 to detect cycles.
4
Space complexity is O(V²) for the distance matrix
limits V to ~10k in production.
5
For sparse graphs, repeated Dijkstra is faster; Floyd-Warshall shines on dense graphs or when negative weights are present.
6
Keep the loop order k outside, i middle, j inner
reversing breaks correctness.

Common mistakes to avoid

4 patterns
×

Using Floyd-Warshall on a sparse graph

Symptom
Algorithm takes hours for V=5000, E=10000. Memory usage blows up to 200 MB for the matrix.
Fix
Switch to repeated Dijkstra (if no negative edges) or Johnson's algorithm. Measure density: if E < V·log V, avoid Floyd-Warshall.
×

Not checking for integer overflow in the distance matrix

Symptom
Distances become negative or incorrect after many updates. Possible paths wrap around INT_MAX.
Fix
Use long (64-bit) for distances. Set INF to Long.MAX_VALUE / 2. Add overflow check before adding: if (dist[i][k] > INF - dist[k][j]) skip.
×

Reordering the loops (i,k,j or i,j,k)

Symptom
Algorithm produces incorrect distances. Some paths not discovered.
Fix
Always keep k as the outermost loop, i middle, j inner. The recurrence relies on k being the 'intermediate node under consideration'.
×

Assuming all nodes are affected if a negative cycle exists

Symptom
Marking every pair as -∞ when only a subset is reachable from the cycle. Unnecessary loss of correct distances.
Fix
Propagate reachability from the cycle as shown in the Negative Cycle Detection section. Only affected pairs become undefined.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR
What is the time and space complexity of Floyd-Warshall?
Q02SENIOR
How does Floyd-Warshall detect negative cycles?
Q03SENIOR
When would you choose Floyd-Warshall over repeated Dijkstra or Bellman-F...
Q04SENIOR
How would you modify Floyd-Warshall to reconstruct the actual shortest p...
Q01 of 04JUNIOR

What is the time and space complexity of Floyd-Warshall?

ANSWER
Time: O(V³) from the triple-nested loop. Space: O(V²) for the distance matrix (and optionally O(V²) for the next matrix for path reconstruction). Each operation is constant time (addition and comparison). The algorithm solves the all-pairs shortest path problem in one pass.
FAQ · 5 QUESTIONS

Frequently Asked Questions

01
When should I use Floyd-Warshall instead of running Dijkstra for every node?
02
Does Floyd-Warshall work on undirected graphs?
03
How do I detect a negative cycle with Floyd-Warshall?
04
How do I reconstruct the actual path, not just the distance?
05
Can Floyd-Warshall handle graphs with 100,000 nodes?
🔥

That's Graphs. Mark it forged?

11 min read · try the examples if you haven't

Previous
Bellman-Ford Algorithm
6 / 17 · Graphs
Next
Topological Sort