Senior 10 min · March 05, 2026
Graph Representation

45-Minute BFS: Adjacency List vs Matrix Memory Mistake

A 2M-node, 10M-edge real graph using adjacency matrix required 4TB memory, slowing BFS to 45 minutes.

N
Naren Founder & Principal Engineer

20+ years shipping performance-critical code where algorithms decide the bill. Lessons pulled from things that broke in production.

Follow
Production
production tested
May 24, 2026
last updated
1,554
articles · all by Naren
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer
  • Graph representation decides how edges and nodes are stored in memory
  • Adjacency matrix: O(V²) space, O(1) edge lookup
  • Adjacency list: O(V+E) space, O(degree) neighbor access
  • Matrix wastes memory on sparse graphs; list dominates real-world apps
  • Performance insight: matrix BFS on 10K nodes scans 100M cells; list BFS traverses only edges
  • Production insight: wrong representation turns a seconds-long algorithm into hours
  • Biggest mistake: using matrix for sparse graph without checking E vs V²
✦ Definition~90s read
What is Graph Representation?

Graph representation is the foundational decision in graph algorithms—choosing between an adjacency matrix and an adjacency list determines whether your BFS finishes in 45 minutes or 45 seconds. An adjacency matrix stores edges in a V×V grid, giving O(1) edge lookups but O(V²) memory, which kills you at 10,000+ nodes (100 million entries, ~800 MB for ints).

Imagine a city map.

An adjacency list stores edges per vertex as linked lists or arrays, using O(V+E) memory—typically 10-100x less for sparse graphs—but costs O(degree) per neighbor iteration. This tradeoff is why production graph databases (Neo4j, Amazon Neptune) and frameworks (NetworkX, igraph) default to adjacency lists for real-world graphs, which are almost always sparse (E << V²).

The mistake that costs 45 minutes is using a matrix for a sparse graph with 50,000 nodes: you allocate 2.5 billion cells, thrash cache, and turn a linear BFS into a quadratic memory disaster. Conversely, for dense graphs (E ≈ V²/2) like social networks with 10,000 users where everyone knows everyone, the matrix wins on iteration speed.

The decision framework boils down to: if your graph fits in RAM as a matrix and you need constant-time edge checks, use it; otherwise, the list is your only scalable option.

Plain-English First

Imagine a city map. Every city is a dot, and every road connecting two cities is a line. A graph is exactly that — dots (called nodes) connected by lines (called edges). Now, how do you store that map inside a computer's memory? That's the whole game of graph representation. You're essentially choosing the best filing system for your map so that lookups, additions, and traversals are as fast as possible.

Graphs power more of your daily software than you'd guess. Google Maps finds your fastest route using graph traversal. Facebook's friend suggestions use graph connectivity. LinkedIn's 'second connection' feature? Graph theory. Every time you see a network of relationships — people, cities, servers, web pages — there's a graph underneath. Understanding how to store that graph isn't a minor detail; it's the decision that determines whether your algorithm runs in milliseconds or minutes.

The core problem is this: a graph is an abstract concept, but your computer only understands memory — arrays, pointers, contiguous blocks of data. You need a concrete data structure that faithfully represents which nodes exist and which nodes are connected, without wasting memory or making lookups unnecessarily slow. Two approaches dominate: the adjacency matrix and the adjacency list. Each is a completely different philosophy about the trade-off between memory usage and access speed, and choosing the wrong one for your problem is a classic, costly mistake.

By the end of this article you'll be able to build both representations from scratch in Java, explain in plain English why each one exists, look at a problem description and immediately know which structure to reach for, and walk into an interview and answer the follow-up questions that trip most candidates up.

Graph Representation: The Memory vs Speed Tradeoff You Can't Ignore

Graph representation is how you store relationships between nodes in memory. The two dominant strategies are adjacency matrices and adjacency lists. A matrix uses a V×V boolean or integer array where cell [i][j] is 1 if edge i→j exists. A list stores for each vertex a collection of its neighbors — typically an ArrayList or LinkedList. That's it. The choice determines whether your BFS runs in O(V²) memory or O(V+E) memory, and whether edge existence checks are O(1) or O(degree).

In practice, adjacency matrices waste space on sparse graphs: a social network with 10⁶ users and ~100 friends each would need 10¹² entries (terabytes) vs 10⁸ entries (megabytes) for lists. But matrices give constant-time edge queries, which matters for dense graphs like fully connected layers in neural networks or all-pairs shortest paths on small graphs (V < 1000). Lists dominate in real-world systems because most graphs are sparse — roads, web links, social connections.

Use adjacency lists for BFS/DFS traversal, shortest path on sparse graphs, and any algorithm that iterates neighbors. Use matrices only when V is small (under a few thousand) or you need O(1) edge existence checks and can afford the memory. The wrong choice here is the most common memory mistake in graph algorithms — picking a matrix for a sparse graph will crash your process or exhaust heap.

Sparse vs Dense: The 5% Rule
If your graph has fewer than 5% of possible edges, an adjacency matrix wastes at least 95% of its allocated memory — and that's before considering cache misses.
Production Insight
Teams building a recommendation engine stored user-item interactions in a 500k×500k adjacency matrix — 250 billion entries, 1 TB of memory. The symptom was repeated OutOfMemoryError during BFS-based collaborative filtering. The rule: if E < V²/10, use adjacency lists; if E > V²/2, consider a matrix.
Key Takeaway
Adjacency lists are the default for sparse graphs (most real-world graphs).
Adjacency matrices are only justified for dense graphs or when O(1) edge lookup is critical.
Memory footprint scales as O(V²) for matrix vs O(V+E) for list — choose based on edge density, not convenience.
Graph Repr: Adjacency Matrix vs List Memory Mistake THECODEFORGE.IO Graph Repr: Adjacency Matrix vs List Memory Mistake Memory vs speed tradeoff in graph representation choices Adjacency Matrix O(V^2) memory, O(1) edge lookup Adjacency List O(V+E) memory, O(deg) edge lookup Edge List O(E) memory, O(E) edge lookup Directed & Weighted Graphs Edges have direction and weight Decision Framework Choose based on density and operations ⚠ Common trap: assuming matrix is always faster For sparse graphs, adjacency list uses less memory and is faster THECODEFORGE.IO
thecodeforge.io
Graph Repr: Adjacency Matrix vs List Memory Mistake
Graph Representation

Worked Example — Building and Comparing Graph Representations

Graph: 4 nodes (0,1,2,3), edges: 0-1, 0-2, 1-2, 2-3.

Adjacency Matrix (4x4, undirected): 0 1 2 3 0 [0, 1, 1, 0] 1 [1, 0, 1, 0] 2 [1, 1, 0, 1] 3 [0, 0, 1, 0]

Check if 0-3 are connected: matrix[0][3] = 0 → No. O(1). Space: O(V^2) = O(16). For dense graphs this is fine; for 1000 nodes with 10 edges, this wastes 999,990 cells.

Adjacency List: 0: [1, 2] 1: [0, 2] 2: [0, 1, 3] 3: [2]

Check if 0-3 connected: scan 0's list [1,2] → not found. O(degree(0)). Space: O(V + E) = O(4 + 4) = O(8). Much better for sparse graphs.

Edge List: [(0,1),(0,2),(1,2),(2,3)]. Space: O(E). Used in Kruskal's MST algorithm where edges are sorted by weight.

Production Insight
The space difference in this tiny example is 2x (16 vs 8).
For a graph with 50K nodes and 100K edges, the difference is 2.5 billion cells vs 100K entries.
Rule: always estimate E before choosing — the matrix is a memory bomb for sparse graphs.
Key Takeaway
Small examples hide the real cost.
Always scale the space calculation to your actual problem size.
If V is large and E is small, adjacency list is the only viable option.

Adjacency Matrix — The 'Spreadsheet' Approach to Storing Edges

An adjacency matrix is a 2D grid where both the rows and columns represent nodes. If there's an edge between node A and node B, you put a 1 (or the edge weight) at position [A][B]. If there's no edge, you store 0. That's it.

Why would you choose a grid? Because checking whether two specific nodes are connected is O(1) — instantaneous. You just index directly into the array like matrix[3][7]. There's no searching, no traversal. That property is invaluable when your core operation is 'does this edge exist?'

The cost you pay is memory. If your graph has N nodes, you always allocate N × N space — even if your graph only has a handful of edges. A social network with 100 million users storing an adjacency matrix would need 10 quadrillion cells. That's why adjacency matrices are reserved for dense graphs — graphs where most possible edges actually exist — and for smaller graphs where the O(1) edge lookup is worth the memory price.

For weighted graphs (like road distances), you simply store the weight instead of 1. For unweighted undirected graphs, the matrix is always symmetric — matrix[i][j] always equals matrix[j][i].

AdjacencyMatrix.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public class AdjacencyMatrix {

    private final int[][] matrix;   // The actual grid storing edge data
    private final int nodeCount;    // Total number of nodes in the graph

    public AdjacencyMatrix(int nodeCount) {
        this.nodeCount = nodeCount;
        // Allocate a nodeCount x nodeCount grid, all initialised to 0 (no edges)
        this.matrix = new int[nodeCount][nodeCount];
    }

    // Add an undirected edge between two nodes (both directions matter)
    public void addEdge(int fromNode, int toNode) {\n        // Validate to avoid silent index-out-of-bounds bugs in production\n        if (fromNode < 0 || toNode < 0 || fromNode >= nodeCount || toNode >= nodeCount) {\n            throw new IllegalArgumentException(\n                \"Node index out of range: \" + fromNode + \", \" + toNode\n            );\n        }\n        matrix[fromNode][toNode] = 1;  // Mark edge in one direction\n        matrix[toNode][fromNode] = 1;  // Mark edge in the other (undirected)\n    }\n\n    // O(1) check — this is the matrix's superpower\n    public boolean hasEdge(int fromNode, int toNode) {\n        return matrix[fromNode][toNode] == 1;\n    }\n\n    // Print the full grid so we can visualise the structure\n    public void printMatrix() {\n        System.out.print(\"    \");\n        for (int col = 0; col < nodeCount; col++) {\n            System.out.printf(\"%3d\", col);  // Column headers\n        }\n        System.out.println();\n\n        for (int row = 0; row < nodeCount; row++) {\n            System.out.printf(\"%3d |\", row);  // Row header\n            for (int col = 0; col < nodeCount; col++) {\n                System.out.printf(\"%3d\", matrix[row][col]);\n            }\n            System.out.println();\n        }\n    }\n\n    public static void main(String[] args) {\n        // Modelling a small city road network with 5 intersections (nodes 0-4)\n        AdjacencyMatrix cityRoads = new AdjacencyMatrix(5);\n\n        cityRoads.addEdge(0, 1);  // Road between intersection 0 and 1\n        cityRoads.addEdge(0, 4);  // Road between intersection 0 and 4\n        cityRoads.addEdge(1, 2);  // Road between intersection 1 and 2\n        cityRoads.addEdge(1, 3);  // Road between intersection 1 and 3\n        cityRoads.addEdge(3, 4);  // Road between intersection 3 and 4\n\n        cityRoads.printMatrix();\n\n        System.out.println();\n        // O(1) lookup — is there a direct road from 0 to 3?\n        System.out.println(\"Direct road from 0 to 3? \" + cityRoads.hasEdge(0, 3));\n        System.out.println(\"Direct road from 1 to 3? \" + cityRoads.hasEdge(1, 3));\n    }\n}",
        "output": "      0  1  2  3  4\n  0 |  0  1  0  0  1\n  1 |  1  0  1  1  0\n  2 |  0  1  0  0  0\n  3 |  0  1  0  0  1\n  4 |  1  0  0  1  0\n\nDirect road from 0 to 3? false\nDirect road from 1 to 3? true"
      }

Adjacency List — The 'Contacts App' Approach That Scales

An adjacency list stores a graph differently: instead of a giant grid, each node gets its own personal list of the nodes it's directly connected to. If node 3 connects to nodes 1, 5, and 7, then node 3's list contains [1, 5, 7]. Nothing more.

Think of it like a contacts app. Each person (node) has a list of their direct friends (neighbours). You don't store a slot for every person on the planet that says 'not friends'. You only store the actual connections.

This is why adjacency lists dominate in the real world: most graphs are sparse — they have far fewer edges than the theoretical maximum. A social network with a billion users might have each user averaging 200 friends. An adjacency matrix would waste an astronomical amount of memory on zeros. An adjacency list stores only the real edges, giving you O(V + E) space where V is vertices and E is edges.

The trade-off: checking if a specific edge exists is now O(degree of node) — you have to scan the neighbour list. But for most graph algorithms like BFS and DFS, you're iterating over neighbours anyway, which is exactly what adjacency lists are optimised for.

AdjacencyList.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class AdjacencyList {

    // Each node maps to a list of its direct neighbours
    // Using a Map allows nodes to have any label (String, Integer, etc.)
    private final Map<Integer, List<Integer>> adjacencyList;

    public AdjacencyList() {
        this.adjacencyList = new HashMap<>();
    }

    // Register a node so it exists in the graph even with no edges
    public void addNode(int node) {
        adjacencyList.putIfAbsent(node, new ArrayList<>());
    }

    // Add an undirected edge — each side adds the other to its neighbour list
    public void addEdge(int fromNode, int toNode) {\n        addNode(fromNode);  // Ensure both nodes exist before adding edges\n        addNode(toNode);\n\n        adjacencyList.get(fromNode).add(toNode);  // fromNode now knows about toNode\n        adjacencyList.get(toNode).add(fromNode);  // toNode now knows about fromNode\n    }

    // O(degree) check — scan the neighbour list of fromNode
    public boolean hasEdge(int fromNode, int toNode) {\n        List<Integer> neighbours = adjacencyList.getOrDefault(fromNode, new ArrayList<>());\n        return neighbours.contains(toNode);\n    }

    // Get all direct neighbours — this is the operation BFS/DFS uses constantly
    public List<Integer> getNeighbours(int node) {
        return adjacencyList.getOrDefault(node, new ArrayList<>());
    }

    // Print each node alongside its neighbour list
    public void printList() {
        for (Map.Entry<Integer, List<Integer>> entry : adjacencyList.entrySet()) {
            System.out.println("Node " + entry.getKey() + " --> " + entry.getValue());
        }
    }

    public static void main(String[] args) {
        // Modelling a simplified version of a flight route network
        AdjacencyList flightRoutes = new AdjacencyList();

        flightRoutes.addEdge(0, 1);  // London -- Paris
        flightRoutes.addEdge(0, 4);  // London -- Dubai
        flightRoutes.addEdge(1, 2);  // Paris -- New York
        flightRoutes.addEdge(1, 3);  // Paris -- Tokyo
        flightRoutes.addEdge(3, 4);  // Tokyo -- Dubai

        System.out.println("=== Flight Route Adjacency List ===");
        flightRoutes.printList();

        System.out.println();
        System.out.println("Neighbours of node 1 (Paris): " + flightRoutes.getNeighbours(1));
        System.out.println("Direct flight from 0 to 3? " + flightRoutes.hasEdge(0, 3));
        System.out.println("Direct flight from 1 to 4? " + flightRoutes.hasEdge(1, 4));
    }
}
Output
=== Flight Route Adjacency List ===
Node 0 --> [1, 4]
Node 1 --> [0, 2, 3]
Node 2 --> [1]
Node 3 --> [1, 4]
Node 4 --> [0, 3]
Neighbours of node 1 (Paris): [0, 2, 3]
Direct flight from 0 to 3? false
Direct flight from 1 to 4? false
Why BFS and DFS Love Adjacency Lists:
Both BFS and DFS need to repeatedly ask 'give me all neighbours of this node'. With an adjacency list, that operation is instant — you just return the pre-built list. With a matrix, you'd scan an entire row of N cells just to find the non-zero entries. For traversal-heavy algorithms, the list wins every time.
Production Insight
Adjacency list iteration is the bottleneck for graph algorithms.
If you store neighbours in an ArrayList, contains() is O(degree) and may be slow.
For frequent edge existence checks, use HashSet per node instead — O(1) but more memory.
Rule: pick the inner collection based on your dominant operation: list for iteration, set for existence.
Key Takeaway
Adjacency list = O(V+E) space, O(degree) neighbor access.
The default choice for 99% of real-world graphs.
Optimise the inner collection based on whether you traverse or check existence.

Directed and Weighted Graphs — When Edges Have Direction and Cost

So far the edges in our examples have been undirected — a road from A to B means you can also go from B to A. But real-world graphs often have direction. Twitter follows are directed: you following someone doesn't mean they follow you back. One-way streets are directed. Dependencies in a build system are directed.

A directed graph (digraph) changes one thing in the adjacency list: when you add an edge from A to B, you only add B to A's neighbour list — not the reverse. In the matrix, you only set matrix[A][B] = 1, not matrix[B][A].

Weighted graphs add a cost to each edge. Google Maps doesn't just care that roads exist — it cares how long each road takes. For an adjacency list, you swap the simple integer neighbour for an Edge object that holds both the destination and the weight. For a matrix, you store the weight value instead of 1.

The code below combines both concepts — a directed, weighted graph — which is the most realistic structure you'll implement in real interview problems and production systems.

DirectedWeightedGraph.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class DirectedWeightedGraph {

    // Inner class representing a directed edge with a weight (travel cost)
    static class Edge {
        int destination;  // Where this edge points TO
        int weight;       // The cost to traverse this edge (e.g. distance in km)

        Edge(int destination, int weight) {\n            this.destination = destination;\n            this.weight = weight;\n        }

        @Override
        public String toString() {
            return "->" + destination + "(cost:" + weight + ")";
        }
    }

    // Each node has a list of outgoing weighted edges
    private final Map<Integer, List<Edge>> graph;

    public DirectedWeightedGraph() {
        this.graph = new HashMap<>();
    }

    public void addNode(int node) {
        graph.putIfAbsent(node, new ArrayList<>());
    }

    // Directed: ONLY adds toNode to fromNode's list, not the reverse
    public void addDirectedEdge(int fromNode, int toNode, int weight) {
        addNode(fromNode);
        addNode(toNode);
        graph.get(fromNode).add(new Edge(toNode, weight));
        // Notice: we do NOT add fromNode to toNode's list here
    }

    public List<Edge> getOutgoingEdges(int node) {
        return graph.getOrDefault(node, new ArrayList<>());
    }

    public void printGraph() {
        for (Map.Entry<Integer, List<Edge>> entry : graph.entrySet()) {
            System.out.print("Node " + entry.getKey() + " : ");
            for (Edge edge : entry.getValue()) {
                System.out.print(edge + "  ");
            }
            System.out.println();
        }
    }

    public static void main(String[] args) {
        // Modelling a package delivery network where roads are one-directional
        // and weight represents delivery time in minutes
        DirectedWeightedGraph deliveryNetwork = new DirectedWeightedGraph();

        deliveryNetwork.addDirectedEdge(0, 1, 10);  // Warehouse -> Depot A (10 mins)
        deliveryNetwork.addDirectedEdge(0, 2, 30);  // Warehouse -> Depot B (30 mins)
        deliveryNetwork.addDirectedEdge(1, 3, 15);  // Depot A   -> Customer X (15 mins)
        deliveryNetwork.addDirectedEdge(2, 3, 5);   // Depot B   -> Customer X (5 mins)
        deliveryNetwork.addDirectedEdge(1, 2, 8);   // Depot A   -> Depot B (8 mins)

        System.out.println("=== Delivery Network (Directed + Weighted) ===");
        deliveryNetwork.printGraph();

        System.out.println();
        System.out.println("Routes out of Depot A (node 1): "
            + deliveryNetwork.getOutgoingEdges(1));

        // Customer X (node 3) has no outgoing edges — it's a terminal node
        System.out.println("Routes out of Customer X (node 3): "
            + deliveryNetwork.getOutgoingEdges(3));
    }
}
Output
=== Delivery Network (Directed + Weighted) ===
Node 0 : ->1(cost:10) ->2(cost:30)
Node 1 : ->3(cost:15) ->2(cost:8)
Node 2 : ->3(cost:5)
Node 3 :
Routes out of Depot A (node 1): [->3(cost:15), ->2(cost:8)]
Routes out of Customer X (node 3): []
Watch Out — Directed vs Undirected Bugs Are Silent:
If you accidentally add both directions when implementing a directed graph, your algorithm won't crash — it'll just give wrong answers. Dijkstra's on a directed graph with accidental reverse edges will find paths that don't actually exist. Always verify directionality before running any shortest-path algorithm.
Production Insight
Directed graph bugs are silent and deadly.
A missing reverse edge in an undirected graph is a bug; a missing reverse edge in a directed graph is correct.
The real production pitfall: accidentally adding both directions when the graph should be directed.
Rule: make addEdge() enforce directionality explicitly — don't rely on the caller to remember.
Key Takeaway
Directed graphs: addEdge(A,B) adds only A→B.
Always verify direction before running shortest-path algorithms.
Use Edge objects for weights; never store weight in a separate map — it breaks encapsulation.

Choosing the Right Representation: A Decision Framework

You don't need to memorise every trade-off — you need a mental checklist. Here's the decision framework used by senior engineers when starting a new graph problem:

  1. Count V and E: If you don't know them, estimate. For most real problems, E is much less than V².
  2. Check density: density = E / (V*(V-1)/2). If density > 20%, consider matrix. If density < 5%, definitely list.
  3. Identify dominant operations:
  4. - Edge existence checks: matrix wins.
  5. - Iterating neighbors: list wins.
  6. - Adding new nodes dynamically: list wins (matrix resizing is O(V²)).
  7. Consider algorithm: Floyd-Warshall needs matrix. BFS/DFS/Dijkstra need neighbor iteration — list.
  8. Memory budget: If you can't allocate V² * 4 bytes, don't use matrix.

This framework applies to both interviews and production design decisions.

Production Insight
The decision is not permanent — you can convert between representations.
Matrix to list conversion: O(V²) to scan all cells. List to matrix: O(V²) to allocate + O(E) to populate.
Rule: always prototype with list first. Only switch to matrix if profiling shows edge existence check is the bottleneck.
Key Takeaway
Count V and E first — that's 80% of the decision.
Sparse + traversal = adjacency list.
Dense + edge checks = adjacency matrix.
When in doubt, start with list.
Graph Representation Decision Tree
IfE >> V (dense graph) and V < 10,000
UseUse adjacency matrix. O(1) edge checks justify memory cost.
IfE << V² (sparse graph) or V > 10,000
UseUse adjacency list. Default choice.
IfAlgorithm requires all-pairs shortest paths (Floyd-Warshall)
UseUse adjacency matrix – the algorithm already accesses every pair.
IfFrequent dynamic node addition/deletion
UseUse adjacency list. Matrix resizes are expensive.
IfYou need to sort edges by weight (Kruskal's MST)
UseUse edge list. Neither matrix nor list gives you edges as a flat array.

Real-World Performance Benchmark: Matrix vs List on a 10,000 Node Graph

Let's compare both representations on a realistic scenario: a graph with 10,000 nodes.

Sparse case: 15,000 edges (average degree 3). - Matrix: 100 million cells (10,000 × 10,000). At 4 bytes per int = 400 MB. BFS visits each node, scanning 10,000 cells per visit = 100 million checks. - List: 15,000 edge entries plus 10,000 list headers ≈ 25,000 references ≈ 200 KB (plus node objects). BFS visits each neighbor directly, total 15,000 iterations.

Dense case: 50 million edges (density ~100%). - Matrix: 400 MB, same as before. BFS still scans 10,000 per node. - List: 50 million entries ≈ 400 MB (each entry a reference to node object). BFS does 50 million iterations. Here the memory is similar, but list still gives faster neighbor iteration for traversal algorithms.

Key insight: For traversal-heavy algorithms, adjacency list is faster regardless of density because you only touch actual neighbors. The matrix forces you to scan zeros.

For edge-existence-heavy algorithms on dense graphs, matrix wins — but only if you can fit it in memory.

GraphBenchmark.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// Snippet: Time to find all neighbors of every node
// Matrix approach
long start = System.nanoTime();
for (int i = 0; i < V; i++) {
    List<Integer> neighbors = new ArrayList<>();
    for (int j = 0; j < V; j++) {
        if (matrix[i][j] != 0) neighbors.add(j);
    }
}
long matrixTime = System.nanoTime() - start;

// List approach
start = System.nanoTime();
for (int i = 0; i < V; i++) {
    List<Integer> neighbors = adjList.get(i);  // O(1)
}
long listTime = System.nanoTime() - start;

System.out.println("Matrix neighbor collection: " + matrixTime + " ns");
System.out.println("List neighbor collection: " + listTime + " ns");

Edge List — The Representation Nobody Talks About (Until Query Performance Bites You)

You've seen the matrix and the list. There's a third player that senior engineers keep in their back pocket: the edge list. It's exactly what it sounds like — a flat list of edges, each entry containing (source, destination, weight).

Why would you ever use something so primitive? Two reasons: serialization and graph construction. When you're building a graph from a CSV file, a database query, or a network stream, the data arrives as edges. You don't need a matrix or adjacency list yet. The edge list is your staging area.

More importantly, the edge list dominates when you need to iterate over every edge in the graph. In the matrix, that's O(V²) — you're checking cells that don't exist. In the adjacency list, you're still traversing per-vertex overhead. The edge list gives you O(E) iteration with the tightest possible memory footprint: 3E integers (or 2E for unweighted). For static analysis, batch processing, or graph algorithms that need edge-level parallelism, this is the default.

The catch? Checking if an edge exists between two vertices is O(E). That's prohibitive for interactive queries. But for offline computation? Edge list is faster than both alternatives.

EdgeListBuild.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// io.thecodeforge — dsa tutorial

import java.util.*;

public class EdgeListBuild {
    static class Edge {
        int src, dest, weight;
        Edge(int s, int d, int w) { src = s; dest = d; weight = w; }
    }

    public static void main(String[] args) {
        // Build from CSV: nodeA,nodeB,weight
        String[] raw = {"2,5,3", "7,1,8", "3,9,2"};
        List edges = new ArrayList<>();
        
        for (String line : raw) {
            String[] parts = line.split(",");
            edges.add(new Edge(
                Integer.parseInt(parts[0]),
                Integer.parseInt(parts[1]),
                Integer.parseInt(parts[2])
            ));
        }

        // Iterate all edges — O(E) and no overhead
        int totalWeight = 0;
        for (Edge e : edges) totalWeight += e.weight;
        
        System.out.println("Edges parsed: " + edges.size());
        System.out.println("Total weight: " + totalWeight);
    }
}
Output
Edges parsed: 3
Total weight: 13
Senior Shortcut:
Edge lists serialize to JSON/CSV in one pass. Adjacency lists need nested loops. For any system that exchanges graph data, the edge list is your wire format — convert to matrix or list only when the query pattern demands it.
Key Takeaway
Use edge lists for construction and batch processing; convert to matrix or adjacency list only when queries need random access.

Compressed Sparse Row (CSR) — The Memory Monster's Worst Nightmare

Adjacency lists waste memory on pointer overhead. Each edge in a linked-list-based adjacency list needs a node object with two references and the destination. On a 64-bit JVM, that's 24-32 bytes per edge — just for the node, before you store the actual vertex ID.

Compressed Sparse Row (CSR) kills that overhead dead. It stores graph data in three flat arrays:

  • rowIndex[V+1] — the start and end offsets into columnIndex for each vertex
  • columnIndex[E] — the destination vertices, sorted by source vertex
  • value[E] — edge weights (optional for unweighted)

Lookup is O(log degree) thanks to sorted neighbors. Total memory for an unweighted graph: V+1 + E integers. That's it. No objects, no pointers, no cache misses from chasing references. CSR is the standard in high-performance graph frameworks like GraphX, SuiteSparse, and every production graph database worth its salt.

Building CSR from an edge list is a 3-step sort-and-compress. The construction cost is O(E log E), but you pay it once. Every traversal after that hits contiguous memory — your L1 cache thanks you.

For graphs with millions of edges, adjacency lists will page-fault you to death. CSR maps to a single malloc call. Your cloud bill will notice the difference.

CSRBuild.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// io.thecodeforge — dsa tutorial

import java.util.*;

public class CSRBuild {
    public static void main(String[] args) {
        // Raw edges: (src, dest)
        int[][] edges = {{0,2}, {1,2}, {1,3}, {2,3}, {4,0}};
        int V = 5, E = edges.length;

        // Determine degree of each vertex
        int[] degree = new int[V];
        for (int[] e : edges) degree[e[0]]++;

        // Build rowIndex from prefix sum of degrees
        int[] rowIndex = new int[V + 1];
        for (int i = 0; i < V; i++)
            rowIndex[i + 1] = rowIndex[i] + degree[i];

        // Fill columnIndex
        int[] columnIndex = new int[E];
        int[] cursor = Arrays.copyOf(rowIndex, V); // temp pointers
        for (int[] e : edges) {
            int src = e[0], dest = e[1];
            columnIndex[cursor[src]++] = dest;
        }

        // Query neighbors of vertex 1
        int start = rowIndex[1], end = rowIndex[2];
        System.out.print("Neighbors of 1: ");
        for (int i = start; i < end; i++)
            System.out.print(columnIndex[i] + " ");
        System.out.println();
        
        long memory = (V + 1 + E) * 4L; // ints = 4 bytes
        System.out.println("CSR memory: " + memory + " bytes");
    }
}
Output
Neighbors of 1: 2 3
CSR memory: 44 bytes
Production Trap:
CSR is read-optimized. Inserting edges requires rebuilding the entire structure. If your graph mutates at runtime, stick with adjacency lists. CSR is for static graphs that you query like crazy — social network analysis, road networks, static call graphs.
Key Takeaway
CSR gives you adjacency list speed with array-level memory density. Use it when your graph is static and query-heavy.

Topological Sorting — Dependency Hell, Solved in Linear Time

If your graph represents dependencies — build systems, course prerequisites, deployment pipelines — you need topological ordering. It's a linear sequence of vertices where every edge goes from earlier to later. No cycles allowed. If there's a cycle, your system deadlocks. Full stop.

Topological sort runs in O(V+E) using Kahn's algorithm. You count indegrees, push zero-in-degree nodes into a queue, process them, decrement neighbors. If the queue empties before you've processed all nodes, you've got a cycle. That's your alarm bell.

Production teams use this for scheduling job DAGs. Airflow, Makefile dependencies, Spark stages — all rely on topological ordering. If your graph doesn't have a valid topological order, your dependency graph is broken. Fix the cycle first, then deploy.

TopologicalSort.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// io.thecodeforge — dsa tutorial

import java.util.*;

public class TopologicalSort {
    public static List<Integer> sort(int n, int[][] edges) {
        List<Integer>[] g = new ArrayList[n];
        int[] indeg = new int[n];
        for (int i = 0; i < n; i++) g[i] = new ArrayList<>();
        for (int[] e : edges) {
            g[e[0]].add(e[1]);
            indeg[e[1]]++;
        }
        Queue<Integer> q = new LinkedList<>();
        for (int i = 0; i < n; i++) if (indeg[i] == 0) q.add(i);
        List<Integer> order = new ArrayList<>();
        while (!q.isEmpty()) {
            int u = q.poll();
            order.add(u);
            for (int v : g[u]) if (--indeg[v] == 0) q.add(v);
        }
        return order.size() == n ? order : List.of(); // cycle
    }

    public static void main(String[] args) {
        int[][] edges = {{5,0},{4,0},{5,2},{2,3},{3,1},{4,1}};
        System.out.println(sort(6, edges));
    }
}
Output
[4, 5, 0, 2, 3, 1]
Production Trap:
Never trust a topological sort that doesn't return all nodes. An empty result means you ignored a cycle. Always validate the output length against vertex count before scheduling downstream tasks.
Key Takeaway
Topological order exists only in DAGs. If you can't order it, you've got a cycle — fix that before anything else.

Cycles — The Silent Killer of Every Graph-Based System

A cycle is a path of edges where the first and last vertex are the same. Sounds academic. In production, cycles mean deadlocks. Circular dependencies in your microservices. Infinite loops in state machines. Garbage collection root leaks.

Detecting cycles is cheap. For an undirected graph, run DFS and check if you hit a visited neighbor that isn't your parent. For a directed graph, track a recursion stack — gray set in DFS. If you back-edge to a node still in the recursion stack, you've found a cycle. That's O(V+E). No excuses.

Most teams bake cycle detection into their build pipeline. They refuse to deploy if the dependency graph has a cycle. Why? Because if your service depends on A, which depends on B, which depends on A, you can't start either one. Detection is free. Downtime is not.

CycleDetection.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// io.thecodeforge — dsa tutorial

import java.util.*;

public class CycleDetection {
    static boolean hasCycle(int n, int[][] edges) {
        List<Integer>[] g = new ArrayList[n];
        for (int i = 0; i < n; i++) g[i] = new ArrayList<>();
        for (int[] e : edges) g[e[0]].add(e[1]);
        int[] state = new int[n]; // 0=unvisited,1=visiting,2=done
        for (int i = 0; i < n; i++) if (dfs(i, g, state)) return true;
        return false;
    }

    static boolean dfs(int u, List<Integer>[] g, int[] state) {
        if (state[u] == 1) return true;
        if (state[u] == 2) return false;
        state[u] = 1;
        for (int v : g[u]) if (dfs(v, g, state)) return true;
        state[u] = 2;
        return false;
    }

    public static void main(String[] args) {
        int[][] cyclic = {{0,1},{1,2},{2,0}};
        int[][] acyclic = {{0,1},{1,2}};
        System.out.println("Cyclic: " + hasCycle(3, cyclic));
        System.out.println("Acyclic: " + hasCycle(3, acyclic));
    }
}
Output
Cyclic: true
Acyclic: false
Senior Shortcut:
Use Union-Find for undirected cycle detection — simpler than DFS, no recursion depth issues. For directed graphs, stick with DFS + state array. Don't confuse the two; they're fundamentally different problems.
Key Takeaway
A cycle in production means a deadlock waiting to happen. Detect it at compile time, not at 3 AM during an outage.
● Production incidentPOST-MORTEMseverity: high

The Slow Graph Traversal Incident

Symptom
BFS traversal that should complete in seconds was taking over 45 minutes. Memory usage was at 99% of heap.
Assumption
The team assumed adjacency matrix would give fast edge lookups for shortest path calculations.
Root cause
The graph had 2 million nodes but only ~10 million edges (average degree 5). Matrix size: 4 trillion cells (2M x 2M). Even though each cell was a byte, that's 4 TB of memory. JVM couldn't allocate it, so they used a sparse matrix library that still caused O(n²) traversal cost.
Fix
Switched to adjacency list. Space dropped to ~80 MB. BFS completed in under 3 seconds.
Key lesson
  • Always estimate E vs V² before choosing representation. If E << V², adjacency list wins.
  • Never assume O(1) edge lookup is worth the memory cost until you compute the actual matrix size.
  • When memory is tight, adjacency list is the default. Only use matrix when density exceeds ~20%.
Production debug guideSymptom → Action for common graph representation problems in production5 entries
Symptom · 01
OutOfMemoryError when creating graph
Fix
Check if you're using adjacency matrix for a large graph. Calculate V² in bytes. If V > 10,000 and E << V², switch to adjacency list immediately.
Symptom · 02
BFS/DFS misses nodes that should be reachable
Fix
Verify undirected edges are added both ways. Print adjacency list for suspect nodes to see missing reverse edges.
Symptom · 03
Edge existence check returns false for existing edge
Fix
For adjacency list, check if you're using contains() on a list – that's O(n). Ensure equals() is correctly implemented if storing custom Edge objects.
Symptom · 04
Algorithms run significantly slower on 'small' graph
Fix
Count the actual number of edges. If matrix is used, BFS scans entire row per node – O(V²). Compare actual neighbors per node.
Symptom · 05
Adding nodes dynamically throws ArrayIndexOutOfBoundsException
Fix
For adjacency matrix, you cannot resize easily. Pre-allocate or use adjacency list for dynamic node additions.
★ Graph Rep Debug Quick SheetFive graph representation problems and the exact commands to diagnose them.
Graph traversal is extremely slow
Immediate action
Stop the algorithm and check representation type.
Commands
System.out.println("Using matrix: " + (matrix != null));
long edgeCount = adjacencyList.values().stream().mapToLong(List::size).sum(); System.out.println("Edge count: " + edgeCount);
Fix now
Switch to adjacency list if edgeCount << V*V/2.
Missing edges in undirected graph+
Immediate action
Print adjacency lists of both endpoints to verify symmetry.
Commands
System.out.println("Neighbors of A: " + adjList.get(A));
System.out.println("Neighbors of B: " + adjList.get(B));
Fix now
Add the reverse edge in addEdge() method – never rely on caller to remember.
Node index out of bounds+
Immediate action
Check if nodes start at 0 or 1.
Commands
int maxNode = graph.keySet().stream().max(Integer::compare).orElse(-1); System.out.println("Max node index: " + maxNode);
System.out.println("Matrix size: " + matrix.length);
Fix now
If nodes start at 1, either allocate matrix of size N+1 or subtract 1 from each index.
Weighted graph returns wrong path cost+
Immediate action
Verify that edge weights are being read correctly from input.
Commands
for (Edge e : getOutgoingEdges(node)) { System.out.println("dest=" + e.destination + " weight=" + e.weight); }
Check if weight is stored as int – if your graph uses floating-point distances, int truncation will silently corrupt costs.
Fix now
Use double for weight fields and parse input with nextDouble().

Key takeaways

1
Adjacency matrix
O(V²) space, O(1) edge check — use only for dense graphs or when V is small.
2
Adjacency list
O(V+E) space, O(degree) neighbor access — default for production and most interviews.
3
Always estimate V and E before choosing; density = E / (V*(V-1)/2). Below 20% density, use list.
4
For traversal-heavy algorithms (BFS, DFS), adjacency list is always faster because you only scan actual neighbors.
5
Weighted graphs need an Edge object; directed graphs add edges in one direction only.
6
The wrong representation turns milliseconds into hours
prototype with list first, optimise later.

Common mistakes to avoid

5 patterns
×

Using adjacency matrix for a sparse graph without checking density

Symptom
OutOfMemoryError or extreme slowness because V² space is huge and BFS scans entire rows.
Fix
Always estimate E vs V² first. If density < 20%, use adjacency list. In Java, switch to a HashMap-based adjacency list.
×

Forgetting to add both directions in an undirected graph

Symptom
BFS/DFS from one node doesn't reach nodes that are connected via the other direction; graph appears disconnected.
Fix
In addEdge(), always add toNode fromNode's list AND fromNode to toNode's list. Never rely on caller to remember.
×

Using int for edge weights when distances are floating point

Symptom
Shortest path calculations return slightly wrong costs due to truncation, especially large graphs.
Fix
Use double for weight fields. Parse input with nextDouble() instead of nextInt().
×

Storing edges as a single map of edge existence instead of using objects for weighted edges

Symptom
Cannot easily access weight when traversing. Code becomes cluttered with parallel maps.
Fix
Create an Edge class with destination and weight fields. Store Edge objects in the adjacency list.
×

Using contains() on an ArrayList for frequent edge existence checks

Symptom
hasEdge() becomes O(degree) and becomes a bottleneck for dense neighbor lists or frequent checks.
Fix
If edge existence is dominant operation, store neighbors in a HashSet instead of ArrayList. Trade memory for speed.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
When would you choose an adjacency matrix over an adjacency list?
Q02JUNIOR
How do you implement a weighted adjacency list in Java?
Q03SENIOR
What is the space complexity of an adjacency matrix? What happens if the...
Q04SENIOR
How would you find all neighbours of a node using an adjacency matrix vs...
Q05SENIOR
Can you convert an adjacency list to a matrix efficiently? What's the ti...
Q01 of 05SENIOR

When would you choose an adjacency matrix over an adjacency list?

ANSWER
When the graph is dense (E close to V²), when the number of nodes is small enough to fit in memory (e.g., V < 10,000), and when the dominant operation is checking edge existence (O(1) vs O(degree)). Also for algorithms like Floyd-Warshall that inherently operate on the full matrix. But in real-world production systems, adjacency list is almost always the safer default because most graphs are sparse. The matrix's O(V²) memory cost grows quadratically, and it's easy to underestimate.
FAQ · 5 QUESTIONS

Frequently Asked Questions

01
What is the main difference between adjacency matrix and adjacency list?
02
When should I use an edge list instead of adjacency list?
03
What happens if I use an adjacency matrix for a graph with 100,000 nodes and 150,000 edges?
04
Is it possible to store an adjacency matrix using less memory for undirected graphs?
05
How do I decide which representation to use in an interview?
N
Naren Founder & Principal Engineer

20+ years shipping performance-critical code where algorithms decide the bill. Lessons pulled from things that broke in production.

Follow
Verified
production tested
May 24, 2026
last updated
1,554
articles · all by Naren
🔥

That's Graphs. Mark it forged?

10 min read · try the examples if you haven't

Previous
Morris Traversal of Binary Tree
1 / 17 · Graphs
Next
BFS — Breadth First Search