Matrix Chain Multiplication with Dynamic Programming — Deep Dive
- Matrix chain multiplication finds the optimal parenthesization to minimize scalar operations.
- It's an interval DP problem: dp[i][j] = min cost to multiply matrices M_i through M_j.
- For each interval [i,j], try all split points k and take the minimum.
Imagine you're wrapping five birthday presents and you can combine them in any order using tape. Some orders waste almost no tape; others waste rolls of it — same presents, wildly different effort. Matrix Chain Multiplication is exactly that: you have a chain of matrices to multiply together, and the ORDER in which you pair them up changes the total number of arithmetic operations dramatically. DP finds the pairing order that does the least work.
Every time a graphics engine renders a scene, a robotics system computes joint transforms, or a machine learning framework chains layer activations, it multiplies a sequence of matrices. The math gives you the same answer no matter what order you parenthesize the chain — but the COST in floating-point operations can differ by orders of magnitude. A chain of just six matrices can have 42 different parenthesizations, and the worst choice can be thousands of times more expensive than the best. At scale, this is the difference between a real-time application and one that lags.
The brute-force approach — try every parenthesization and pick the cheapest — has a runtime that grows faster than exponential (it follows the Catalan number sequence). Matrix Chain Multiplication solved with Dynamic Programming collapses that to O(n³) time and O(n²) space by recognizing a crucial truth: the optimal solution to the full chain is built from optimal solutions to sub-chains. That overlapping-subproblem structure is the heartbeat of DP.
By the end of this article you'll be able to derive the recurrence relation from scratch, implement both a top-down memoized and a bottom-up tabulated solution in Java, read the parenthesization table to reconstruct the actual split points, reason about the space–time tradeoffs, and handle the edge cases that trip people up in interviews and production code.
What is Matrix Chain Multiplication? — Plain English
Matrix Chain Multiplication asks: given a sequence of matrices to multiply, what order of parenthesization minimizes the total number of scalar multiplications?
Matrix multiplication is associative: (AB)C = A(BC). But the number of operations differs! If A is 10x30, B is 30x5, C is 5x60: (AB)C: AB costs 10305=1500 ops, then (AB)C costs 10560=3000. Total: 4500. A(BC): BC costs 30560=9000 ops, then A(BC) costs 103060=18000. Total: 27000. The first ordering is 6x cheaper. With many matrices, the difference can be enormous.
This is a classic interval DP problem where we decide, for each pair (i,j), where to split the chain.
How Matrix Chain Multiplication Works — Step by Step
Let matrices M_1, M_2, ..., M_n have dimensions p[0]p[1], p[1]p[2], ..., p[n-1]p[n]. Define dp[i][j] = minimum multiplications to compute M_i M_{i+1} ... M_j.
Base case: dp[i][i] = 0 (single matrix, no multiplication needed).
For chain length l from 2 to n: For each starting index i from 1 to n-l+1: j = i + l - 1 (end of the chain) dp[i][j] = infinity For each split point k from i to j-1: cost = dp[i][k] + dp[k+1][j] + p[i-1]p[k]p[j] dp[i][j] = min(dp[i][j], cost)
Answer: dp[1][n]. Time: O(n^3). Space: O(n^2).
Worked Example — Tracing the Algorithm
4 matrices: A(10x30), B(30x5), C(5x60), D(60x15). Dimensions array p = [10, 30, 5, 60, 15]. Matrices indexed 1-4.
Length 2 chains: dp[1][2]: split at k=1. cost = dp[1][1]+dp[2][2]+p[0]p[1]p[2] = 0+0+10305=1500. dp[1][2]=1500. dp[2][3]: cost = 0+0+30560=9000. dp[2][3]=9000. dp[3][4]: cost = 0+0+56015=4500. dp[3][4]=4500.
Length 3 chains: dp[1][3]: k=1: dp[1][1]+dp[2][3]+p[0]p[1]p[3] = 0+9000+103060=27000. total=27000. k=2: dp[1][2]+dp[3][3]+p[0]p[2]p[3] = 1500+0+10560=4500. total=6000. dp[1][3] = min(27000, 6000) = 6000. dp[2][4]: k=2: 0+4500+30515=2250. total=6750. k=3: 9000+0+306015=36000. total=45000. dp[2][4] = 6750.
Length 4 chain: dp[1][4]: k=1: dp[1][1]+dp[2][4]+p[0]p[1]p[4] = 0+6750+103015=4500. total=11250. k=2: dp[1][2]+dp[3][4]+p[0]p[2]p[4] = 1500+4500+10515=750. total=6750. k=3: dp[1][3]+dp[4][4]+p[0]p[3]p[4] = 6000+0+106015=9000. total=15000. dp[1][4] = min(11250, 6750, 15000) = 6750.
Minimum multiplications: 6750. Optimal split: ((AB)(CD)) — split at k=2.
Implementation
The following implementation demonstrates Matrix Chain Multiplication with clear comments.
def matrix_chain(p): """p[i] * p[i+1] are the dimensions of the i-th matrix (1-indexed)""" n = len(p) - 1 # number of matrices dp = [[0] * (n + 1) for _ in range(n + 1)] # l = chain length for l in range(2, n + 1): for i in range(1, n - l + 2): j = i + l - 1 dp[i][j] = float('inf') for k in range(i, j): cost = dp[i][k] + dp[k+1][j] + p[i-1] * p[k] * p[j] if cost < dp[i][j]: dp[i][j] = cost return dp[1][n] p = [10, 30, 5, 60, 15] # A:10x30, B:30x5, C:5x60, D:60x15 print('Min multiplications:', matrix_chain(p)) # 6750
| Aspect | Bottom-Up Tabulation | Top-Down Memoization |
|---|---|---|
| Time Complexity | O(n³) — always fills all cells | O(n³) worst case, often less |
| Space Complexity | O(n²) for the DP table | O(n²) table + O(n) call stack |
| Sub-problems computed | All n² cells, even unused ones | Only reachable sub-problems |
| Stack overflow risk | None — purely iterative | Yes, for very large n (n > ~5000) |
| Cache friendliness | High — sequential row access | Lower — unpredictable access order |
| Code clarity | Trickier loop ordering to get right | Closer to the recursive definition |
| Best for | Interviews, dense chains, all n | Sparse chains, structured inputs |
| Reconstruction | Read split[][] after loop | Read split[][] after solve() returns |
🎯 Key Takeaways
- Matrix chain multiplication finds the optimal parenthesization to minimize scalar operations.
- It's an interval DP problem: dp[i][j] = min cost to multiply matrices M_i through M_j.
- For each interval [i,j], try all split points k and take the minimum.
- Time: O(n^3). Space: O(n^2). Reconstruction requires a separate split table.
- Matrix multiplication is associative but not commutative — you can reorder parentheses but not matrix order.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QWhat is the recurrence for matrix chain multiplication DP?
- QWhy does the order of parenthesization matter for matrix multiplication?
- QWhat is the time complexity of the matrix chain DP algorithm?
Frequently Asked Questions
Why can't we just multiply matrices left to right greedily?
Left-to-right multiplication is one valid ordering but not necessarily optimal. The cost depends on the dimensions of the sub-result matrices at each step. Greedy choices that minimize immediate cost can produce large intermediate matrices that make subsequent multiplications expensive. Only exhaustive search with memoization (DP) guarantees optimality.
How do you reconstruct the optimal parenthesization?
Track a 'split' table alongside the dp table. When dp[i][j] is updated with split point k, record split[i][j] = k. To reconstruct: if split[i][j] = k, the optimal split is (M_i...M_k) * (M_{k+1}...M_j). Recursively apply to both halves to get the full parenthesization.
What is the time complexity and can it be improved?
O(n^3) time and O(n^2) space. This is optimal for the standard problem. With Hu-Shing algorithm, matrix chain multiplication can be solved in O(n log n), but it is significantly more complex to implement and rarely needed in practice or interviews.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.