Senior 5 min · March 05, 2026

Edit Distance Problem: Dynamic Programming Deep Dive with Java

Edit Distance (Levenshtein Distance) explained deeply — DP table construction, space optimization, backtracking, and production gotchas.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • Edit Distance (Levenshtein) measures minimum single-character operations (insert, delete, substitute) to transform one string into another.
  • DP recurrence: if characters match, take diagonal value; else 1 + min(delete, insert, substitute).
  • Time O(mn), space O(mn) — reducible to O(min(m,n)) with two-row rolling DP.
  • Base cases: dp[i][0] = i (i deletions), dp[0][j] = j (j insertions).
  • Backtracking through the table reconstructs the actual operation sequence — critical for diff tools.
  • Production trap: charAt() fails for emoji/multi-byte — use codePointAt() with correct indexing.
Plain-English First

Imagine your phone's autocorrect sees you typed 'teh'. It needs to figure out how many single fixes — swap a letter, add one, or delete one — turn 'teh' into 'the'. The smallest number of fixes needed is the edit distance. That's it. Spell-checkers, DNA sequencers, and Git's diff tool all use this same counting trick under the hood.

Every time you mistype a word and your IDE quietly corrects it, or you run git diff and see which lines changed, or a bioinformatics pipeline aligns two DNA strands — the edit distance algorithm is likely doing the heavy lifting. It's one of those rare algorithms that quietly powers a dozen industries simultaneously. Knowing it deeply, not just being able to recite the recurrence relation, is what separates candidates who get offers from those who don't.

The edit distance problem (formally called Levenshtein distance) asks: given two strings, what is the minimum number of single-character operations — insert, delete, or substitute — required to transform one string into the other? The naive recursive solution recomputes the same subproblems exponentially. Dynamic programming collapses that exponential tree into an O(m×n) table where every cell is computed exactly once. The trick is recognising that the optimal answer for transforming any prefix of string A into any prefix of string B depends only on three smaller subproblems — and that insight is the whole game.

By the end of this article you'll be able to: build the full DP table from scratch and explain every cell intuitively, reconstruct the actual sequence of edit operations (not just the count), optimize the solution down to O(min(m,n)) space, handle Unicode and multi-byte strings correctly, and answer every interview follow-up an interviewer can throw at you. Let's build it piece by piece.

What is Edit Distance (Levenshtein Distance)? — Plain English

Edit Distance (also called Levenshtein Distance) measures how many single-character operations — insertions, deletions, or substitutions — are needed to transform one string into another. For example, transforming 'kitten' into 'sitting' requires 3 operations: substitute k->s, substitute e->i, insert g at the end.

Real-world uses: spell checkers compute edit distance to suggest corrections. DNA sequence alignment measures how genetically similar two organisms are. Git merge tools detect which lines were changed, inserted, or deleted.

But here's the thing: the algorithm doesn't care about semantics — it's purely a counting problem. The cost of each operation is assumed to be 1. That assumption breaks in some production systems where substitution cost differs (e.g., keyboard adjacency weighting). We'll get to that later.

Production Insight
Most implementations ignore Unicode normalization, causing false mismatches for accented characters.
Spell-check systems often add keyboard-distance weighting to substitution costs – standard Levenshtein doesn't handle that.
Rule: always benchmark with your actual input domain; default costs rarely match real-world requirements.
Key Takeaway
Edit distance = minimal number of insert, delete, substitute operations.
Standard costs are 1 per operation; custom costs are common in production.
Always normalize strings before comparison to avoid locale-dependent errors.

How Edit Distance (Levenshtein Distance) Works — Step by Step

Define dp[i][j] = minimum edit distance between the first i characters of word1 and the first j characters of word2.

Base cases: dp[0][j] = j (transform empty string to word2[0..j-1] requires j insertions). dp[i][0] = i (transform word1[0..i-1] to empty string requires i deletions).

Recurrence: If word1[i-1] == word2[j-1]: dp[i][j] = dp[i-1][j-1] (characters match — no operation needed). Else: dp[i][j] = 1 + min( dp[i-1][j], # delete word1[i-1] dp[i][j-1], # insert word2[j-1] into word1 dp[i-1][j-1] # substitute word1[i-1] with word2[j-1] )

Time: O(mn). Space: O(mn) for full table, O(min(m,n)) with rolling rows.

The Three Options Mental Model
  • Delete: remove A[i-1] (cost 1) + solve for prefixes (i-1, j).
  • Insert: add B[j-1] into A (cost 1) + solve for (i, j-1).
  • Substitute: replace A[i-1] with B[j-1] (cost 1) + solve for (i-1, j-1).
  • If the last chars are equal, you can skip – no cost, just carry over the diagonal value.
Production Insight
The recurrence assumes all operations cost the same – that's rarely true in practice.
For keyboard auto-correct, substituting 's' and 'd' costs less than 's' and 'k'.
Rule: turn costs into a matrix when domain-specific weighting matters.
Key Takeaway
Recurrence: equal char -> diagonal; else 1 + min(up, left, diagonal).
Base cases are easy to forget – verify them first when debugging.
If you need domain-specific costs, swap the constant '1' with a lookup.
Choose Between Standard and Weighted Edit Distance
IfSpell-checking for keyboard typos
UseUse weighted edit distance with keyboard adjacency costs.
IfDNA sequence alignment
UseUse weighted costs (affine gap penalties) – standard Levenshtein doesn't capture gap penalties.
IfSimple string comparison (e.g., CAS
UseStandard Levenshtein with all costs = 1 is sufficient.

Worked Example — Tracing the Algorithm

Transform 'horse' into 'ros'.

dp table (rows=horse, cols=ros): '' r o s '' [ 0, 1, 2, 3 ] h [ 1, 1, 2, 3 ] o [ 2, 2, 1, 2 ] r [ 3, 2, 2, 2 ] s [ 4, 3, 3, 2 ] e [ 5, 4, 4, 3 ]

Fill example at dp[2][2] (o vs o): word1[1]='o' == word2[1]='o'. dp[2][2] = dp[1][1] = 1. Fill at dp[5][3] (e vs s): 'e' != 's'. min(dp[4][3], dp[5][2], dp[4][2]) = min(2,4,3) = 2. Add 1 = 3.

Answer: dp[5][3] = 3. Operations: delete h, delete r, substitute e->s. (or: substitute h->r, delete o, delete r... multiple valid paths)

Production Insight
Multiple optimal paths exist – the DP table captures the minimum count, not a unique sequence.
When displaying diffs, you must pick one path; simple backtracking chooses the first found.
Rule: if you need a specific diff style (e.g., minimal substitutions), bias backtracking accordingly.
Key Takeaway
Work through a small example to cement the recurrence.
The bottom-right cell holds the answer.
Multiple optimal operation sequences exist – backtracking is an arbitrary choice.

Implementation

The following Java implementation demonstrates Edit Distance (Levenshtein Distance) with clear comments. It uses the full DP table for clarity; the two-row optimization is shown in the next section.

EditDistance.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
package io.thecodeforge.dp;

public class EditDistance {
    public static int minDistance(String word1, String word2) {
        int m = word1.length();
        int n = word2.length();
        int[][] dp = new int[m + 1][n + 1];

        // Base cases
        for (int i = 0; i <= m; i++) dp[i][0] = i;
        for (int j = 0; j <= n; j++) dp[0][j] = j;

        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
                    dp[i][j] = dp[i - 1][j - 1];
                } else {
                    dp[i][j] = 1 + Math.min(
                        dp[i - 1][j],    // delete
                        Math.min(
                            dp[i][j - 1],    // insert
                            dp[i - 1][j - 1] // substitute
                        )
                    );
                }
            }
        }
        return dp[m][n];
    }

    public static void main(String[] args) {
        System.out.println(minDistance("horse", "ros"));    // 3
        System.out.println(minDistance("kitten", "sitting")); // 3
        System.out.println(minDistance("", "abc"));          // 3
    }
}
Output
3
3
3
Production Insight
Using charAt() fails for Unicode supplementary characters (emoji, some CJK). Replace with codePointAt and adjust iteration.
Full 2D array leads to OOM for large inputs – consider two-row DP immediately.
Rule: for production code, always use codePointAt and prefer rolling arrays.
Key Takeaway
Full DP table code is straightforward – base cases and recurrence.
Java's charAt is safe only for BMP characters.
Start with full table for testing, but ship with rolling rows for production.

Space Optimization: Two-Row DP

The full DP table uses O(m*n) memory, which can be a problem for large strings (e.g., DNA sequences of length 10^5). Notice that each row dp[i] depends only on dp[i-1] (the previous row). We can store just two rows and swap after each iteration. This reduces space to O(min(m,n)) if we always orient the shorter string as the columns.

```java package io.thecodeforge.dp;

public class EditDistanceOptimized { public static int minDistance(String word1, String word2) { if (word1.length() < word2.length()) { String tmp = word1; word1 = word2; word2 = tmp; } int m = word1.length(), n = word2.length(); int[] prev = new int[n + 1]; int[] curr = new int[n + 1]; for (int j = 0; j <= n; j++) prev[j] = j; for (int i = 1; i <= m; i++) { curr[0] = i; for (int j = 1; j <= n; j++) { if (word1.charAt(i - 1) == word2.charAt(j - 1)) { curr[j] = prev[j - 1]; } else { curr[j] = 1 + Math.min(prev[j], Math.min(curr[j - 1], prev[j - 1])); } } int[] temp = prev; prev = curr; curr = temp; } return prev[n]; } } ```

Time remains O(m*n), space drops to O(n). This is the version you'd use when handling 10K+ character strings in production.

Lost Backtracking Capability
Two-row DP only stores the last row – you cannot reconstruct the edit path from the optimized version alone. If you need backtracking, either keep the full table or store traceback pointers in a separate 2D byte array (costs O(m*n) as well).
Production Insight
Two-row DP halves memory per row but still O(n) – for very long strings (10^7) even O(n) is ~40MB for int[].
For 100k x 100k, two-row DP uses 400KB – trivial. Full table would be 40GB.
Rule: always orient the shorter string as columns to minimize the row array size.
Key Takeaway
Two-row DP reduces space from O(m*n) to O(n) with no time penalty.
Orient shorter string as columns for minimum array size.
If you need the edit sequence, you cannot skip storing extra data.
Choose Memory Strategy
IfNeed backtracking and memory permits
UseUse full DP table with traceback pointers.
IfNeed backtracking but memory tight
UseStore traceback as byte array alongside two-row DP (still O(m*n) for traceback).
IfOnly need distance count (no diff)
UseUse two-row DP – best space/time trade-off.

Backtracking to Reconstruct Operations

Knowing the edit distance number is useful, but often you need the actual sequence of operations. For example, when displaying a diff to a user, you want to show which characters were inserted, deleted, or substituted.

To reconstruct operations, you start from dp[m][n] and walk backwards to dp[0][0]. At each step: - If characters match (word1[i-1] == word2[j-1]), move diagonally upward (no operation). - If not, check which adjacent cell gave the minimum: dp[i-1][j-1] + 1 → substitution of word1[i-1] with word2[j-1] dp[i-1][j] + 1 → deletion of word1[i-1] * dp[i][j-1] + 1 → insertion of word2[j-1]

Record operations in reverse order, then reverse them at the end.

This is O(m+n) additional space and time. Production systems that need to show user-friendly diffs (e.g., Git) use a similar backtracking approach but optimise with traceback pointers stored during the forward pass.

Production Insight
When multiple optimal paths exist, backtracking decision order matters – you get a different diff each time.
Git's diff uses a variant that biases towards minimal substitutions for more human-readable output.
Rule: if producing diffs for human consumption, prefer insert over delete over substitute when scores tie.
Key Takeaway
Backtracking reconstructs the operation sequence from the DP table.
Walk from bottom-right to top-left, following minimum-cost moves.
Ties produce different diffs – decide tie-breaking order based on use case.

Production Considerations and Pitfalls

Edit distance seems simple until you put it in production. Here are the real-world issues:

  1. Unicode and multi-byte characters: charAt(i) in Java gives a UTF-16 code unit, not a code point. Comparing code units can break for emoji or characters outside the Basic Multilingual Plane. Use codePointAt and codePointCount.
  2. Operation costs aren't always 1: In some applications (e.g., DNA alignment), substitution might cost 2, or insertion/deletion might have different weights. The algorithm generalises easily by replacing '1' with a cost matrix.
  3. Large strings: O(m*n) time is fine for 1000 chars, but for 100,000 chars it's 10^10 operations — you need to switch to approximate algorithms like Wu-Manber.
  4. Memory fragmentation: Using a full 2D array of ints for long strings can cause OOM. Prefer the two-row optimisation, or use a single row with manual diagonal caching.
  5. Negative costs or cycles: If any operation cost is negative, dynamic programming fails (since the problem is no longer optimal substructure with non-negative weights). Use Bellman-Ford or similar for edit distance with negative costs (rare).
Approximate Algorithms for Very Large Strings
When your strings exceed ~10K characters, consider BK-trees or simhash for approximate matching. Exact DP with O(m*n) is too slow for production at that scale. For DNA alignment, specialized algorithms like Smith-Waterman (local alignment) can be accelerated with SIMD or GPU.
Production Insight
Unicode bugs are silent – they don't throw exceptions, just produce wrong distances.
Negative costs break the DP recurrence – they create cycles that violate optimal substructure.
Rule: for strings > 10K, always have a fallback to an approximate algorithm with a threshold.
Key Takeaway
Production pitfalls: Unicode, variable costs, large inputs, memory, negative weights.
Always use codePointAt for Unicode-aware comparisons.
Profile your input sizes and choose the right algorithm – DP is not always the answer.
● Production incidentPOST-MORTEMseverity: high

OOM When Comparing DNA Sequences of Length 100k

Symptom
Java heap space error thrown when calling editDistance on two long strings (each ~100k characters).
Assumption
The full DP table is acceptable for all inputs — it worked on small test data.
Root cause
O(m*n) memory allocation: for 100k x 100k, the int[][] table allocates ~40GB (10^10 ints × 4 bytes).
Fix
Switch to two-row rolling DP (O(n) space) or use an approximate algorithm (BK-tree, simhash) for very large strings.
Key lesson
  • Always profile with worst-case input sizes before deploying to production.
  • Use rolling DP for any production string comparison that might exceed a few thousand characters.
Production debug guideCommon production issues and how to fix them4 entries
Symptom · 01
Edit distance is 0 for obviously different strings
Fix
Check that input strings are not identical; verify case sensitivity and Unicode normalization (NFD/NFKC) are applied consistently.
Symptom · 02
OutOfMemoryError in production
Fix
Profile input lengths; switch to two-row DP or use approximate algorithm (e.g., BK-tree) when lengths exceed 10,000.
Symptom · 03
Incorrect edit distance for accented characters or emoji
Fix
Replace charAt(i) with codePointAt(i) and adjust iteration with Character.charCount() or codePointCount(). Normalize with java.text.Normalizer.
Symptom · 04
DP table construction is slow for 10k+ strings
Fix
Verify algorithm is O(m*n) not exponential; consider using weighted edit distance with early termination if distance exceeds a threshold.
★ Quick Edit Distance Debug Cheat SheetRapid checks for common edit distance implementation errors.
Distance too large for identical strings?
Immediate action
Check base case initialization: dp[0][j] = j, dp[i][0] = i
Commands
print dp[1][1] when word1[0]==word2[0] -> should be 0
print dp[m][n] for word1='a', word2='a' -> should be 0
Fix now
Ensure dp[i][j] = dp[i-1][j-1] when chars equal, not 1+min(...)
OOM with large input?+
Immediate action
Check input lengths and whether using full 2D array
Commands
System.out.println(m + " x " + n);
long estimatedMemory = (long)(m+1)*(n+1)*4; // bytes
Fix now
Replace 2D array with two rolling rows (prev/curr) to reduce space from O(m*n) to O(n)
Edit Distance Variants Comparison
VariantOperation CostsUse Case
Levenshtein (standard)Insert=1, Delete=1, Substitute=1Spell check, DNA alignment
Weighted edit distanceCustom costs per operationKeyboard layout corrections, bioinformatics (affine gaps)
LCS (Longest Common Subsequence)Insert=1, Delete=1, Substitute=∞ (or not allowed)File diff, plagiarism detection

Key takeaways

1
Edit distance measures the minimum insertions, deletions, and substitutions to transform one string to another.
2
DP recurrence
if characters match, copy diagonal; else 1 + min(delete, insert, substitute).
3
Time O(mn), Space O(mn) reducible to O(min(m,n)) with rolling rows.
4
Base cases
dp[i][0]=i (i deletions), dp[0][j]=j (j insertions).
5
Used in spell checkers, DNA alignment, diff tools, and fuzzy string matching.
6
Always handle Unicode with code points, not charAt.
7
Backtracking reveals the operation sequence
critical for diff displays.
8
Profile input sizes
for >10K characters, consider approximate algorithms or two-row DP.
9
Custom operation costs are common in production
generalize the recurrence with a cost matrix.

Common mistakes to avoid

4 patterns
×

Using charAt() for Unicode strings

Symptom
Emoji or accented characters cause incorrect edit distance values because charAt returns UTF-16 code units, not code points.
Fix
Normalise strings with java.text.Normalizer and use codePointAt / offsetByCodePoints for iteration.
×

Forgetting base cases

Symptom
Returned edit distance is too low or too high when one string is empty.
Fix
Always initialise dp[0][j]=j and dp[i][0]=i.
×

Assuming all operations cost 1

Symptom
Algorithm produces a distance that doesn't match desired behaviour (e.g., substitution should cost 2 in DNA alignment).
Fix
Generalise recurrence to use a cost table: dp[i][j] = min(dp[i-1][j-1]+costSub, dp[i-1][j]+costDel, dp[i][j-1]+costIns).
×

Not handling large strings efficiently

Symptom
OOM error or hours of computation when comparing documents with >10K characters.
Fix
Use two-row DP or switch to an approximate algorithm (e.g., BK-tree, simhash) for very large strings.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR
What are the three operations in Levenshtein distance and what DP state ...
Q02JUNIOR
What is the base case for the edit distance DP table?
Q03SENIOR
How is edit distance related to LCS?
Q04SENIOR
How can you reduce space complexity from O(m*n) to O(min(m,n))?
Q05SENIOR
How do you reconstruct the actual sequence of edit operations?
Q06SENIOR
What happens if you have negative operation costs?
Q07SENIOR
How do you handle Unicode supplementary characters in edit distance?
Q08SENIOR
In a production spell-checker, how would you incorporate word frequency ...
Q01 of 08JUNIOR

What are the three operations in Levenshtein distance and what DP state captures each?

ANSWER
Insertion: adding a character to word1 (dp[i][j-1]). Deletion: removing a character from word1 (dp[i-1][j]). Substitution: replacing a character in word1 with one from word2 (dp[i-1][j-1]). The recurrence chooses the minimum of these three plus 1 when characters differ.
FAQ · 6 QUESTIONS

Frequently Asked Questions

01
What is the difference between edit distance and LCS?
02
How do you reconstruct the actual sequence of operations?
03
How can edit distance be computed in O(min(m,n)) space?
04
Is edit distance case-sensitive?
05
Can edit distance handle weighted operations?
06
What is the best way to handle very long strings (100k+ characters)?
🔥

That's Dynamic Programming. Mark it forged?

5 min read · try the examples if you haven't

Previous
Matrix Chain Multiplication
8 / 15 · Dynamic Programming
Next
Rod Cutting Problem