Senior 7 min · March 05, 2026

Edit Distance Problem: Dynamic Programming Deep Dive with Java

Q: What is the difference between edit distance and LCS?

LCS counts matching characters and is used to find similarity. Edit distance counts the operations needed for transformation and measures difference. They are related: if only insertions and deletions are allowed (no substitutions), edit distance = m + n - 2*LCS(word1, word2).

Q: How do you reconstruct the actual sequence of operations?

Backtrack through the dp table from dp[m][n] to dp[0][0]. At each cell: if characters matched, move diagonally (no operation). If you came from dp[i-1][j-1], it was a substitution. From dp[i-1][j] it was a deletion. From dp[i][j-1] it was an insertion. Record each operation in reverse, then reverse the list.

Q: How can edit distance be computed in O(min(m,n)) space?

Each row of the dp table only depends on the previous row and the current row. Maintain two 1D arrays (previous row and current row) and swap them after each row. This reduces space from O(m*n) to O(n). Alternatively, orient word2 as the shorter string to minimize array size.

Q: Is edit distance case-sensitive?

By default, yes — 'A' and 'a' are different characters. In many applications (e.g., spell checking), you normalize case or use case-insensitive comparison by converting both strings to lowercase before running the algorithm.

Q: Can edit distance handle weighted operations?

Yes. The recurrence generalises to: dp[i][j] = min(dp[i-1][j]+costDel, dp[i][j-1]+costIns, dp[i-1][j-1]+costSub). The weights can be any non-negative numbers, including different costs for different character pairs.

Q: What is the best way to handle very long strings (100k+ characters)?

Use two-row DP to save memory, but even O(m*n) time might be too slow. Consider approximate algorithms: BK-trees, simhash, or MinHash. For DNA sequences, specialized tools like BLAST use heuristics.

Edit Distance (Levenshtein Distance) explained deeply — DP table construction, space optimization, backtracking, and production gotchas.

Naren Founder & Principal Engineer

20+ years shipping performance-critical code where algorithms decide the bill. Lessons pulled from things that broke in production.

✓ Production

production tested

May 24, 2026

last updated

1,554

articles · all by Naren

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

Edit Distance (Levenshtein) measures minimum single-character operations (insert, delete, substitute) to transform one string into another.
DP recurrence: if characters match, take diagonal value; else 1 + min(delete, insert, substitute).
Time O(mn), space O(mn) — reducible to O(min(m,n)) with two-row rolling DP.
Base cases: dp[i][0] = i (i deletions), dp[0][j] = j (j insertions).
Backtracking through the table reconstructs the actual operation sequence — critical for diff tools.
Production trap: charAt() fails for emoji/multi-byte — use codePointAt() with correct indexing.

✦ Definition~90s read

What is Edit Distance Problem?

Edit Distance (also called Levenshtein Distance) measures how many single-character operations — insertions, deletions, or substitutions — are needed to transform one string into another. For example, transforming 'kitten' into 'sitting' requires 3 operations: substitute k->s, substitute e->i, insert g at the end.

★

Imagine your phone's autocorrect sees you typed 'teh'.

Real-world uses: spell checkers compute edit distance to suggest corrections. DNA sequence alignment measures how genetically similar two organisms are. Git merge tools detect which lines were changed, inserted, or deleted.

But here's the thing: the algorithm doesn't care about semantics — it's purely a counting problem. The cost of each operation is assumed to be 1. That assumption breaks in some production systems where substitution cost differs (e.g., keyboard adjacency weighting). We'll get to that later.

Plain-English First

Imagine your phone's autocorrect sees you typed 'teh'. It needs to figure out how many single fixes — swap a letter, add one, or delete one — turn 'teh' into 'the'. The smallest number of fixes needed is the edit distance. That's it. Spell-checkers, DNA sequencers, and Git's diff tool all use this same counting trick under the hood.

Every time you mistype a word and your IDE quietly corrects it, or you run git diff and see which lines changed, or a bioinformatics pipeline aligns two DNA strands — the edit distance algorithm is likely doing the heavy lifting. It's one of those rare algorithms that quietly powers a dozen industries simultaneously. Knowing it deeply, not just being able to recite the recurrence relation, is what separates candidates who get offers from those who don't.

The edit distance problem (formally called Levenshtein distance) asks: given two strings, what is the minimum number of single-character operations — insert, delete, or substitute — required to transform one string into the other? The naive recursive solution recomputes the same subproblems exponentially. Dynamic programming collapses that exponential tree into an O(m×n) table where every cell is computed exactly once. The trick is recognising that the optimal answer for transforming any prefix of string A into any prefix of string B depends only on three smaller subproblems — and that insight is the whole game.

By the end of this article you'll be able to: build the full DP table from scratch and explain every cell intuitively, reconstruct the actual sequence of edit operations (not just the count), optimize the solution down to O(min(m,n)) space, handle Unicode and multi-byte strings correctly, and answer every interview follow-up an interviewer can throw at you. Let's build it piece by piece.

What is Edit Distance (Levenshtein Distance)? — Plain English

Production Insight

Most implementations ignore Unicode normalization, causing false mismatches for accented characters.

Spell-check systems often add keyboard-distance weighting to substitution costs – standard Levenshtein doesn't handle that.

Rule: always benchmark with your actual input domain; default costs rarely match real-world requirements.

Key Takeaway

Edit distance = minimal number of insert, delete, substitute operations.

Standard costs are 1 per operation; custom costs are common in production.

Always normalize strings before comparison to avoid locale-dependent errors.

thecodeforge.io

Edit Distance DP Deep Dive

Edit Distance Problem

How Edit Distance (Levenshtein Distance) Works — Step by Step

Define dp[i][j] = minimum edit distance between the first i characters of word1 and the first j characters of word2.

Base cases: dp[0][j] = j (transform empty string to word2[0..j-1] requires j insertions). dp[i][0] = i (transform word1[0..i-1] to empty string requires i deletions).

Recurrence: If word1[i-1] == word2[j-1]: dp[i][j] = dp[i-1][j-1] (characters match — no operation needed). Else: dp[i][j] = 1 + min( dp[i-1][j], # delete word1[i-1] dp[i][j-1], # insert word2[j-1] into word1 dp[i-1][j-1] # substitute word1[i-1] with word2[j-1] )

Time: O(mn). Space: O(mn) for full table, O(min(m,n)) with rolling rows.

The Three Options Mental Model

Delete: remove A[i-1] (cost 1) + solve for prefixes (i-1, j).
Insert: add B[j-1] into A (cost 1) + solve for (i, j-1).
Substitute: replace A[i-1] with B[j-1] (cost 1) + solve for (i-1, j-1).
If the last chars are equal, you can skip – no cost, just carry over the diagonal value.

Production Insight

The recurrence assumes all operations cost the same – that's rarely true in practice.

For keyboard auto-correct, substituting 's' and 'd' costs less than 's' and 'k'.

Rule: turn costs into a matrix when domain-specific weighting matters.

Key Takeaway

Recurrence: equal char -> diagonal; else 1 + min(up, left, diagonal).

Base cases are easy to forget – verify them first when debugging.

If you need domain-specific costs, swap the constant '1' with a lookup.

Choose Between Standard and Weighted Edit Distance

IfSpell-checking for keyboard typos

→

UseUse weighted edit distance with keyboard adjacency costs.

IfDNA sequence alignment

→

UseUse weighted costs (affine gap penalties) – standard Levenshtein doesn't capture gap penalties.

IfSimple string comparison (e.g., CAS

→

UseStandard Levenshtein with all costs = 1 is sufficient.

Worked Example — Tracing the Algorithm

Transform 'horse' into 'ros'.

dp table (rows=horse, cols=ros): '' r o s '' [ 0, 1, 2, 3 ] h [ 1, 1, 2, 3 ] o [ 2, 2, 1, 2 ] r [ 3, 2, 2, 2 ] s [ 4, 3, 3, 2 ] e [ 5, 4, 4, 3 ]

Fill example at dp[2][2] (o vs o): word1[1]='o' == word2[1]='o'. dp[2][2] = dp[1][1] = 1. Fill at dp[5][3] (e vs s): 'e' != 's'. min(dp[4][3], dp[5][2], dp[4][2]) = min(2,4,3) = 2. Add 1 = 3.

Answer: dp[5][3] = 3. Operations: delete h, delete r, substitute e->s. (or: substitute h->r, delete o, delete r... multiple valid paths)

Production Insight

Multiple optimal paths exist – the DP table captures the minimum count, not a unique sequence.

When displaying diffs, you must pick one path; simple backtracking chooses the first found.

Rule: if you need a specific diff style (e.g., minimal substitutions), bias backtracking accordingly.

Key Takeaway

Work through a small example to cement the recurrence.

The bottom-right cell holds the answer.

Multiple optimal operation sequences exist – backtracking is an arbitrary choice.

Implementation

The following Java implementation demonstrates Edit Distance (Levenshtein Distance) with clear comments. It uses the full DP table for clarity; the two-row optimization is shown in the next section.

EditDistance.javaJAVA

package io.thecodeforge.dp;

public class EditDistance {
    public static int minDistance(String word1, String word2) {
        int m = word1.length();
        int n = word2.length();
        int[][] dp = new int[m + 1][n + 1];

        // Base cases
        for (int i = 0; i <= m; i++) dp[i][0] = i;
        for (int j = 0; j <= n; j++) dp[0][j] = j;

        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
                    dp[i][j] = dp[i - 1][j - 1];
                } else {
                    dp[i][j] = 1 + Math.min(
                        dp[i - 1][j],    // delete
                        Math.min(
                            dp[i][j - 1],    // insert
                            dp[i - 1][j - 1] // substitute
                        )
                    );
                }
            }
        }
        return dp[m][n];
    }

    public static void main(String[] args) {
        System.out.println(minDistance("horse", "ros"));    // 3
        System.out.println(minDistance("kitten", "sitting")); // 3
        System.out.println(minDistance("", "abc"));          // 3
    }
}

Output

Production Insight

Using charAt() fails for Unicode supplementary characters (emoji, some CJK). Replace with codePointAt and adjust iteration.

Full 2D array leads to OOM for large inputs – consider two-row DP immediately.

Rule: for production code, always use codePointAt and prefer rolling arrays.

Key Takeaway

Full DP table code is straightforward – base cases and recurrence.

Java's charAt is safe only for BMP characters.

Start with full table for testing, but ship with rolling rows for production.

Space Optimization: Two-Row DP

The full DP table uses O(m*n) memory, which can be a problem for large strings (e.g., DNA sequences of length 10^5). Notice that each row dp[i] depends only on dp[i-1] (the previous row). We can store just two rows and swap after each iteration. This reduces space to O(min(m,n)) if we always orient the shorter string as the columns.

Implementation in Java:

```java package io.thecodeforge.dp;

public class EditDistanceOptimized { public static int minDistance(String word1, String word2) { if (word1.length() < word2.length()) { String tmp = word1; word1 = word2; word2 = tmp; } int m = word1.length(), n = word2.length(); int[] prev = new int[n + 1]; int[] curr = new int[n + 1]; for (int j = 0; j <= n; j++) prev[j] = j; for (int i = 1; i <= m; i++) { curr[0] = i; for (int j = 1; j <= n; j++) { if (word1.charAt(i - 1) == word2.charAt(j - 1)) { curr[j] = prev[j - 1]; } else { curr[j] = 1 + Math.min(prev[j], Math.min(curr[j - 1], prev[j - 1])); } } int[] temp = prev; prev = curr; curr = temp; } return prev[n]; } } ```

Time remains O(m*n), space drops to O(n). This is the version you'd use when handling 10K+ character strings in production.

Lost Backtracking Capability

Two-row DP only stores the last row – you cannot reconstruct the edit path from the optimized version alone. If you need backtracking, either keep the full table or store traceback pointers in a separate 2D byte array (costs O(m*n) as well).

Production Insight

Two-row DP halves memory per row but still O(n) – for very long strings (10^7) even O(n) is ~40MB for int[].

For 100k x 100k, two-row DP uses 400KB – trivial. Full table would be 40GB.

Rule: always orient the shorter string as columns to minimize the row array size.

Key Takeaway

Two-row DP reduces space from O(m*n) to O(n) with no time penalty.

Orient shorter string as columns for minimum array size.

If you need the edit sequence, you cannot skip storing extra data.

Choose Memory Strategy

IfNeed backtracking and memory permits

→

UseUse full DP table with traceback pointers.

IfNeed backtracking but memory tight

→

UseStore traceback as byte array alongside two-row DP (still O(m*n) for traceback).

IfOnly need distance count (no diff)

→

UseUse two-row DP – best space/time trade-off.

Backtracking to Reconstruct Operations

Knowing the edit distance number is useful, but often you need the actual sequence of operations. For example, when displaying a diff to a user, you want to show which characters were inserted, deleted, or substituted.

To reconstruct operations, you start from dp[m][n] and walk backwards to dp[0][0]. At each step: - If characters match (word1[i-1] == word2[j-1]), move diagonally upward (no operation). - If not, check which adjacent cell gave the minimum: dp[i-1][j-1] + 1 → substitution of word1[i-1] with word2[j-1] dp[i-1][j] + 1 → deletion of word1[i-1] * dp[i][j-1] + 1 → insertion of word2[j-1]

Record operations in reverse order, then reverse them at the end.

This is O(m+n) additional space and time. Production systems that need to show user-friendly diffs (e.g., Git) use a similar backtracking approach but optimise with traceback pointers stored during the forward pass.

Production Insight

When multiple optimal paths exist, backtracking decision order matters – you get a different diff each time.

Git's diff uses a variant that biases towards minimal substitutions for more human-readable output.

Rule: if producing diffs for human consumption, prefer insert over delete over substitute when scores tie.

Key Takeaway

Backtracking reconstructs the operation sequence from the DP table.

Walk from bottom-right to top-left, following minimum-cost moves.

Ties produce different diffs – decide tie-breaking order based on use case.

Production Considerations and Pitfalls

Edit distance seems simple until you put it in production. Here are the real-world issues:

Unicode and multi-byte characters: charAt(i) in Java gives a UTF-16 code unit, not a code point. Comparing code units can break for emoji or characters outside the Basic Multilingual Plane. Use codePointAt and codePointCount.
Operation costs aren't always 1: In some applications (e.g., DNA alignment), substitution might cost 2, or insertion/deletion might have different weights. The algorithm generalises easily by replacing '1' with a cost matrix.
Large strings: O(m*n) time is fine for 1000 chars, but for 100,000 chars it's 10^10 operations — you need to switch to approximate algorithms like Wu-Manber.
Memory fragmentation: Using a full 2D array of ints for long strings can cause OOM. Prefer the two-row optimisation, or use a single row with manual diagonal caching.
Negative costs or cycles: If any operation cost is negative, dynamic programming fails (since the problem is no longer optimal substructure with non-negative weights). Use Bellman-Ford or similar for edit distance with negative costs (rare).

Approximate Algorithms for Very Large Strings

When your strings exceed ~10K characters, consider BK-trees or simhash for approximate matching. Exact DP with O(m*n) is too slow for production at that scale. For DNA alignment, specialized algorithms like Smith-Waterman (local alignment) can be accelerated with SIMD or GPU.

Production Insight

Unicode bugs are silent – they don't throw exceptions, just produce wrong distances.

Negative costs break the DP recurrence – they create cycles that violate optimal substructure.

Rule: for strings > 10K, always have a fallback to an approximate algorithm with a threshold.

Key Takeaway

Production pitfalls: Unicode, variable costs, large inputs, memory, negative weights.

Always use codePointAt for Unicode-aware comparisons.

Profile your input sizes and choose the right algorithm – DP is not always the answer.

Problem Statement — Stop Guessing, Start Reading

You're given two strings: source and target. You want to turn source into target using three operations — insert a character, delete a character, or substitute one character for another. Each operation costs 1 unit. The edit distance is the minimum number of operations required.

That's it. No magic. No hidden rules. You can't reorder characters, you can't copy-paste a chunk. You're working character by character, paying the toll each time.

Why does this matter? Because every autocomplete, spell-check, DNA sequence aligner, and plagiarism detector uses some form of this. Levenshtein distance is the foundation. If you don't understand the problem statement cold, you'll waste hours on implementation bugs that stem from misinterpreting what "edit" actually means.

Here's the gotcha: substitutions cost 1. Some variants (like Damerau–Levenshtein) also allow transpositions. But for the classic version, three operations, each a flat fee. Memorize that.

EditDistanceDefinition.javaJAVA

// io.thecodeforge — dsa tutorial

public class EditDistanceDefinition {
    // Returns edit distance between source and target
    public static int editDistance(String source, String target) {
        int m = source.length();
        int n = target.length();
        int[][] dp = new int[m + 1][n + 1];
        
        // Base cases: empty string to another
        for (int i = 0; i <= m; i++) dp[i][0] = i;
        for (int j = 0; j <= n; j++) dp[0][j] = j;
        
        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (source.charAt(i - 1) == target.charAt(j - 1)) {
                    dp[i][j] = dp[i - 1][j - 1]; // no cost
                } else {
                    dp[i][j] = 1 + Math.min(
                        dp[i - 1][j],    // delete
                        Math.min(
                            dp[i][j - 1],    // insert
                            dp[i - 1][j - 1] // substitute
                        )
                    );
                }
            }
        }
        return dp[m][n];
    }

    public static void main(String[] args) {
        System.out.println(editDistance("kitten", "sitting")); // Output: 3
    }
}

Output

Production Trap:

Most bugs come from confusing indices. dp[i][j] corresponds to the first i chars of source and first j chars of target. Off-by-one errors are the #1 cause of silent failures in production spell-check pipelines.

Key Takeaway

Edit distance is the minimum cost to transform one string into another using insert, delete, and substitute operations — each costing 1 unit.

Brute-Force Recursion — Why You Need It Before DP

Before you write a single DP table, you need to understand the raw recursion. DP is just caching. The core logic is the same.

At each position in both strings, you have two possibilities: the characters match (proceed) or they don't (three operations). Recursively try all three, take the minimum. It's O(3^(m+n)) — and yes, that's catastrophically slow for anything beyond a toy example. But it's the truth.

Why bother? Because every optimization — memoization, bottom-up DP, space-optimized DP — is just a variant of this. If you can't write the recursive version in your sleep, you don't understand the problem. Full stop.

Here's the skeleton: base cases are when one string is empty (return the length of the other). Otherwise, compare chars. Match? Skip. No match? Min of (delete, insert, substitute) plus 1. That's it. The DP version is the same logic, just bottom-up.

Don't skip this step. I've seen senior devs jump straight to DP, get the wrong answer, then spend hours debugging because they never validated the recursion logic first.

BruteForceRecursion.javaJAVA

// io.thecodeforge — dsa tutorial

public class BruteForceRecursion {
    public static int editDistance(String s, String t, int i, int j) {
        // Base cases
        if (i == s.length()) return t.length() - j; // insert remaining
        if (j == t.length()) return s.length() - i; // delete remaining
        
        // Match — no cost, move both pointers
        if (s.charAt(i) == t.charAt(j)) {
            return editDistance(s, t, i + 1, j + 1);
        }
        
        // Mismatch — try all three operations
        int delete = editDistance(s, t, i + 1, j);
        int insert = editDistance(s, t, i, j + 1);
        int substitute = editDistance(s, t, i + 1, j + 1);
        
        return 1 + Math.min(delete, Math.min(insert, substitute));
    }
    
    public static void main(String[] args) {
        System.out.println(editDistance("cat", "car", 0, 0)); // Output: 1
    }
}

Output

Senior Shortcut:

Always start with recursion + memoization before bottom-up DP. It's easier to reason about correctness. Then refactor to bottom-up when you're sure. Premature optimization is the root of all evil — and all silly bugs.

Key Takeaway

Brute-force recursion is O(3^n). Don't use it in production, but master it to understand the DP derivation.

● Production incidentPOST-MORTEMseverity: high

OOM When Comparing DNA Sequences of Length 100k

Symptom

Java heap space error thrown when calling editDistance on two long strings (each ~100k characters).

Assumption

The full DP table is acceptable for all inputs — it worked on small test data.

Root cause

O(m*n) memory allocation: for 100k x 100k, the int[][] table allocates ~40GB (10^10 ints × 4 bytes).

Fix

Switch to two-row rolling DP (O(n) space) or use an approximate algorithm (BK-tree, simhash) for very large strings.

Key lesson

Always profile with worst-case input sizes before deploying to production.
Use rolling DP for any production string comparison that might exceed a few thousand characters.

Production debug guideCommon production issues and how to fix them4 entries

Symptom · 01

Edit distance is 0 for obviously different strings

→

Fix

Check that input strings are not identical; verify case sensitivity and Unicode normalization (NFD/NFKC) are applied consistently.

Symptom · 02

OutOfMemoryError in production

→

Fix

Profile input lengths; switch to two-row DP or use approximate algorithm (e.g., BK-tree) when lengths exceed 10,000.

Symptom · 03

Incorrect edit distance for accented characters or emoji

→

Fix

Replace charAt(i) with codePointAt(i) and adjust iteration with Character.charCount() or codePointCount(). Normalize with java.text.Normalizer.

Symptom · 04

DP table construction is slow for 10k+ strings

→

Fix

Verify algorithm is O(m*n) not exponential; consider using weighted edit distance with early termination if distance exceeds a threshold.

★ Quick Edit Distance Debug Cheat SheetRapid checks for common edit distance implementation errors.

Distance too large for identical strings?−

Immediate action

Check base case initialization: dp[0][j] = j, dp[i][0] = i

Commands

print dp[1][1] when word1[0]==word2[0] -> should be 0

print dp[m][n] for word1='a', word2='a' -> should be 0

Fix now

Ensure dp[i][j] = dp[i-1][j-1] when chars equal, not 1+min(...)

OOM with large input?+

Edit Distance Variants Comparison

Variant	Operation Costs	Use Case
Levenshtein (standard)	Insert=1, Delete=1, Substitute=1	Spell check, DNA alignment
Weighted edit distance	Custom costs per operation	Keyboard layout corrections, bioinformatics (affine gaps)
LCS (Longest Common Subsequence)	Insert=1, Delete=1, Substitute=∞ (or not allowed)	File diff, plagiarism detection

Key takeaways

Edit distance measures the minimum insertions, deletions, and substitutions to transform one string to another.

DP recurrence

if characters match, copy diagonal; else 1 + min(delete, insert, substitute).

Time O(mn), Space O(mn) reducible to O(min(m,n)) with rolling rows.

Base cases

dp[i][0]=i (i deletions), dp[0][j]=j (j insertions).

Used in spell checkers, DNA alignment, diff tools, and fuzzy string matching.

Always handle Unicode with code points, not charAt.

Backtracking reveals the operation sequence

critical for diff displays.

Profile input sizes

for >10K characters, consider approximate algorithms or two-row DP.

Custom operation costs are common in production

generalize the recurrence with a cost matrix.

Common mistakes to avoid

4 patterns

Using charAt() for Unicode strings

Symptom

Emoji or accented characters cause incorrect edit distance values because charAt returns UTF-16 code units, not code points.

Fix

Normalise strings with java.text.Normalizer and use codePointAt / offsetByCodePoints for iteration.

Forgetting base cases

Symptom

Returned edit distance is too low or too high when one string is empty.

Fix

Always initialise dp[0][j]=j and dp[i][0]=i.

Assuming all operations cost 1

Symptom

Algorithm produces a distance that doesn't match desired behaviour (e.g., substitution should cost 2 in DNA alignment).

Fix

Generalise recurrence to use a cost table: dp[i][j] = min(dp[i-1][j-1]+costSub, dp[i-1][j]+costDel, dp[i][j-1]+costIns).

Not handling large strings efficiently

Symptom

OOM error or hours of computation when comparing documents with >10K characters.

Fix

Use two-row DP or switch to an approximate algorithm (e.g., BK-tree, simhash) for very large strings.

LEETCODE PRACTICE · 5 PROBLEMS

Practice These on LeetCode

Open LeetCode

#392Easy

Is Subsequence

Practices the core concept of comparing two strings by checking if one can be derived from the other via deletions, a foundational operation in edit distance.

Solve on LeetCode

#583Medium

Delete Operation for Two Strings

Directly applies the edit distance DP formulation to find the minimum number of deletions needed to make two strings equal.

Solve on LeetCode

#72Medium

Edit Distance

The classic problem that defines the edit distance metric using dynamic programming with insert, delete, and replace operations.

Solve on LeetCode

#10Hard

Regular Expression Matching

Extends edit distance concepts to handle pattern matching with wildcards and repetition using DP over string indices.

Solve on LeetCode

#44Hard

Wildcard Matching

Applies a DP approach similar to edit distance to match strings with wildcard characters that can represent sequences of characters.

Solve on LeetCode

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR

What are the three operations in Levenshtein distance and what DP state ...

Q02JUNIOR

What is the base case for the edit distance DP table?

Q03SENIOR

How is edit distance related to LCS?

Q04SENIOR

How can you reduce space complexity from O(m*n) to O(min(m,n))?

Q05SENIOR

How do you reconstruct the actual sequence of edit operations?

Q06SENIOR

What happens if you have negative operation costs?

Q07SENIOR

How do you handle Unicode supplementary characters in edit distance?

Q08SENIOR

In a production spell-checker, how would you incorporate word frequency ...

Q01 of 08JUNIOR

What are the three operations in Levenshtein distance and what DP state captures each?

ANSWER

Insertion: adding a character to word1 (dp[i][j-1]). Deletion: removing a character from word1 (dp[i-1][j]). Substitution: replacing a character in word1 with one from word2 (dp[i-1][j-1]). The recurrence chooses the minimum of these three plus 1 when characters differ.

FAQ · 6 QUESTIONS

Frequently Asked Questions

What is the difference between edit distance and LCS?

How do you reconstruct the actual sequence of operations?

How can edit distance be computed in O(min(m,n)) space?

Is edit distance case-sensitive?

Can edit distance handle weighted operations?

What is the best way to handle very long strings (100k+ characters)?

Naren Founder & Principal Engineer

20+ years shipping performance-critical code where algorithms decide the bill. Lessons pulled from things that broke in production.

✓ Verified

production tested

May 24, 2026

last updated

1,554

articles · all by Naren

🔥

That's Dynamic Programming. Mark it forged?

7 min read · try the examples if you haven't