KMP String Matching — LPS Off-by-One Bug in Production
- KMP avoids redundant comparisons by precomputing an LPS (failure function) array for the pattern.
- LPS[i] = length of the longest proper prefix of pattern[0..i] that is also a suffix.
- On mismatch at pattern[j], set j = lps[j-1] instead of restarting from 0.
- KMP precomputes an LPS array to avoid re-examining matched characters
- LPS[i] = length of longest proper prefix of pattern[0..i] that is also a suffix
- On mismatch, fallback to lps[j-1] instead of restarting from index 0
- Text pointer never moves backward — guaranteed O(n+m) runtime
- Build LPS in O(m) using a two-pointer backtracking loop
- Biggest mistake: using lps[j] instead of lps[j-1] — causes off-by-one silent failures
KMP Quick Debug Cheat Sheet
No matches found (but pattern exists)
print(build_lps('ABCAB')) # Expected: [0,0,0,1,2]print(kmp_search('ABABCABCABAB', 'ABCAB')) # Expected: [2,5]Infinite loop or very slow on small text
Add breakpoint at 'while i < n:' and step through with text='AAAB', pattern='AAB'Verify that i increments at least once per cycle — if not, you're missing the else branch.Matches found but at wrong positions
print(f'Match at {i-j}') just before appending — confirm against manual tracing.Test with pattern 'AA' and text 'AAA' — expected matches at 0 and 1.Production Incident
j = lps[j] to j = lps[j-1]. Added a randomised fuzz test that generates random patterns and texts and compares against a naive search — catches exactly these edge cases.Production Debug GuideDiagnose a broken KMP implementation in minutes
KMP is one of those algorithms that every senior engineer has encountered in a 1am incident. You're processing a 500MB log file looking for a pattern — maybe a specific error signature or a token format — and your naive substr search is burning through CPU. KMP is the fix, and understanding why it works is what separates engineers who reach for the right tool versus engineers who reach for regex and hope for the best.
Published in 1977 by Knuth, Morris, and Pratt, KMP solves string pattern matching in O(n+m) time — guaranteed, not amortised, not average-case. It works by never re-examining a character in the text. Once the text pointer moves right, it never goes left. This single invariant is why the algorithm runs in linear time regardless of what adversarial pattern you throw at it. The mechanism that enforces this invariant is the LPS array — and once you truly understand LPS, you'll see it hidden inside Z-algorithm, Aho-Corasick, and even suffix arrays.
How KMP Works — Plain English and Algorithm
The naive string search checks every position in the text and compares the pattern character by character — O(n*m) worst case. KMP (Knuth-Morris-Pratt) is faster because it never re-examines characters it has already matched. When a mismatch occurs, KMP uses a precomputed table to skip ahead intelligently instead of restarting from scratch.
The key insight: when a mismatch occurs at pattern[j], the substring pattern[0..j-1] was already matched. The 'failure function' (or LPS array — Longest Proper Prefix which is also Suffix) tells us the longest proper prefix of pattern[0..j-1] that is also a suffix. We can restart matching from that prefix instead of from the beginning.
Step 1 — Build the LPS (failure function) array for the pattern: 1. lps[0] = 0 always. 2. Use two pointers: len=0 (length of previous longest prefix-suffix), i=1. 3. If pattern[i] == pattern[len]: lps[i] = len+1, increment both i and len. 4. Else if len > 0: len = lps[len-1] (try a shorter prefix — do NOT increment i). 5. Else: lps[i] = 0, increment i.
Step 2 — Search the text using the LPS array: 1. Use two pointers: i (text index), j (pattern index), both start at 0. 2. If text[i] == pattern[j]: increment both i and j. 3. If j == len(pattern): match found at index i-j. Set j = lps[j-1]. 4. Else if i < len(text) and text[i] != pattern[j]: - If j > 0: j = lps[j-1] (skip via failure function, do NOT move i). - Else: increment i.
Time complexity: O(n + m) where n = text length, m = pattern length. The LPS build is O(m) and the search is O(n).
- The LPS array measures the 'slack' in the rope — how far it can recoil.
- If pattern[0..k] matches the suffix of the matched prefix, you can skip k characters of the text.
- The text pointer never goes backward — it's a unidirectional scan with smart jumps on the pattern side.
- KMP works because the pattern knows its own symmetry — it doesn't need to re-examine what it already learned.
Visual LPS Table Construction
The LPS array is built by walking through the pattern and determining the longest proper prefix that is also a suffix for each prefix of the pattern. The algorithm maintains a length variable that tracks the length of the previous longest prefix-suffix. At each step, the table shows the current state of the LPS array and the positions of pattern[i] and pattern[len]. This visual breaks down the construction for the pattern 'ABABCAB' (length 7) step by step.
- i=0: lps[0]=0 by definition. len=0.
- i=1: pattern[1]='B' != pattern[0]='A' and len=0 → lps[1]=0.
- i=2: pattern[2]='A' == pattern[0]='A' → len=1, lps[2]=1.
- i=3: pattern[3]='B' == pattern[1]='B' → len=2, lps[3]=2.
- i=4: pattern[4]='C' != pattern[2]='A' and len>0 → len=lps[1]=0; then pattern[4]='C' != pattern[0]='A' → lps[4]=0.
- i=5: pattern[5]='A' == pattern[0]='A' → len=1, lps[5]=1.
- i=6: pattern[6]='B' == pattern[1]='B' → len=2, lps[6]=2.
The diagram below shows the LPS array evolving after each comparison.
Worked Example — KMP Search for 'ABCAB' in 'ABABCABCABAB'
Pattern: ABCAB Text: ABABCABCABAB
Step 1 — Build LPS for pattern 'ABCAB': i=0: lps[0]=0 (always) i=1: pattern[1]='B' vs pattern[0]='A'. No match. lps[1]=0. i=2: pattern[2]='C' vs pattern[0]='A'. No match. lps[2]=0. i=3: pattern[3]='A' vs pattern[0]='A'. Match! lps[3]=1, len=1. i=4: pattern[4]='B' vs pattern[1]='B'. Match! lps[4]=2, len=2. LPS = [0, 0, 0, 1, 2]
Step 2 — Search (i=text index, j=pattern index): i=0,j=0: text[0]='A'==pattern[0]='A'. i=1,j=1. i=1,j=1: text[1]='B'==pattern[1]='B'. i=2,j=2. i=2,j=2: text[2]='A'!=pattern[2]='C'. j=lps[1]=0. i=2,j=0: text[2]='A'==pattern[0]='A'. i=3,j=1. i=3,j=1: text[3]='B'==pattern[1]='B'. i=4,j=2. i=4,j=2: text[4]='C'==pattern[2]='C'. i=5,j=3. i=5,j=3: text[5]='A'==pattern[3]='A'. i=6,j=4. i=6,j=4: text[6]='B'==pattern[4]='B'. j==5==len(pattern)! MATCH at index i-j=6-5=1... wait, i-j=6-5=1? No: i=7 after increment, j=5 equals pattern length. Match at i-j=7-5=2. j=lps[4]=2. Continue search from i=7,j=2... i=7,j=2: text[7]='C'==pattern[2]='C'. i=8,j=3. i=8,j=3: text[8]='A'==pattern[3]='A'. i=9,j=4. i=9,j=4: text[9]='B'==pattern[4]='B'. Match at i-j=10-5=5. j=lps[4]=2.
Matches found at indices 2 and 5 in the text.
Notice that after finding a match, KMP uses lps[j-1] to slide the pattern forward by the right amount — it does not restart from the beginning of the pattern.
KMP Implementation
The two-function structure (build_lps and kmp_search) matches the two-phase algorithm exactly. Study build_lps carefully — the len backtrack using lps[len-1] is the subtlest part.
def build_lps(pattern): m = len(pattern) lps = [0] * m length = 0 # length of previous longest prefix-suffix i = 1 while i < m: if pattern[i] == pattern[length]: length += 1 lps[i] = length i += 1 elif length > 0: length = lps[length - 1] # try shorter prefix # do NOT increment i here else: lps[i] = 0 i += 1 return lps def kmp_search(text, pattern): n, m = len(text), len(pattern) if m == 0: return [] lps = build_lps(pattern) matches = [] i = j = 0 while i < n: if text[i] == pattern[j]: i += 1; j += 1 if j == m: matches.append(i - j) # match at this index j = lps[j - 1] elif i < n and text[i] != pattern[j]: if j > 0: j = lps[j - 1] else: i += 1 return matches text = 'ABABCABCABAB' pattern = 'ABCAB' print('LPS:', build_lps(pattern)) print('Matches at indices:', kmp_search(text, pattern))
Matches at indices: [2, 5]
m==0 must be handled separately — an empty pattern should return no matches, but many implementations incorrectly treat it as a match at every position.C++ Implementation
The C++ implementation of KMP mirrors the Python version but requires careful handling of array indices and the std::vector container. The LPS construction and search functions are split similarly. Key differences: C++ uses size_t for indices, and the LPS array is built with a while loop that may not increment i in every iteration. The search function returns a vector of match positions. Below is a complete, production-ready C++ implementation that follows the same logic as the Python reference.
#include <vector> #include <string> #include <iostream> std::vector<int> build_lps(const std::string& pattern) { int m = pattern.length(); std::vector<int> lps(m, 0); int len = 0; // length of previous longest prefix-suffix int i = 1; while (i < m) { if (pattern[i] == pattern[len]) { len++; lps[i] = len; i++; } else if (len > 0) { len = lps[len - 1]; // backtrack without incrementing i } else { lps[i] = 0; i++; } } return lps; } std::vector<int> kmp_search(const std::string& text, const std::string& pattern) { int n = text.length(); int m = pattern.length(); if (m == 0) return {}; std::vector<int> lps = build_lps(pattern); std::vector<int> matches; int i = 0, j = 0; while (i < n) { if (text[i] == pattern[j]) { i++; j++; } if (j == m) { matches.push_back(i - j); j = lps[j - 1]; } else if (i < n && text[i] != pattern[j]) { if (j > 0) j = lps[j - 1]; else i++; } } return matches; } int main() { std::string text = "ABABCABCABAB"; std::string pattern = "ABCAB"; std::vector<int> lps = build_lps(pattern); std::cout << "LPS: "; for (int v : lps) std::cout << v << " "; std::cout << std::endl; std::vector<int> matches = kmp_search(text, pattern); std::cout << "Matches at indices: "; for (int idx : matches) std::cout << idx << " "; std::cout << std::endl; return 0; }
Matches at indices: 2 5
std::vector<int> for dynamic arrays rather than raw int* allocations. The same off-by-one bugs apply: double-check the fallback index lps[j-1]. Also note that C++ string indices are size_t, so casting to int for the LPS vector is safe for typical problem sizes, but avoid using signed/unsigned mismatches in larger loops.Common KMP Mistakes and How to Debug Them
Even experienced engineers slip up on KMP's fallback logic. The algorithm has three high-risk areas: the LPS backbuilding loop, the mismatch fallback in search, and the match-continuation handler.
- Off-by-one in LPS fallback: When len is not equal to 0 and pattern[i] != pattern[len], you must set len = lps[len-1]. Using lps[len] or lps[len-2] produces wrong values that cascade into missed matches.
- Confusing j and i in search: After a mismatch, you must NOT increment i when j > 0. This is the most common infinite-loop trigger. Only increment i when j == 0.
- After a match, setting j = 0 instead of j = lps[j-1]: This kills overlap detection. If your pattern is 'AA' and text is 'AAA', you'll miss the match at index 1.
- Forgetting that LPS lengths are always < j: lps[j-1] is always less than j. If your fallback index ever equals or exceeds j, you have a bug in LPS construction.
KMP Variants, Complexity, and When to Use Alternatives
KMP is not always the fastest string matching algorithm for every scenario. Here's how it compares:
- Boyer-Moore: Usually faster for large alphabets because it can skip more characters per mismatch (bad character and good suffix heuristics). Worst-case O(n*m) but rarely hits it.
- Rabin-Karp: Uses rolling hash. Good for multiple pattern search (plagiarism detection), but O(n*m) worst-case from hash collisions.
- Aho-Corasick: Builds a finite automaton from multiple patterns simultaneously. Ideal for virus signature detection or filtering many keywords.
- Z-algorithm: Computes an array Z[i] = longest substring starting at i that matches prefix. Can be used to find all matches in O(n+m) with simpler code than KMP.
KMP remains the best choice when you need guaranteed linear time and low overhead for a single pattern search. It's also the foundation for understanding more complex string algorithms.
Formal O(n) Complexity Proof
The KMP search algorithm guarantees O(n) time because the outer while loop runs at most n times where n is the text length. The key invariant: the text index i never decreases; it either increments (when a character matches or when j=0 and mismatch occurs) or stays the same (when j>0 and a mismatch occurs, causing j to fallback). However, each time j falls back, j decreases by at least 1, and j can only increase when i increments. Since j is bounded by m (pattern length), the total number of fallbacks across the entire search is also bounded by the total number of i increments, which is at most n. Therefore the total number of operations in the search loop is O(n).
More formally: let the search loop have two types of operations: - Type A: i increases by 1 (and j may increase at most 1). This can happen at most n times. - Type B: j decreases by at least 1 (via lps[j-1]). Since j starts at 0 and never exceeds m, and each increase of j (in Type A) is paired with an increase in i, the total number of Type B operations is at most the total number of Type A operations, i.e., at most n. Hence total operations ≤ 2n = O(n).
The LPS construction has an analogous proof: the pointer i only moves forward (at most m times), and the variable len only decreases when backtracking. Each backtrack reduces len by at least 1, and each increase of len is paired with an increment of i. So total backtrack steps ≤ m, giving O(m) for the build phase. Thus overall O(n+m).
String Period and Compression with KMP
One powerful application of the LPS (prefix function) is computing the smallest period of a string. The period of a string s is the smallest integer p such that s[i] = s[i+p] for all i where i+p < |s|. This is equivalent to saying that s is made up of repetitions of a substring of length p. Using the LPS array (or equivalently the prefix function), the smallest period of s is |s| - LPS[|s|-1] if |s| % (|s| - LPS[|s|-1]) == 0, otherwise the string is not periodic.
More concretely: let π denote the prefix function (same as LPS but defined for the whole string, not just prefixes). Then the length of the smallest period is n - π[n-1] if n % (n - π[n-1]) == 0. For example, for s = "ABABAB", n=6, LPS last value = 4 (since "ABAB" is prefix-suffix), period = 6-4=2. Indeed "AB" repeated 3 times.
This property allows string compression: a periodic string can be stored as a base substring and a repeat count, reducing memory. It also helps in finding repeated substrings in DNA sequences or log patterns. In competitive programming, you can solve problems like "Find the smallest period of a string" or "Check if a string can be formed by repeating a substring" using this technique.
Practice Problems
Solidify your understanding of KMP by solving these problems that leverage the algorithm and its variants. They range from direct application to more creative uses of the prefix function. Attempt them in order of difficulty and test against edge cases like overlapping patterns.
| Algorithm | Time Complexity | Space | Best For | Worst-Case |
|---|---|---|---|---|
| KMP | O(n+m) | O(m) | Single pattern, guaranteed linear | Always O(n+m) |
| Boyer-Moore | O(n+m) average, O(n*m) worst | O(σ+m) | Large alphabet, long text | Adversarial patterns |
| Rabin-Karp | O(n+m) average, O(n*m) worst | O(1) | Multiple patterns, rolling hash | Hash collisions |
| Aho-Corasick | O(n+m+total matches) | O(Σ * total pattern length) | Multiple patterns simultaneously | Very large automaton |
| Z-algorithm | O(n+m) | O(n+m) | Simple implementation, pattern matching | Same as KMP |
🎯 Key Takeaways
- KMP avoids redundant comparisons by precomputing an LPS (failure function) array for the pattern.
- LPS[i] = length of the longest proper prefix of pattern[0..i] that is also a suffix.
- On mismatch at pattern[j], set j = lps[j-1] instead of restarting from 0.
- Time complexity: O(n+m). Space: O(m) for the LPS array.
- KMP never moves the text pointer i backward — it only advances, guaranteeing linear time.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QExplain what the LPS array stores and how it is computed.Mid-levelReveal
- QWhy does KMP run in O(n+m) rather than O(n*m)?SeniorReveal
- QHow would you use KMP to check if one string is a rotation of another?Mid-levelReveal
- QWhat is the LPS array for the pattern 'ABACABA'?SeniorReveal
Frequently Asked Questions
What does the LPS array represent?
LPS[i] is the length of the longest proper prefix of pattern[0..i] that is also a suffix of pattern[0..i]. 'Proper' means it cannot be the whole string itself. For pattern 'ABCAB', lps[4]=2 because 'AB' (length 2) is both a prefix and a suffix of 'ABCAB'. This tells KMP: after a mismatch at position j, restart matching from position lps[j-1] instead of 0.
Why is KMP O(n+m) when it seems like the inner loop could run multiple times?
The variable j can only be decremented via lps lookups, and it can only be incremented when a character matches (which also increments i). Since i only ever increases and is bounded by n, and j's total number of decrements is bounded by its total number of increments, the total work across the entire search is O(n). The LPS build is similarly O(m).
When should I use KMP vs other string matching algorithms?
Use KMP when searching for a single pattern in a long text — it is simple to implement and O(n+m). Use Aho-Corasick when searching for multiple patterns simultaneously. Use Rabin-Karp when you need rolling hash (e.g., plagiarism detection over sliding windows). Use Boyer-Moore when the alphabet is large — it skips more characters per mismatch.
How do I verify my KMP implementation is correct?
Create a fuzz test that generates random patterns and texts, then compare KMP's match results with a naive substring search. Test edge cases: empty pattern, single character pattern, pattern with all same characters (e.g., 'AAA'), pattern with overlapping prefixes (e.g., 'ABAB'), and text that is shorter than the pattern.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.