Intermediate 3 min · March 24, 2026

Z-Algorithm — Direct Match Lengths vs KMP

Replace KMP's failure function with Z-algorithm's Z-array for exact match lengths — see O(n+m) pattern matching in Python with clear examples.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
Plain-English First

Imagine a string like 'aabxaabaabx'. The Z-algorithm asks: at each position i, how long is the longest substring starting at i that matches the very beginning of the string? Position 4 gives 'aabx' which matches the start — so Z[4]=4. This single idea, computed efficiently for every position, is enough to find any pattern in any text.

The Z-algorithm is competitive programming's go-to string tool. It solves the same problem as KMP in O(n+m), but the Z-array is directly interpretable — you see match lengths explicitly rather than reasoning about failure functions. Engineers who learned through competitive programming reach for Z-algorithm first because it is memorisable under interview pressure.

The Z-box insight that makes it linear is the exact same structural trick as KMP's failure function: maintain a rightmost matched boundary, initialise from the mirror position, only expand what has not yet been proven. Once you see this pattern, you recognise it in KMP, Manacher, and suffix array construction.

How the Z-Algorithm Works — Plain English and Steps

The Z-algorithm computes a Z-array for a string S where Z[i] is the length of the longest substring starting at S[i] that is also a prefix of S. For example, in S='aabxaa', Z[4]=2 because S[4..5]='aa' matches the prefix 'aa'.

For pattern matching, concatenate pattern + '$' + text (the '$' is a separator not in either string). Compute the Z-array. Any position i in the text portion where Z[i] == len(pattern) is a match.

Algorithm to build the Z-array in O(n): 1. Z[0] is undefined (or set to n by convention). Maintain a window [l, r] — the rightmost Z-box found so far. 2. For each i from 1 to n-1: a. If i > r: compute Z[i] naively by matching S[i..] against S[0..]. Update l=i, r=i+Z[i]-1. b. Else (i is inside the current Z-box [l,r]): - Let k = i - l (position of i within the box, mirrored in prefix). - If Z[k] < r - i + 1: Z[i] = Z[k] (the match is entirely inside the box). - Else: extend naively from r+1 and update l and r.

The Z-box [l,r] allows us to reuse previous match results — this is what gives O(n) time instead of O(n^2).

Worked Example — Z-Array for 'aabcaab'

String S = 'aabcaab' (indices 0-6)

Z[0] = 7 by convention (entire string matches itself).

i=1: S[1..]='abcaab'. Match against prefix 'aabcaab'. S[1]='a'==S[0]='a', S[2]='b'==S[1]='a'? No. Z[1]=1. l=1,r=1. i=2: S[2..]='bcaab'. S[2]='b'!=S[0]='a'. Z[2]=0. i=3: S[3..]='caab'. S[3]='c'!=S[0]='a'. Z[3]=0. i=4: i>r(=1). S[4..]='aab'. S[4]='a'==S[0], S[5]='a'==S[1], S[6]='b'==S[2]? Yes! Z[4]=3. l=4,r=6. i=5: i=5 is inside [4,6]. k=5-4=1. Z[k]=Z[1]=1. Check: r-i+1=6-5+1=2. Z[1]=1 < 2. Z[5]=Z[1]=1. i=6: i=6 is inside [4,6]. k=6-4=2. Z[k]=Z[2]=0. Z[2]=0 < r-i+1=1. Z[6]=0.

Z-array: [7, 1, 0, 0, 3, 1, 0]

Pattern matching example: find 'aa' in text 'aabcaab'. Concatenated: 'aa$aabcaab' (length 10) Z-array: [10,1,0,2,1,0,0,3,1,0] Pattern length = 2. Look for Z[i] == 2 in the text portion (indices >= 3). Z[3]=2 -> match at text index 3-3=0. Z[7]=3 >= 2 -> match at text index 7-3=4. Matches at indices 0 and 4 in the original text.

Z-Algorithm Implementation

The implementation maintains the rightmost Z-box [l, r] to avoid redundant comparisons.

z_algorithm.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def build_z_array(s):
    n = len(s)
    z = [0] * n
    z[0] = n
    l = r = 0
    for i in range(1, n):
        if i < r:
            z[i] = min(r - i, z[i - l])
        while i + z[i] < n and s[z[i]] == s[i + z[i]]:
            z[i] += 1
        if i + z[i] > r:
            l, r = i, i + z[i]
    return z

def z_search(text, pattern):
    if not pattern: return []
    combined = pattern + '$' + text
    z = build_z_array(combined)
    m = len(pattern)
    return [i - m - 1 for i in range(m + 1, len(combined)) if z[i] == m]

text    = 'aabcaab'
pattern = 'aa'
print('Z-array:', build_z_array('aabcaab'))
print('Matches at:', z_search(text, pattern))
Output
Z-array: [7, 1, 0, 0, 3, 1, 0]
Matches at: [0, 4]

Key takeaways

1
Z[i] = length of the longest prefix of S that matches S starting at position i.
2
For pattern matching
concatenate pattern + '$' + text. Any position with Z[i] == m is a match.
3
The Z-box [l,r] allows O(n) construction by reusing previously computed matches.
4
Time complexity
O(n+m). Space: O(n+m) for the combined string and Z-array.
5
Conceptually simpler than KMP
one array instead of two phases.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

FAQ · 3 QUESTIONS

Frequently Asked Questions

01
What is the Z-array used for besides pattern matching?
02
How does the Z-algorithm compare to KMP?
03
Why use '$' as a separator in pattern matching?
🔥

That's Strings. Mark it forged?

3 min read · try the examples if you haven't

Previous
KMP Algorithm — Knuth-Morris-Pratt String Matching
3 / 8 · Strings
Next
Boyer-Moore String Search Algorithm