Home DSA String Manipulation Patterns in DSA — The Definitive Guide

String Manipulation Patterns in DSA — The Definitive Guide

In Plain English 🔥
Imagine you're sorting through a long receipt from a grocery store, looking for every time you bought milk. You don't reread the entire receipt from scratch each time — you scan once, maybe use your finger as a pointer, and keep a tally. String manipulation patterns work exactly the same way: they're clever scanning strategies that let your code process text efficiently without doing unnecessary repetitive work. Once you learn the patterns, you stop reinventing the wheel every time a string problem shows up.
⚡ Quick Answer
Imagine you're sorting through a long receipt from a grocery store, looking for every time you bought milk. You don't reread the entire receipt from scratch each time — you scan once, maybe use your finger as a pointer, and keep a tally. String manipulation patterns work exactly the same way: they're clever scanning strategies that let your code process text efficiently without doing unnecessary repetitive work. Once you learn the patterns, you stop reinventing the wheel every time a string problem shows up.

Strings are everywhere in software. Search engines scan billions of words per second. Your bank checks that your account number matches a pattern. Autocomplete predicts the next word you're about to type. Every one of those features is built on a small set of string manipulation patterns that engineers reach for again and again. Understanding them doesn't just make you better at LeetCode — it makes you a sharper programmer in production code.

The core problem with naive string processing is that it's deceptively slow. It's tempting to write nested loops that check every possible substring, but that approach scales terribly. A string with 10,000 characters processed naively can require 100 million comparisons. The patterns covered here — sliding window, two pointers, frequency maps, and hashing — reduce most of those problems to a single linear pass, or at worst O(n log n).

By the end of this article you'll be able to identify which pattern fits a given string problem before you write a single line of code, implement each one cleanly in Java, dodge the subtle bugs that trip up even experienced developers, and walk into an interview knowing exactly why your solution is optimal — not just that it works.

The Sliding Window Pattern — Scan Once, Answer Fast

The sliding window pattern is your go-to whenever a problem asks about a contiguous substring — 'find the longest substring with no repeating characters', 'find the smallest window containing all target characters', or 'find every anagram of a pattern in a string'. The insight is simple: instead of restarting your scan from scratch every time, you maintain a window defined by two boundary pointers and slide it forward, updating your state incrementally.

Think of it like reading through a book with a sticky-note bookmark that stretches. When the text inside the bookmark violates your rule, you shrink the left edge. When it's fine, you extend the right edge. You never go backward.

There are two flavours: fixed-size windows (the window length never changes — useful for anagram detection) and variable-size windows (the window grows and shrinks based on a condition — useful for 'longest/shortest substring that satisfies X'). Knowing which flavour to use is half the battle. Fixed window: the problem gives you a target length. Variable window: the problem asks you to find a length.

LongestUniqueSubstring.java · JAVA
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
import java.util.HashMap;
import java.util.Map;

public class LongestUniqueSubstring {

    /**
     * Finds the length of the longest substring without repeating characters.
     * Classic variable-size sliding window problem.
     *
     * Time:  O(n) — each character is visited at most twice (once by right, once by left)
     * Space: O(min(n, alphabet)) — the map holds at most one entry per unique character
     */
    public static int findLongestUniqueSubstring(String input) {
        // Maps each character to the index AFTER its last known position.
        // Storing index+1 lets us jump the left pointer forward in one step.
        Map<Character, Integer> lastSeenAt = new HashMap<>();

        int maxLength = 0;
        int windowStart = 0; // left edge of our sliding window

        for (int windowEnd = 0; windowEnd < input.length(); windowEnd++) {
            char currentChar = input.charAt(windowEnd);

            // If this character already exists inside our current window,
            // move the left edge just past its previous position so the
            // window no longer contains the duplicate.
            if (lastSeenAt.containsKey(currentChar)) {
                // Math.max prevents the window from moving BACKWARD
                // if the duplicate was outside the current window.
                windowStart = Math.max(windowStart, lastSeenAt.get(currentChar));
            }

            // Record this character's "next safe position" for the left pointer
            lastSeenAt.put(currentChar, windowEnd + 1);

            // Current window length = windowEnd - windowStart + 1
            maxLength = Math.max(maxLength, windowEnd - windowStart + 1);
        }

        return maxLength;
    }

    public static void main(String[] args) {
        System.out.println(findLongestUniqueSubstring("abcabcbb"));  // 3  → "abc"
        System.out.println(findLongestUniqueSubstring("bbbbb"));     // 1  → "b"
        System.out.println(findLongestUniqueSubstring("pwwkew"));    // 3  → "wke"
        System.out.println(findLongestUniqueSubstring(""));          // 0  → empty string
        System.out.println(findLongestUniqueSubstring("abcdefg"));   // 7  → whole string
    }
}
▶ Output
3
1
3
0
7
⚠️
Watch Out: The Missing Math.maxOmitting Math.max(windowStart, lastSeenAt.get(currentChar)) is the #1 sliding window bug. Without it, a duplicate character found BEFORE the current window drags your left pointer backwards, making the window artificially shrink. Always guard the left pointer — it must only ever move forward.

Frequency Maps — When You Care About Character Counts, Not Positions

A frequency map (also called a character count array or histogram) is a data structure that counts how many times each character appears. It transforms a string into a numerical fingerprint. Two strings with identical fingerprints are anagrams of each other. That's powerful.

For strings restricted to lowercase English letters, you can use an int[26] array instead of a HashMap. Array access is O(1) with no hashing overhead, and comparing two int[26] arrays takes exactly 26 comparisons — constant time regardless of string length. This is a significant practical speedup that interviewers love to hear you mention.

Frequency maps unlock the fixed-size sliding window approach for anagram problems. You precompute the frequency map of the pattern, then slide a same-length window across the text. Each slide, you add one character and remove one character from your running frequency map. When the running map matches the pattern map, you've found an anagram. One pass. O(n) time.

AnagramFinder.java · JAVA
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class AnagramFinder {

    /**
     * Returns all starting indices where an anagram of 'pattern' begins in 'text'.
     * Uses a fixed-size sliding window with an int[26] frequency map.
     *
     * Time:  O(n) where n = text.length()
     * Space: O(1) — the two int[26] arrays are constant size regardless of input
     */
    public static List<Integer> findAnagramStartIndices(String text, String pattern) {
        List<Integer> resultIndices = new ArrayList<>();

        // Edge case: pattern can't fit inside text
        if (pattern.length() > text.length()) {
            return resultIndices;
        }

        int windowSize = pattern.length();

        // Build the frequency fingerprint for the target pattern
        int[] patternFrequency = new int[26];
        for (char ch : pattern.toCharArray()) {
            patternFrequency[ch - 'a']++; // 'a'=0, 'b'=1, ..., 'z'=25
        }

        // Build the frequency fingerprint for the first window in text
        int[] windowFrequency = new int[26];
        for (int i = 0; i < windowSize; i++) {
            windowFrequency[text.charAt(i) - 'a']++;
        }

        // Check the first window before we start sliding
        if (Arrays.equals(patternFrequency, windowFrequency)) {
            resultIndices.add(0);
        }

        // Slide the window one character at a time
        for (int windowEnd = windowSize; windowEnd < text.length(); windowEnd++) {
            // Add the new character entering the right side of the window
            windowFrequency[text.charAt(windowEnd) - 'a']++;

            // Remove the character leaving the left side of the window
            int windowStart = windowEnd - windowSize;
            windowFrequency[text.charAt(windowStart) - 'a']--;

            // Arrays.equals on two int[26] arrays is 26 comparisons — O(1)
            if (Arrays.equals(patternFrequency, windowFrequency)) {
                resultIndices.add(windowStart + 1); // +1 because new window starts one ahead
            }
        }

        return resultIndices;
    }

    public static void main(String[] args) {
        System.out.println(findAnagramStartIndices("cbaebabacd", "abc")); // [0, 6]
        System.out.println(findAnagramStartIndices("abab", "ab"));        // [0, 1, 2]
        System.out.println(findAnagramStartIndices("af", "be"));          // []
    }
}
▶ Output
[0, 6]
[0, 1, 2]
[]
⚠️
Pro Tip: int[26] vs HashMapWhen the problem guarantees lowercase English letters only, always use int[26] over HashMap. The array version avoids autoboxing, hash collisions, and resize operations. In benchmarks it runs 3-5x faster for this specific use case. Mention this trade-off explicitly in an interview — it signals you think about real-world performance, not just algorithmic complexity.

Two Pointers on Strings — The Palindrome and Reverse Toolkit

Two pointers on a string means placing one pointer at the start and one at the end, then marching them toward each other. It's perfect for problems involving symmetry (palindromes), reversals, or comparisons from both ends simultaneously.

The reason this pattern exists is that many string properties are inherently symmetric. A palindrome reads the same forwards and backwards. The minimum number of deletions to make a string a palindrome depends on mismatches at mirrored positions. Both of these are naturally expressed as comparisons between a left and right pointer moving inward.

Two pointers also pairs beautifully with other patterns. You can use two pointers on the result of a frequency map to reconstruct strings. You can use them inside a sliding window as the window boundaries themselves. Once you're comfortable with each pattern in isolation, start noticing when problems require you to combine two of them — that's where intermediate-level solutions start to look elegant rather than brute-force.

PalindromeChecker.java · JAVA
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100
public class PalindromeChecker {

    /**
     * Checks if a string is a valid palindrome, considering only
     * alphanumeric characters and ignoring case.
     * Real-world variant of the classic problem — matches how
     * "A man, a plan, a canal: Panama" is considered a palindrome.
     *
     * Time:  O(n)
     * Space: O(1) — no extra data structures, just two integer pointers
     */
    public static boolean isValidPalindrome(String sentence) {
        int leftPointer = 0;
        int rightPointer = sentence.length() - 1;

        while (leftPointer < rightPointer) {
            // Skip non-alphanumeric characters from the left
            while (leftPointer < rightPointer
                    && !Character.isLetterOrDigit(sentence.charAt(leftPointer))) {
                leftPointer++;
            }

            // Skip non-alphanumeric characters from the right
            while (leftPointer < rightPointer
                    && !Character.isLetterOrDigit(sentence.charAt(rightPointer))) {
                rightPointer--;
            }

            // Compare characters at both pointers, case-insensitive
            char leftChar  = Character.toLowerCase(sentence.charAt(leftPointer));
            char rightChar = Character.toLowerCase(sentence.charAt(rightPointer));

            if (leftChar != rightChar) {
                return false; // Mismatch — definitely not a palindrome
            }

            // Both characters matched — move both pointers inward
            leftPointer++;
            rightPointer--;
        }

        // All mirrored pairs matched
        return true;
    }

    /**
     * Finds the length of the longest palindromic substring using
     * the "expand around center" technique — a two-pointer expansion.
     * Handles both odd-length ("racecar") and even-length ("abba") palindromes.
     *
     * Time:  O(n²) — each of the 2n-1 centers expands up to n/2 times
     * Space: O(1)
     */
    public static String longestPalindromicSubstring(String word) {
        if (word == null || word.isEmpty()) return "";

        int bestStart  = 0;
        int bestLength = 1;

        for (int centerIndex = 0; centerIndex < word.length(); centerIndex++) {
            // Expand for ODD-length palindromes (single character center)
            int oddLength = expandFromCenter(word, centerIndex, centerIndex);

            // Expand for EVEN-length palindromes (gap between two characters)
            int evenLength = expandFromCenter(word, centerIndex, centerIndex + 1);

            int longerExpansion = Math.max(oddLength, evenLength);

            if (longerExpansion > bestLength) {
                bestLength = longerExpansion;
                // Reverse-calculate where this palindrome starts
                bestStart = centerIndex - (longerExpansion - 1) / 2;
            }
        }

        return word.substring(bestStart, bestStart + bestLength);
    }

    private static int expandFromCenter(String word, int left, int right) {
        // Keep expanding outward while characters match and we're in bounds
        while (left >= 0 && right < word.length()
                && word.charAt(left) == word.charAt(right)) {
            left--;
            right++;
        }
        // When the loop exits, left and right have gone one step too far
        // Length formula: right - left - 1
        return right - left - 1;
    }

    public static void main(String[] args) {
        System.out.println(isValidPalindrome("A man, a plan, a canal: Panama")); // true
        System.out.println(isValidPalindrome("race a car"));                      // false
        System.out.println(isValidPalindrome(" "));                               // true

        System.out.println(longestPalindromicSubstring("babad"));   // "bab" or "aba"
        System.out.println(longestPalindromicSubstring("cbbd"));    // "bb"
        System.out.println(longestPalindromicSubstring("racecar")); // "racecar"
    }
}
▶ Output
true
false
true
bab
bb
racecar
🔥
Interview Gold: Expand Around Center vs Dynamic ProgrammingThe expand-around-center approach for longest palindromic substring runs in O(n²) time and O(1) space. Dynamic programming also gives O(n²) time but uses O(n²) space for its table. Unless you need to reconstruct all palindromic substrings, expand-around-center is strictly better. Manacher's algorithm gets it to O(n) but is rarely expected in interviews — knowing it exists and why is enough.

String Hashing — Catching Patterns You Can't See by Eye

Hashing a string means converting it into a number so you can compare strings in O(1) instead of O(n). The canonical application is the Rabin-Karp rolling hash algorithm for substring search, but the concept shows up any time you need to detect duplicate substrings, group anagrams, or find repeated patterns at scale.

The key idea is a rolling hash: when your window slides one character to the right, you don't recompute the entire hash from scratch. You mathematically remove the contribution of the outgoing character and add the incoming character. This keeps each slide at O(1), making the full scan O(n) regardless of pattern length.

Grouping anagrams is a softer but very common application. The trick: sort each word's characters to produce a canonical key. All anagrams of 'eat' sort to 'aet'. Store them in a HashMap> keyed by the sorted form. This is O(n * k log k) where k is the average word length — entirely practical for real word lists and comes up frequently in interview problems involving dictionaries.

AnagramGrouper.java · JAVA
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778
import java.util.*;

public class AnagramGrouper {

    /**
     * Groups a list of words so that anagrams appear together.
     * Uses a sorted-character string as a canonical hash key.
     *
     * Real-world use: autocorrect dictionaries, scrabble solvers,
     * duplicate content detection in CMS platforms.
     *
     * Time:  O(n * k log k) where n = number of words, k = average word length
     * Space: O(n * k) to store all words in the result map
     */
    public static List<List<String>> groupAnagrams(String[] words) {
        // Key: sorted characters of a word (its anagram fingerprint)
        // Value: all words that share that fingerprint
        Map<String, List<String>> anagramBuckets = new HashMap<>();

        for (String word : words) {
            // Sort the characters to create a canonical key
            // "eat", "tea", "ate" all produce "aet"
            char[] wordChars = word.toCharArray();
            Arrays.sort(wordChars);
            String canonicalKey = new String(wordChars);

            // computeIfAbsent creates a new list only when the key is new
            anagramBuckets.computeIfAbsent(canonicalKey, k -> new ArrayList<>()).add(word);
        }

        // Return all the groups as a flat list of lists
        return new ArrayList<>(anagramBuckets.values());
    }

    /**
     * Alternative fingerprint approach using a frequency-count key.
     * Avoids sorting entirely — O(n * k) overall.
     * Useful when k is large (long words) and sorting becomes the bottleneck.
     */
    public static List<List<String>> groupAnagramsLinear(String[] words) {
        Map<String, List<String>> anagramBuckets = new HashMap<>();

        for (String word : words) {
            int[] charCounts = new int[26];
            for (char ch : word.toCharArray()) {
                charCounts[ch - 'a']++;
            }

            // Build a string like "#2#0#1#..." to represent the frequency map
            // This string uniquely identifies an anagram family
            StringBuilder keyBuilder = new StringBuilder();
            for (int count : charCounts) {
                keyBuilder.append('#').append(count);
            }
            String frequencyKey = keyBuilder.toString();

            anagramBuckets.computeIfAbsent(frequencyKey, k -> new ArrayList<>()).add(word);
        }

        return new ArrayList<>(anagramBuckets.values());
    }

    public static void main(String[] args) {
        String[] wordList = {"eat", "tea", "tan", "ate", "nat", "bat"};

        List<List<String>> grouped = groupAnagrams(wordList);
        System.out.println("Sorted-key approach:");
        for (List<String> group : grouped) {
            System.out.println("  " + group);
        }

        System.out.println("\nFrequency-key approach:");
        List<List<String>> groupedLinear = groupAnagramsLinear(wordList);
        for (List<String> group : groupedLinear) {
            System.out.println("  " + group);
        }
    }
}
▶ Output
Sorted-key approach:
[eat, tea, ate]
[tan, nat]
[bat]

Frequency-key approach:
[eat, tea, ate]
[tan, nat]
[bat]
⚠️
Pro Tip: The Frequency-Key Trick Beats Sorting for Long WordsSorting each word is O(k log k). Building a frequency key is O(k). For short words (k ≤ 10) the difference is negligible. But when processing a genome sequence where 'words' can be thousands of characters long, the frequency-key approach cuts your runtime meaningfully. Always ask: 'How long can the strings be?' before choosing your approach.
PatternBest ForTime ComplexitySpace ComplexityKey Signal in Problem
Sliding Window (variable)Longest/shortest substring with constraintO(n)O(alphabet size)Problem says 'longest/shortest/minimum window'
Sliding Window (fixed)Anagram detection, fixed-length pattern matchO(n)O(1) with int[26]Problem gives a fixed pattern length
Two Pointers (inward)Palindrome check, symmetry comparisonO(n)O(1)Problem involves mirroring or reading both ends
Frequency MapCharacter count comparison, anagram groupingO(n)O(alphabet size)Problem asks 'same characters?' or 'rearrangement?'
String Hashing / Sorted KeyGrouping anagrams, duplicate substring detectionO(n * k log k)O(n * k)Problem asks to group or deduplicate by content

🎯 Key Takeaways

  • Pattern recognition before coding: identify whether the problem needs a fixed or variable window, symmetric comparison, or frequency fingerprint before touching the keyboard — the right pattern choice eliminates 80% of implementation bugs.
  • The sliding window left pointer must only ever move forward — the missing Math.max guard is the single most common sliding window bug in interview settings and production code alike.
  • int[26] is strictly better than HashMap for lowercase-only string problems: no autoboxing, no collision handling, constant-time comparison via Arrays.equals — always state this trade-off in an interview.
  • Sorting characters is the easiest anagram key but costs O(k log k) per word — the frequency-count key approach cuts it to O(k) and matters significantly when processing long strings at scale.

⚠ Common Mistakes to Avoid

  • Mistake 1: Not guarding the left pointer in variable sliding window — Symptom: the window appears to shrink correctly but then suddenly gives wrong answers for strings with distant duplicate characters (e.g. 'abba' returns 3 instead of 2) — Fix: always use windowStart = Math.max(windowStart, lastSeenAt.get(currentChar)) when updating the left pointer; the max prevents the pointer from jumping backward.
  • Mistake 2: Using String concatenation inside a loop to build results — Symptom: code is logically correct but TLEs (Time Limit Exceeded) on inputs with 10,000+ characters because each '+' creates a new String object in O(n) — Fix: always use StringBuilder for any string assembly inside a loop; call toString() once at the very end.
  • Mistake 3: Assuming int[26] works for all string problems — Symptom: ArrayIndexOutOfBoundsException or silent wrong answers when the input contains uppercase letters, digits, spaces, or Unicode characters — Fix: read the constraints before choosing your data structure; use int[128] for full ASCII or HashMap for Unicode; add Character.toLowerCase() if the problem is case-insensitive.

Interview Questions on This Topic

  • QGiven a string s and a pattern p, find the minimum window substring of s that contains all characters of p, including duplicates. Walk me through your approach before writing any code.
  • QHow would you determine if two strings are anagrams of each other in O(n) time and O(1) space? What constraint on the input makes O(1) space possible, and when does that assumption break?
  • QYou've solved 'longest substring without repeating characters' using a HashMap. Your interviewer asks you to do it with O(1) space. Is that possible? What trade-off would you make, and what constraint on the input would you need?

Frequently Asked Questions

What is the sliding window technique in string problems?

The sliding window technique maintains a contiguous substring defined by two pointers (left and right). Instead of restarting from scratch for each position, you slide the window forward by adding one character on the right and optionally removing one on the left, updating your state incrementally. This reduces most O(n²) substring problems to O(n).

How do I check if two strings are anagrams in Java?

The fastest O(n) approach for lowercase-only strings is to build an int[26] frequency array for each string — increment for the first string, decrement for the second — then check that all 26 values are zero. If any entry is non-zero, the strings aren't anagrams. Alternatively, sort both strings and compare with equals(), but that's O(n log n).

When should I use two pointers vs sliding window for string problems?

Use two pointers moving inward from both ends when the problem involves symmetry, palindromes, or comparing characters from opposite sides of the string. Use a sliding window when the problem involves a contiguous segment (substring) and asks you to find the longest, shortest, or all such segments satisfying a constraint. Many problems can be phrased either way — the key is whether you're comparing ends (two pointers) or tracking a region (sliding window).

🔥
TheCodeForge Editorial Team Verified Author

Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.

← PreviousDutch National Flag AlgorithmNext →Anagram and Palindrome Problems
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged