String Manipulation Patterns in DSA — The Definitive Guide
Strings are everywhere in software. Search engines scan billions of words per second. Your bank checks that your account number matches a pattern. Autocomplete predicts the next word you're about to type. Every one of those features is built on a small set of string manipulation patterns that engineers reach for again and again. Understanding them doesn't just make you better at LeetCode — it makes you a sharper programmer in production code.
The core problem with naive string processing is that it's deceptively slow. It's tempting to write nested loops that check every possible substring, but that approach scales terribly. A string with 10,000 characters processed naively can require 100 million comparisons. The patterns covered here — sliding window, two pointers, frequency maps, and hashing — reduce most of those problems to a single linear pass, or at worst O(n log n).
By the end of this article you'll be able to identify which pattern fits a given string problem before you write a single line of code, implement each one cleanly in Java, dodge the subtle bugs that trip up even experienced developers, and walk into an interview knowing exactly why your solution is optimal — not just that it works.
The Sliding Window Pattern — Scan Once, Answer Fast
The sliding window pattern is your go-to whenever a problem asks about a contiguous substring — 'find the longest substring with no repeating characters', 'find the smallest window containing all target characters', or 'find every anagram of a pattern in a string'. The insight is simple: instead of restarting your scan from scratch every time, you maintain a window defined by two boundary pointers and slide it forward, updating your state incrementally.
Think of it like reading through a book with a sticky-note bookmark that stretches. When the text inside the bookmark violates your rule, you shrink the left edge. When it's fine, you extend the right edge. You never go backward.
There are two flavours: fixed-size windows (the window length never changes — useful for anagram detection) and variable-size windows (the window grows and shrinks based on a condition — useful for 'longest/shortest substring that satisfies X'). Knowing which flavour to use is half the battle. Fixed window: the problem gives you a target length. Variable window: the problem asks you to find a length.
import java.util.HashMap; import java.util.Map; public class LongestUniqueSubstring { /** * Finds the length of the longest substring without repeating characters. * Classic variable-size sliding window problem. * * Time: O(n) — each character is visited at most twice (once by right, once by left) * Space: O(min(n, alphabet)) — the map holds at most one entry per unique character */ public static int findLongestUniqueSubstring(String input) { // Maps each character to the index AFTER its last known position. // Storing index+1 lets us jump the left pointer forward in one step. Map<Character, Integer> lastSeenAt = new HashMap<>(); int maxLength = 0; int windowStart = 0; // left edge of our sliding window for (int windowEnd = 0; windowEnd < input.length(); windowEnd++) { char currentChar = input.charAt(windowEnd); // If this character already exists inside our current window, // move the left edge just past its previous position so the // window no longer contains the duplicate. if (lastSeenAt.containsKey(currentChar)) { // Math.max prevents the window from moving BACKWARD // if the duplicate was outside the current window. windowStart = Math.max(windowStart, lastSeenAt.get(currentChar)); } // Record this character's "next safe position" for the left pointer lastSeenAt.put(currentChar, windowEnd + 1); // Current window length = windowEnd - windowStart + 1 maxLength = Math.max(maxLength, windowEnd - windowStart + 1); } return maxLength; } public static void main(String[] args) { System.out.println(findLongestUniqueSubstring("abcabcbb")); // 3 → "abc" System.out.println(findLongestUniqueSubstring("bbbbb")); // 1 → "b" System.out.println(findLongestUniqueSubstring("pwwkew")); // 3 → "wke" System.out.println(findLongestUniqueSubstring("")); // 0 → empty string System.out.println(findLongestUniqueSubstring("abcdefg")); // 7 → whole string } }
1
3
0
7
Frequency Maps — When You Care About Character Counts, Not Positions
A frequency map (also called a character count array or histogram) is a data structure that counts how many times each character appears. It transforms a string into a numerical fingerprint. Two strings with identical fingerprints are anagrams of each other. That's powerful.
For strings restricted to lowercase English letters, you can use an int[26] array instead of a HashMap. Array access is O(1) with no hashing overhead, and comparing two int[26] arrays takes exactly 26 comparisons — constant time regardless of string length. This is a significant practical speedup that interviewers love to hear you mention.
Frequency maps unlock the fixed-size sliding window approach for anagram problems. You precompute the frequency map of the pattern, then slide a same-length window across the text. Each slide, you add one character and remove one character from your running frequency map. When the running map matches the pattern map, you've found an anagram. One pass. O(n) time.
import java.util.ArrayList; import java.util.Arrays; import java.util.List; public class AnagramFinder { /** * Returns all starting indices where an anagram of 'pattern' begins in 'text'. * Uses a fixed-size sliding window with an int[26] frequency map. * * Time: O(n) where n = text.length() * Space: O(1) — the two int[26] arrays are constant size regardless of input */ public static List<Integer> findAnagramStartIndices(String text, String pattern) { List<Integer> resultIndices = new ArrayList<>(); // Edge case: pattern can't fit inside text if (pattern.length() > text.length()) { return resultIndices; } int windowSize = pattern.length(); // Build the frequency fingerprint for the target pattern int[] patternFrequency = new int[26]; for (char ch : pattern.toCharArray()) { patternFrequency[ch - 'a']++; // 'a'=0, 'b'=1, ..., 'z'=25 } // Build the frequency fingerprint for the first window in text int[] windowFrequency = new int[26]; for (int i = 0; i < windowSize; i++) { windowFrequency[text.charAt(i) - 'a']++; } // Check the first window before we start sliding if (Arrays.equals(patternFrequency, windowFrequency)) { resultIndices.add(0); } // Slide the window one character at a time for (int windowEnd = windowSize; windowEnd < text.length(); windowEnd++) { // Add the new character entering the right side of the window windowFrequency[text.charAt(windowEnd) - 'a']++; // Remove the character leaving the left side of the window int windowStart = windowEnd - windowSize; windowFrequency[text.charAt(windowStart) - 'a']--; // Arrays.equals on two int[26] arrays is 26 comparisons — O(1) if (Arrays.equals(patternFrequency, windowFrequency)) { resultIndices.add(windowStart + 1); // +1 because new window starts one ahead } } return resultIndices; } public static void main(String[] args) { System.out.println(findAnagramStartIndices("cbaebabacd", "abc")); // [0, 6] System.out.println(findAnagramStartIndices("abab", "ab")); // [0, 1, 2] System.out.println(findAnagramStartIndices("af", "be")); // [] } }
[0, 1, 2]
[]
Two Pointers on Strings — The Palindrome and Reverse Toolkit
Two pointers on a string means placing one pointer at the start and one at the end, then marching them toward each other. It's perfect for problems involving symmetry (palindromes), reversals, or comparisons from both ends simultaneously.
The reason this pattern exists is that many string properties are inherently symmetric. A palindrome reads the same forwards and backwards. The minimum number of deletions to make a string a palindrome depends on mismatches at mirrored positions. Both of these are naturally expressed as comparisons between a left and right pointer moving inward.
Two pointers also pairs beautifully with other patterns. You can use two pointers on the result of a frequency map to reconstruct strings. You can use them inside a sliding window as the window boundaries themselves. Once you're comfortable with each pattern in isolation, start noticing when problems require you to combine two of them — that's where intermediate-level solutions start to look elegant rather than brute-force.
public class PalindromeChecker { /** * Checks if a string is a valid palindrome, considering only * alphanumeric characters and ignoring case. * Real-world variant of the classic problem — matches how * "A man, a plan, a canal: Panama" is considered a palindrome. * * Time: O(n) * Space: O(1) — no extra data structures, just two integer pointers */ public static boolean isValidPalindrome(String sentence) { int leftPointer = 0; int rightPointer = sentence.length() - 1; while (leftPointer < rightPointer) { // Skip non-alphanumeric characters from the left while (leftPointer < rightPointer && !Character.isLetterOrDigit(sentence.charAt(leftPointer))) { leftPointer++; } // Skip non-alphanumeric characters from the right while (leftPointer < rightPointer && !Character.isLetterOrDigit(sentence.charAt(rightPointer))) { rightPointer--; } // Compare characters at both pointers, case-insensitive char leftChar = Character.toLowerCase(sentence.charAt(leftPointer)); char rightChar = Character.toLowerCase(sentence.charAt(rightPointer)); if (leftChar != rightChar) { return false; // Mismatch — definitely not a palindrome } // Both characters matched — move both pointers inward leftPointer++; rightPointer--; } // All mirrored pairs matched return true; } /** * Finds the length of the longest palindromic substring using * the "expand around center" technique — a two-pointer expansion. * Handles both odd-length ("racecar") and even-length ("abba") palindromes. * * Time: O(n²) — each of the 2n-1 centers expands up to n/2 times * Space: O(1) */ public static String longestPalindromicSubstring(String word) { if (word == null || word.isEmpty()) return ""; int bestStart = 0; int bestLength = 1; for (int centerIndex = 0; centerIndex < word.length(); centerIndex++) { // Expand for ODD-length palindromes (single character center) int oddLength = expandFromCenter(word, centerIndex, centerIndex); // Expand for EVEN-length palindromes (gap between two characters) int evenLength = expandFromCenter(word, centerIndex, centerIndex + 1); int longerExpansion = Math.max(oddLength, evenLength); if (longerExpansion > bestLength) { bestLength = longerExpansion; // Reverse-calculate where this palindrome starts bestStart = centerIndex - (longerExpansion - 1) / 2; } } return word.substring(bestStart, bestStart + bestLength); } private static int expandFromCenter(String word, int left, int right) { // Keep expanding outward while characters match and we're in bounds while (left >= 0 && right < word.length() && word.charAt(left) == word.charAt(right)) { left--; right++; } // When the loop exits, left and right have gone one step too far // Length formula: right - left - 1 return right - left - 1; } public static void main(String[] args) { System.out.println(isValidPalindrome("A man, a plan, a canal: Panama")); // true System.out.println(isValidPalindrome("race a car")); // false System.out.println(isValidPalindrome(" ")); // true System.out.println(longestPalindromicSubstring("babad")); // "bab" or "aba" System.out.println(longestPalindromicSubstring("cbbd")); // "bb" System.out.println(longestPalindromicSubstring("racecar")); // "racecar" } }
false
true
bab
bb
racecar
String Hashing — Catching Patterns You Can't See by Eye
Hashing a string means converting it into a number so you can compare strings in O(1) instead of O(n). The canonical application is the Rabin-Karp rolling hash algorithm for substring search, but the concept shows up any time you need to detect duplicate substrings, group anagrams, or find repeated patterns at scale.
The key idea is a rolling hash: when your window slides one character to the right, you don't recompute the entire hash from scratch. You mathematically remove the contribution of the outgoing character and add the incoming character. This keeps each slide at O(1), making the full scan O(n) regardless of pattern length.
Grouping anagrams is a softer but very common application. The trick: sort each word's characters to produce a canonical key. All anagrams of 'eat' sort to 'aet'. Store them in a HashMap
import java.util.*; public class AnagramGrouper { /** * Groups a list of words so that anagrams appear together. * Uses a sorted-character string as a canonical hash key. * * Real-world use: autocorrect dictionaries, scrabble solvers, * duplicate content detection in CMS platforms. * * Time: O(n * k log k) where n = number of words, k = average word length * Space: O(n * k) to store all words in the result map */ public static List<List<String>> groupAnagrams(String[] words) { // Key: sorted characters of a word (its anagram fingerprint) // Value: all words that share that fingerprint Map<String, List<String>> anagramBuckets = new HashMap<>(); for (String word : words) { // Sort the characters to create a canonical key // "eat", "tea", "ate" all produce "aet" char[] wordChars = word.toCharArray(); Arrays.sort(wordChars); String canonicalKey = new String(wordChars); // computeIfAbsent creates a new list only when the key is new anagramBuckets.computeIfAbsent(canonicalKey, k -> new ArrayList<>()).add(word); } // Return all the groups as a flat list of lists return new ArrayList<>(anagramBuckets.values()); } /** * Alternative fingerprint approach using a frequency-count key. * Avoids sorting entirely — O(n * k) overall. * Useful when k is large (long words) and sorting becomes the bottleneck. */ public static List<List<String>> groupAnagramsLinear(String[] words) { Map<String, List<String>> anagramBuckets = new HashMap<>(); for (String word : words) { int[] charCounts = new int[26]; for (char ch : word.toCharArray()) { charCounts[ch - 'a']++; } // Build a string like "#2#0#1#..." to represent the frequency map // This string uniquely identifies an anagram family StringBuilder keyBuilder = new StringBuilder(); for (int count : charCounts) { keyBuilder.append('#').append(count); } String frequencyKey = keyBuilder.toString(); anagramBuckets.computeIfAbsent(frequencyKey, k -> new ArrayList<>()).add(word); } return new ArrayList<>(anagramBuckets.values()); } public static void main(String[] args) { String[] wordList = {"eat", "tea", "tan", "ate", "nat", "bat"}; List<List<String>> grouped = groupAnagrams(wordList); System.out.println("Sorted-key approach:"); for (List<String> group : grouped) { System.out.println(" " + group); } System.out.println("\nFrequency-key approach:"); List<List<String>> groupedLinear = groupAnagramsLinear(wordList); for (List<String> group : groupedLinear) { System.out.println(" " + group); } } }
[eat, tea, ate]
[tan, nat]
[bat]
Frequency-key approach:
[eat, tea, ate]
[tan, nat]
[bat]
| Pattern | Best For | Time Complexity | Space Complexity | Key Signal in Problem |
|---|---|---|---|---|
| Sliding Window (variable) | Longest/shortest substring with constraint | O(n) | O(alphabet size) | Problem says 'longest/shortest/minimum window' |
| Sliding Window (fixed) | Anagram detection, fixed-length pattern match | O(n) | O(1) with int[26] | Problem gives a fixed pattern length |
| Two Pointers (inward) | Palindrome check, symmetry comparison | O(n) | O(1) | Problem involves mirroring or reading both ends |
| Frequency Map | Character count comparison, anagram grouping | O(n) | O(alphabet size) | Problem asks 'same characters?' or 'rearrangement?' |
| String Hashing / Sorted Key | Grouping anagrams, duplicate substring detection | O(n * k log k) | O(n * k) | Problem asks to group or deduplicate by content |
🎯 Key Takeaways
- Pattern recognition before coding: identify whether the problem needs a fixed or variable window, symmetric comparison, or frequency fingerprint before touching the keyboard — the right pattern choice eliminates 80% of implementation bugs.
- The sliding window left pointer must only ever move forward — the missing Math.max guard is the single most common sliding window bug in interview settings and production code alike.
- int[26] is strictly better than HashMap for lowercase-only string problems: no autoboxing, no collision handling, constant-time comparison via Arrays.equals — always state this trade-off in an interview.
- Sorting characters is the easiest anagram key but costs O(k log k) per word — the frequency-count key approach cuts it to O(k) and matters significantly when processing long strings at scale.
⚠ Common Mistakes to Avoid
- ✕Mistake 1: Not guarding the left pointer in variable sliding window — Symptom: the window appears to shrink correctly but then suddenly gives wrong answers for strings with distant duplicate characters (e.g. 'abba' returns 3 instead of 2) — Fix: always use windowStart = Math.max(windowStart, lastSeenAt.get(currentChar)) when updating the left pointer; the max prevents the pointer from jumping backward.
- ✕Mistake 2: Using String concatenation inside a loop to build results — Symptom: code is logically correct but TLEs (Time Limit Exceeded) on inputs with 10,000+ characters because each '+' creates a new String object in O(n) — Fix: always use StringBuilder for any string assembly inside a loop; call toString() once at the very end.
- ✕Mistake 3: Assuming int[26] works for all string problems — Symptom: ArrayIndexOutOfBoundsException or silent wrong answers when the input contains uppercase letters, digits, spaces, or Unicode characters — Fix: read the constraints before choosing your data structure; use int[128] for full ASCII or HashMap
for Unicode; add Character.toLowerCase() if the problem is case-insensitive.
Interview Questions on This Topic
- QGiven a string s and a pattern p, find the minimum window substring of s that contains all characters of p, including duplicates. Walk me through your approach before writing any code.
- QHow would you determine if two strings are anagrams of each other in O(n) time and O(1) space? What constraint on the input makes O(1) space possible, and when does that assumption break?
- QYou've solved 'longest substring without repeating characters' using a HashMap. Your interviewer asks you to do it with O(1) space. Is that possible? What trade-off would you make, and what constraint on the input would you need?
Frequently Asked Questions
What is the sliding window technique in string problems?
The sliding window technique maintains a contiguous substring defined by two pointers (left and right). Instead of restarting from scratch for each position, you slide the window forward by adding one character on the right and optionally removing one on the left, updating your state incrementally. This reduces most O(n²) substring problems to O(n).
How do I check if two strings are anagrams in Java?
The fastest O(n) approach for lowercase-only strings is to build an int[26] frequency array for each string — increment for the first string, decrement for the second — then check that all 26 values are zero. If any entry is non-zero, the strings aren't anagrams. Alternatively, sort both strings and compare with equals(), but that's O(n log n).
When should I use two pointers vs sliding window for string problems?
Use two pointers moving inward from both ends when the problem involves symmetry, palindromes, or comparing characters from opposite sides of the string. Use a sliding window when the problem involves a contiguous segment (substring) and asks you to find the longest, shortest, or all such segments satisfying a constraint. Many problems can be phrased either way — the key is whether you're comparing ends (two pointers) or tracking a region (sliding window).
Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.