Trie Data Structure — Efficient Prefix Search and Autocomplete
- Tries share prefixes among stored strings, giving O(m) insert/search where m is word length, independent of the number of words.
- The startsWith operation makes tries ideal for autocomplete, prefix search, and IP routing.
- A trie with n words and average length m uses O(n*m) space in the worst case but benefits from prefix sharing.
Imagine a giant library where every book is stored not by its full title, but letter by letter on branching shelves. The shelf 'A' branches into 'AP', 'AR', 'AT', and so on. To find 'APPLE' you walk A → P → P → L → E — you never check any shelf that doesn't start with 'A'. That branching shelf system is a Trie. It's built for one thing: blazing-fast prefix lookups, the same trick your phone uses when it suggests the next word as you type.
Every time you type a search query into Google and it completes your thought before you finish, or your IDE suggests a method name after two keystrokes, a prefix tree — a Trie — is almost certainly doing the heavy lifting underneath. It's one of those data structures that feels like a party trick until you actually need it, at which point nothing else comes close. At scale, the difference between a HashMap-based word lookup and a Trie isn't academic — it's the difference between searching a million-word dictionary in proportional time versus constant time relative to the word length alone.
The core problem Tries solve is that hash-based structures are terrible at prefix queries. A HashMap will tell you in O(1) whether 'apple' exists, but ask it 'give me every word that starts with app' and it has to scan every key. A sorted array lets you binary-search for a prefix range, but insertion is O(n). A Trie sidesteps both problems by structuring data so that sharing a prefix literally means sharing nodes in memory — the words 'apple', 'application', and 'apply' all travel through the same A→P→P path before diverging.
By the end of this article you'll understand exactly how a Trie node is laid out in memory, why the O(L) complexity claim deserves some scepticism in practice, how to implement a production-quality Trie with autocomplete in Java, how to compress a Trie into a Radix Tree when memory is tight, and the exact mistakes that burn engineers in interviews and on the job. Let's build it from scratch.
What is a Trie? — Plain English
A Trie (pronounced 'try', from re-TRIE-val) is a tree-shaped data structure designed for storing strings where each edge represents a single character. Unlike a hash map that stores entire strings, a trie shares prefixes among all strings that start with them. For example, 'apple', 'app', and 'apply' all share the path A→P→P. This makes prefix searches extremely fast. Real-world uses include: autocomplete (type 'app' and get all completions), spell checkers, IP routing tables (CIDR prefix matching), and dictionary implementations. A node in a trie has up to 26 children (for lowercase letters), a flag indicating whether this node ends a complete word, and optionally a count for how many words pass through it.
How Trie Works — Step by Step
Operations on a trie containing words 'apple', 'app', 'apply':
Insert 'app': 1. Start at root. Create/follow child 'a'. Create/follow child 'p'. Create/follow child 'p'. Mark this node as end_of_word = True.
Insert 'apple': 2. Start at root. Follow 'a'→'p'→'p' (existing nodes). Create child 'l'. Create child 'e'. Mark 'e' node as end_of_word = True.
Search 'apple': 3. Follow a→p→p→l→e. Reach 'e' node. end_of_word = True → return True.
Search 'ap': 4. Follow a→p. Reach 'p' node. end_of_word = False → word 'ap' not in trie → return False.
startsWith 'app': 5. Follow a→p→p. Reach node. It exists → return True (there is a word starting with 'app').
Each operation is O(m) where m is the word length, regardless of how many words are stored.
Worked Example — Autocomplete
Trie contains: ['car','card','care','careful','cars','cat'].
startsWith 'car' → follow c→a→r. Node exists. To list all completions, DFS from this node: - At 'r' node: end_of_word=True → yield 'car'. - Go to child 'd': end_of_word=True → yield 'card'. - Backtrack. Go to child 'e': end_of_word=True → yield 'care'. Go to child 'f'→'u'→'l': yield 'careful'. - Backtrack to 'r'. Go to child 's': yield 'cars'.
All completions: ['car','card','care','careful','cars'].
startsWith 'cat' → c→a→t exists. DFS: yields 'cat'. Completions: ['cat'].
startsWith 'dog' → root has no 'd' child → return False immediately. No completions.
Time: O(p + n) where p is the prefix length and n is the number of completions found.
Implementation
A TrieNode stores a dictionary of children (char → TrieNode) and a boolean is_end. Insertion walks the trie creating nodes as needed and marks the final node. Search walks and returns True only if the path exists and the final node is marked. The startsWith operation is the same walk but ignores is_end. For autocomplete, DFS from the prefix endpoint collecting all paths that reach is_end nodes.
class TrieNode: def __init__(self): self.children = {} # char -> TrieNode self.is_end = False class Trie: def __init__(self): self.root = TrieNode() def insert(self, word): node = self.root for ch in word: if ch not in node.children: node.children[ch] = TrieNode() node = node.children[ch] node.is_end = True def search(self, word): node = self.root for ch in word: if ch not in node.children: return False node = node.children[ch] return node.is_end def starts_with(self, prefix): node = self.root for ch in prefix: if ch not in node.children: return False node = node.children[ch] return True def autocomplete(self, prefix): node = self.root for ch in prefix: if ch not in node.children: return [] node = node.children[ch] results = [] self._dfs(node, prefix, results) return results def _dfs(self, node, current, results): if node.is_end: results.append(current) for ch, child in node.children.items(): self._dfs(child, current + ch, results) # Example t = Trie() for w in ['car','card','care','careful','cars','cat']: t.insert(w) print(t.search('care')) # True print(t.search('ca')) # False print(t.starts_with('car')) # True print(t.autocomplete('car')) # ['car','card','care','careful','cars']
False
True
['car', 'card', 'care', 'careful', 'cars']
| Feature | Trie (Prefix Tree) | HashMap | Balanced BST |
|---|---|---|---|
| Search Complexity | O(L) - Word Length | O(1) - Average | O(L * log N) |
| Prefix Search | O(L + K) - Highly Efficient | O(N) - Must scan all keys | O(L * log N) |
| Memory Usage | High (Multiple pointers) | Moderate | Low |
| Best Use Case | Autocomplete, Spell Check | Exact point lookups | General-purpose sorting |
🎯 Key Takeaways
- Tries share prefixes among stored strings, giving O(m) insert/search where m is word length, independent of the number of words.
- The startsWith operation makes tries ideal for autocomplete, prefix search, and IP routing.
- A trie with n words and average length m uses O(n*m) space in the worst case but benefits from prefix sharing.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QWhat is the time complexity of trie insert?
- QHow does startsWith differ from search in a trie?
- QWhen would you use a trie over a hash map?
Frequently Asked Questions
What is the time and space complexity of a trie?
Time: O(m) for insert, search, and startsWith where m is the word length. Space: O(ALPHABET_SIZE m n) in the worst case where n is number of words and m is average word length. In practice, shared prefixes reduce space significantly. A compressed trie (Patricia trie) merges single-child chains to save space.
How does a trie compare to a hash map for string storage?
Hash map offers O(m) insert/search with O(1) average lookup but cannot answer prefix queries. A trie also gives O(m) operations and additionally supports prefix search, auto-complete, and ordered traversal (alphabetical order via DFS). Hash maps win on memory for random strings with no shared prefixes.
What is a compressed trie (Patricia tree)?
A compressed trie merges chains of single-child nodes into a single edge labeled with the entire substring. This reduces the number of nodes from O(sum of word lengths) to O(number of words). Suffix trees are compressed tries of all suffixes of a string, enabling O(m) substring search in a text of length n.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.