Skip to content
Home DSA Trie Data Structure — Efficient Prefix Search and Autocomplete

Trie Data Structure — Efficient Prefix Search and Autocomplete

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Trees → Topic 10 of 15
Master the Trie (prefix tree) data structure: learn insert, search, startsWith operations with step-by-step examples and Python implementation.
🔥 Advanced — solid DSA foundation required
In this tutorial, you'll learn
Master the Trie (prefix tree) data structure: learn insert, search, startsWith operations with step-by-step examples and Python implementation.
  • Tries share prefixes among stored strings, giving O(m) insert/search where m is word length, independent of the number of words.
  • The startsWith operation makes tries ideal for autocomplete, prefix search, and IP routing.
  • A trie with n words and average length m uses O(n*m) space in the worst case but benefits from prefix sharing.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer

Imagine a giant library where every book is stored not by its full title, but letter by letter on branching shelves. The shelf 'A' branches into 'AP', 'AR', 'AT', and so on. To find 'APPLE' you walk A → P → P → L → E — you never check any shelf that doesn't start with 'A'. That branching shelf system is a Trie. It's built for one thing: blazing-fast prefix lookups, the same trick your phone uses when it suggests the next word as you type.

Every time you type a search query into Google and it completes your thought before you finish, or your IDE suggests a method name after two keystrokes, a prefix tree — a Trie — is almost certainly doing the heavy lifting underneath. It's one of those data structures that feels like a party trick until you actually need it, at which point nothing else comes close. At scale, the difference between a HashMap-based word lookup and a Trie isn't academic — it's the difference between searching a million-word dictionary in proportional time versus constant time relative to the word length alone.

The core problem Tries solve is that hash-based structures are terrible at prefix queries. A HashMap will tell you in O(1) whether 'apple' exists, but ask it 'give me every word that starts with app' and it has to scan every key. A sorted array lets you binary-search for a prefix range, but insertion is O(n). A Trie sidesteps both problems by structuring data so that sharing a prefix literally means sharing nodes in memory — the words 'apple', 'application', and 'apply' all travel through the same A→P→P path before diverging.

By the end of this article you'll understand exactly how a Trie node is laid out in memory, why the O(L) complexity claim deserves some scepticism in practice, how to implement a production-quality Trie with autocomplete in Java, how to compress a Trie into a Radix Tree when memory is tight, and the exact mistakes that burn engineers in interviews and on the job. Let's build it from scratch.

What is a Trie? — Plain English

A Trie (pronounced 'try', from re-TRIE-val) is a tree-shaped data structure designed for storing strings where each edge represents a single character. Unlike a hash map that stores entire strings, a trie shares prefixes among all strings that start with them. For example, 'apple', 'app', and 'apply' all share the path A→P→P. This makes prefix searches extremely fast. Real-world uses include: autocomplete (type 'app' and get all completions), spell checkers, IP routing tables (CIDR prefix matching), and dictionary implementations. A node in a trie has up to 26 children (for lowercase letters), a flag indicating whether this node ends a complete word, and optionally a count for how many words pass through it.

How Trie Works — Step by Step

Insert 'app': 1. Start at root. Create/follow child 'a'. Create/follow child 'p'. Create/follow child 'p'. Mark this node as end_of_word = True.

Insert 'apple': 2. Start at root. Follow 'a'→'p'→'p' (existing nodes). Create child 'l'. Create child 'e'. Mark 'e' node as end_of_word = True.

Search 'apple': 3. Follow a→p→p→l→e. Reach 'e' node. end_of_word = True → return True.

Search 'ap': 4. Follow a→p. Reach 'p' node. end_of_word = False → word 'ap' not in trie → return False.

startsWith 'app': 5. Follow a→p→p. Reach node. It exists → return True (there is a word starting with 'app').

Each operation is O(m) where m is the word length, regardless of how many words are stored.

Worked Example — Autocomplete

Trie contains: ['car','card','care','careful','cars','cat'].

startsWith 'car' → follow c→a→r. Node exists. To list all completions, DFS from this node: - At 'r' node: end_of_word=True → yield 'car'. - Go to child 'd': end_of_word=True → yield 'card'. - Backtrack. Go to child 'e': end_of_word=True → yield 'care'. Go to child 'f'→'u'→'l': yield 'careful'. - Backtrack to 'r'. Go to child 's': yield 'cars'.

All completions: ['car','card','care','careful','cars'].

startsWith 'cat' → c→a→t exists. DFS: yields 'cat'. Completions: ['cat'].

startsWith 'dog' → root has no 'd' child → return False immediately. No completions.

Time: O(p + n) where p is the prefix length and n is the number of completions found.

Implementation

A TrieNode stores a dictionary of children (char → TrieNode) and a boolean is_end. Insertion walks the trie creating nodes as needed and marks the final node. Search walks and returns True only if the path exists and the final node is marked. The startsWith operation is the same walk but ignores is_end. For autocomplete, DFS from the prefix endpoint collecting all paths that reach is_end nodes.

trie.py · PYTHON
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657
class TrieNode:
    def __init__(self):
        self.children = {}   # char -> TrieNode
        self.is_end = False

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        node = self.root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
        node.is_end = True

    def search(self, word):
        node = self.root
        for ch in word:
            if ch not in node.children:
                return False
            node = node.children[ch]
        return node.is_end

    def starts_with(self, prefix):
        node = self.root
        for ch in prefix:
            if ch not in node.children:
                return False
            node = node.children[ch]
        return True

    def autocomplete(self, prefix):
        node = self.root
        for ch in prefix:
            if ch not in node.children:
                return []
            node = node.children[ch]
        results = []
        self._dfs(node, prefix, results)
        return results

    def _dfs(self, node, current, results):
        if node.is_end:
            results.append(current)
        for ch, child in node.children.items():
            self._dfs(child, current + ch, results)

# Example
t = Trie()
for w in ['car','card','care','careful','cars','cat']:
    t.insert(w)
print(t.search('care'))       # True
print(t.search('ca'))         # False
print(t.starts_with('car'))   # True
print(t.autocomplete('car'))  # ['car','card','care','careful','cars']
▶ Output
True
False
True
['car', 'card', 'care', 'careful', 'cars']
FeatureTrie (Prefix Tree)HashMapBalanced BST
Search ComplexityO(L) - Word LengthO(1) - AverageO(L * log N)
Prefix SearchO(L + K) - Highly EfficientO(N) - Must scan all keysO(L * log N)
Memory UsageHigh (Multiple pointers)ModerateLow
Best Use CaseAutocomplete, Spell CheckExact point lookupsGeneral-purpose sorting

🎯 Key Takeaways

  • Tries share prefixes among stored strings, giving O(m) insert/search where m is word length, independent of the number of words.
  • The startsWith operation makes tries ideal for autocomplete, prefix search, and IP routing.
  • A trie with n words and average length m uses O(n*m) space in the worst case but benefits from prefix sharing.

⚠ Common Mistakes to Avoid

    Ignoring character set constraints — Assuming an array of 26 is enough when the input contains uppercase, digits, or symbols will lead to IndexOutOfBoundsExceptions. Always normalize input or use a Map.

    use a Map.

    Forgetting the 'isEndOfWord' flag — If you don't mark the end of a word, searching for 'app' will return true if 'apple' exists, even if 'app' was never explicitly added.

    itly added.

    Recursive deletions without cleanup — When deleting a word, you must backtrack and delete nodes that are no longer part of any other word to prevent 'memory leaks' within the tree structure.

    structure.

Interview Questions on This Topic

  • QWhat is the time complexity of trie insert?
  • QHow does startsWith differ from search in a trie?
  • QWhen would you use a trie over a hash map?

Frequently Asked Questions

What is the time and space complexity of a trie?

Time: O(m) for insert, search, and startsWith where m is the word length. Space: O(ALPHABET_SIZE m n) in the worst case where n is number of words and m is average word length. In practice, shared prefixes reduce space significantly. A compressed trie (Patricia trie) merges single-child chains to save space.

How does a trie compare to a hash map for string storage?

Hash map offers O(m) insert/search with O(1) average lookup but cannot answer prefix queries. A trie also gives O(m) operations and additionally supports prefix search, auto-complete, and ordered traversal (alphabetical order via DFS). Hash maps win on memory for random strings with no shared prefixes.

What is a compressed trie (Patricia tree)?

A compressed trie merges chains of single-child nodes into a single edge labeled with the entire substring. This reduces the number of nodes from O(sum of word lengths) to O(number of words). Suffix trees are compressed tries of all suffixes of a string, enabling O(m) substring search in a text of length n.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousFenwick Tree — Binary Indexed TreeNext →Lowest Common Ancestor
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged