Skip to content
Home DSA Serialize and Deserialize Binary Tree — BFS and DFS Approaches

Serialize and Deserialize Binary Tree — BFS and DFS Approaches

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Trees → Topic 14 of 15
Learn two approaches to serialize a binary tree to a string and deserialize it back: level-order BFS and preorder DFS, both in O(n) time.
🔥 Advanced — solid DSA foundation required
In this tutorial, you'll learn
Learn two approaches to serialize a binary tree to a string and deserialize it back: level-order BFS and preorder DFS, both in O(n) time.
  • Null markers are essential to distinguish tree structure during deserialization.
  • Preorder DFS serialization uses O(n) tokens where null leaves account for n+1 of them (for n internal nodes).
  • Using a deque for deserialization tokens gives O(1) popleft vs O(n) for list.pop(0).
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer

Imagine you built an incredible LEGO castle and you need to mail it to a friend. You can't ship it assembled, so you write down every brick's shape, color, and position in a numbered list. Your friend reads that list and rebuilds the exact same castle on their end. Serializing a binary tree is exactly that — turning a tree structure into a flat string or list so it can be stored, sent over a network, or saved to disk. Deserializing is your friend rebuilding the castle from your instructions.

Binary trees live in memory as a web of pointers. The moment your process ends, that web vanishes. Every production system that needs to persist a decision tree, store a parsed expression, cache a DOM snapshot, or ship a hierarchical config over a network needs a way to flatten that tree into bytes and later reconstruct it perfectly. This is not a toy problem — it sits at the heart of databases (B-tree page serialization), compilers (AST persistence), and distributed systems (task-graph checkpointing).

The core challenge is that a flat sequence loses structure. An array of values like [1, 2, 3, 4, 5] could represent dozens of different tree shapes. Serialization must encode both the values AND the structural relationships — specifically, which nodes are absent — so that deserialization can reconstruct the unique tree that produced that sequence, not just any tree containing those values.

By the end of this article you'll have two complete, runnable implementations (BFS-level-order and DFS-preorder), a crystal-clear mental model for why each delimiter and null-marker exists, the performance trade-offs between approaches, and the exact edge cases that silently break naive solutions in production. You'll also be ready to answer every variant of this question an interviewer can throw at you.

What is Serialize and Deserialize Binary Tree? — Plain English

Serialization converts a binary tree into a string (or array) that can be stored in a file or sent over a network. Deserialization reconstructs the exact same tree from that string. The challenge is capturing null children explicitly — without them, you cannot distinguish a node with one child from a complete internal node. For example, trees [1,2] and [1,null,2] have nodes 1 and 2 but completely different shapes. The standard approach marks null children with a sentinel like 'null' and separates values with commas. Two common methods are: BFS (level-order), which mirrors how LeetCode displays trees; and DFS (preorder), which is simpler to implement recursively.

How Serialize/Deserialize Works — Step by Step

Preorder DFS serialization: 1. If node is None, append 'null' to the output and return. 2. Append node.val to the output. 3. Recursively serialize node.left. 4. Recursively serialize node.right. 5. Join with commas: '1,2,null,null,3,null,null'.

Preorder DFS deserialization: 1. Split the string on commas into a queue of tokens. 2. Pop the first token. If it is 'null', return None. 3. Create a node with value = int(token). 4. node.left = recursively deserialize (pops next token). 5. node.right = recursively deserialize (pops next token). 6. Return node.

The key: each recursive call consumes exactly one token (either a value or 'null'). The preorder structure means left and right subtrees are self-contained subsequences.

Worked Example — Serializing and Restoring a Tree

Tree: 1 / \ 2 3 / \ 4 5

Preorder serialization trace: 1. Visit 1: output=['1']. 2. Visit 2 (left of 1): output=['1','2']. 3. Visit null (left of 2): output=['1','2','null']. 4. Visit null (right of 2): output=['1','2','null','null']. 5. Visit 3 (right of 1): output=['1','2','null','null','3']. 6. Visit 4 (left of 3): output=[...,'3','4']. 7. Visit null,null for 4's children: output=[...,'4','null','null']. 8. Visit 5 (right of 3), then nulls: output=[...,'5','null','null']. Final: '1,2,null,null,3,4,null,null,5,null,null'.

Deserialization consumes tokens left to right, building the same tree back. Each 'null' terminates a branch.

Implementation

Preorder serialization uses recursive DFS. Deserialization uses a deque (collections.deque) as a token iterator — popleft() consumes the next token in O(1). The entire tree is processed in O(n) time and O(n) space for the serialized string. BFS serialization using a queue produces the level-order format (useful for display), but DFS is simpler to implement and uses less intermediate state.

serialize_deserialize.py · PYTHON
1234567891011121314151617181920212223242526272829303132333435363738394041424344
from collections import deque

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val; self.left = left; self.right = right

class Codec:
    def serialize(self, root):
        """Preorder DFS → comma-separated string."""
        parts = []
        def dfs(node):
            if node is None:
                parts.append('null')
                return
            parts.append(str(node.val))
            dfs(node.left)
            dfs(node.right)
        dfs(root)
        return ','.join(parts)

    def deserialize(self, data):
        """Reconstruct tree from serialized string."""
        tokens = deque(data.split(','))
        def build():
            tok = tokens.popleft()
            if tok == 'null':
                return None
            node = TreeNode(int(tok))
            node.left = build()
            node.right = build()
            return node
        return build()

# Build tree: 1 -> 2, 3 -> 4,5
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3, TreeNode(4), TreeNode(5))

codec = Codec()
s = codec.serialize(root)
print(s)  # 1,2,null,null,3,4,null,null,5,null,null

restored = codec.deserialize(s)
print(codec.serialize(restored))  # same string
▶ Output
1,2,null,null,3,4,null,null,5,null,null
1,2,null,null,3,4,null,null,5,null,null
ConceptUse CaseExample
Serialize and Deserialize Binary TreeCore usageSee code above

🎯 Key Takeaways

  • Null markers are essential to distinguish tree structure during deserialization.
  • Preorder DFS serialization uses O(n) tokens where null leaves account for n+1 of them (for n internal nodes).
  • Using a deque for deserialization tokens gives O(1) popleft vs O(n) for list.pop(0).

⚠ Common Mistakes to Avoid

    Memorising syntax before understanding the concept
    Skipping practice and only reading theory

Interview Questions on This Topic

  • QWhy are null markers needed in serialization?
  • QExplain preorder DFS serialization.
  • QHow does deserialization consume the token stream?

Frequently Asked Questions

Why must null children be explicitly included in the serialization?

Without null markers, many different trees produce the same sequence of values. For example, preorder of [1,2] (root=1, left=2) and [1,null,2] (root=1, right=2) both produce '1,2' without nulls. The null markers uniquely identify the tree structure.

What is the difference between the DFS and BFS serialization formats?

DFS (preorder) serializes root-left-right recursively. Each subtree is a contiguous block in the output, making deserialization a natural recursive decomposition. BFS (level-order) serializes level by level, which produces the same format LeetCode uses to display trees, but deserialization requires a queue tracking parent-child relationships.

Is it possible to serialize without storing null markers if we know the tree is a full binary tree?

Yes — for a full binary tree (every node has 0 or 2 children), preorder alone is sufficient because the number of leaves equals internal_nodes + 1. But for general trees with nodes having one child, null markers are required. Most interview problems assume general trees.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousConvert BST to Sorted ArrayNext →Morris Traversal of Binary Tree
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged