📖 6 min read
Last updated on

Serialize and Deserialize Binary Tree — Coding Interview Walkthrough


You’re in a coding interview. The interviewer says: “Design an algorithm to serialize a binary tree to a string and deserialize it back to the tree.” The challenge isn’t the traversal itself; it’s choosing an encoding scheme that unambiguously represents null nodes so reconstruction is deterministic. Serialize and Deserialize Binary Tree is LeetCode #297, a Hard problem where clarity of design matters more than algorithmic complexity.

This walkthrough covers both BFS and DFS approaches, explains why encoding nulls is the key insight, and shows the DFS preorder solution that produces the cleanest code.


Already know how to solve Serialize And Deserialize Binary Tree? Try a Serialize And Deserialize Binary Tree mock interview on Intervu →

The Problem

Design a Codec class with:

  • serialize(root) -> str: encodes a tree to a string.
  • deserialize(data: str) -> TreeNode: decodes a string back to the original tree.

The encoding format is your choice. There’s no right answer as long as serialization and deserialization are inverses of each other. (LeetCode #297)

Example

Input root = [1,2,3,null,null,4,5]

Serialized "1,2,null,null,3,4,null,null,5,null,null" (DFS preorder with null markers)

Deserialized Original tree reconstructed correctly.

Binary tree [1,2,3,null,null,4,5] with DFS preorder serialization showing null markers.

What Interviewers Are Actually Testing

This is a design problem. Interviewers want to see you make a clear design decision (BFS vs. DFS, delimiter choice, null representation), justify it, and implement both halves correctly.

SkillWhat Interviewers Observe
Null node encodingDo you explicitly encode nulls so the tree can be reconstructed?
BFS or DFS choiceCan you implement either and explain the trade-offs?
Delimiter + format clarityDo you use a consistent delimiter and parse it correctly?
Deserialize pointer managementDo you advance through the token list correctly?
Correctness on edge casesEmpty tree, single node, skewed tree?

Step 1: Clarifying Questions

  1. Can node values be negative? Yes. This means the delimiter can’t be -. Use , as the separator between tokens.

  2. Is the tree a BST? No. It’s a general binary tree. Without BST ordering, you need null markers to determine structure.

  3. What should serialize(None) return? A string (e.g., "null"). deserialize("null") should return None.

  4. Is there a size constraint? Up to 10^4 nodes. Both O(n) approaches work fine.


Step 2: The Key Insight

Encode null nodes explicitly. Without representing nulls, you can’t distinguish a node with one child from a node with two children. With nulls encoded, the tree structure is fully determined by a single traversal.

A binary tree serialized as just its values (e.g., "1,2,3") is ambiguous: is 3 the right child of 1, or the right child of 2? With null markers ("1,2,null,null,3,null,null"), the structure is unambiguous.


Step 3: Approaches

DFS (preorder): Serialize root, then left, then right, using a sentinel for nulls. Reconstruct with a recursive function consuming one token at a time. Clean, compact, no queue.

BFS (level-order): Serialize level by level, the same format LeetCode uses. Reconstruct with a queue. Slightly more intuitive but more code.

Both are O(n) time and O(n) space. DFS is usually the better interview choice because the code is shorter.


class Codec:
    def serialize(self, root) -> str:
        tokens = []
        def dfs(node):
            if not node:
                tokens.append("null")
                return
            tokens.append(str(node.val))
            dfs(node.left)
            dfs(node.right)
        dfs(root)
        return ",".join(tokens)

    def deserialize(self, data: str):
        tokens = iter(data.split(","))
        def dfs():
            val = next(tokens)
            if val == "null":
                return None
            node = TreeNode(int(val))
            node.left = dfs()
            node.right = dfs()
            return node
        return dfs()

Why this is interview-friendly:

  • iter() makes token consumption clean. next(tokens) advances the pointer automatically. No index tracking needed.
  • Symmetry between serialize and deserialize. Both follow the same preorder pattern: root, left, right.
  • Null sentinels eliminate ambiguity. Every leaf has exactly two "null" children in the serialized string.

Python Implementation: BFS (Alternative)

from collections import deque

class Codec:
    def serialize(self, root) -> str:
        if not root:
            return "null"
        result = []
        queue = deque([root])
        while queue:
            node = queue.popleft()
            if node:
                result.append(str(node.val))
                queue.append(node.left)
                queue.append(node.right)
            else:
                result.append("null")
        return ",".join(result)

    def deserialize(self, data: str):
        if data == "null":
            return None
        tokens = data.split(",")
        root = TreeNode(int(tokens[0]))
        queue = deque([root])
        i = 1
        while queue:
            node = queue.popleft()
            if tokens[i] != "null":
                node.left = TreeNode(int(tokens[i]))
                queue.append(node.left)
            i += 1
            if tokens[i] != "null":
                node.right = TreeNode(int(tokens[i]))
                queue.append(node.right)
            i += 1
        return root

Time & Space Complexity

TimeSpace
serializeO(n)O(n)
deserializeO(n)O(n)

Every node is visited once. The serialized string and recursion stack (or queue) are both O(n).


Common Interview Mistakes

  1. Not encoding nulls. Without null markers, the tree can’t be uniquely reconstructed. This is the single most important design decision.

  2. Choosing a delimiter that appears in values. Node values can be negative, so - is a bad delimiter. Use , consistently.

  3. BFS deserialize index tracking. Off-by-one in the token index is a common bug. Each dequeued node consumes exactly two tokens (left child, right child). Tracking i carefully is essential.

  4. Not handling the empty tree. serialize(None) should return "null", and deserialize("null") should return None. Missing this edge case is a silent bug.

  5. Using split() without considering empty tokens. "1,,2".split(",") produces ["1", "", "2"]. Ensure your serialization never produces consecutive delimiters.


What a Strong Candidate Sounds Like

“I’ll use DFS preorder serialization with null markers. The preorder structure, root then left subtree then right subtree, combined with explicit nulls, uniquely encodes any binary tree. Deserialization mirrors the same traversal: consume one token, create a node or return null, then recurse left and right. An iterator over the token list keeps the implementation clean. O(n) time, O(n) space.”


Example Interview Dialogue

Interviewer: Why DFS over BFS?

Candidate: Both work, but DFS preorder produces cleaner code. The serialize and deserialize functions are symmetric: both follow root, left, right order. With BFS, the deserialize needs a queue and explicit index tracking, which is more code and more room for off-by-one errors.

Interviewer: Can you reconstruct a tree from just preorder without null markers?

Candidate: Not for a general binary tree. Without nulls, you can’t tell where the left subtree ends and the right subtree begins. For a BST, you could use the ordering property to determine the split, but for a general tree, null markers are required.



Frequently Asked Questions

What traversal perfectly handles Binary Tree serialization?

A classic Depth-First Search (DFS) or Preorder traversal perfectly serializes tree coordinates. By universally pushing the current node data before digging down strictly into left and right branches consecutively, you guarantee identical sequence recreation during later reconstruction phases.

How do you serialize null/empty nodes to ensure tree precision?

Capturing structural gaps accurately mandates recording null pointers distinctly (usually with a placeholder like “N”). Without saving exactly where child endpoints terminate, any future string parsing attempts would fail to differentiate identical values resting on completely different tree depths or branch structures.

Practice This in a Mock Interview

Serialize and Deserialize Binary Tree rewards candidates who pick a clean encoding scheme upfront and explain it clearly before coding. The DFS preorder + null sentinel approach produces the most concise and elegant implementation. Practice choosing and justifying your approach before writing the first line.

Start a mock interview for Serialize and Deserialize Binary Tree on Intervu.

Further reading:

Practice Like It's the Real Interview

Get instant feedback on your approach, communication, and code — powered by AI.

Start a Mock Interview →