Fibonacci — Recursion vs Iteration vs Memoization
Fibonacci is the canonical demonstration of recursion's cost because the pathology is so clear. The naive recursive version recomputes the same subproblems an exponential number of times — fib(5) calls fib(4) and fib(3), fib(4) calls fib(3) and fib(2), and fib(3) gets called again from fib(4)'s branch. The call tree doubles in size with each increment of n. By fib(40), you're making roughly a billion function calls to compute a number that a loop would find in 40 iterations.
Memoization fixes the exponential redundancy while keeping the recursive structure intact. @lru_cache stores the result of each unique (n,) argument tuple, so fib(3) is computed once and every subsequent call to fib(3) returns the cached result immediately. This collapses the O(2^n) call tree into a O(n) chain — you still recurse, but each unique subproblem executes exactly once.
Iteration beats both on space. The memoized version caches n results — O(n) space. The iterative version uses two variables that get updated in place — O(1) space. For computing fib(1_000_000), the iterative version uses negligible memory; the memoized recursive version would require thousands of cached entries and stack frames.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
import sys
import time
from functools import lru_cache
# --- Naive recursion: O(2^n) time, O(n) space ---
# DO NOT use this in production for n > 35
def fib_recursive(n):
if n <= 1:
return n
return fib_recursive(n - 1) + fib_recursive(n - 2)
# --- Iterative: O(n) time, O(1) space ---
# This is what production code should use
def fib_iterative(n):
if n <= 1:
return n
a, b = 0, 1
for _ in range(2, n + 1):
a, b = b, a + b
return b
# --- Memoized recursion: O(n) time, O(n) space ---
# Useful when you need recursive structure for clarity
# and can afford the O(n) cache memory
@lru_cache(maxsize=None)
def fib_memo(n):
if n <= 1:
return n
return fib_memo(n - 1) + fib_memo(n - 2)
# Timing comparison — run on your machine to feel the difference
def time_it(fn, n, label):
start = time.perf_counter()
result = fn(n)
elapsed = time.perf_counter() - start
print(f"{label:30s} fib({n}) = {result:>20} in {elapsed*1000:.3f}ms")
time_it(fib_iterative, 50, "Iterative")
time_it(fib_memo, 50, "Memoized recursive")
# fib_recursive(50) would take several minutes — do not run it
# Python's recursion limit — check yours
print(f"\nRecursion limit: {sys.getrecursionlimit()} frames")
# fib_memo(990) works; fib_memo(1001) raises RecursionError
# because the memoized version still recurses to compute uncached values
try:
fib_memo(995)
print("fib_memo(995): OK")
except RecursionError:
print("fib_memo(995): RecursionError — cache not warm")
# Iterative has no such limit
print(f"fib_iterative(10000): {fib_iterative(10000)} (truncated)"[:60] + '...')Output
Iterative fib(50) = 12586269025 in 0.004ms
Memoized recursive fib(50) = 12586269025 in 0.021ms
Recursion limit: 1000 frames
fib_memo(995): OK
fib_iterative(10000): 33644764876431783695163498604...(truncated)
The Call Stack as a Stack of Plates
- Each plate = one function call's local variables, arguments, and return address
- Base case = the point where you stop adding plates and start removing them from the top
- Stack overflow = the stack grew taller than the OS allows — typically 1-8MB per thread
- Memoization = writing the result on each plate before putting it down so you never have to redo that calculation
- Iteration = using a single plate that you erase and rewrite in place — constant stack height regardless of n
Production Insight
Naive recursive fib(50) makes approximately 2^50 function calls — over one quadrillion. In CPython each stack frame costs roughly 100-200 bytes. This is not a depth problem (the maximum depth for fib(50) is only 50), it is a redundancy problem: the same subproblem fib(2) is computed roughly 2^48 times. The fix is memoization, not iteration — but know that even memoized recursion can hit the stack limit if you call fib(1500) cold (no cached values), because the first call recurses 1500 levels deep before the cache is warm. For large n with a cold cache, iterative is the only safe choice.
Key Takeaway
Naive recursion is exponential — it is a teaching example, not production code. Memoization preserves the recursive structure and readability while reducing time complexity to O(n). Iteration wins on space (O(1) vs O(n)) and avoids the cold-cache recursion depth problem entirely. For anything beyond small bounded n, iterate.
Ifn is small and bounded (< 30), code clarity matters most
→
UseMemoized recursion — the structure mirrors the mathematical definition and is easy to verify
Ifn can be large (> 100) or is determined by user input
→
UseIterative — O(1) space, no recursion limit risk, 3-5x faster than memoized
IfComputing many Fibonacci values repeatedly in a long-running process
→
UseMemoized recursion — lru_cache persists across calls, so each value is computed only once total
Ifn is unbounded or controlled by external input
→
UseIterative only — never recurse when depth is controlled by untrusted input
Converting Recursive DFS to Iterative
The pattern for converting any recursive algorithm to iterative is mechanical once you understand it: the call stack that the language manages implicitly becomes an explicit data structure you manage yourself. Where the recursive version calls itself with a smaller problem, the iterative version pushes that smaller problem onto a list or deque. Where the recursive version returns and picks up where it left off, the iterative version pops from the list and continues the loop.
DFS is the cleanest example because the conversion is direct. The recursive DFS works by calling itself for each unvisited neighbour — the call stack naturally implements LIFO ordering, visiting the deepest unvisited node first. The iterative version replaces the call stack with a Python list used as a stack (append for push, pop for pop), producing identical traversal order.
The one subtlety that trips people up: push order. The recursive version processes neighbours left-to-right because it calls itself for the first neighbour, recurses all the way down that branch, then comes back and calls itself for the second neighbour. To match this order iteratively, you push neighbours in reverse — the last neighbour goes on the stack first, so the first neighbour gets popped and processed first.
This matters in interview settings and in production when the traversal order is semantically meaningful — for example, when processing rules in priority order or when the graph represents a dependency chain where the order of resolution matters.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
from collections import deque
# --- Recursive DFS ---
# Clean, readable, mirrors the algorithm definition
# Dangerous on graphs with unknown depth or high node count
def dfs_recursive(graph, node, visited=None):
if visited is None:
visited = set() # IMPORTANT: never use mutable default arg
visited.add(node)
print(node, end=' ')
for neighbour in graph.get(node, []):
if neighbour not in visited:
dfs_recursive(graph, neighbour, visited)
return visited
# --- Iterative DFS ---
# Same traversal order, no stack overflow risk
# Uses heap memory (Python list) instead of the C call stack
def dfs_iterative(graph, start):
visited = set()
stack = [start] # explicit stack — append is push, pop is pop
while stack:
node = stack.pop()
if node in visited:
continue
visited.add(node)
print(node, end=' ')
# Push in reverse to match recursive left-to-right processing order
for neighbour in reversed(graph.get(node, [])):
if neighbour not in visited:
stack.append(neighbour)
return visited
# For contrast: BFS uses a deque instead of a list
def bfs_iterative(graph, start):
visited = set([start])
queue = deque([start]) # FIFO — popleft instead of pop
while queue:
node = queue.popleft()
print(node, end=' ')
for neighbour in graph.get(node, []):
if neighbour not in visited:
visited.add(neighbour)
queue.append(neighbour)
return visited
# Test graph — directed, acyclic
graph = {
'A': ['B', 'C'],
'B': ['D', 'E'],
'C': ['F'],
'D': [],
'E': ['F'],
'F': []
}
print('Recursive DFS: ', end=''); dfs_recursive(graph, 'A'); print()
print('Iterative DFS: ', end=''); dfs_iterative(graph, 'A'); print()
print('Iterative BFS: ', end=''); bfs_iterative(graph, 'A'); print()
# Demonstrate the depth advantage
# Simulate a chain graph 5000 nodes long — recursive would crash
chain = {i: [i + 1] for i in range(5000)}
chain[5000] = []
visited = dfs_iterative(chain, 0)
print(f"\nIterative DFS visited {len(visited)} nodes in 5000-node chain without stack overflow")Output
Recursive DFS: A B D E F C
Iterative DFS: A B D E F C
Iterative BFS: A B C D E F
Iterative DFS visited 5001 nodes in 5000-node chain without stack overflow
Order Matters When Converting — and Mutable Default Arguments Will Ruin Your Day
Two things to get right when converting recursive DFS to iterative: (1) Push neighbours in reverse order to match the LIFO behaviour of the call stack in the recursive version — without this, you get a valid DFS but in a different node order, which may or may not matter for your use case. (2) Never use a mutable object (set, list, dict) as a default argument in Python — def dfs(graph, node, visited=set()) creates the set once at function definition time and reuses it across all calls. The visited set from one traversal will pollute the next. Always use visited=None and create a fresh set inside the function.
Production Insight
A 5000-node chain graph is not unusual in production — think a deeply nested comment thread, a long dependency resolution chain, or a file system directory structure. Recursive DFS on that chain would push 5000 frames onto the C stack and crash with a RecursionError at frame 1000. The iterative version uses a Python list on the heap to track the same state — and heap memory is measured in gigabytes, not megabytes. The conversion is mechanical and the result is functionally identical. There is no reason to use recursive DFS on a graph whose depth you cannot bound at design time.
Key Takeaway
The conversion pattern from recursive to iterative DFS is universal: replace the call stack with an explicit list, push instead of recurse, pop and continue in the loop. Neighbours must be pushed in reverse order to match recursive traversal order. The iterative version uses heap memory instead of the C stack — heap is gigabytes, the C stack is megabytes. Always use iterative for graphs whose depth is controlled by external input.
IfGraph depth is bounded and small (e.g., balanced tree, bounded AST depth)
→
UseRecursive is fine — clean code, easy to verify, depth bounded by problem structure
IfGraph represents user-uploaded content, external data, or a file system
→
UseIterative mandatory — depth is controlled by external input and cannot be bounded safely
IfNeed BFS (level-order) traversal
→
UseAlways iterative — BFS has no natural recursive formulation; use a deque
IfGraph may contain cycles
→
UseBoth work with a visited set, but iterative is safer for large or unknown depth
IfProcessing order of nodes matters for correctness
→
UseVerify the push order in your iterative version — use the reverse-neighbours trick to match recursive order
Tail Recursion — Why Python Refuses to Optimize It
Tail recursion is a specific structural property of a recursive function: the recursive call is the very last operation executed before the function returns, with no pending computation waiting for the result. In a tail-recursive function, the current stack frame is no longer needed once the recursive call is made — all the information needed for the rest of the computation has been passed as arguments to the next call.
Languages like Scheme, Haskell, and Scala exploit this property through tail call optimization (TCO): the compiler recognizes that the current frame is no longer needed and reuses it for the next call instead of pushing a new one. The recursive call is effectively compiled into a jump instruction rather than a function call, making tail-recursive code consume constant stack space regardless of recursion depth. Tail recursion in these languages is genuinely equivalent to a while loop at the hardware level.
Python deliberately does not implement TCO. Guido van Rossum's reasoning has been documented publicly: stack traces are the primary debugging tool for Python developers, and TCO would make stack traces useless — instead of seeing the full chain of recursive calls, you would see a single frame. He also argued that Python is not a functional language and that the idiomatic solution for iteration is a loop, not compiler magic. This decision has never been reversed.
Java is in a similar position, though for a different reason. The JVM specification does not require TCO, and the HotSpot JVM does not implement it for general recursive calls. There is ongoing work in Project Loom and through invokedynamic to support functional-style TCO in some contexts, but as of 2026 you cannot rely on TCO in standard Java code.
The practical implication is stark: in Python and Java, tail-recursive code is exactly as expensive as non-tail-recursive code. Writing a function in tail-recursive style in Python provides zero performance benefit and no stack depth advantage. If you want iteration performance, write a loop.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
import sys
# --- Tail-recursive factorial ---
# The recursive call is the LAST operation — nothing pending after it
# accumulator carries the result forward so no computation needed after return
def factorial_tail(n, accumulator=1):
if n <= 1:
return accumulator
return factorial_tail(n - 1, n * accumulator) # tail position — accumulator updated
# --- Non-tail-recursive factorial ---
# Multiplication happens AFTER the recursive call returns
# The stack frame must be retained to perform n * (result of recursive call)
def factorial_non_tail(n):
if n <= 1:
return 1
return n * factorial_non_tail(n - 1) # NOT tail position — n must be remembered
# --- Iterative factorial — what you should actually write ---
def factorial_iterative(n):
result = 1
for i in range(2, n + 1):
result *= i
return result
# Demonstrate that Python treats both recursive versions identically
# Both will fail at the same recursion depth
for fn_name, fn in [("tail", factorial_tail), ("non_tail", factorial_non_tail)]:
sys.setrecursionlimit(200) # low limit to make failure obvious
try:
result = fn(150)
print(f"factorial_{fn_name}(150) = {str(result)[:20]}...")
except RecursionError:
print(f"factorial_{fn_name}(150) -> RecursionError at depth 150")
# Iterative has no such limit
sys.setrecursionlimit(1000) # restore
print(f"factorial_iterative(150) = {str(factorial_iterative(150))[:20]}...")
print(f"factorial_iterative(10000): computed successfully, len={len(str(factorial_iterative(10000)))} digits")
# Show the tail-recursive -> iterative mechanical conversion
# Replace: return factorial_tail(n-1, n*acc) with loop variable reassignment
def factorial_tail_to_iterative(n):
accumulator = 1
while n > 1: # was: if n <= 1: return accumulator
accumulator = n * accumulator # was: argument to recursive call
n = n - 1 # was: first argument to recursive call
return accumulator
print(f"\nTail-to-iterative conversion: factorial(20) = {factorial_tail_to_iterative(20)}")Output
factorial_tail(150) -> RecursionError at depth 150
factorial_non_tail(150) -> RecursionError at depth 150
factorial_iterative(150) = 57133839564458545227...
factorial_iterative(10000): computed successfully, len=35660 digits
Tail-to-iterative conversion: factorial(20) = 2432902008176640000
Guido's Decision — and Why It Has Held for 30 Years
Guido van Rossum's rejection of tail call optimization is not a limitation waiting to be fixed — it is a deliberate design philosophy that has been reaffirmed multiple times. His argument: Python prioritizes explicit, debuggable code over compiler magic. TCO would silently transform recursive functions into iterative ones, making stack traces misleading and debugging harder. If you want iteration, write a loop — that is explicit, readable, and does not depend on the optimizer recognizing a structural pattern in your code. This philosophy ('explicit is better than implicit') is the fourth line of the Zen of Python and governs far more than just TCO.
Production Insight
I have reviewed code from engineers who came from functional language backgrounds (Haskell, Elixir, Scala) and wrote elegant tail-recursive Python because that style was natural to them. The code was correct and readable, but it had no performance advantage over the non-tail-recursive equivalent and crashed at the same 1000-frame depth. The review comment was always the same: convert the tail-recursive pattern to a while loop — the mechanical conversion takes five minutes and the result is idiomatic Python that anyone on the team can maintain without knowing what tail recursion means.
Key Takeaway
Tail recursion is a theoretical optimization — Python and Java deliberately do not implement it. Both tail-recursive and non-tail-recursive functions consume the same stack depth per call in Python and standard Java. The conversion from tail recursion to a while loop is mechanical: replace the recursive call with argument reassignment and wrap in a while loop. Write loops for iteration — do not expect the runtime to do the conversion for you.
IfFunction is tail-recursive and depth is bounded and small (< 500)
→
UseLeave as-is if clarity is the priority — Python won't optimize it but it won't crash either
IfFunction is tail-recursive and depth could exceed 1000
→
UseConvert to while loop — the conversion is mechanical and the result is faster and safer
IfFunctional language developer wrote tail-recursive code expecting TCO
→
UseExplain TCO is not supported; convert to iterative during review
IfNeed recursion depth beyond 1000 in Java
→
UseConvert to iterative or use a thread with a larger stack — never rely on JVM TCO support
When Recursion Wins — Tree Traversal and Divide and Conquer
Despite its costs, recursion is the right choice for problems whose structure is inherently recursive. Tree traversal is the clearest example: a tree is defined recursively (a node with left and right subtrees that are themselves trees), so a recursive traversal directly mirrors the definition. You can read the three-line recursive inorder traversal and immediately verify it is correct because the code and the definition are the same thing. The iterative equivalent requires managing an explicit stack, tracking multiple state variables, and reasoning about edge cases that simply do not exist in the recursive version.
Divide and conquer algorithms (merge sort, quicksort, binary search) have the same property. The recursive version of merge sort — split the array in half, sort each half, merge — is directly expressible in code. The iterative version requires managing multiple pointers and merge widths through nested loops, and is substantially harder to understand and verify.
The important constraint that makes recursion safe for these problems: the recursion depth is bounded by the structure of the data, not by user-controlled input. A balanced binary tree with one billion nodes has a depth of approximately 30. Even a perfectly degenerate tree (a linked list) has a depth equal to its node count — but if you know the tree is balanced (because you built it that way), depth 30 is well within any platform's stack limit.
The practical rule: use recursion when the problem structure is recursive AND you can bound the depth at design time. The moment depth is controlled by external input — user-uploaded files, API responses, user-constructed data structures — you must switch to iteration.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
from __future__ import annotations
from dataclasses import dataclass
from typing import Optional
@dataclass
class TreeNode:
val: int
left: Optional['TreeNode'] = None
right: Optional['TreeNode'] = None
# --- Recursive inorder: left -> root -> right ---
# 3 lines, directly mirrors the definition
# Depth = tree height — O(log n) for balanced, O(n) for degenerate
def inorder_recursive(node: Optional[TreeNode]) -> list[int]:
if node is None:
return []
return inorder_recursive(node.left) + [node.val] + inorder_recursive(node.right)
# --- Iterative inorder: same result, harder to read ---
# Required when tree height is unbounded or user-controlled
def inorder_iterative(root: Optional[TreeNode]) -> list[int]:
result, stack, current = [], [], root
while current or stack:
# Go as far left as possible
while current:
stack.append(current)
current = current.left
# Process the leftmost unprocessed node
current = stack.pop()
result.append(current.val)
# Move to right subtree
current = current.right
return result
# --- Recursive merge sort: clean divide-and-conquer ---
def merge_sort(arr: list[int]) -> list[int]:
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid])
right = merge_sort(arr[mid:])
return merge(left, right)
def merge(left: list[int], right: list[int]) -> list[int]:
result = []
i = j = 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i]); i += 1
else:
result.append(right[j]); j += 1
return result + left[i:] + right[j:]
# Build a balanced BST for testing
# 4
# / \
# 2 6
# / \ / \
# 1 3 5 7
root = TreeNode(4,
TreeNode(2, TreeNode(1), TreeNode(3)),
TreeNode(6, TreeNode(5), TreeNode(7))
)
recursive_result = inorder_recursive(root)
iterative_result = inorder_iterative(root)
print(f"Recursive inorder: {recursive_result}")
print(f"Iterative inorder: {iterative_result}")
print(f"Results match: {recursive_result == iterative_result}")
# Merge sort demo
unsorted = [38, 27, 43, 3, 9, 82, 10]
print(f"\nMerge sort: {unsorted} -> {merge_sort(unsorted)}")
# Depth analysis: balanced tree with 1M nodes
import math
nodes = 1_000_000
balanced_depth = math.ceil(math.log2(nodes + 1))
print(f"\nBalanced tree with {nodes:,} nodes: max depth = {balanced_depth}")
print(f"Python's recursion limit: 1000 frames")
print(f"Safe for recursion? {'Yes' if balanced_depth < 900 else 'No'}")Output
Recursive inorder: [1, 2, 3, 4, 5, 6, 7]
Iterative inorder: [1, 2, 3, 4, 5, 6, 7]
Results match: True
Merge sort: [38, 27, 43, 3, 9, 82, 10] -> [3, 9, 10, 27, 38, 43, 82]
Balanced tree with 1,000,000 nodes: max depth = 20
Python's recursion limit: 1000 frames
Safe for recursion? Yes
The Rule for Safe Recursive Tree Processing
Recursion on trees is safe when you built the tree yourself with a known balancing property — O(log n) depth is well within any platform's stack limit. Recursion on trees is dangerous when a user, an API, or a file built the tree — a degenerate tree (effectively a linked list) has depth equal to its node count. The question to ask at design time is: who controls the depth of this data structure? If the answer is 'the user' or 'an external system', use iteration.
Production Insight
The iterative inorder traversal in the code above has 8 meaningful lines — the recursive version has 3. Both are correct. For a BST you built and maintain with known balance guarantees, the recursive version is the right choice: shorter, easier to review, and the depth is bounded by your own invariants. For a tree that users construct (a file system directory, a user-uploaded JSON tree, a dependency graph from user-defined packages), the iterative version is the right choice: the depth is externally controlled and you cannot safely bound it at design time. Make this decision once per tree type and document it in the code.
Key Takeaway
Recursion wins when the problem structure IS recursive and you control the depth. A balanced binary tree with 1 million nodes has a maximum recursion depth of 20 — comfortably safe. Recursive tree traversal code is 3x shorter and directly verifiable against the problem definition. The line between safe recursive and mandatory iterative is simple: who controls the depth? If the answer includes any external party, iterate.
IfTree is balanced and built by your own code (BST, heap, segment tree)
→
UseRecursion is safe — depth is O(log n) and bounded by your invariants
IfTree is user-provided, loaded from external data, or has no balance guarantee
→
UseUse iterative traversal — depth is externally controlled and cannot be bounded safely
IfImplementing divide-and-conquer (merge sort, quicksort, binary search)
→
UseRecursion is natural and safe — depth is O(log n) for standard implementations
IfImplementing backtracking (N-queens, Sudoku solver, permutation generation)
→
UseRecursion is idiomatic — depth is bounded by the problem size, not input depth
IfParsing a file format, processing nested configuration, traversing an AST
→
UseDepends on who produced the file — if external, use iterative; if your own toolchain, recursion is acceptable with a depth guard