Single-consumption rule: itertools objects exhaust after one pass — materialise to list only when you have confirmed the data fits in memory
Performance: itertools functions run at C speed; the win over Python loops is largest for filter and map patterns, smallest for simple traversals — always benchmark on your actual data
Production trap: groupby groups consecutive keys only — sort by key first or you get duplicate groups and silently wrong aggregations
Biggest mistake: tee with divergent consumers causes unbounded buffering — use list() if branches advance unevenly
Combinatorics trap: permutations(12, 12) produces 479 million tuples at ~152 bytes each on 64-bit CPython — always check math.perm/comb before any list() call
Plain-English First
Imagine a library that holds every book ever written. An eager librarian photocopies every book before you ask for anything — the room fills with paper before you read a word. A lazy librarian hands you one page at a time, printing the next only when you ask for it. itertools is the lazy librarian. It never produces the next value until your code specifically requests it, which means it never wastes memory generating items you never actually read. The moment you stop asking, it stops working. That single property — compute only what is needed, only when it is needed — is what makes itertools the right tool when your data is large, your memory is limited, or your search space is combinatorially enormous.
Python lists are great until they're not. The moment your data grows to millions of rows — log files, sensor streams, combinatoric search spaces — loading everything into memory stops being clever and starts being catastrophic. itertools is the standard library's answer to that problem: a collection of memory-efficient, fast iterator-building blocks that the CPython team wrote in C. Serious Python engineers reach for it constantly, yet most tutorials only scratch its surface.
The real problem itertools solves is the hidden cost of materialisation. Every time you write [item for item in range(10_000_000)] you are allocating a list in heap memory. With itertools you get a lazy pipeline — a chain of objects that produce values on demand, one at a time. This is not academic niceness. It is the difference between a data pipeline that processes a 10-million-row sensor stream in constant memory and one that exhausts a container's RAM allocation in seconds.
The module has also grown meaningfully in recent Python releases. Python 3.10 added pairwise, which eliminates a classic zip(it, it[1:]) hack that every codebase reimplemented slightly differently. Python 3.12 added batched, which finally gives you a canonical way to chunk an iterator into fixed-size groups without materialising it first. If your itertools knowledge predates 3.10, there are tools in the module right now that replace code you wrote yourself.
By the end of this article you will understand how itertools works at the iterator-protocol level, know exactly which tool to reach for in each scenario, avoid the gotchas that cause silent data corruption and OOM crashes in production, and be able to answer the interview questions that separate itertools users from itertools understanders.
How itertools Actually Works: Lazy Evaluation Under the Hood
Every object returned by itertools implements Python's iterator protocol — __iter__ and __next__. When you call itertools.chain(a, b) you do not get a new list. You get a small C object that holds references to a and b and advances an internal pointer each time __next__ is called. No data is copied. Nothing is computed until forced.
This is fundamentally different from list comprehensions or list(map(...)). Those are eager — they compute everything immediately and store it in heap memory. itertools objects are lazy — they compute nothing until a for loop, next(), or a consuming function like list() or sum() forces the next value out.
The performance implication is dramatic in the right scenarios. itertools.chain holds two references — a 56-byte C object regardless of input size. The equivalent list_a + list_b allocates a brand-new list proportional to the total number of elements. For infinite sequences like itertools.count(), eager evaluation would never terminate.
One nuance that causes silent production bugs: because these objects are stateful one-directional machines, they can only be consumed once. After you have iterated through an itertools.chain object it is exhausted. Calling list() on it a second time returns an empty list with no error, no warning, and no indication that anything went wrong. If your pipeline produces empty results and you cannot find the bug, a double-consumption is the first thing to check.
The right pattern when you need multiple passes: either recreate the iterator from its source for each pass, or materialise to a list once — but only after confirming the data volume fits in memory. For large data where neither is acceptable, restructure the algorithm to make a single pass.
import itertools
import sys
# ── 1. Memory footprint: eager vs lazy ───────────────────────────────────────# The eager list allocates every integer object reference upfront.# The lazy iterator is a fixed-size C object regardless of how many values it produces.
eager_numbers = list(range(1_000_000))
lazy_numbers = itertools.islice(itertools.count(0), 1_000_000)
print(f"Eager list size : {sys.getsizeof(eager_numbers):>12,} bytes")
print(f"Lazy iterator : {sys.getsizeof(lazy_numbers):>12,} bytes")
print(f"Memory ratio : {sys.getsizeof(eager_numbers) // sys.getsizeof(lazy_numbers):>12,}x more memory for eager")
# ── 2. Single-consumption: the most common itertools production bug ──────────# There is no error, no warning — just empty results on the second pass.
it = itertools.chain([1, 2, 3], [4, 5, 6])
first_pass = list(it)
second_pass = list(it) # exhausted — always emptyprint(f"\nFirst pass : {first_pass}")
print(f"Second pass : {second_pass} <-- empty, not a bug — expected single-consumption behaviour")
# ── 3. The correct pattern: recreate from source for each pass ───────────────# When materialising is not an option, wrap in a factory and call fresh.defmake_pipeline():
"""Returns a fresh iterator every time. No state shared between calls."""return itertools.chain([1, 2, 3], [4, 5, 6])
for pass_number inrange(1, 4):
result = list(make_pipeline())
print(f"Pass {pass_number}: {result}") # Always [1, 2, 3, 4, 5, 6]# ── 4. chain object size: constant regardless of input ───────────────────────
chain_small = itertools.chain(range(10), range(10))
chain_large = itertools.chain(range(10_000_000), range(10_000_000))
print(f"\nchain over 20 items : {sys.getsizeof(chain_small)} bytes")
print(f"chain over 20,000,000 items: {sys.getsizeof(chain_large)} bytes")
print("Both the same — chain holds references, not copies.")
Output
Eager list size : 8,448,728 bytes
Lazy iterator : 48 bytes
Memory ratio : 176,015x more memory for eager
First pass : [1, 2, 3, 4, 5, 6]
Second pass : [] <-- empty, not a bug — expected single-consumption behaviour
Pass 1: [1, 2, 3, 4, 5, 6]
Pass 2: [1, 2, 3, 4, 5, 6]
Pass 3: [1, 2, 3, 4, 5, 6]
chain over 20 items : 56 bytes
chain over 20,000,000 items: 56 bytes
Both the same — chain holds references, not copies.
Watch Out: Silent Empty Results From Double Consumption
If you pass an itertools object to two different functions — max(it) and then min(it) — the second call sees an empty iterator and returns a ValueError or an empty result with no indication of why. This is the most common itertools bug in production precisely because it is silent. The fix is to either materialise once (result = list(it); use result) or recreate from a source factory for each consumer. Never pass an itertools object to two consumers without one of these patterns in place.
Production Insight
The production cost of accidental double-consumption is silent data loss with no traceback.
A pipeline processes zero rows on the second pass, emits no error, and produces downstream results that look plausible but are entirely wrong.
Rule: if the same iterator variable appears in two consuming calls anywhere in a function, that is a bug. Assign list(iterator) to a variable and reuse it, or use a factory function.
Key Takeaway
Lazy iterators use O(1) memory and compute on demand — the iterator object size is constant regardless of how many values it will produce.
They exhaust after one pass — plan for single-consumption or use a factory pattern for replay.
If you need multiple passes, materialise once when memory allows, or recreate from source each time.
Infinite Iterators, Slicing and Chaining: Building Real Data Pipelines
itertools gives you three infinite iterators: count, cycle, and repeat. These sound dangerous but they are the backbone of real patterns: round-robin load balancing with cycle, default-value padding with repeat, and auto-incrementing IDs with count. The discipline is always pairing them with islice or takewhile to impose a finite boundary. An infinite iterator without a termination condition is an infinite loop waiting to happen.
chain and chain.from_iterable deserve attention because they are often the glue between pipeline stages. chain(*list_of_lists) unpacks at call time — all sub-iterables must exist before the call. chain.from_iterable(generator_of_lists) is fully lazy — it does not touch the next sub-iterable until the previous one is exhausted. That distinction is critical when sub-iterables are expensive to create: open file handles, database cursors, or network connections.
The file descriptor story is a real one. If you build a list of open file handles and then pass it to chain(*handles), all handles are open simultaneously at call time. With thousands of files that hits the OS file descriptor limit. chain.from_iterable with a generator that opens files one at a time defers each open call until needed, keeping at most one or two file handles alive at any moment.
islice is your lazy version of Python's slice syntax. It cannot accept negative indices — it does not know the total length — but for extracting a window from a stream it is indispensable. Pair it with dropwhile and takewhile to express WHERE-clause style filters over any iterable without materialising it.
zip_longest handles the case where two streams have different lengths. Standard zip stops at the shorter one silently, which is a source of data loss bugs when merging streams of different provenance. zip_longest fills missing values with a configurable fillvalue, making the length mismatch explicit rather than silent.
io/thecodeforge/itertools/pipeline_demo.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
import itertools
# ── 1. count + islice: bounded ID generator ──────────────────────────────────# count() is infinite. islice() imposes the finite boundary.# Never use count() in a for loop without a termination condition.defgenerate_ids(prefix: str, count: int):
for i in itertools.islice(itertools.count(1), count):
yield f"{prefix}-{i:07d}"
ids = list(generate_ids("JOB", 5))
print(f"Generated IDs: {ids}")
# ── 2. cycle + islice: round-robin load balancer ─────────────────────────────
servers = ['api-a', 'api-b', 'api-c']
requests = [f"req-{i}"for i inrange(9)]
for request, server inzip(requests, itertools.cycle(servers)):
print(f"{request} -> {server}")
# ── 3. chain.from_iterable: lazy log file merger ─────────────────────────────# WRONG pattern: chain(*handles) — all files opened at once at call time# handles = [open(f) for f in file_list] # all open simultaneously# merged = itertools.chain(*handles) # file descriptor exhaustion risk# CORRECT pattern: chain.from_iterable with a generator that opens lazily# Each file is opened only when the previous one is exhausted.defopen_files_lazily(file_paths):
"""Opens and yields lines from each file, one file at a time."""for path in file_paths:
try:
withopen(path, 'r', encoding='utf-8') as f:
yieldfrom f
exceptFileNotFoundError:
pass # skip missing log files gracefully# Usage: processes 10,000 log files with at most 1 open file descriptor at a time# for line in open_files_lazily(all_log_paths):# process(line)# ── 4. takewhile + dropwhile: conditional stream windowing ───────────────────
data = [0, 0, 0, 1, 2, 3, 0, 4, 5]
take = list(itertools.takewhile(lambda x: x == 0, data)) # leading zeros
skip = list(itertools.dropwhile(lambda x: x == 0, data)) # skip leading zerosprint(f"\nOriginal : {data}")
print(f"Leading zeros : {take}")
print(f"After zeros : {skip}")
# ── 5. zip_longest: safe merging of unequal-length streams ───────────────────# Standard zip silently truncates to the shorter sequence.# zip_longest makes the length mismatch explicit and fills gaps.
stream_a = [1, 2, 3, 4, 5]
stream_b = ['a', 'b', 'c'] # shorter — zip would silently drop items 4 and 5print("\nWith zip (silent truncation):")
for pair inzip(stream_a, stream_b):
print(f" {pair}")
print("\nWith zip_longest (explicit fill):")
for pair in itertools.zip_longest(stream_a, stream_b, fillvalue='MISSING'):
print(f" {pair}")
Prefer chain.from_iterable for Lazy Resource Merging
When chaining many iterables that are expensive to create — file handles, database cursors, network connections — always use chain.from_iterable with a generator that creates each sub-iterable on demand. chain(*list) unpacks all sub-iterables at call time: every file is opened, every cursor is created, before you process a single item. chain.from_iterable defers creation of the next sub-iterable until the previous one is fully exhausted, keeping resource usage bounded to one at a time.
Production Insight
A production log merger used chain(*[open(f) for f in file_list]). The list comprehension opened all 10,000 files before chain was even called. The process hit the OS file descriptor limit (ulimit -n 1024) and crashed.
The fix used a generator that opened each file inside a with block and yielded from it, passed to chain.from_iterable. At most one file descriptor was open at any moment. Memory usage dropped by two orders of magnitude.
Rule: if the sub-iterables are resource-bound (files, sockets, cursors), the generator must create them lazily inside chain.from_iterable, not before it.
Key Takeaway
Always pair infinite iterators (count, cycle, repeat) with islice or takewhile — an infinite iterator without a termination condition is an infinite loop.
chain.from_iterable is lazier than chain(*args) — it defers sub-iterable creation until needed, preventing file descriptor exhaustion.
zip_longest makes stream length mismatches explicit — use it instead of zip whenever the two streams may not be the same length.
pairwise and batched: The Modern itertools Tools You Should Be Using
Python 3.10 added pairwise. Python 3.12 added batched. Both solve patterns that every Python codebase had implemented manually for years, usually in subtly wrong ways. If your itertools knowledge predates these versions, there is code in your codebase right now that these tools replace cleanly.
pairwise(iterable) returns successive overlapping pairs: (s[0], s[1]), (s[1], s[2]), (s[2], s[3])... The manual implementation was zip(it, it[1:]) which requires materialising the iterable twice, or the more arcane zip(it, islice(it, 1, None)) which is stateful and confusing to read. pairwise is lazy, needs no materialisation, and communicates intent clearly. It is the right tool for consecutive difference calculations, time-series delta analysis, and detecting transitions between states in a stream.
batched(iterable, n) splits an iterable into chunks of at most n items each, yielding each chunk as a tuple. The final chunk may be shorter than n if the iterable does not divide evenly. Before Python 3.12 this required a manual islice-in-a-while-loop pattern that appears in countless Stack Overflow answers, each with slightly different edge case handling for the final short chunk. batched handles it correctly by design.
Both tools are available only in the standard library from their respective Python versions. If you are running Python 3.9 or earlier you will need to implement equivalents manually — the code examples below show the pre-3.10 and pre-3.12 patterns alongside the canonical versions so you can recognise both forms.
import itertools
import sys
PYTHON_VERSION = sys.version_info
# ── pairwise (Python 3.10+) ───────────────────────────────────────────────────# Returns successive overlapping pairs from an iterable.# Essential for time-series deltas, state transitions, consecutive comparisons.if PYTHON_VERSION >= (3, 10):
# Canonical — clear, lazy, correct
prices = [10.0, 10.5, 9.8, 11.2, 11.0, 12.3]
deltas = [(b - a) for a, b in itertools.pairwise(prices)]
print(f"Prices : {prices}")
print(f"Deltas : {[round(d, 2) for d in deltas]}")
# State transition detection in a log stream
states = ['idle', 'idle', 'processing', 'processing', 'idle', 'error']
transitions = [
(prev, curr)
for prev, curr in itertools.pairwise(states)
if prev != curr
]
print(f"\nState transitions: {transitions}")
else:
print("pairwise requires Python 3.10+ — using manual equivalent")
defpairwise_compat(iterable):
"""Pre-3.10 equivalent. Materialises one copy."""
a, b = itertools.tee(iterable)
next(b, None)
returnzip(a, b)
# ── batched (Python 3.12+) ────────────────────────────────────────────────────# Chunks an iterable into tuples of at most n items.# Final chunk may be shorter if the iterable does not divide evenly.if PYTHON_VERSION >= (3, 12):
records = list(range(1, 11)) # [1, 2, 3, ..., 10]print(f"\nRecords: {records}")
print("Batched into groups of 3:")
for batch in itertools.batched(records, 3):
print(f" {batch}")
# Real-world use: chunked API writesdefinsert_in_chunks(items, chunk_size: int, insert_fn):
"""Processes items in chunks without materialising the full list."""for batch in itertools.batched(items, chunk_size):
insert_fn(batch)
defsimulated_db_insert(batch):
print(f" Inserting batch of {len(batch)}: {batch}")
print("\nChunked DB inserts:")
insert_in_chunks(range(1, 8), 3, simulated_db_insert)
else:
print("batched requires Python 3.12+ — using manual equivalent")
defbatched_compat(iterable, n):
"""Pre-3.12 equivalent using islice."""
it = iter(iterable)
whileTrue:
batch = tuple(itertools.islice(it, n))
ifnot batch:
returnyield batch
# ── Combining pairwise + batched: sliding window analysis ────────────────────# Real use case: detect anomalies in sensor readings by comparing consecutive# values in batches, without loading the entire stream.if PYTHON_VERSION >= (3, 12):
sensor_readings = [100, 102, 98, 250, 101, 99, 103, 400, 97]
threshold = 50print("\nAnomaly detection (consecutive delta > threshold):")
for a, b in itertools.pairwise(sensor_readings):
delta = abs(b - a)
if delta > threshold:
print(f" Anomaly: {a} -> {b} (delta={delta})")
Output
Prices : [10.0, 10.5, 9.8, 11.2, 11.0, 12.3]
Deltas : [0.5, -0.7, 1.4, -0.2, 1.3]
State transitions: [('idle', 'processing'), ('processing', 'idle'), ('idle', 'error')]
pairwise and batched Replace Code You Wrote Yourself
The pre-3.10 equivalent of pairwise was zip(it, it[1:]) which requires materialising the iterable. The pre-3.12 equivalent of batched was an islice-in-a-while loop that every Python codebase has implemented slightly differently. Both canonical versions are lazy, handle edge cases correctly, and communicate intent clearly. If your codebase has a utility function called 'chunked', 'batch', 'sliding_window', or 'consecutive_pairs', it is probably a candidate for replacement.
Production Insight
pairwise replaced a zip(readings, readings[1:]) pattern in a time-series anomaly detector that processed 50 million sensor readings per day. The old pattern materialised a full copy of the readings list for readings[1:] — 400 MB of extra allocation per run. pairwise eliminated that entirely with no change to the output logic.
batched replaced a hand-rolled chunking function in a bulk database writer that had a subtle bug: it used list(islice(it, n)) and returned an empty list for the final chunk rather than skipping it, causing an empty INSERT statement on every run. batched's correct final-chunk handling fixed the bug implicitly.
batched (3.12+) chunks an iterator into fixed-size tuples lazily — replace manual islice-in-a-while patterns.
Both handle edge cases correctly by design. Check your codebase for hand-rolled equivalents and replace them.
Combinatorics: product, permutations, combinations — Size It Before You Materialise It
itertools provides four combinatoric generators: product, permutations, combinations, and combinations_with_replacement. They are lazy in the same sense as all itertools objects — they produce one tuple at a time on demand. But their output count grows at rates that make laziness irrelevant if you materialise them.
product(n, repeat=r) yields n^r tuples. permutations(n, r) yields n!/(n-r)! tuples. combinations(n, r) yields n!/(r!(n-r)!) tuples. These numbers exceed available RAM well before the input sizes feel dangerous. permutations(12, 12) is 479 million tuples. On 64-bit CPython 3.11, each tuple of 12 elements occupies approximately 152 bytes: a 56-byte tuple header plus 12 internal pointers at 8 bytes each. Small integers in range(12) are interned and share their storage, but the tuple structures themselves are not. Materialising the full set requires approximately 72 GB — on an 8 GB pod that is a death sentence.
The golden rule: never call list() on a combinatoric iterator before computing the expected size with math.perm or math.comb. That computation is O(1) and takes microseconds. The OOM kill is not reversible.
Processing combinatoric output lazily means working with one tuple at a time: filtering with filter() or itertools.filterfalse() directly on the lazy iterator, slicing with islice to take a bounded sample, or using a for loop that breaks early. None of these ever builds the full set in memory. The discarded tuples cost nothing.
import itertools
import math
import sys
# ── 1. Size check before materialising: always do this first ─────────────────
n, r = 12, 12
perm_count = math.perm(n, r) # 479,001,600
comb_count = math.comb(n, r) # 1 (choosing all 12 from 12 is one combination)
comb_10 = math.comb(n, 10) # 66 (more interesting: choosing 10 from 12)print(f"permutations({n}, {r}) = {perm_count:>15,} tuples")
print(f"combinations({n}, {r}) = {comb_count:>15,} tuples")
print(f"combinations({n}, 10) = {comb_10:>15,} tuples")
# Memory estimate on 64-bit CPython 3.11# Each tuple of 12 elements: 56 byte header + 12 pointers * 8 bytes = 152 bytes# Small integers (0-11) are interned so no per-tuple allocation for the ints themselves
bytes_per_tuple = 56 + (r * 8)
print(f"\nBytes per tuple of {r} elements: {bytes_per_tuple}")
print(f"Memory for list(permutations({n},{r})): ~{perm_count * bytes_per_tuple / 1e9:.1f} GB")
print(f"Memory for list(combinations({n},10)): ~{comb_10 * (56 + 10*8) / 1e6:.1f} MB <- safe")
# ── 2. Safe lazy processing: filter directly on the iterator ─────────────────# Rule: filter before you materialise, never after.defis_ascending(t):
"""True if the tuple values are strictly increasing."""
return all(a < b for a, b in itertools.pairwise(t)) # pairwise from 3.10+# Process lazily: tuples that don't pass the filter are never stored
ascending_count = 0
for perm in itertools.permutations(range(6)): # 720 permutations — safe to iterate fullyifis_ascending(perm):
ascending_count += 1print(f"\nAscending permutations of range(6): {ascending_count}") # should be 1: (0,1,2,3,4,5)# ── 3. islice: bounded sampling from a huge combinatoric space ───────────────print("\nFirst 5 permutations of range(12) (sample only — not materialised fully):")
for p in itertools.islice(itertools.permutations(range(12)), 5):
print(f" {p}")
# ── 4. product: more predictable growth than permutations ────────────────────# product(n, repeat=r) = n^r — exponential but predictable and often much smaller
config_values = {
'workers': [2, 4, 8],
'timeout': [30, 60],
'retries': [1, 3, 5],
}
keys = list(config_values.keys())
value_lists = list(config_values.values())
grid_size = math.prod(len(v) for v in value_lists)
print(f"\nConfig grid size: {grid_size} combinations")
print("Configs:")
for combo in itertools.product(*value_lists):
config = dict(zip(keys, combo))
print(f" {config}")
# ── 5. Pre-flight assertion: catch it before it crashes ──────────────────────
MAX_SAFE_MATERIALISE = 1_000_000
defsafe_permutations(items, r=None):
"""
Yields permutations lazily after confirming the count is within budget.
RaisesValueError before allocating anything if the count exceeds the limit.
"""
n = len(items)
r = r if r isnotNoneelse n
count = math.perm(n, r)
if count > MAX_SAFE_MATERIALISE:
raiseValueError(
f"permutations({n}, {r}) would produce {count:,} tuples — "
f"exceeds safe limit of {MAX_SAFE_MATERIALISE:,}. "
f"Use filter() and islice() on the lazy iterator instead."
)
return itertools.permutations(items, r)
try:
result = safe_permutations(list(range(12)))
exceptValueErroras e:
print(f"\nPre-flight check caught: {e}")
Output
permutations(12, 12) = 479,001,600 tuples
combinations(12, 12) = 1 tuples
combinations(12, 10) = 66 tuples
Bytes per tuple of 12 elements: 152
Memory for list(permutations(12,12)): ~72.8 GB
Memory for list(combinations(12,10)): ~0.1 MB <- safe
Ascending permutations of range(6): 1
First 5 permutations of range(12) (sample only — not materialised fully):
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 10)
(0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 9, 11)
(0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 9)
(0, 1, 2, 3, 4, 5, 6, 7, 8, 11, 9, 10)
Config grid size: 18 combinations
Configs:
{'workers': 2, 'timeout': 30, 'retries': 1}
{'workers': 2, 'timeout': 30, 'retries': 3}
{'workers': 2, 'timeout': 30, 'retries': 5}
{'workers': 2, 'timeout': 60, 'retries': 1}
{'workers': 2, 'timeout': 60, 'retries': 3}
{'workers': 2, 'timeout': 60, 'retries': 5}
{'workers': 4, 'timeout': 30, 'retries': 1}
{'workers': 4, 'timeout': 30, 'retries': 3}
{'workers': 4, 'timeout': 30, 'retries': 5}
{'workers': 4, 'timeout': 60, 'retries': 1}
{'workers': 4, 'timeout': 60, 'retries': 3}
{'workers': 4, 'timeout': 60, 'retries': 5}
{'workers': 8, 'timeout': 30, 'retries': 1}
{'workers': 8, 'timeout': 30, 'retries': 3}
{'workers': 8, 'timeout': 30, 'retries': 5}
{'workers': 8, 'timeout': 60, 'retries': 1}
{'workers': 8, 'timeout': 60, 'retries': 3}
{'workers': 8, 'timeout': 60, 'retries': 5}
Pre-flight check caught: permutations(12, 12) would produce 479,001,600 tuples — exceeds safe limit of 1,000,000. Use filter() and islice() on the lazy iterator instead.
72 GB, Not 23 GB — The Tuple Size Math Matters
Each tuple of 12 elements on 64-bit CPython occupies approximately 152 bytes: a 56-byte object header plus 12 internal pointers at 8 bytes each. Small integers (0-11) are interned so the integers themselves are not duplicated, but the tuple structure is not small. Materialising 479 million such tuples requires approximately 72 GB — nearly 9× an 8 GB container limit. The '48 bytes per tuple' figure that appears in older itertools articles is wrong. The correct number makes the lesson even stronger: the margin for error is smaller than most engineers assume.
Production Insight
A team generated combinations(20, 10) = 184,756 tuples and called list() — fine, fits easily.
The following sprint they switched to permutations(20, 10) without re-checking the size. math.perm(20, 10) = 670,442,572,800 — 670 billion tuples. At 152 bytes each that is approximately 100 TB. The pipeline ran out of swap before anyone could intervene.
The post-mortem added a safe_permutations wrapper with a pre-flight size check as a mandatory utility for any service that generates combinatoric output at runtime. It is now part of the team's standard library.
Key Takeaway
Tuple memory on 64-bit CPython 3.11+: 56 bytes header + 8 bytes per pointer = 152 bytes for a 12-element tuple. Not 48.
Always compute math.perm(n, r) or math.comb(n, r) and compare against your memory budget before any list() call.
Add a pre-flight size assertion in any service that generates combinatoric output — catch it at startup, not at OOM kill.
Performance: When itertools Beats Pure Python and When It Doesn't
itertools functions are implemented in C and avoid Python bytecode overhead per element. The win is real but not uniform — it depends on what the equivalent Python code is and what the bottleneck actually is.
The gains are largest for filter and transform patterns. itertools.compress applies a boolean mask in C with no Python-level per-element call. A list comprehension with a conditional does execute Python bytecode for every element. The difference is measurable and consistent: around 2–3× for compress versus an equivalent list comprehension at scale.
The gains are smallest — or reversed — for random-access patterns on pre-built lists. islice over a list that supports O(1) indexing can be slower than a direct slice, because islice has per-element __next__ overhead that a slice's underlying C memcpy does not. The right comparison is islice over a lazy generator source (where slicing is impossible) versus materialising the generator first and then slicing. In that comparison islice wins substantially.
chain against list concatenation shows a genuine memory win even when total iteration time is similar. The + operator on two lists allocates a third list in one operation — fast, but O(n) memory. chain holds two references and produces values on demand — O(1) memory. For pipelines where memory pressure matters more than raw CPU time, chain is the right choice even when the timing difference is modest.
The broader point: itertools is not a universal speedup. It is a memory management tool that happens to avoid some Python-level overhead as a side effect. Use it when your data is large enough that O(n) memory allocation matters, not as a reflexive replacement for every list comprehension. Always benchmark on your actual data size with timeit before committing to an itertools-heavy approach in a hot path.
import itertools
import timeit
import sys
# ── 1. compress vs list comprehension filter ─────────────────────────────────# compress applies a boolean mask in C — no Python call per element.# List comprehension executes Python bytecode for every element.
records = [f"row-{i}"for i inrange(500_000)]
mask = [i % 3 == 0for i inrange(500_000)]
deflist_comp_filter():
return [r for r, m inzip(records, mask) if m]
defitertools_compress_filter():
returnlist(itertools.compress(records, mask))
time_comp = timeit.timeit(list_comp_filter, number=10)
time_compress = timeit.timeit(itertools_compress_filter, number=10)
print("── compress vs list comprehension (500k records) ──")
print(f"list comprehension : {time_comp:.4f}s (10 runs)")
print(f"itertools.compress : {time_compress:.4f}s (10 runs)")
print(f"Speedup : {time_comp / time_compress:.1f}x")
# ── 2. chain vs list concatenation ───────────────────────────────────────────# + operator allocates a new list — O(n) memory, fast CPU.# chain holds references — O(1) memory, similar CPU, no intermediate allocation.
list_a = list(range(500_000))
list_b = list(range(500_000))
defeager_plus():
result = list_a + list_b
return len(result) # force evaluationdeflazy_chain():
result = itertools.chain(list_a, list_b)
return sum(1 for _ in result) # force evaluation without materialising
time_plus = timeit.timeit(eager_plus, number=20)
time_chain = timeit.timeit(lazy_chain, number=20)
print("\n── chain vs list concatenation (1M elements total) ──")
print(f"list_a + list_b : {time_plus:.4f}s (20 runs)")
print(f"itertools.chain : {time_chain:.4f}s (20 runs)")
print(f"Speed difference : {abs(time_plus - time_chain):.4f}s")
print(f"Memory difference : list allocates ~{sys.getsizeof(list_a + list_b):,} bytes extra")
print(f" chain allocates ~{sys.getsizeof(itertools.chain(list_a, list_b))} bytes")
# ── 3. islice on a generator vs materialise-then-slice ───────────────────────# islice wins when the source is lazy (generator) — avoids materialisation entirely.# On a pre-built list, direct slicing is faster — no per-element overhead.defmake_generator(n):
return (x * x for x inrange(n))
defmaterialise_then_slice(n):
# Materialise the full generator, then slicereturnlist(make_generator(n))[100:200]
defislice_on_generator(n):
# Take only elements 100..199 from the generator — never materialises the restreturnlist(itertools.islice(make_generator(n), 100, 200))
time_mat = timeit.timeit(lambda: materialise_then_slice(500_000), number=50)
time_isl = timeit.timeit(lambda: islice_on_generator(500_000), number=50)
print("\n── islice vs materialise-then-slice on a generator (500k elements) ──")
print(f"Materialise + slice: {time_mat:.4f}s (50 runs)")
print(f"islice on generator: {time_isl:.4f}s (50 runs)")
print(f"Speedup : {time_mat / time_isl:.1f}x")
print("Note: islice wins because it never touches elements 200..499,999")
# ── 4. starmap vs map+lambda ─────────────────────────────────────────────────# starmap avoids the lambda call overhead for functions that take multiple args.
coords = [(x, y) for x inrange(1000) for y inrange(10)]
defwith_lambda():
returnlist(map(lambda args: args[0] ** 2 + args[1] ** 2, coords))
defwith_starmap():
import math
returnlist(itertools.starmap(lambda x, y: x ** 2 + y ** 2, coords))
time_lambda = timeit.timeit(with_lambda, number=50)
time_star = timeit.timeit(with_starmap, number=50)
print("\n── starmap vs map+lambda (10k two-arg operations) ──")
print(f"map + lambda : {time_lambda:.4f}s (50 runs)")
print(f"itertools.starmap : {time_star:.4f}s (50 runs)")
print(f"Speedup : {time_lambda / time_star:.1f}x")
Output
── compress vs list comprehension (500k records) ──
list comprehension : 0.3841s (10 runs)
itertools.compress : 0.1472s (10 runs)
Speedup : 2.6x
── chain vs list concatenation (1M elements total) ──
list_a + list_b : 0.3217s (20 runs)
itertools.chain : 0.3489s (20 runs)
Speed difference : 0.0272s
Memory difference : list allocates ~8,448,728 bytes extra
chain allocates ~56 bytes
── islice vs materialise-then-slice on a generator (500k elements) ──
Materialise + slice: 0.4231s (50 runs)
islice on generator: 0.0019s (50 runs)
Speedup : 222.7x
Note: islice wins because it never touches elements 200..499,999
── starmap vs map+lambda (10k two-arg operations) ──
map + lambda : 0.0891s (50 runs)
itertools.starmap : 0.0623s (50 runs)
Speedup : 1.4x
The Real Win Is Often Memory, Not CPU
The chain benchmark shows nearly identical CPU time to list concatenation. But the memory difference is stark: list_a + list_b allocates 8+ MB for the new combined list. chain allocates 56 bytes. For pipelines under memory pressure, that difference matters enormously even when the wall-clock time is similar. Do not dismiss itertools because the speedup in a microbenchmark looks modest — measure memory pressure, not just time.
Production Insight
We replaced a list comprehension processing 10M sensor readings with itertools.compress. Processing time dropped from 7.8s to 3.0s. But the more important win was GC pressure: the comprehension created two intermediate lists totalling 160 MB of heap allocation that the GC had to collect after each batch. The compress pipeline created no intermediate structures, reducing GC pause frequency by 80% and improving tail latency significantly.
Rule: benchmark both time and memory allocation. Reducing GC pressure often has larger production impact than reducing raw CPU time.
Key Takeaway
compress is 2–3× faster than an equivalent list comprehension for filter patterns — the C implementation eliminates Python bytecode per element.
islice on a generator source is dramatically faster than materialise-then-slice because it never evaluates the discarded elements at all.
chain's real win is O(1) memory versus O(n) for list concatenation, even when CPU time is comparable — measure both.
groupby, tee, and accumulate: Tools That Bite Back
Three itertools tools have subtle behaviours that produce production bugs. Each one has a pattern that looks correct, runs without error, and returns wrong results.
groupby does not group all identical keys together. It groups consecutive identical keys. The name strongly implies SQL GROUP BY behaviour — it is not. If your data has 'engineering' records at positions 1, 3, and 5 interleaved with 'sales' records at positions 2 and 4, groupby produces three groups: one engineering, one sales, one engineering. Not two. The fix is always a sorted() call on the same key immediately before groupby. This is the most common itertools bug in production and the one most likely to go undetected because the output looks plausible.
tee splits a single iterator into n independent iterators. Internally it maintains a linked list of fixed-size chunks — each chunk holds 57 elements in CPython's implementation. Items consumed by both iterators are released. Items consumed by one but not yet the other are buffered. The buffer holds exactly the delta between the faster and slower consumer, not everything the faster consumer has ever seen. This distinction matters: a tee where both consumers advance at similar rates is safe. A tee where one consumer races 100,000 items ahead of the other buffers 100,000 items. Use tee only for lockstep consumers or consumers that stay within a few thousand items of each other.
accumulate computes running aggregates. It works with any binary function, not just addition. Running max, running min, running product — all valid. One important note: Python integers are arbitrary precision, so accumulate with operator.mul does not overflow in the fixed-width sense. What it does produce is factorially growing integers — each multiplication involves larger numbers than the last, and the per-operation cost grows accordingly. For large numeric sequences this can become a CPU and memory bottleneck without any exception being raised.
io/thecodeforge/itertools/gotchas_demo.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
import itertools
import operator
# ── 1. groupby: sort first or get duplicate groups ───────────────────────────
data = [
{'department': 'engineering', 'name': 'Alice'},
{'department': 'sales', 'name': 'Bob'},
{'department': 'engineering', 'name': 'Charlie'}, # same dept, not consecutive
{'department': 'sales', 'name': 'Diana'},
]
key_func = lambda x: x['department']
print("── groupby WITHOUT sorting (wrong) ──")
for dept, group in itertools.groupby(data, key_func):
members = [p['name'] for p in group]
print(f" {dept}: {members}")
# engineering and sales each appear twice — duplicate groups, wrong aggregationprint("\n── groupby WITH sorting (correct) ──")
sorted_data = sorted(data, key=key_func)
for dept, group in itertools.groupby(sorted_data, key_func):
members = [p['name'] for p in group]
print(f" {dept}: {members}")
# Each department appears once with all its members# ── 2. tee: buffer holds the delta between consumers, not everything ──────────# The buffer grows when one consumer advances ahead of the other.# Items consumed by BOTH are released — tee does not hold the entire stream.
source = iter(range(10))
t1, t2 = itertools.tee(source, 2)
# Advance t1 by 5 items — tee buffers items 0..4 for t2for _ inrange(5):
next(t1)
# t2 hasn't consumed anything yet — the 5 buffered items are waitingprint("\n── tee: consuming the buffered items ──")
print(f"t2 sees: {list(t2)}") # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]print(f"t1 sees remaining: {list(t1)}") # [5, 6, 7, 8, 9]# Safe tee pattern: lockstep consumers via zip
source2 = iter(range(8))
t1, t2 = itertools.tee(source2)
consecutive_pairs = list(zip(t1, t2))
# Note: this is equivalent to pairwise in Python 3.10+print(f"\nLockstep tee (consecutive pairs): {consecutive_pairs}")
# ── 3. accumulate: arbitrary binary functions, factorially growing integers ──
values = [3, 1, 4, 1, 5, 9, 2, 6]
# Running sum (default)
running_sum = list(itertools.accumulate(values))
print(f"\nRunning sum : {running_sum}")
# Running maximum
running_max = list(itertools.accumulate(values, max))
print(f"Running max : {running_max}")
# Running minimum
running_min = list(itertools.accumulate(values, min))
print(f"Running min : {running_min}")
# Running product — integers grow factorially in magnitude, not overflowing# (Python has arbitrary-precision integers) but getting expensive to compute
running_product = list(itertools.accumulate(values, operator.mul))
print(f"Running prod: {running_product}")
print("Note: no overflow — Python ints are arbitrary precision.")
print("But: each multiplication involves larger numbers than the last.")
print("For large sequences with large values this becomes a CPU cost, not a correctness issue.")
# With initial value (Python 3.8+)
running_with_initial = list(itertools.accumulate(values, initial=100))
print(f"\nWith initial=100: {running_with_initial}")
groupby Is Not SQL GROUP BY — This Trips Up Everyone
SQL GROUP BY collects all rows with the same key regardless of their order in the input. itertools.groupby groups only consecutive identical keys. If the same key appears in two non-adjacent positions, you get two separate groups for that key. Always sort by the key immediately before calling groupby. The sort and the groupby must use the same key function. This pairing is so reliable that most production code uses a helper that enforces it: def sorted_groupby(data, key): return itertools.groupby(sorted(data, key=key), key=key).
Production Insight
A reporting pipeline grouped sales records by region without sorting first. The data arrived roughly in chronological order, not by region. Europe appeared at positions 1, 4, and 9 in the stream — groupby created three separate 'Europe' groups and summed each independently. The report showed European revenue triple-counted in three separate line items.
The error went undetected for two reporting cycles because the individual subtotals looked plausible and no one summed them manually.
Fix: sorted_groupby(records, key=lambda r: r.region) as a one-line replacement. The sort is O(n log n) and entirely worth it.
Key Takeaway
groupby groups consecutive keys — sort by key first, always, with the same key function.
tee buffers the delta between consumers, not the entire stream — safe for lockstep consumers, problematic for divergent ones.
accumulate with operator.mul produces arbitrarily large integers, not overflow — but the per-operation cost grows with the magnitude of the values.
● Production incidentPOST-MORTEMseverity: high
Materialised Combinatorics Blew Up a 16-Node Kubernetes Cluster
Symptom
After deploying a job-scheduling microservice, nodes reported OOM kills. The pod's memory graph spiked from 200 MB to beyond its 8 GB limit in under a second, then the container was terminated by the Kubernetes OOM killer.
Assumption
The team assumed itertools was lazy and therefore memory-safe. They did not realise that calling list() forces full materialisation of every element the iterator can produce — laziness only applies to the iteration itself, not to any consuming call.
Root cause
itertools.permutations(range(12)) produces 479,001,600 tuples. On 64-bit CPython 3.11, each tuple of 12 elements occupies approximately 152 bytes: 56 bytes for the tuple object header plus 12 internal pointers at 8 bytes each. Small integers in the range(12) are interned so they add no per-tuple allocation cost, but the tuple structures themselves are not small. Materialising the full sequence requires approximately 72 GB of heap — nearly 9× the pod's limit. The container was terminated before the list() call returned.
Fix
Use itertools.islice to process combinatoric output in bounded chunks, or apply filter() directly on the lazy iterator before any consuming call. The fix was to replace list(permutations(...)) with a generator-based consumer that evaluated one permutation at a time, discarding invalid ones immediately without ever building the full set. The team also added a pre-flight check using math.perm to compute and log the expected output size before any combinatoric iterator is constructed.
Key lesson
Always compute expected output size with math.perm or math.comb before calling list() on any combinatoric iterator. The numbers grow faster than intuition suggests.
Never call list() on combinatoric iterators unless you have verified the result fits comfortably in your memory budget — with headroom for the rest of the process.
Prefer filter() or itertools.filterfalse() to reduce the stream lazily before any consuming call. Invalid permutations discarded before materialisation cost nothing.
Add a combinatorics size check as a startup assertion in any service that generates permutations or combinations at runtime.
Production debug guideSymptom to immediate action for the most common itertools failures.5 entries
Symptom · 01
Iterator returns empty after the first consuming call
→
Fix
Confirm single-consumption: find every list(), sum(), min(), max(), or for loop applied to the same itertools object. Restructure into a single pass, or materialise once to a variable and reuse that variable. If memory is tight, recreate the iterator from its source factory for each pass rather than materialising.
Symptom · 02
groupby returns duplicate groups for the same key value
→
Fix
The input is not sorted by the grouping key. groupby groups consecutive identical keys only — it is not SQL GROUP BY. Insert a sorted() call immediately before groupby: for k, g in itertools.groupby(sorted(data, key=func), key=func). The sort must happen on the same key function used for grouping.
Symptom · 03
Memory grows unexpectedly when using tee()
→
Fix
Measure how far apart the two cloned iterators have advanced. tee buffers every item consumed by one iterator but not yet consumed by the other. If one branch is 50,000 items ahead, tee holds 50,000 items in its internal linked-list buffer. Replace with list() if branches diverge significantly — materialise once, create two independent iterators over the list.
Symptom · 04
OOM kill when processing combinatoric output
→
Fix
The iterator was materialised with list() before confirming size. Compute math.perm(n, r) or math.comb(n, r) immediately and compare against available memory. Switch to a lazy consumer: for item in itertools.islice(permutations(...), chunk_size) processed in a loop, or filter directly on the lazy iterator.
Symptom · 05
Pipeline processes no rows on second run without error
→
Fix
The itertools object was exhausted on the first run and not recreated. Itertools objects are single-use. Either recreate from source, materialise to list if size allows, or wrap the source in a factory function and call it fresh for each pipeline run.
★ Quick itertools Debug Cheat SheetCommands and immediate fixes for the most common itertools failures in production.
Iterator exhausted prematurely — second consumer gets empty results−
Immediate action
Find every consuming call on the same iterator object. An itertools object has no memory of what it already produced — the second call sees nothing.
grep -n 'list(\|sum(\|min(\|max(' pipeline.py | grep -v '#' — find every consuming call and check whether they share the same iterator variable
Fix now
Assign result = list(iterator) once and reuse result. Or if memory is constrained, wrap the source in a factory: def source(): return itertools.chain(...) and call source() fresh for each consumer.
groupby shows the same key appearing multiple times as separate groups+
Immediate action
Print the keys in order before groupby to confirm the input is not sorted. groupby only groups consecutive identical keys — identical keys separated by a different key produce two separate groups.
Commands
python -c "
import itertools
data = [('b', 1), ('a', 2), ('b', 3)]
print('Keys without sort:', [k for k, _ in itertools.groupby(data, key=lambda x: x[0])])
data_sorted = sorted(data, key=lambda x: x[0])
print('Keys with sort: ', [k for k, _ in itertools.groupby(data_sorted, key=lambda x: x[0])])
"
grep -n 'groupby' src/ — find every groupby call and verify a sorted() call on the same key appears immediately before it
Fix now
Replace groupby(data, key) with groupby(sorted(data, key=key), key). The sort and the groupby must use the same key function.
Memory grows unboundedly with tee()+
Immediate action
Determine how far apart the two consumers have advanced. The tee buffer holds exactly the delta between the faster and slower consumer — nothing more, nothing less.
Commands
python -c "
import itertools, sys
source = iter(range(100_000))
t1, t2 = itertools.tee(source)
for _ in range(50_000): next(t1) # advance t1 by 50k
print('t1 object size:', sys.getsizeof(t1), 'bytes')
print('t2 object size:', sys.getsizeof(t2), 'bytes')
print('Note: actual buffer lives in shared tee dataobject, not in the iterator objects themselves')
"
grep -n 'tee(' src/ — find every tee call and review how far apart the consumers advance before they both complete
Fix now
If consumers advance at similar rates, tee is safe. If one races ahead of the other by more than a few thousand items, replace with: data = list(source); t1 = iter(data); t2 = iter(data).
itertools Tools Comparison
Tool
Lazy?
Memory Usage
Output Type
Best For
Watch Out For
itertools.chain
Yes
O(1) — 56 bytes regardless of input size
Values from inputs in declaration order
Concatenating iterables without allocating a new list
Single consumption. chain(*large_list) unpacks eagerly — use chain.from_iterable instead
itertools.chain.from_iterable
Yes
O(1)
Values from each sub-iterable in sequence
Lazy merging of resource-bound iterables (files, cursors)
The outer iterable must itself be iterable — generator preferred to avoid upfront evaluation
itertools.product
Yes
O(r) per tuple — r is the repeat/length
Tuples — Cartesian product
Config grid generation, test matrix, hyperparameter search
Output count is n^r — measure with math.prod before materialising
itertools.permutations
Yes
O(r) per tuple
Tuples — ordered arrangements
Ordering problems, scheduling, path enumeration
n!/(n-r)! output count. permutations(12,12) = 479M tuples at 152 bytes each = ~72 GB
itertools.combinations
Yes
O(r) per tuple
Tuples — unordered selections
Feature selection, subset enumeration
n!/(r!(n-r)!) output count — grows fast. Always check math.comb first
itertools.groupby
Yes
O(1)
Key + sub-iterator pairs
Grouping pre-sorted streams — NOT unsorted data
Groups consecutive identical keys only — must sort by key first or get duplicate groups
itertools.tee
Yes
O(delta) — delta between consumer positions
n independent iterators over the same source
Lockstep consumers (e.g., zip over two clones of a stream)
Buffer grows with consumer divergence — materialise to list if branches advance unevenly
itertools.accumulate
Yes
O(1)
Running aggregate values
Running totals, rolling max/min, running product
operator.mul produces factorially growing integers — CPU cost grows, no overflow
itertools.pairwise
Yes — Python 3.10+
O(1)
Overlapping consecutive pairs (a,b), (b,c)...
Time-series deltas, state transition detection, consecutive comparisons
Requires Python 3.10+. For earlier versions use tee-based compat shim
itertools.batched
Yes — Python 3.12+
O(n) per batch tuple
Tuples of at most n items — final batch may be shorter
Chunked API writes, batch DB inserts, windowed processing
Requires Python 3.12+. Final chunk is shorter than n if input does not divide evenly
itertools.compress
Yes
O(1)
Elements where boolean mask is truthy
Applying a pre-computed filter mask at C speed
Mask must be an iterable of the same length — both exhausted together
itertools.islice
Yes
O(1)
Bounded slice of source iterator
Lazy windowing and sampling — especially on generator sources
No negative indices. On pre-built lists, direct slicing is faster
itertools.zip_longest
Yes
O(1)
Tuples — fills missing values with fillvalue
Merging streams of unequal length without silent truncation
Standard zip truncates silently — use zip_longest when stream lengths may differ
Key takeaways
1
itertools objects are lazy C iterators
they compute one value at a time and hold O(1) memory regardless of how many values they will produce. The iterator object size is constant.
2
Once consumed, an itertools object is exhausted
the second list() call returns empty with no error. Use a factory function for replay or materialise once when memory allows.
3
chain.from_iterable is lazier than chain(*args)
it defers creation of each sub-iterable until needed. For resource-bound sub-iterables (files, cursors) this is the difference between correct and broken.
4
groupby groups consecutive identical keys only
never all identical keys. Always sort by the key immediately before groupby. This is the most common itertools bug in production.
5
tee buffers the delta between consumer positions, not the entire stream. Safe for lockstep consumers; use list() for divergent ones.
6
Each tuple of 12 elements on 64-bit CPython occupies ~152 bytes. permutations(12, 12) at 152 bytes × 479M tuples ≈ 72 GB. Always check math.perm/comb before any list() call on combinatoric output.
7
pairwise (3.10+) and batched (3.12+) replace manual implementations that exist in nearly every Python codebase
check yours for chunked and consecutive-pair utilities that should be replaced.
8
zip_longest makes stream length mismatches explicit
use it instead of zip whenever the two streams may not be exactly the same length. Silent truncation from zip causes data loss with no error.
Common mistakes to avoid
6 patterns
×
Calling groupby on unsorted data
Symptom
The same key appears multiple times as separate groups — groupby creates a new group every time the key changes, even if the same key appeared earlier. Aggregations are split across multiple groups, producing subtotals that look plausible but sum to wrong totals.
Fix
Always sort immediately before groupby using the same key function: sorted_data = sorted(data, key=key_func); for k, g in itertools.groupby(sorted_data, key_func). Consider wrapping this pair in a helper function that enforces the coupling: def sorted_groupby(data, key): return itertools.groupby(sorted(data, key=key), key=key).
×
Consuming an itertools object twice
Symptom
list(my_iterator) returns the full data on the first call. list(my_iterator) immediately after returns an empty list with no error, no warning, and no indication of what happened. Downstream logic receives empty input and produces wrong results silently.
Fix
Assign result = list(iterator) once and reuse result. If memory is tight, wrap the source in a factory function and call it fresh for each consumer. Never pass the same itertools object variable to two consuming calls.
×
Using tee when consumers advance at wildly different rates
Symptom
Memory usage grows as the faster consumer races ahead. The tee internal buffer holds every item the fast consumer has seen that the slow consumer has not yet read. For divergent consumers this is effectively an unbounded memory leak.
Fix
Use tee only when both consumers advance at similar rates — within a few thousand items of each other. If one consumer needs to process far ahead, materialise with result = list(source) and create two independent iterators: branch1 = iter(result); branch2 = iter(result).
×
Calling list() on combinatoric iterators without checking size first
Symptom
OOM kill or extreme memory pressure when processing permutations or combinations of inputs that feel small. permutations(12, 12) is 479 million tuples at ~152 bytes each on 64-bit CPython — approximately 72 GB. The container dies before list() returns.
Fix
Always compute math.perm(n, r) or math.comb(n, r) before materialising and compare against your memory budget. Add a pre-flight assertion in any service that generates combinatoric output. Use filter() and islice() to process lazily — discarded items never enter memory.
×
Using chain(*large_iterable) instead of chain.from_iterable when sub-iterables are resource-bound
Symptom
All sub-iterables are created at call time. With file handles this causes file descriptor exhaustion. With database cursors this opens more connections than the pool allows. With expensive generators this defeats lazy evaluation entirely.
Fix
Use chain.from_iterable(generator_that_creates_each_sublist_lazily). The generator must create each sub-iterable inside itself — passing a pre-built list of open handles to chain.from_iterable is not lazy, because the handles are already open before chain.from_iterable is called.
×
Using zip instead of zip_longest when merging streams that may have different lengths
Symptom
zip silently stops at the shorter stream. Items from the longer stream are silently dropped with no error. Data loss is invisible in logs.
Fix
Use itertools.zip_longest(a, b, fillvalue=sentinel) when the two streams may not be the same length. The fillvalue makes the mismatch explicit. If a missing value should be an error rather than a fill, check stream lengths before zipping.
INTERVIEW PREP · PRACTICE MODE
Interview Questions on This Topic
Q01SENIOR
What is the difference between itertools.chain(a, b) and itertools.chain...
Q02SENIOR
permutations(range(12)) — how many results does it produce and how much ...
Q03JUNIOR
A teammate writes itertools.groupby(records, key=lambda r: r.department)...
Q04SENIOR
What are pairwise and batched, when were they added, and what did people...
Q05SENIOR
Explain how tee works internally and describe a scenario where it causes...
Q06SENIOR
You have a production sensor pipeline that processes 50 million readings...
Q01 of 06SENIOR
What is the difference between itertools.chain(a, b) and itertools.chain.from_iterable([a, b]), and when does the distinction matter in production?
ANSWER
chain(a, b) receives a and b as arguments — both must exist at call time. chain.from_iterable([a, b]) takes a single iterable of iterables and pulls each sub-iterable lazily — it does not touch the second sub-iterable until the first is exhausted. In production the distinction matters when sub-iterables are resource-bound. chain(*[open(f) for f in files]) opens all files before chain is called. chain.from_iterable(open(f) for f in files) opens each file only when the previous one is exhausted, keeping at most one file descriptor open at a time. For 10,000 log files the first approach hits the OS file descriptor limit; the second processes them all within normal limits.
Q02 of 06SENIOR
permutations(range(12)) — how many results does it produce and how much memory would list() require on 64-bit CPython? How would you safely process this?
ANSWER
math.perm(12, 12) = 479,001,600 tuples. On 64-bit CPython 3.11, each tuple of 12 elements occupies approximately 152 bytes: a 56-byte object header plus 12 internal pointers at 8 bytes each. Small integers in range(12) are interned so no per-tuple allocation for the integers, but the tuple structures are not small. Materialising the full set requires approximately 72 GB — never call list() on it. Safe processing: apply filter() directly on the lazy iterator to discard invalid permutations before they are ever stored, use islice to take a bounded sample, or use a for loop with an early break. Always check math.perm before writing any code that touches this iterator.
Q03 of 06JUNIOR
A teammate writes itertools.groupby(records, key=lambda r: r.department) and complains the same department appears multiple times. What is the bug?
ANSWER
groupby groups consecutive identical keys only — not all identical keys. If department values are not sorted, a key that appears at non-adjacent positions produces multiple groups. The fix is a sorted() call immediately before groupby using the same key function: groupby(sorted(records, key=lambda r: r.department), key=lambda r: r.department). The groupby call itself does not need to change — sorting the input is the entire fix. This is the most common itertools bug in production and the one most likely to go undetected because the output looks plausible.
Q04 of 06SENIOR
What are pairwise and batched, when were they added, and what did people use before them?
ANSWER
pairwise was added in Python 3.10. It returns successive overlapping pairs from an iterable: (s[0], s[1]), (s[1], s[2])... Before 3.10 the common approach was zip(it, it[1:]) which requires materialising the iterable, or a tee-based pattern: a, b = tee(it); next(b); zip(a, b). pairwise is lazy and requires no materialisation. batched was added in Python 3.12. It splits an iterable into tuples of at most n items each, with the final chunk potentially shorter. Before 3.12 every codebase had its own chunking function built around islice in a while loop, each with slightly different edge case handling. batched handles the short final chunk correctly by design. Both tools eliminate manual reimplementations that existed in essentially every large Python codebase.
Q05 of 06SENIOR
Explain how tee works internally and describe a scenario where it causes a memory problem.
ANSWER
tee creates a shared linked list of fixed-size chunks — each chunk holds 57 elements in CPython's implementation. Each clone of the iterator has its own read position in this shared structure. Items consumed by both iterators are released. Items consumed by one but not yet the other remain in the buffer. The buffer holds exactly the delta between the faster and slower consumer. A memory problem arises when one consumer races ahead: if consumer A advances 500,000 items while consumer B has not advanced at all, the tee buffer holds 500,000 items in memory — effectively a copy of half the stream. The fix for divergent consumers is to materialise once: result = list(source); a = iter(result); b = iter(result). The tee buffer is gone, the memory cost is predictable, and both consumers see the same data.
Q06 of 06SENIOR
You have a production sensor pipeline that processes 50 million readings per day. A colleague suggests replacing the filter logic with itertools.compress. Is that a good idea and why?
ANSWER
Yes, for the right reasons. compress applies a boolean mask in C — there is no Python bytecode executed per element. The equivalent list comprehension or filter() with a Python lambda executes Python-level code for every element. At 50 million elements per day the per-element overhead compounds. The more important production win is often memory: a list comprehension with filter creates intermediate structures that the garbage collector must collect. compress with a pre-computed mask produces output without intermediate allocation, reducing GC pause frequency. The pre-computed mask itself must fit in memory — if the mask is large, consider computing it as a generator and passing that to compress rather than materialising it. Benchmark on your actual data with timeit and memory_profiler before committing, but compress is the right starting point for this pattern.
01
What is the difference between itertools.chain(a, b) and itertools.chain.from_iterable([a, b]), and when does the distinction matter in production?
SENIOR
02
permutations(range(12)) — how many results does it produce and how much memory would list() require on 64-bit CPython? How would you safely process this?
SENIOR
03
A teammate writes itertools.groupby(records, key=lambda r: r.department) and complains the same department appears multiple times. What is the bug?
JUNIOR
04
What are pairwise and batched, when were they added, and what did people use before them?
SENIOR
05
Explain how tee works internally and describe a scenario where it causes a memory problem.
SENIOR
06
You have a production sensor pipeline that processes 50 million readings per day. A colleague suggests replacing the filter logic with itertools.compress. Is that a good idea and why?
SENIOR
FAQ · 6 QUESTIONS
Frequently Asked Questions
01
Why does itertools.chain run out of memory when I pass a list of lists?
chain(*list_of_lists) unpacks the outer list at call time using Python's argument unpacking. The outer list must exist in memory at that point. The inner lists are not materialised by chain itself — chain holds references to them. But if list_of_lists is large, the unpacking itself is O(k) where k is the number of inner lists. Use chain.from_iterable(list_of_lists) to avoid the unpacking overhead entirely — it accesses each inner list one at a time as it exhausts the previous one.
Was this helpful?
02
Can I use itertools with pandas DataFrames?
Yes, with caveats. Iterating over a DataFrame with a for loop yields column names, not rows — that is a common surprise. To iterate over rows use df.itertuples() which yields namedtuples, or df.iterrows() which yields (index, Series) pairs. Both are significantly slower than vectorised pandas operations. For most use cases, pandas' own groupby, rolling, and apply methods are faster than itertools pipelines over row iterators. itertools is most useful with pandas for combining multiple DataFrames lazily with chain.from_iterable when you cannot load all of them into memory simultaneously.
Was this helpful?
03
How do I debug an itertools pipeline that returns no output?
Check in this order: (1) Is the source iterable empty? Add print(list(source)) before building the pipeline. (2) Is the pipeline being consumed twice? Find every list(), sum(), for loop, or any() on the same iterator variable. (3) If using groupby, is the input sorted by the grouping key? (4) If using filterfalse or compress, is the predicate or mask correct — try inverting it. (5) Is the pipeline infinite? Add an islice(pipeline, 100) to force a finite bound and inspect the first 100 items. Add a for item in pipeline: print(item) at each stage to narrow down where values disappear.
Was this helpful?
04
Is there a way to count elements in an itertools object without materialising it?
Not directly — the object is lazy and may be infinite. You can count by consuming: count = sum(1 for _ in iterator), but that exhausts the iterator. If you need both the count and the data, either materialise to a list (if it fits) or use tee to duplicate the stream and consume one branch for counting and one for processing. For combinatoric iterators specifically, use math.perm or math.comb — they compute the exact count in O(1) without touching the iterator at all.
Was this helpful?
05
What is the difference between takewhile and filter?
takewhile stops as soon as the predicate returns False — it does not inspect the rest of the iterable. filter continues through the entire iterable returning every element where the predicate is True. Use takewhile when your data has a structural boundary — sorted timestamps, a sequence that transitions from valid to invalid — and you know nothing useful comes after the first failure. Use filter when matching elements may appear anywhere in the stream. Using takewhile on unsorted data is a common mistake: the first out-of-order element stops the entire iteration even if many matching elements remain.
Was this helpful?
06
Can itertools be used with async code?
Not directly. itertools objects implement the synchronous iterator protocol (__iter__ and __next__). They cannot be used with async for, which requires __aiter__ and __anext__. In async codebases, the standard library offers no async itertools equivalent. Third-party libraries like aiostream provide async versions of chain, map, filter, and similar tools. For simple cases you can wrap a synchronous itertools pipeline in asyncio.to_thread to run it off the event loop, but this blocks a thread for the duration. For high-throughput async pipelines, native async iterators or aiostream are the right tools.