Python List Comprehensions — The 500K Email OOM Crash
Server OOM from a list comprehension building 500K emails — understand the memory trap and use for loops instead.
- List comprehensions build new lists by applying an expression to each item in an iterable, optionally filtered.
- Syntax:
[expression for item in iterable if condition]— read it left to right: "give me expression, for each item in iterable, but only if condition". - Filtering uses a trailing
if; transforming uses a ternary inside the expression. Mix them up and you'll get wrong results silently. - Performance: ~10–35% faster than a
forloop with.append(), but only for large lists — the speedup comes from avoiding method lookup overhead. - Production trap: using a list comprehension for side effects (printing, DB writes) builds a thrown-away list and confuses the next dev. Use a plain loop instead.
Imagine you're at a fruit stall and you want to pick only the ripe apples, wash each one, and put them in a bag — all in one smooth motion. A list comprehension is exactly that: a single instruction that loops through a collection, optionally filters items, and transforms each one into a new list. Instead of writing three separate steps (loop, check, append), you describe the whole operation in plain English-like code on one line. It's not magic — it's just a more natural way to say 'give me this, from that, if this condition is true'.
Every Python program that works with data — scraping websites, processing CSVs, filtering API responses — spends a huge amount of time building new lists from old ones. How you do that has a direct impact on how readable your code is and, to a meaningful degree, how fast it runs. List comprehensions are Python's built-in answer to this everyday problem, and they're one of the first things experienced Python developers reach for when they see a for-loop that builds a list.
Before list comprehensions existed, building a filtered or transformed list meant writing a loop, declaring an empty list, and calling .append() on every iteration — four to six lines of boilerplate just to express a single idea. That ceremony buries the actual intent of the code under a pile of scaffolding. List comprehensions collapse all of that into one expression that reads almost like a sentence, making your intent immediately obvious to the next developer — or to yourself six months later.
By the end of this article you'll know exactly how list comprehensions work under the hood, when they're the right tool and when they're not, how to layer in filtering and nesting without creating unreadable one-liners, and the two or three mistakes that trip up almost every developer the first time they use them in a real project. You'll also walk away with concrete answers to the interview questions that come up every time this topic surfaces.
The Anatomy of a List Comprehension — Reading It Like a Sentence
A list comprehension has three parts, and the order they sit in the expression mirrors the order you'd describe them out loud. The structure is: [expression for item in iterable if condition]. Read it left to right and it says: 'Give me expression, for each item in iterable, but only if condition is true.' The if clause is completely optional — leave it out and every item gets transformed.
The reason the expression comes first — before the for — is that it puts the most important thing front and centre. You're telling the reader immediately what each element of the new list will look like. The loop mechanics and the filter are supporting details that follow.
Under the hood, Python compiles a list comprehension into bytecode that's slightly faster than an equivalent for loop with .append(). That's because .append() has to look up the method on the list object every single iteration, whereas the comprehension's internal C-level loop skips that lookup. For small lists the difference is negligible, but at tens of thousands of items it starts to matter.
One crucial mental model: a list comprehension always produces a brand-new list. It never mutates the original. If you're iterating over temperatures and writing [t * 1.8 + 32 for t in temperatures], your original temperatures list is completely untouched.
Filtering with Conditions — The if Clause That Changes Everything
The if clause at the end of a comprehension is a gate: only items that pass the test make it into the output list. This is where list comprehensions really start to earn their keep in real-world code, because filtering and transforming at the same time is something you do constantly — think 'get me all active users and format their names' or 'find all log lines that contain an error and strip the timestamp'.
There's an important distinction to keep straight: the if at the end of the comprehension (after the for) is a filter — it controls which items are included. A conditional expression inside the output expression (using value_if_true if condition else value_if_false) is a transformation — it changes what an item becomes. You can use both in the same comprehension, and knowing which is which prevents a lot of confusing bugs.
For example, [score if score >= 50 else 0 for score in exam_scores] transforms every failing score to zero but keeps passing scores as-is. Compare that to [score for score in exam_scores if score >= 50] which simply drops failing scores entirely. The first changes items; the second removes them. These are fundamentally different operations, and mixing them up produces wrong results silently — Python won't complain either way.
[x if x > 0 for x in numbers] is a SyntaxError — a ternary expression always needs an else. Write [x if x > 0 else 0 for x in numbers] to transform, or [x for x in numbers if x > 0] to filter. They do different things.[x for x in data if x] instead of [x for x in data if x is not None].if excludes items; a ternary in the expression changes values.if clause at the end: [x for x in iterable if condition][x if cond else y for x in iterable][f(x) for x in iterable if cond]Nested Comprehensions and Real-World Data — When One Loop Isn't Enough
Sometimes your data isn't a flat list — it's a list of lists. Think a spreadsheet (rows of rows), a game board, or a JSON response that returns a list of orders, each containing a list of items. Nested list comprehensions let you flatten or transform these structures without resorting to nested loops that take up half a screen.
The mental model for reading nested comprehensions is the same 'right to left' trick, but applied twice. In [cell for row in grid for cell in row], you read it as: 'for each row in grid, for each cell in that row, give me cell.' The outermost loop always comes first after the expression.
That said, nesting deeper than two levels is almost always a code smell. If you find yourself writing three for clauses inside one comprehension, stop and ask whether a helper function or a regular loop would be clearer. Readability is the whole point — a comprehension that requires five minutes to decode has failed at its one job.
A genuinely common real-world use case is flattening API response data: an endpoint returns paginated results as a list of pages, each page containing a list of records, and you need one flat list to work with. A two-level comprehension handles this in one expressive line.
[f(x) for row in matrix for x in row] flattens into a 1D list. [[f(x) for x in row] for row in matrix] preserves the 2D structure. The position of the inner brackets makes all the difference — the outer expression determines the shape of the result.for clauses.for clauses max, or switch to a loop.[x for row in data for x in row][[f(x) for x in row] for row in data]When NOT to Use a List Comprehension — Knowing the Limits
List comprehensions have a superpower, but like all superpowers, using them in the wrong situation creates problems. The most important rule is this: if the comprehension doesn't fit on two lines and still read clearly, it's time to switch to a regular loop.
The other big consideration is memory. A list comprehension always builds the entire list in memory immediately. If you're working with a million records and only need to consume them one at a time — say, writing them to a file line by line — you should use a generator expression instead. The syntax is identical, but with round brackets instead of square ones: (expression for item in iterable). A generator is lazy: it produces one item at a time on demand and holds almost nothing in memory at once.
There's also a subtler reason to avoid comprehensions: side effects. If the main purpose of your loop is to do something — print to the console, update a database, send a request — rather than to produce a value, a comprehension is the wrong tool. Using a comprehension purely for its side effects, and throwing away the resulting list, is a code smell that confuses readers and wastes memory.
Finally, never use a list comprehension when a built-in function already does the job more clearly. , sum(), max(), and filter() all exist precisely for common cases. Knowing when to reach for them instead is what separates intermediate from advanced Python.map()
[send_email(user) for user in users] to send 500k emails.for loop and use batching.Performance Deep Dive — When Comprehension Speed Actually Matters
The common wisdom says list comprehensions are ~10–35% faster than for loops with .append(). That's true, but the real question is: does that speedup matter in your use case? For small lists (a few hundred items), the difference is microseconds — not worth sacrificing readability. For large lists (hundreds of thousands or millions), the difference can be seconds, which might matter in a latency-sensitive pipeline.
Where comprehensions really shine is in data processing scripts, API response cleaning, and batch transformations. But be aware: if your comprehension calls a function that does I/O (database, file, network), the I/O cost will completely dominate — the comprehension's speedup becomes irrelevant.
There's another hidden cost: error handling. If a comprehension raises an exception, you lose all context — you can't easily tell which item caused it. In a for loop, you can wrap the transformation in a try/except and log the offending data. Debugging a comprehension that crashes in production often requires rewriting it as a loop just to add logging.
Also, be careful with heavily nested comprehensions. Each level of nesting adds overhead from multiple C loops. A two-level comprehension with a filter and a ternary may still be faster than a nested loop, but profile before you commit.
A practical tip: for numerical data, consider using NumPy instead of a list comprehension. A NumPy array operation is written in C and runs orders of magnitude faster than any Python-level comprehension. np.square(arr) vs [x**2 for x in arr] — the NumPy version is 10-50x faster on large arrays.
timeit or cProfile. The speedup varies by context and is often negligible compared to I/O or function call overhead.List Comprehension Built 500,000 Emails — Then Crashed the Server
[send_email(user) for user in users] builds a list of 500,000 return values (probably None or a status object) in memory before discarding it. The real memory cost was not the result list but the fact that send_email() opened connections and allocated buffers for each call — all held until the comprehension completed. The comprehension also blocked the event loop (if async) because it's eager.for loop. Or use a generator expression and iterate over it with for status in (send_email(u) for u in users): but the core fix was to not use a comprehension for side effects. The team also added batching and connection pooling.- Never use a list comprehension purely for side effects. It builds a list you don't need and wastes memory.
- If the primary goal is to do something (send, write, print), reach for a
forloop. Comprehensions are for producing new collections. - Generator expressions are a middle ground — they're lazy and don't build a list, but they still signal 'iteration, not side effect' poorly.
NameError saying a variable is undefined[x for row in matrix for x in row] is correct; [x for x in row for row in matrix] fails because row is referenced before assignment.timeit. If the comprehension includes a function call (f(x) for x in data), the function overhead dominates. Inline the logic or use a vectorised library (NumPy) for numeric data.None values in outputNone (e.g., .append() in a comprehension inside a comprehension). Replace with a proper expression or use a filter to drop None.[x for row in matrix for x in row]Key takeaways
if filters which items survive; a ternary value_if_true if cond else value_if_false in the expression transforms what surviving items become. These are different operations and can be combined in the same comprehension.(expr for item in iterable) any time you're processing a large dataset you only need to iterate oncefor loop is clearer, more debuggable, and the right tool for the job.Common mistakes to avoid
3 patternsTernary inside expression without else branch
[x if x > 0 for x in numbers] raises SyntaxError. Python can't parse the ternary because it expects an else.else branch: [x if x > 0 else 0 for x in numbers]. If you want to drop items, use a filter: [x for x in numbers if x > 0].Using comprehension for side effects
[send_email(user) for user in users] works but builds a list of return values that's immediately discarded. Memory waste and confusing intent.for loop: for user in users: send_email(user). If the function returns nothing, a loop is clearer and doesn't allocate an unused list.Reversing loop order in nested comprehensions
[x for x in row for row in matrix] raises NameError because row is referenced before it's defined. The outer loop must come first.for statements: [x for row in matrix for x in row].Interview Questions on This Topic
What's the difference between a list comprehension and a generator expression, and how do you decide which one to use for a given problem?
sum(), max(), or any().
Example: sum(x2 for x in range(107)) uses minimal memory, while sum([x2 for x in range(107)]) would create a huge list first.Frequently Asked Questions
That's Data Structures. Mark it forged?
6 min read · try the examples if you haven't