PHP Generators Explained — yield, send(), and Memory-Efficient Iteration
Most PHP developers hit a wall the first time they try to process a 500MB CSV file or generate a million-row report inside a web request. The naive approach — loading everything into an array — sends memory usage through the roof and often crashes the process entirely. Generators exist precisely to solve this class of problem, and they do it elegantly without forcing you to rearchitect your entire codebase.
Before generators (introduced in PHP 5.5), your only options were to either build a full array and accept the memory cost, or write a stateful iterator class with five methods and a lot of ceremony. Generators give you lazy, resumable sequences with a syntax so clean it looks almost too simple to be real. Under the hood, PHP compiles a generator function into a Generator object — a special internal class that implements both Iterator and the ability to receive values mid-execution.
By the end of this article you'll understand not just how to write a generator, but WHY the VM pauses execution at yield, how to push values back INTO a running generator with send(), how to compose generators using yield from, how to extract a final return value, and where generators genuinely beat arrays — plus the exact gotchas that trip up experienced developers in production.
How PHP Compiles a Generator Function — The Internals You Actually Need
When PHP's parser sees the yield keyword inside a function body, it silently transforms the entire function into a generator factory. Calling that function no longer executes any of its code — it returns a Generator object immediately. This is the single most important thing to internalise, because it explains almost every 'why didn't my code run?' bug you'll ever see with generators.
The Generator class implements the Iterator interface internally (rewind, current, key, next, valid) plus three extra methods: send(), throw(), and getReturn(). The Zend Engine stores the generator's execution context — local variable table, instruction pointer, call stack frame — in a heap-allocated zend_generator struct. This is why a generator can pause mid-function and resume: its stack frame is preserved off the real call stack.
Each time you call next() (or foreach triggers it), the engine restores that frame, runs until the next yield, stashes the yielded value, and suspends again. Memory for local variables stays alive for the generator's lifetime, but you only ever hold the current value in userland — not the entire sequence. That's the performance win in one sentence.
A generator function can yield zero or more times. After the last yield (or an explicit return), valid() returns false and the generator is exhausted. Exhausted generators cannot be rewound — rewind() on an already-started generator throws a fatal error.
<?php /** * A generator factory — calling this returns a Generator object instantly. * No code inside the function body runs yet. */ function countUpTo(int $limit): Generator { echo "[generator] Starting execution\n"; // runs on first next()/current() for ($number = 1; $number <= $limit; $number++) { echo "[generator] About to yield $number\n"; yield $number; // <-- execution PAUSES here; $number is the yielded value echo "[generator] Resumed after yield $number\n"; // runs on the NEXT next() call } echo "[generator] Generator exhausted\n"; // runs after the loop ends } // --- Calling the function does NOT execute any code yet --- $generator = countUpTo(3); echo "Generator object created. No output above this line yet.\n\n"; // --- Manually driving the generator --- $generator->current(); // initialises + runs to first yield echo "First value: " . $generator->current() . "\n\n"; $generator->next(); // resumes from yield 1, runs to yield 2 echo "Second value: " . $generator->current() . "\n\n"; $generator->next(); // resumes from yield 2, runs to yield 3 echo "Third value: " . $generator->current() . "\n\n"; $generator->next(); // resumes from yield 3, loop ends, generator exhausts echo "Valid? " . ($generator->valid() ? 'yes' : 'no') . "\n";
[generator] Starting execution
[generator] About to yield 1
First value: 1
[generator] Resumed after yield 1
[generator] About to yield 2
Second value: 2
[generator] Resumed after yield 2
[generator] About to yield 3
Third value: 3
[generator] Resumed after yield 3
[generator] Generator exhausted
Valid? no
Practical Power — Processing Large Files and Infinite Sequences Without Blowing Memory
Here's where generators stop being a curiosity and start being a production tool. Reading a large CSV line-by-line with a generator keeps peak memory at roughly the size of one line — no matter if the file is 1 MB or 10 GB. The alternative, file() or array-based fgetcsv loops that collect into an array, scales memory linearly with file size.
The key insight is that generators compose beautifully. You can pipe a file-reading generator through a filtering generator through a transforming generator — each stage lazy, each holding only one record at a time. This is PHP's equivalent of Unix pipes and it's just as powerful.
Infinite sequences are another natural fit. A Fibonacci generator, a UUID stream, a paginated API cursor — anything where 'all values' is meaningless or impossibly large becomes trivial with a generator. You pull exactly as many values as you need, then let the generator get garbage-collected.
For the file example below, watch the memory figures. On a real 100k-row CSV the array approach consumes ~80 MB; the generator approach holds steady at ~1 MB regardless of file length. That's not an optimisation — it's a different class of solution.
<?php /** * Yields one parsed CSV row at a time — never loads the whole file. * Type: Generator<int, array<string, string>, void, void> */ function readCsvLazily(string $filePath, bool $hasHeaderRow = true): Generator { $fileHandle = fopen($filePath, 'r'); if ($fileHandle === false) { throw new RuntimeException("Cannot open file: $filePath"); } try { $headers = []; if ($hasHeaderRow) { // Read the first line as column headers $headers = fgetcsv($fileHandle); if ($headers === false) { return; // empty file — generator yields nothing } } $rowIndex = 0; while (($rawFields = fgetcsv($fileHandle)) !== false) { // Combine headers with row values so callers get named keys $namedRow = $hasHeaderRow ? array_combine($headers, $rawFields) : $rawFields; yield $rowIndex => $namedRow; // key => value yield $rowIndex++; } } finally { // The finally block ALWAYS runs — even if the caller breaks early fclose($fileHandle); echo "[cleanup] File handle closed.\n"; } } /** * Filtering generator — wraps another generator, yields only matching rows. * This is the 'pipe' pattern: generator -> generator -> consumer. */ function filterRows(Generator $source, string $column, string $matchValue): Generator { foreach ($source as $rowIndex => $row) { if (isset($row[$column]) && $row[$column] === $matchValue) { yield $rowIndex => $row; // pass through the original key } } } // --- Demo: create a small sample CSV in memory to make this runnable --- $tempFile = tempnam(sys_get_temp_dir(), 'csv_demo_'); file_put_contents($tempFile, "id,name,department,salary\n" . "1,Alice,Engineering,95000\n" . "2,Bob,Marketing,72000\n" . "3,Carol,Engineering,102000\n" . "4,Dave,Marketing,68000\n" . "5,Eve,Engineering,88000\n" ); $memoryBefore = memory_get_usage(); // Build a lazy pipeline — nothing reads the file yet $allRows = readCsvLazily($tempFile); $engineeringRows = filterRows($allRows, 'department', 'Engineering'); echo "Memory after pipeline setup (nothing read yet): " . number_format(memory_get_usage() - $memoryBefore) . " bytes\n\n"; // NOW we pull values — file is read one line at a time echo "Engineering department employees:\n"; foreach ($engineeringRows as $index => $employee) { echo " Row $index: {$employee['name']} — \${$employee['salary']}\n"; } unlink($tempFile); // remove the temp file
Engineering department employees:
Row 0: Alice — $95000
Row 2: Carol — $102000
Row 4: Eve — $88000
[cleanup] File handle closed.
Two-Way Communication — send(), throw(), and getReturn() in Depth
Most tutorials stop at yield-as-output. But generators are bidirectional channels. The send() method lets you push a value INTO a running generator — the yield expression evaluates to whatever you sent. This unlocks coroutine-style patterns: think cooperative task schedulers, stateful parsers, or async I/O pipelines.
The flow is: you call send($value), the generator resumes, the yield expression evaluates to $value, execution continues to the next yield, and send() returns the newly yielded value. If there's no next yield, send() returns null.
Throw() is send()'s sibling — it injects an exception at the point of suspension instead of a value. This lets a scheduler cancel a coroutine cleanly without a global flag variable.
getReturn() extracts the value from an explicit return statement inside the generator. This is only accessible after the generator is exhausted (valid() === false). It's perfect for aggregations: let the generator yield intermediate progress and return the final computed result — callers get both streaming updates and a clean final answer without two separate functions.
<?php /** * A running-total accumulator implemented as a coroutine. * * - Yields a status string each time a number is received. * - Returns the final total when closed via throw(). * * Caller sends numbers; generator responds with a receipt. */ function runningTotalAccumulator(): Generator { $total = 0; $itemCount = 0; // The generator starts here when current()/send(null) is called echo "[accumulator] Ready to receive numbers.\n"; while (true) { // yield does two jobs at once: // 1. Sends a status string OUT to the caller // 2. Receives the next number sent IN via send() $incomingNumber = yield "Current total: $total (items: $itemCount)"; if ($incomingNumber === null) { // send(null) is what foreach internally does on the first next() — // we treat it as a no-op to handle gracefully continue; } $total += $incomingNumber; $itemCount++; echo "[accumulator] Added $incomingNumber → running total = $total\n"; } } $accumulator = runningTotalAccumulator(); // Initialise the generator — runs to first yield, prints the 'Ready' line $initialStatus = $accumulator->current(); echo "Status: $initialStatus\n\n"; // Send values in — each send() resumes the generator and returns the NEW yielded value $numbers = [150, 75, 200, 25]; foreach ($numbers as $number) { $status = $accumulator->send($number); // sends number IN, gets status string OUT echo "Caller received: $status\n"; } echo "\n"; // --- Demonstrating throw() for clean cancellation --- echo "Cancelling accumulator early via throw()...\n"; try { $accumulator->throw(new RuntimeException('Cancelled by caller')); } catch (RuntimeException $exception) { // throw() re-throws the exception in the caller's scope if the // generator doesn't catch it internally echo "Caught in caller: " . $exception->getMessage() . "\n"; } // --- Demonstrating getReturn() --- /** * A generator that streams progress updates and returns a summary array. */ function processOrderBatch(array $orderIds): Generator { $processedCount = 0; $failedIds = []; foreach ($orderIds as $orderId) { // Simulate work — in production this might be a DB write $success = ($orderId % 7 !== 0); // pretend multiples of 7 fail if ($success) { $processedCount++; yield "Processed order #$orderId"; // stream progress to caller } else { $failedIds[] = $orderId; yield "FAILED order #$orderId"; } } // return is only accessible via getReturn() after generator exhausts return [ 'processed' => $processedCount, 'failed' => $failedIds, 'total' => count($orderIds), ]; } $orderIds = [101, 102, 103, 105, 112, 114, 119]; $orderBatch = processOrderBatch($orderIds); echo "\n--- Order Batch Processing ---\n"; foreach ($orderBatch as $progressMessage) { echo $progressMessage . "\n"; } // Generator is now exhausted — getReturn() is safe to call $summary = $orderBatch->getReturn(); echo "\nSummary: {$summary['processed']} processed, " . count($summary['failed']) . " failed " . "(IDs: " . implode(', ', $summary['failed']) . ")\n";
Status: Current total: 0 (items: 0)
[accumulator] Added 150 → running total = 150
Caller received: Current total: 150 (items: 1)
[accumulator] Added 75 → running total = 225
Caller received: Current total: 225 (items: 2)
[accumulator] Added 200 → running total = 425
Caller received: Current total: 425 (items: 3)
[accumulator] Added 25 → running total = 450
Caller received: Current total: 450 (items: 4)
Cancelling accumulator early via throw()...
Caught in caller: Cancelled by caller
--- Order Batch Processing ---
Processed order #101
Processed order #102
Processed order #103
Processed order #105
Processed order #112
FAILED order #114
Processed order #119
Summary: 6 processed, 1 failed (IDs: 114)
Generator Delegation with yield from — Composing Sequences Like a Pro
yield from lets one generator delegate iteration to another iterable — another generator, an array, or any Traversable. It's not just syntactic sugar for a nested foreach: it transparently proxies send() and throw() calls through to the inner generator and captures the inner generator's return value as the result of the yield from expression itself.
This is critical for recursive algorithms. Without yield from, you can't simply call a generator recursively and have it yield into the outer stream — you'd have to foreach the inner generator and re-yield each value. yield from collapses this into one clean expression.
The return value capture is the part most developers miss. When the delegated generator finishes, yield from evaluates to that generator's return value. This lets you build tree-walking or recursive descent parsers where each level both yields leaf values AND returns metadata to its parent — a pattern that's genuinely hard to achieve with iterators.
Performance note: yield from avoids the double-copy overhead of re-yielding in a loop. The Zend Engine links the inner generator's frame directly, so values flow through without an intermediate allocation per element.
<?php /** * Recursively walks a nested directory tree, yielding file paths. * Uses yield from for recursive delegation — no re-yielding loop needed. */ function walkDirectoryTree(string $directoryPath): Generator { $entries = new DirectoryIterator($directoryPath); foreach ($entries as $entry) { if ($entry->isDot()) { continue; // skip '.' and '..' } if ($entry->isDir()) { // Delegate to a recursive call — send()/throw() are proxied through. // The return value of the inner generator is captured in $subDirCount. $subDirCount = yield from walkDirectoryTree($entry->getPathname()); echo "[delegation] Subdirectory '{$entry->getFilename()}' contained $subDirCount file(s)\n"; } else { yield $entry->getPathname(); // yield the file path to the top-level caller } } // This return value flows back to the parent generator via yield from return iterator_count(walkDirectoryTree($directoryPath)); // count files in this dir only } /** * Simpler example showing yield from with arrays and return capture. * This makes the delegation mechanics easy to observe without filesystem access. */ function innerSequence(string $label): Generator { echo "[inner:$label] starting\n"; yield "$label-A"; yield "$label-B"; echo "[inner:$label] done\n"; return "metadata-from-$label"; // captured by yield from in the outer generator } function outerSequence(): Generator { yield "outer-start"; // yield from array — delegation works with plain arrays too yield from ["array-item-1", "array-item-2"]; // yield from generator — return value is captured $returnedMetadata = yield from innerSequence('alpha'); echo "[outer] inner returned: $returnedMetadata\n"; $returnedMetadata = yield from innerSequence('beta'); echo "[outer] inner returned: $returnedMetadata\n"; yield "outer-end"; return "outer-summary"; } echo "=== Delegated sequence output ===\n"; foreach (outerSequence() as $index => $value) { echo " [$index] $value\n"; } // --- Real-world: merging multiple paginated API result generators --- function simulateApiPage(string $resource, int $page): Generator { // Simulates fetching a page of records from a REST API $itemsPerPage = 3; $start = ($page - 1) * $itemsPerPage + 1; $end = $start + $itemsPerPage - 1; for ($id = $start; $id <= $end; $id++) { yield ['resource' => $resource, 'id' => $id, 'page' => $page]; } return $itemsPerPage; // how many items this page had } function fetchAllPages(string $resource, int $totalPages): Generator { $totalFetched = 0; for ($page = 1; $page <= $totalPages; $page++) { $pageCount = yield from simulateApiPage($resource, $page); $totalFetched += $pageCount; echo "[paginator] Fetched page $page ($pageCount items, $totalFetched total so far)\n"; } return $totalFetched; } echo "\n=== Paginated API fetch ===\n"; $apiGenerator = fetchAllPages('products', 2); foreach ($apiGenerator as $record) { echo " Product ID {$record['id']} (page {$record['page']})\n"; } echo "Total fetched: " . $apiGenerator->getReturn() . "\n";
[0] outer-start
[1] array-item-1
[2] array-item-2
[inner:alpha] starting
[3] alpha-A
[4] alpha-B
[inner:alpha] done
[outer] inner returned: metadata-from-alpha
[inner:beta] starting
[5] beta-A
[6] beta-B
[inner:beta] done
[outer] inner returned: metadata-from-beta
[7] outer-end
=== Paginated API fetch ===
Product ID 1 (page 1)
Product ID 2 (page 1)
Product ID 3 (page 1)
[paginator] Fetched page 1 (3 items, 3 total so far)
Product ID 4 (page 2)
Product ID 5 (page 2)
Product ID 6 (page 2)
[paginator] Fetched page 2 (3 items, 6 total so far)
Total fetched: 6
| Aspect | Array-Based Iteration | Generator-Based Iteration |
|---|---|---|
| Memory usage (1M rows) | ~100–800 MB depending on row size | ~1–2 MB regardless of total rows |
| Time to first value | After ALL values computed/loaded | After FIRST value computed (immediate) |
| Rewindable | Yes — iterate as many times as needed | No — exhausted generator cannot rewind |
| Bidirectional (send values in) | No — read-only | Yes — via send() and throw() |
| Return a final value | The array IS the return value | Via return + getReturn() after exhaustion |
| Recursive composition | Manual re-merging of arrays | Clean via yield from with return capture |
| Lazy / short-circuit friendly | No — full computation always happens | Yes — unused values are never computed |
| Type-hinting in PHPDoc | array | Generator |
| Works as argument to foreach | Yes | Yes (implements Iterator) |
| Error propagation into sequence | Not applicable | Via throw() — injects exception at yield |
| PHP version requirement | All versions | PHP 5.5+ (yield from: PHP 7.0+) |
🎯 Key Takeaways
- A generator function returns a Generator object immediately — zero code inside it runs until you call current(), next(), or send(). This surprises even experienced developers.
- yield is a two-way valve: it sends a value out AND receives a value in (via send()). The yielded-OUT value is what callers see; the sent-IN value is what yield evaluates to inside the generator body.
- yield from isn't just a loop shortcut — it transparently proxies send() and throw() to the inner generator AND captures the inner generator's return value as an expression result, enabling clean recursive composition.
- Generators are single-use and cannot be rewound once started. If you need a replayable lazy sequence, encapsulate the generator factory in an IteratorAggregate class so getIterator() creates a fresh generator each time.
⚠ Common Mistakes to Avoid
- ✕Mistake 1: Calling send($value) on a brand-new generator — throws 'Cannot send value to a newly created generator' fatal error — Always initialise the generator first with current() or send(null) before sending a real value. Think of it as a handshake: the generator must reach its first yield before it can accept incoming data.
- ✕Mistake 2: Assuming a generator can be iterated twice — after the first foreach exhausts it, a second foreach produces nothing; no error is thrown, it just silently yields zero items — Generators are single-use. If you need to iterate multiple times, wrap the generator factory in a class that re-calls the generator function on each getIterator() call (implement IteratorAggregate), or store results in an array explicitly.
- ✕Mistake 3: Forgetting that returning from a generator doesn't work like a normal function return — developers expect $result = someGenerator() to capture the return value, but it just gets the Generator object; the actual return value is only accessible via $gen->getReturn() AFTER the generator is fully exhausted — Always drain the generator (via foreach or while($gen->valid())) before calling getReturn(), otherwise you'll get a 'Cannot get return value of a generator that hasn't returned' exception.
Interview Questions on This Topic
- QWhat is the difference between yield and return in a PHP generator, and what happens if you call getReturn() before the generator is exhausted?
- QHow does send() work internally — what value does the yield expression evaluate to, and why can't you send a non-null value to a generator before it has started?
- QExplain yield from in PHP 7+. How does it differ from a foreach loop that re-yields each value, and how do you capture the return value of a delegated generator?
Frequently Asked Questions
What is the difference between a PHP generator and a normal iterator?
A normal iterator requires a full class with five methods (rewind, current, key, next, valid). A generator achieves the same result with a plain function using yield — PHP auto-generates the Iterator implementation. Generators also add bidirectional communication (send/throw) and return value capture, which standard iterators don't have.
Does using generators actually save memory in PHP?
Yes, measurably and significantly. A generator holds exactly one value in memory at a time — the currently yielded item. An equivalent array-based approach allocates memory for every element simultaneously. For a 100,000-row CSV, the difference is typically 1–2 MB (generator) versus 50–200 MB (array), depending on row width.
Can I use a PHP generator with functions that expect an array, like array_map or array_filter?
Not directly — array_map and array_filter require actual arrays. You have two options: convert the generator to an array with iterator_to_array($generator) (which defeats the memory benefit), or use a pipeline of chained generators as shown in this article. For truly large datasets, always prefer generator pipelines over converting to an array mid-stream.
Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.