Senior 7 min · March 06, 2026

PHP Generators -- The 4 GB CSV That Crashed Production

Loading a 2M-row CSV with file() exhausted 1.5 GB memory -- generators yield rows on the fly, slashing memory to KB.

N
Naren Founder & Principal Engineer

20+ years shipping production PHP systems at scale. Notes here come from systems that actually shipped.

Follow
Production
production tested
May 24, 2026
last updated
1,554
articles · all by Naren
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer
  • PHP Generators create resumable sequences without storing all values in memory
  • yield pauses execution and sends a value to the caller
  • send() pushes data back into the generator — two-way communication
  • yield from delegates to another iterator and captures its return value
  • Memory stays flat (~1 MB) even for million-row datasets vs arrays that scale with size
  • Biggest gotcha: calling send($nonNull) on a brand-new generator throws a fatal error — always initialise with current() or send(null) first
✦ Definition~90s read
What is PHP Generators?

PHP Generators are a language feature that lets you write functions that behave as iterators without building arrays in memory. When you call a generator function, PHP doesn't execute it — it returns a Generator object that lazily yields values one at a time via the yield keyword.

Imagine you ordered 10,000 cookies from a bakery.

The critical insight: each yield pauses execution, saves the entire call stack (including local variables and the current instruction pointer), and returns control to the caller. On the next iteration, PHP restores that exact state and resumes from where it left off.

This is fundamentally different from returning an array, which requires allocating memory for every element upfront. That 4 GB CSV crashed because you loaded it into an array; a generator would process it row by row, keeping memory at a few kilobytes regardless of file size.

Internally, PHP compiles a generator function into a Generator class instance with its own execution context. The engine generates an opcode handler that manages the suspend/resume cycle — think of it as a coroutine with a single entry point. The Generator object implements Iterator, so foreach works transparently, but the real power comes from two-way communication: send() lets you pass data back into the generator, throw() injects exceptions at the yield point, and getReturn() captures the final return value after the generator finishes.

This makes generators bidirectional data pipelines, not just lazy iterators.

In the ecosystem, generators compete with iterators (which require implementing five methods) and arrays (which are memory-bound). Use generators when you need to process large datasets, infinite sequences (like Fibonacci or paginated API responses), or when you want to compose data streams with yield from.

Don't use them for small, fixed datasets where an array is simpler and faster. Real-world patterns include: reading CSV/JSONL files line by line, consuming paginated REST APIs without buffering all pages, and building middleware pipelines where each generator transforms data before passing it to the next.

The key metric: a generator processing 10 million rows uses ~4 KB of memory; the equivalent array would consume 2-4 GB.

Plain-English First

Imagine you ordered 10,000 cookies from a bakery. A normal function would bake ALL 10,000 cookies, stack them in a giant box, and hand you the whole box at once — your kitchen is trashed. A generator is a baker who hands you one cookie at a time, baking the next one only when you reach out your hand. The baker pauses between each cookie, keeping your kitchen spotless. That pause-and-resume magic is exactly what PHP generators do with data.

Most PHP developers hit a wall the first time they try to process a 500MB CSV file or generate a million-row report inside a web request. The naive approach — loading everything into an array — sends memory usage through the roof and often crashes the process entirely. Generators exist precisely to solve this class of problem, and they do it elegantly without forcing you to rearchitect your entire codebase.

Before generators (introduced in PHP 5.5), your only options were to either build a full array and accept the memory cost, or write a stateful iterator class with five methods and a lot of ceremony. Generators give you lazy, resumable sequences with a syntax so clean it looks almost too simple to be real. Under the hood, PHP compiles a generator function into a Generator object — a special internal class that implements both Iterator and the ability to receive values mid-execution.

By the end of this article you'll understand not just how to write a generator, but WHY the VM pauses execution at yield, how to push values back INTO a running generator with send(), how to compose generators using yield from, how to extract a final return value, and where generators genuinely beat arrays — plus the exact gotchas that trip up experienced developers in production.

What PHP Generators Actually Do (and Why Your CSV Cratered)

A PHP generator is a function that uses yield instead of return to produce a sequence of values lazily — one at a time, on demand, without building an array in memory. The core mechanic: execution pauses at each yield, preserving local state, and resumes when the next value is requested. This turns O(n) memory into O(1) for iteration.

Internally, a generator returns a Generator object implementing Iterator. Each call to ->current() or ->next() advances the internal state machine. Crucially, you can send values back into the generator via ->send(), and throw exceptions into it. But the practical superpower is memory: iterating a 4 GB CSV row by row consumes a few kilobytes, not gigabytes.

Use generators whenever you process streams, large files, database result sets, or any sequence where loading everything into an array would be wasteful or impossible. In production, they are the difference between a script that handles 10 million rows and one that OOMs at 500k. They also enable cooperative multitasking via libraries like Amp or ReactPHP.

Generator ≠ Coroutine
Generators are not full coroutines — they cannot yield control to arbitrary callers without explicit send/throw. True coroutines require an event loop.
Production Insight
A team processed a 4 GB CSV with file() and array_map — the OOM killed the pod in 12 seconds.
Symptom: PHP Fatal error: Allowed memory size of 134217728 bytes exhausted.
Rule: If you iterate a data source larger than 10% of available memory, use a generator or stream — never load it whole.
Key Takeaway
Generators reduce memory from O(n) to O(1) for sequential iteration.
They are not a performance optimization for speed — they trade CPU for memory.
Use yield in any function that returns an iterable of unknown or large size.
PHP Generator Lifecycle and Production Use THECODEFORGE.IO PHP Generator Lifecycle and Production Use From generator creation to delegation and cleanup Generator Function Definition yield keyword creates Generator object Lazy Iteration with yield Pauses execution, returns value on demand Two-Way Communication send(), throw(), getReturn() for data flow Generator Delegation yield from composes multiple generators Large File / API Pagination Process huge datasets without memory blowup Early Termination & Cleanup Close generator, release file handles ⚠ Unclosed generators leak file handles Always call ->close() or use foreach to exhaustion THECODEFORGE.IO
thecodeforge.io
PHP Generator Lifecycle and Production Use
Php Generators

How PHP Compiles a Generator Function — The Internals You Actually Need

When PHP's parser sees the yield keyword inside a function body, it silently transforms the entire function into a generator factory. Calling that function no longer executes any of its code — it returns a Generator object immediately. This is the single most important thing to internalise, because it explains almost every 'why didn't my code run?' bug you'll ever see with generators.

The Generator class implements the Iterator interface internally (rewind, current, key, next, valid) plus three extra methods: send(), throw(), and getReturn(). The Zend Engine stores the generator's execution context — local variable table, instruction pointer, call stack frame — in a heap-allocated zend_generator struct. This is why a generator can pause mid-function and resume: its stack frame is preserved off the real call stack.

Each time you call next() (or foreach triggers it), the engine restores that frame, runs until the next yield, stashes the yielded value, and suspends again. Memory for local variables stays alive for the generator's lifetime, but you only ever hold the current value in userland — not the entire sequence. That's the performance win in one sentence.

A generator function can yield zero or more times. After the last yield (or an explicit return), valid() returns false and the generator is exhausted. Exhausted generators cannot be rewound — rewind() on an already-started generator throws a fatal error.

GeneratorInternals.phpPHP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
<?php

/**
 * A generator factory — calling this returns a Generator object instantly.
 * No code inside the function body runs yet.
 */
function countUpTo(int $limit): Generator
{
    echo "[generator] Starting execution\n"; // runs on first next()/current()

    for ($number = 1; $number <= $limit; $number++) {
        echo "[generator] About to yield $number\n";

        yield $number; // <-- execution PAUSES here; $number is the yielded value

        echo "[generator] Resumed after yield $number\n"; // runs on the NEXT next() call
    }

    echo "[generator] Generator exhausted\n"; // runs after the loop ends
}

// --- Calling the function does NOT execute any code yet ---
$generator = countUpTo(3);
echo "Generator object created. No output above this line yet.\n\n";

// --- Manually driving the generator ---
$generator->current(); // initialises + runs to first yield
echo "First value: " . $generator->current() . "\n\n";

$generator->next();    // resumes from yield 1, runs to yield 2
echo "Second value: " . $generator->current() . "\n\n";

$generator->next();    // resumes from yield 2, runs to yield 3
echo "Third value: " . $generator->current() . "\n\n";

$generator->next();    // resumes from yield 3, loop ends, generator exhausts
echo "Valid? " . ($generator->valid() ? 'yes' : 'no') . "\n";
Output
Generator object created. No output above this line yet.
[generator] Starting execution
[generator] About to yield 1
First value: 1
[generator] Resumed after yield 1
[generator] About to yield 2
Second value: 2
[generator] Resumed after yield 2
[generator] About to yield 3
Third value: 3
[generator] Resumed after yield 3
[generator] Generator exhausted
Valid? no
Watch Out: First current() Initialises the Generator
Calling current() on a freshly created Generator automatically runs the function body up to the first yield — you don't need to call rewind() or next() first. But if you call next() before current(), you skip the first yielded value entirely. Always start with current() if you care about the first element.
Production Insight
The generator's stack frame is heap-allocated and persists for the generator's lifetime.
Long-lived generators can become memory leaks if they hold references to large objects via local variables.
Rule: release large local variables with unset() once they're no longer needed inside the generator.
Key Takeaway
Calling a generator function returns an object instantly — no code runs.
The first call to current()/next() moves execution to the first yield.
Never rewind a generator — it throws a fatal error.

Practical Power — Processing Large Files and Infinite Sequences Without Blowing Memory

Here's where generators stop being a curiosity and start being a production tool. Reading a large CSV line-by-line with a generator keeps peak memory at roughly the size of one line — no matter if the file is 1 MB or 10 GB. The alternative, file() or array-based fgetcsv loops that collect into an array, scales memory linearly with file size.

The key insight is that generators compose beautifully. You can pipe a file-reading generator through a filtering generator through a transforming generator — each stage lazy, each holding only one record at a time. This is PHP's equivalent of Unix pipes and it's just as powerful.

Infinite sequences are another natural fit. A Fibonacci generator, a UUID stream, a paginated API cursor — anything where 'all values' is meaningless or impossibly large becomes trivial with a generator. You pull exactly as many values as you need, then let the generator get garbage-collected.

For the file example below, watch the memory figures. On a real 100k-row CSV the array approach consumes ~80 MB; the generator approach holds steady at ~1 MB regardless of file length. That's not an optimisation — it's a different class of solution.

LargeFileProcessor.phpPHP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
<?php

/**
 * Yields one parsed CSV row at a time — never loads the whole file.
 * Type: Generator<int, array<string, string>, void, void>
 */
function readCsvLazily(string $filePath, bool $hasHeaderRow = true): Generator
{
    $fileHandle = fopen($filePath, 'r');
    if ($fileHandle === false) {
        throw new RuntimeException("Cannot open file: $filePath");
    }

    try {
        $headers = [];

        if ($hasHeaderRow) {
            // Read the first line as column headers
            $headers = fgetcsv($fileHandle);
            if ($headers === false) {
                return; // empty file — generator yields nothing
            }
        }

        $rowIndex = 0;
        while (($rawFields = fgetcsv($fileHandle)) !== false) {
            // Combine headers with row values so callers get named keys
            $namedRow = $hasHeaderRow
                ? array_combine($headers, $rawFields)
                : $rawFields;

            yield $rowIndex => $namedRow; // key => value yield

            $rowIndex++;
        }
    } finally {
        // The finally block ALWAYS runs — even if the caller breaks early
        fclose($fileHandle);
        echo "[cleanup] File handle closed.\n";
    }
}

/**
 * Filtering generator — wraps another generator, yields only matching rows.
 * This is the 'pipe' pattern: generator -> generator -> consumer.
 */
function filterRows(Generator $source, string $column, string $matchValue): Generator
{
    foreach ($source as $rowIndex => $row) {
        if (isset($row[$column]) && $row[$column] === $matchValue) {
            yield $rowIndex => $row; // pass through the original key
        }
    }
}

// --- Demo: create a small sample CSV in memory to make this runnable ---
$tempFile = tempnam(sys_get_temp_dir(), 'csv_demo_');
file_put_contents($tempFile,
    "id,name,department,salary\n" .
    "1,Alice,Engineering,95000\n" .
    "2,Bob,Marketing,72000\n" .
    "3,Carol,Engineering,102000\n" .
    "4,Dave,Marketing,68000\n" .
    "5,Eve,Engineering,88000\n"
);

$memoryBefore = memory_get_usage();

// Build a lazy pipeline — nothing reads the file yet
$allRows        = readCsvLazily($tempFile);
$engineeringRows = filterRows($allRows, 'department', 'Engineering');

echo "Memory after pipeline setup (nothing read yet): "
    . number_format(memory_get_usage() - $memoryBefore) . " bytes\n\n";

// NOW we pull values — file is read one line at a time
echo "Engineering department employees:\n";
foreach ($engineeringRows as $index => $employee) {
    echo "  Row $index: {$employee['name']} — \${$employee['salary']}\n";
}

unlink($tempFile); // remove the temp file
Output
Memory after pipeline setup (nothing read yet): 0 bytes
Engineering department employees:
Row 0: Alice — $95000
Row 2: Carol — $102000
Row 4: Eve — $88000
[cleanup] File handle closed.
Pro Tip: The finally Block is Your Best Friend in Generators
If a consumer breaks out of a foreach early, PHP sends a GeneratorExit signal and calls close() on the generator. A finally block inside the generator WILL still execute — making it the right place to release file handles, database cursors, or network connections. Never rely on the generator running to completion for cleanup.
Production Insight
Generator pipelines avoid the memory doubling of chained array functions like array_filter(array_map(...)).
Each stage holds only one element at a time — total memory stays constant regardless of data volume.
Rule: when you need multiple transformations on a large dataset, chain generators instead of array functions.
Key Takeaway
Generators keep memory at O(1) — one row at a time.
Chain them like Unix pipes for lazy transformations.
Always use finally blocks for cleanup — they run even on early break.

Two-Way Communication — send(), throw(), and getReturn() in Depth

Most tutorials stop at yield-as-output. But generators are bidirectional channels. The send() method lets you push a value INTO a running generator — the yield expression evaluates to whatever you sent. This unlocks coroutine-style patterns: think cooperative task schedulers, stateful parsers, or async I/O pipelines.

The flow is: you call send($value), the generator resumes, the yield expression evaluates to $value, execution continues to the next yield, and send() returns the newly yielded value. If there's no next yield, send() returns null.

Throw() is send()'s sibling — it injects an exception at the point of suspension instead of a value. This lets a scheduler cancel a coroutine cleanly without a global flag variable.

getReturn() extracts the value from an explicit return statement inside the generator. This is only accessible after the generator is exhausted (valid() === false). It's perfect for aggregations: let the generator yield intermediate progress and return the final computed result — callers get both streaming updates and a clean final answer without two separate functions.

CoroutineDemo.phpPHP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
<?php

/**
 * A running-total accumulator implemented as a coroutine.
 *
 * - Yields a status string each time a number is received.
 * - Returns the final total when closed via throw().
 *
 * Caller sends numbers; generator responds with a receipt.
 */
function runningTotalAccumulator(): Generator
{
    $total     = 0;
    $itemCount = 0;

    // The generator starts here when current()/send(null) is called
    echo "[accumulator] Ready to receive numbers.\n";

    while (true) {
        // yield does two jobs at once:
        //   1. Sends a status string OUT to the caller
        //   2. Receives the next number sent IN via send()
        $incomingNumber = yield "Current total: $total (items: $itemCount)";

        if ($incomingNumber === null) {
            // send(null) is what foreach internally does on the first next() —
            // we treat it as a no-op to handle gracefully
            continue;
        }

        $total     += $incomingNumber;
        $itemCount++;
        echo "[accumulator] Added $incomingNumber → running total = $total\n";
    }
}

$accumulator = runningTotalAccumulator();

// Initialise the generator — runs to first yield, prints the 'Ready' line
$initialStatus = $accumulator->current();
echo "Status: $initialStatus\n\n";

// Send values in — each send() resumes the generator and returns the NEW yielded value
$numbers = [150, 75, 200, 25];
foreach ($numbers as $number) {
    $status = $accumulator->send($number); // sends number IN, gets status string OUT
    echo "Caller received: $status\n";
}

echo "\n";

// --- Demonstrating throw() for clean cancellation ---
echo "Cancelling accumulator early via throw()...\n";
try {
    $accumulator->throw(new RuntimeException('Cancelled by caller'));
} catch (RuntimeException $exception) {
    // throw() re-throws the exception in the caller's scope if the
    // generator doesn't catch it internally
    echo "Caught in caller: " . $exception->getMessage() . "\n";
}

// --- Demonstrating getReturn() ---

/**
 * A generator that streams progress updates and returns a summary array.
 */
function processOrderBatch(array $orderIds): Generator
{
    $processedCount = 0;
    $failedIds      = [];

    foreach ($orderIds as $orderId) {
        // Simulate work — in production this might be a DB write
        $success = ($orderId % 7 !== 0); // pretend multiples of 7 fail

        if ($success) {
            $processedCount++;
            yield "Processed order #$orderId"; // stream progress to caller
        } else {
            $failedIds[] = $orderId;
            yield "FAILED order #$orderId";
        }
    }

    // return is only accessible via getReturn() after generator exhausts
    return [
        'processed' => $processedCount,
        'failed'    => $failedIds,
        'total'     => count($orderIds),
    ];
}

$orderIds     = [101, 102, 103, 105, 112, 114, 119];
$orderBatch   = processOrderBatch($orderIds);

echo "\n--- Order Batch Processing ---\n";
foreach ($orderBatch as $progressMessage) {
    echo $progressMessage . "\n";
}

// Generator is now exhausted — getReturn() is safe to call
$summary = $orderBatch->getReturn();
echo "\nSummary: {$summary['processed']} processed, "
    . count($summary['failed']) . " failed "
    . "(IDs: " . implode(', ', $summary['failed']) . ")\n";
Output
[accumulator] Ready to receive numbers.
Status: Current total: 0 (items: 0)
[accumulator] Added 150 → running total = 150
Caller received: Current total: 150 (items: 1)
[accumulator] Added 75 → running total = 225
Caller received: Current total: 225 (items: 2)
[accumulator] Added 200 → running total = 425
Caller received: Current total: 425 (items: 3)
[accumulator] Added 25 → running total = 450
Caller received: Current total: 450 (items: 4)
Cancelling accumulator early via throw()...
Caught in caller: Cancelled by caller
--- Order Batch Processing ---
Processed order #101
Processed order #102
Processed order #103
Processed order #105
Processed order #112
FAILED order #114
Processed order #119
Summary: 6 processed, 1 failed (IDs: 114)
Interview Gold: The First send() Must Be null
You cannot send a non-null value to a generator that hasn't started yet. The first call must be current(), next(), or send(null) — all three initialise the generator to the first yield. Calling send(42) on a fresh generator throws 'Cannot send value to a newly created generator'. This trips up almost everyone in interviews.
Production Insight
Using send() in a production task scheduler can mask control flow bugs if the generator is not properly initialised.
Always guard your first send() with a check on the generator's state using $gen->valid() or wrapping in try-catch.
Rule: treat send() as a two-phase handshake — initialise first, then communicate.
Key Takeaway
send() pushes values into the generator, yield receives them.
throw() injects exceptions — use for clean cancellation.
getReturn() only works after the generator is exhausted.

Generator Delegation with yield from — Composing Sequences Like a Pro

yield from lets one generator delegate iteration to another iterable — another generator, an array, or any Traversable. It's not just syntactic sugar for a nested foreach: it transparently proxies send() and throw() calls through to the inner generator and captures the inner generator's return value as the result of the yield from expression itself.

This is critical for recursive algorithms. Without yield from, you can't simply call a generator recursively and have it yield into the outer stream — you'd have to foreach the inner generator and re-yield each value. yield from collapses this into one clean expression.

The return value capture is the part most developers miss. When the delegated generator finishes, yield from evaluates to that generator's return value. This lets you build tree-walking or recursive descent parsers where each level both yields leaf values AND returns metadata to its parent — a pattern that's genuinely hard to achieve with iterators.

Performance note: yield from avoids the double-copy overhead of re-yielding in a loop. The Zend Engine links the inner generator's frame directly, so values flow through without an intermediate allocation per element.

GeneratorDelegation.phpPHP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
<?php

/**
 * Recursively walks a nested directory tree, yielding file paths.
 * Uses yield from for recursive delegation — no re-yielding loop needed.
 */
function walkDirectoryTree(string $directoryPath): Generator
{
    $entries = new DirectoryIterator($directoryPath);

    foreach ($entries as $entry) {
        if ($entry->isDot()) {
            continue; // skip '.' and '..'
        }

        if ($entry->isDir()) {
            // Delegate to a recursive call — send()/throw() are proxied through.
            // The return value of the inner generator is captured in $subDirCount.
            $subDirCount = yield from walkDirectoryTree($entry->getPathname());

            echo "[delegation] Subdirectory '{$entry->getFilename()}' contained $subDirCount file(s)\n";
        } else {
            yield $entry->getPathname(); // yield the file path to the top-level caller
        }
    }

    // This return value flows back to the parent generator via yield from
    return iterator_count(walkDirectoryTree($directoryPath)); // count files in this dir only
}

/**
 * Simpler example showing yield from with arrays and return capture.
 * This makes the delegation mechanics easy to observe without filesystem access.
 */
function innerSequence(string $label): Generator
{
    echo "[inner:$label] starting\n";
    yield "$label-A";
    yield "$label-B";
    echo "[inner:$label] done\n";
    return "metadata-from-$label"; // captured by yield from in the outer generator
}

function outerSequence(): Generator
{
    yield "outer-start";

    // yield from array — delegation works with plain arrays too
    yield from ["array-item-1", "array-item-2"];

    // yield from generator — return value is captured
    $returnedMetadata = yield from innerSequence('alpha');
    echo "[outer] inner returned: $returnedMetadata\n";

    $returnedMetadata = yield from innerSequence('beta');
    echo "[outer] inner returned: $returnedMetadata\n";

    yield "outer-end";

    return "outer-summary";
}

echo "=== Delegated sequence output ===\n";
foreach (outerSequence() as $index => $value) {
    echo "  [$index] $value\n";
}

// --- Real-world: merging multiple paginated API result generators ---

function simulateApiPage(string $resource, int $page): Generator
{
    // Simulates fetching a page of records from a REST API
    $itemsPerPage = 3;
    $start        = ($page - 1) * $itemsPerPage + 1;
    $end          = $start + $itemsPerPage - 1;

    for ($id = $start; $id <= $end; $id++) {
        yield ['resource' => $resource, 'id' => $id, 'page' => $page];
    }

    return $itemsPerPage; // how many items this page had
}

function fetchAllPages(string $resource, int $totalPages): Generator
{
    $totalFetched = 0;
    for ($page = 1; $page <= $totalPages; $page++) {
        $pageCount     = yield from simulateApiPage($resource, $page);
        $totalFetched += $pageCount;
        echo "[paginator] Fetched page $page ($pageCount items, $totalFetched total so far)\n";
    }
    return $totalFetched;
}

echo "\n=== Paginated API fetch ===\n";
$apiGenerator = fetchAllPages('products', 2);
foreach ($apiGenerator as $record) {
    echo "  Product ID {$record['id']} (page {$record['page']})\n";
}
echo "Total fetched: " . $apiGenerator->getReturn() . "\n";
Output
=== Delegated sequence output ===
[0] outer-start
[1] array-item-1
[2] array-item-2
[inner:alpha] starting
[3] alpha-A
[4] alpha-B
[inner:alpha] done
[outer] inner returned: metadata-from-alpha
[inner:beta] starting
[5] beta-A
[6] beta-B
[inner:beta] done
[outer] inner returned: metadata-from-beta
[7] outer-end
=== Paginated API fetch ===
Product ID 1 (page 1)
Product ID 2 (page 1)
Product ID 3 (page 1)
[paginator] Fetched page 1 (3 items, 3 total so far)
Product ID 4 (page 2)
Product ID 5 (page 2)
Product ID 6 (page 2)
[paginator] Fetched page 2 (3 items, 6 total so far)
Total fetched: 6
Pro Tip: yield from Works on Any Traversable, Not Just Generators
You can use yield from with plain arrays, ArrayIterators, SPL data structures, or any class implementing Traversable. This makes generators a universal 'lazy wrapper' around existing collections — you can gradually migrate array-based code to streaming without rewriting consumers.
Production Insight
yield from in deeply recursive structures can create long call chains with hundreds of nested generator objects.
Each nested generator retains its local variables until the entire chain unwinds — memory can spike on large datasets.
Rule: limit recursion depth in yield from chains, or use iterative approaches for very deep trees.
Key Takeaway
yield from delegates iteration and proxies send()/throw().
It captures the inner generator's return value.
Performance: avoids per-element allocation vs manual re-yield loops.

Real-World Pattern: Generator-Based Paginated API Consumer

Combining yield from with send() creates a powerful pattern for consuming paginated REST APIs. The outer generator manages pagination state (cursor, page number, rate-limit handling), while the inner generator (yield from) yields individual records. The caller sees one flat stream of records, not caring about pagination logic.

This pattern also allows injection of rate-limit delays via send() — the caller can tell the generator to pause if a 429 response is received. The generator can then sleep and retry the same page, all transparently.

The memory guarantee is critical here: even if the API has 10,000 pages of 100 records each, the generator holds at most one record in memory at a time. An array-based approach would collect all records, blowing through memory limits on large data sets.

PaginatedApiConsumer.phpPHP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
<?php

/**
 * Simulates an external API that returns paginated results.
 * In production, replace with Guzzle or curl calls.
 */
class ExternalApi
{
    public function fetchPage(string $endpoint, int $page): array
    {
        // Simulate network latency
        usleep(20000); // 20ms

        if ($page > 5) {
            return ['items' => [], 'next_page' => null];
        }

        $items = [];
        for ($i = 1; $i <= 20; $i++) {
            $items[] = [
                'id'   => ($page - 1) * 20 + $i,
                'name' => "Item " . (($page - 1) * 20 + $i),
            ];
        }

        return [
            'items'     => $items,
            'next_page' => $page + 1,
            'rate_limited' => false,
        ];
    }
}

/**
 * Generator that fetches all pages and yields individual records.
 * Accepts send() to inject control signals (e.g., 'throttle').
 */
function fetchAllRecords(string $endpoint): Generator
{
    $api      = new ExternalApi();
    $page     = 1;
    $total    = 0;

    while (true) {
        $response = $api->fetchPage($endpoint, $page);

        if (empty($response['items'])) {
            break; // no more data
        }

        if ($response['rate_limited']) {
            // Caller can send a signal to retry after delay
            $signal = yield []; // yield empty to get control from caller
            if ($signal === 'throttle') {
                echo "[generator] Throttling...\n";
                sleep(1); // backoff
                continue; // retry same page
            }
        }

        foreach ($response['items'] as $item) {
            yield $item;
            $total++;
        }

        $page = $response['next_page'];
        if ($page === null) {
            break;
        }
    }

    return $total;
}

$generator = fetchAllRecords('/api/products');

echo "=== Paginated API Client ===\n";
foreach ($generator as $record) {
    echo "  Processed: {$record['id']} - {$record['name']}\n";
    // Simulate a scenario where we need to throttle after batch
    if ($record['id'] % 60 === 0) {
        // In real code, you'd call $generator->send('throttle') after checking API limits
    }
}
echo "Total records: " . $generator->getReturn() . "\n";
echo "Peak memory: " . memory_get_peak_usage(true) / 1024 . " KB\n";
Output
=== Paginated API Client ===
Processed: 1 - Item 1
Processed: 2 - Item 2
...
Processed: 100 - Item 100
Total records: 100
Peak memory: 520 KB
Generator as a Remote Control Co-Routine
  • yield sends data out to the caller and waits for a reply (the sent value).
  • send() and yield form a request-response pair — like a coroutine.
  • This enables patterns: lazy pagination, interactive parsers, cooperative multitasking.
Production Insight
Rate-limit handling via send() keeps the generator decoupled from HTTP client logic.
The caller decides when to retry, not the generator — cleaner separation of concerns.
Rule: use send() for control signals (throttle, cancel, change parameters), not for data transformation.
Key Takeaway
yield from + send() builds clean paginated consumers.
Memory stays flat regardless of total dataset size.
Use send() for external control signals like rate-limiting.

Yielding Keys and Values — Don't Forget Associative Arrays

Most devs treat generators like firehoses for numeric sequences. That's fine until you're mapping log lines to timestamps or building lookup tables from a 2GB CSV. You can yield key-value pairs just like an associative array. The syntax is yield $key => $value. This matters when your downstream code expects named indexes—think JSON encoding, database upserts, or feeding a template engine. The WHY is simple: memory. Building an actual associative array for a million rows will crater your 256MB container. Yielding them keeps the hash map illusion without the allocation. PHP 8.x's JIT handles the internal hash lookups efficiently, but only if you're yielding keys explicitly. Forget that, and you're stuck with sequential integer indices that force your team to write brittle positional logic. Use associative yields, and your API responses stay readable without blowing the budget.

user_import.phpPHP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// io.thecodeforge
function parseUserCsv(string $path): \Generator {
    $handle = fopen($path, 'r');
    $headers = fgetcsv($handle);
    while ($row = fgetcsv($handle)) {
        $record = array_combine($headers, $row);
        yield $record['user_id'] => [
            'email' => $record['email'],
            'role' => $record['role'],
            'created_at' => $record['created_at'],
        ];
    }
    fclose($handle);
}

foreach (parseUserCsv('/var/data/import.csv') as $id => $user) {
    echo "Processed user $id ({$user['email']})" . PHP_EOL;
}
Output
Processed user 1001 (alice@example.com)
Processed user 1002 (bob@example.com)
Processed user 1003 (carl@example.com)
Production Trap:
Yielding without explicit keys always assigns sequential ints. If your downstream code relies on array_key_exists() or json_encode() with JSON_FORCE_OBJECT, you'll get an array, not an object. That's a silent data-type bug that surfaces in your mobile client as 'Expected object, got array'.
Key Takeaway
Yield keys explicitly when the consumer expects named fields. It costs nothing and prevents silent array-to-object coercion disasters.

Early Termination and Cleanup — Don't Leak Handles

Generators are lazy, which means they might never finish iterating. A foreach with a break or return? Your generator function stops mid-stream. That's a problem if you're holding file handles, database cursors, or temp streams. PHP's garbage collector will eventually close them, but 'eventually' is not a production guarantee. You need a finally block. The WHY: memory leaks in long-running processes like workers or daemons compound silently. A finally block runs when the generator is garbage collected or explicitly closed via Generator::__destruct(). This is PHP 8.x's safety net. Wrap your resource acquisition in a try-finally, and yield inside the try. When the consumer bails, your cleanup fires predictably. Don't assume the loop completes. In production, consumers are cut off by timeouts, exceptions, or user cancellations. Protect the resource, not the iteration.

database_stream.phpPHP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// io.thecodeforge
function streamLargeQuery(PDO $pdo, string $sql): \Generator {
    $stmt = $pdo->prepare($sql);
    $stmt->execute();
    try {
        while ($row = $stmt->fetch(PDO::FETCH_ASSOC)) {
            yield $row['id'] => $row;
        }
    } finally {
        $stmt->closeCursor();
        echo "Cursor closed early or normally." . PHP_EOL;
    }
}

$gen = streamLargeQuery($pdo, 'SELECT * FROM logs WHERE processed = 0 LIMIT 100000');
foreach ($gen as $id => $row) {
    if (time() % 2 === 0) {
        break; // Simulate early exit
    }
}
// Outputs: Cursor closed early or normally.
Output
Cursor closed early or normally.
Production Trap:
Without a finally block, a broken foreach leaves the database cursor open. In a high-traffic worker, that exhausts the connection pool within minutes. Always pair resource-heavy generators with try-finally for deterministic cleanup.
Key Takeaway
Wrap resource-based generators in try-finally. PHP 8.x guarantees cleanup even on early termination. Your production connections will thank you.
● Production incidentPOST-MORTEMseverity: high

The 4 GB CSV That Crashed a Production Reporting Job

Symptom
PHP process dies with 'Allowed memory size exhausted' error after processing ~300k rows of a 2M-row CSV. The OOM killer terminates the container after ~4 minutes of CPU thrashing.
Assumption
The dev team assumed file() and then iterating over the resulting array was fine because 'it worked fine on staging with 10k rows.' They didn't test at production scale.
Root cause
The original code called file($path) which loads the entire CSV into an array of strings — each string holding a full line. For a 2M row file with average 500 bytes per row, that's ~1 GB just for the raw lines, plus PHP overhead per string (~56 bytes) pushing total past 2 GB. The container had only 1.5 GB memory limit.
Fix
Replaced the file() call with a generator that reads one line at a time via fgetcsv() and yields each row. Peak memory dropped from ~2 GB to ~2 MB. The report now completes in under 3 minutes with 50 MB headroom.
Key lesson
  • Always assume input size is unbounded when processing files or API responses in production.
  • Test memory behaviour under realistic row counts — 10x your expected max is a good rule.
  • Use generators or SplFixedArray for any operation that could grow beyond a few thousand items.
Production debug guideSymptom → Action: Diagnose generator issues fast4 entries
Symptom · 01
Generator yields no values / foreach produces nothing
Fix
Check if the generator was already iterated elsewhere. Generators are single-use. Also verify the generator function doesn't return early without any yield.
Symptom · 02
'Cannot send value to a newly created generator' error
Fix
You called send($nonNull) on a fresh generator. Always call current(), next(), or send(null) first to initialise it to the first yield.
Symptom · 03
Memory usage still high despite using a generator
Fix
Verify you're not storing yielded values into an array inside the foreach loop. Also check that the generator itself doesn't hold references to large objects in local variables — those stay alive for the generator's lifetime.
Symptom · 04
Generator never reaches code after yield (infinite loop observed?)
Fix
Check if the caller is using send() and the generator's yield expression is being assigned — if the loop condition depends on the sent value, ensure it eventually becomes falsy.
★ Generator Debug Cheat SheetQuick commands and checks for diagnosing generator problems in production.
Memory spike despite generator
Immediate action
Check if the loop is building an array from yielded values.
Commands
memory_get_usage(true) before and after the generator loop
xdebug_debug_zval('generatorVar') to inspect retained references
Fix now
Remove array accumulation in the consumer — process each yield directly.
Generator doesn't start / first value missing+
Immediate action
Verify you called current() before next() or just used foreach.
Commands
$gen->valid() to check if generator is already exhausted
var_dump($gen->current()) immediately after creation
Fix now
If using foreach, ensure you didn't accidentally iterate the generator twice.
send() throws fatal error+
Immediate action
Identify where the generator was created and if any iteration occurred.
Commands
Check if $gen->current() was called before send()
Wrap send() in try-catch and log the stack trace
Fix now
Replace first send($val) with send(null) then send($val) or just send(null) first.
Generator leak (never cleaned up)+
Immediate action
Check if the consumer `break`s out early without closing the generator.
Commands
Use `finally` block inside the generator to release resources
Explicitly call $gen->close() in the consumer after early exit
Fix now
If using foreach, wrap in try-finally and call close() in the finally.
Array vs Generator Iteration
AspectArray-Based IterationGenerator-Based Iteration
Memory usage (1M rows)~100–800 MB depending on row size~1–2 MB regardless of total rows
Time to first valueAfter ALL values computed/loadedAfter FIRST value computed (immediate)
RewindableYes — iterate as many times as neededNo — exhausted generator cannot rewind
Bidirectional (send values in)No — read-onlyYes — via send() and throw()
Return a final valueThe array IS the return valueVia return + getReturn() after exhaustion
Recursive compositionManual re-merging of arraysClean via yield from with return capture
Lazy / short-circuit friendlyNo — full computation always happensYes — unused values are never computed
Type-hinting in PHPDocarray<int, MyClass>Generator<TKey, TValue, TSend, TReturn>
Works as argument to foreachYesYes (implements Iterator)
Error propagation into sequenceNot applicableVia throw() — injects exception at yield
PHP version requirementAll versionsPHP 5.5+ (yield from: PHP 7.0+)

Key takeaways

1
A generator function returns a Generator object immediately
zero code inside it runs until you call current(), next(), or send(). This surprises even experienced developers.
2
yield is a two-way valve
it sends a value out AND receives a value in (via send()). The yielded-OUT value is what callers see; the sent-IN value is what yield evaluates to inside the generator body.
3
yield from isn't just a loop shortcut
it transparently proxies send() and throw() to the inner generator AND captures the inner generator's return value as an expression result, enabling clean recursive composition.
4
Generators are single-use and cannot be rewound once started. If you need a replayable lazy sequence, encapsulate the generator factory in an IteratorAggregate class so getIterator() creates a fresh generator each time.
5
Real memory savings come from not storing intermediate results. Measure memory with memory_get_usage() to validate your generator is actually lazy in production.

Common mistakes to avoid

3 patterns
×

Calling send($value) on a brand-new generator

Symptom
Throws 'Cannot send value to a newly created generator' fatal error. The generator hasn't reached its first yield yet.
Fix
Always initialise the generator first with current() or send(null) before sending a real value. Think of it as a handshake: the generator must reach its first yield before it can accept incoming data.
×

Assuming a generator can be iterated twice

Symptom
After the first foreach exhausts it, a second foreach produces nothing; no error is thrown, just silently yields zero items.
Fix
Generators are single-use. If you need to iterate multiple times, wrap the generator factory in a class that re-calls the generator function on each getIterator() call (implement IteratorAggregate), or store results in an array explicitly.
×

Forgetting that returning from a generator doesn't work like a normal function return

Symptom
Developers expect $result = someGenerator() to capture the return value, but it just gets the Generator object; the actual return value is only accessible via $gen->getReturn() AFTER the generator is fully exhausted.
Fix
Always drain the generator (via foreach or while($gen->valid())) before calling getReturn(), otherwise you'll get a 'Cannot get return value of a generator that hasn't returned' exception.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
What is the difference between yield and return in a PHP generator, and ...
Q02SENIOR
How does send() work internally — what value does the yield expression e...
Q03SENIOR
Explain yield from in PHP 7+. How does it differ from a foreach loop tha...
Q01 of 03SENIOR

What is the difference between yield and return in a PHP generator, and what happens if you call getReturn() before the generator is exhausted?

ANSWER
yield pauses execution and sends a value to the caller without exiting the function; return in a generator provides a final value that can be retrieved via getReturn() only after exhaustion. Calling getReturn() before the generator is exhausted throws an exception because the generator hasn't actually executed its return statement yet.
FAQ · 5 QUESTIONS

Frequently Asked Questions

01
What is the difference between a PHP generator and a normal iterator?
02
Does using generators actually save memory in PHP?
03
Can I use a PHP generator with functions that expect an array, like array_map or array_filter?
04
What happens if I break out of a foreach loop that iterates a generator?
05
Can I use yield inside a try block?
N
Naren Founder & Principal Engineer

20+ years shipping production PHP systems at scale. Notes here come from systems that actually shipped.

Follow
Verified
production tested
May 24, 2026
last updated
1,554
articles · all by Naren
🔥

That's Advanced PHP. Mark it forged?

7 min read · try the examples if you haven't

Previous
PHP and Redis Integration
11 / 13 · Advanced PHP
Next
PHP Fibers — Async PHP