Senior 9 min · March 05, 2026

Node.js Streams — Missing drain Handler Causes OOM

Missing drain handler in Transform caused RSS climb from 120 MB to 3.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer

- npm i or npm install Install Packages

Plain-English First

Imagine you're filling a bathtub from a fire hose. If you just blast the water all at once, it floods the bathroom. Streams are like turning that fire hose into a gentle tap — water flows in at a rate the tub can handle. A Buffer is the plug in the drain: it holds a fixed chunk of water (raw bytes) temporarily so you can inspect or move it before letting more in. Together, they let Node.js handle huge amounts of data without drowning in memory. The moment you understand that analogy at a mechanical level — not just as a metaphor — is the moment streams stop being confusing.

Every Node.js server you have ever run has been silently relying on Streams and Buffers — whether you knew it or not. When you serve a 4 GB video file, parse an incoming multipart upload, or pipe data from a database cursor to an HTTP response, you are in stream territory. Get it wrong and your server leaks memory, stalls under load, or corrupts binary data in ways that are genuinely nightmarish to debug at 2 AM. Get it right and you can process files larger than your available RAM with a flat, predictable memory footprint that holds steady under load.

The core problem Streams solve is the mismatch between producer speed and consumer speed. A database might emit rows faster than your HTTP client can receive them. A file system read might outpace a gzip compressor. Without a flow-control mechanism, the fast side buffers everything into memory until something crashes. Node.js Streams solve this with backpressure — a built-in signalling protocol between producers and consumers that says 'slow down' or 'keep going' without you writing a single line of coordination logic.

I have personally traced three separate production OOM incidents back to misunderstood stream backpressure — in two cases, engineers had used pipe() and assumed it handled everything safely, not realizing that pipe() does not propagate errors and does not protect against a broken backpressure chain inside a custom Transform.

By the end of this article you will understand how the Buffer class maps onto V8 memory outside the garbage collector, what highWaterMark actually controls (it is not a hard limit, and most engineers get this wrong), how to build a production-grade Transform stream, why pipeline() is almost always safer than pipe(), and the specific failure modes that only show up under load — never in your development environment.

Buffer Internals — V8 Heap vs External Memory

Buffer is one of the most misunderstood classes in Node.js, and the misunderstanding usually surfaces in production as an OOM kill that heap profiling tools cannot explain. Engineers look at the heap snapshot, see 50 MB, and cannot reconcile it with the 2 GB RSS climbing in their dashboards. The reason is where Buffer memory actually lives.

Before Node.js 4.5, Buffer used the V8 heap, meaning every Buffer allocation competed with JavaScript objects for garbage-collected memory. Since then, Buffer.alloc() and Buffer.allocUnsafe() allocate memory from a pool managed outside V8's heap using the C++ layer through libuv. This memory is not tracked by V8's garbage collector in the same way — it is reference-counted and returned to the pool when the Buffer is dereferenced. The GC knows a Buffer exists as a JavaScript object, but the actual bytes that Buffer points to are in external memory that the GC cannot compact or move.

This has a concrete production implication: your heap snapshot will show a clean 50 MB heap while your process RSS sits at 2 GB because of accumulated Buffer allocations. Every heap profiling tool, every Chrome DevTools memory snapshot, every heapUsed metric will look completely healthy. You must monitor process.memoryUsage().external and process.memoryUsage().rss to detect Buffer-related memory growth.

Buffer.alloc(size) zero-fills the allocated memory before returning it, which is safe but slower. Buffer.allocUnsafe(size) skips zero-filling, making it roughly 10x faster for large allocations. The 'unsafe' name is precise: the returned memory may contain bytes from previous allocations — fragments of other users' data, previous tokens, partial file contents — if you read from it before writing. For security-sensitive paths, always use Buffer.alloc(). For internal processing where you guarantee write-before-read, Buffer.allocUnsafe() is the correct production choice and the performance difference is real at high allocation rates.

io/thecodeforge/buffers/buffer-allocation.jsJAVASCRIPT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
const { Buffer } = require('buffer');

// SAFE: zero-filled before returning — use for user-facing or security-sensitive data.
// The zero-fill is not optional safety theater — it prevents information disclosure.
const safeBuf = Buffer.alloc(1024);
console.log('alloc [0]:', safeBuf[0]); // 0 — guaranteed, always

// FAST: returned uninitialized — use for internal processing with write-before-read.
// The 'unsafe' refers to information disclosure risk, not memory safety.
const fastBuf = Buffer.allocUnsafe(1024);
console.log('allocUnsafe [0]:', fastBuf[0]); // unpredictable — could be any byte

// Small allocations (< 4 KB) share a pre-allocated 8 KB pool.
// This means two small allocUnsafe calls may share an underlying ArrayBuffer.
// Writing to one at the wrong offset can corrupt the other.
const smallA = Buffer.allocUnsafe(100);
const smallB = Buffer.allocUnsafe(200);
console.log('Same backing store:', smallA.buffer === smallB.buffer); // true for small allocs

// The key metric to watch in production — NOT heapUsed.
// Buffer memory shows up in external and rss, not heapUsed.
const mem = process.memoryUsage();
console.log({
  rss:          `${(mem.rss          / 1024 / 1024).toFixed(1)} MB`, // total process memory
  heapUsed:     `${(mem.heapUsed     / 1024 / 1024).toFixed(1)} MB`, // V8 JS objects
  external:     `${(mem.external     / 1024 / 1024).toFixed(1)} MB`, // Buffer bytes
  arrayBuffers: `${(mem.arrayBuffers / 1024 / 1024).toFixed(1)} MB`  // ArrayBuffer bytes
});

// If external climbs while heapUsed stays flat — Buffer leak.
Output
alloc [0]: 0
allocUnsafe [0]: 47
Same backing store: true
{ rss: '28.4 MB', heapUsed: '5.8 MB', external: '0.3 MB', arrayBuffers: '0.3 MB' }
Buffer memory is invisible to heap profilers
process.memoryUsage().external is your only reliable indicator of Buffer memory growth. Heap snapshots will not show it. If your RSS climbs while heapUsed is stable, you have a Buffer leak — almost always from a broken backpressure chain.
Production Insight
process.memoryUsage().external tracks Buffer memory that lives outside V8's heap.
Heap snapshots and V8 profilers will not reveal Buffer leaks — RSS climbing while heap stays flat is the only reliable telltale.
Rule: instrument process.memoryUsage() in your production health endpoint and alert on RSS growth rate, not just heapUsed.
Key Takeaway
Buffer memory lives outside V8's garbage-collected heap — this is intentional, not an oversight.
Heap profilers will not reveal Buffer leaks. Monitor process.memoryUsage().external and .rss.
If RSS climbs while heap stays flat, your Buffers are accumulating — find the backpressure break.
Buffer Allocation Decision
IfBuffer contains user input, authentication tokens, or data sent to clients
UseUse Buffer.alloc() — zero-filled, prevents information disclosure
IfInternal processing with guaranteed write-before-read (file reads, binary protocol encoding)
UseUse Buffer.allocUnsafe() — roughly 10x faster, safe when you control the write cycle
IfAllocation size < 4 KB and using allocUnsafe
UseBe aware of shared pool corruption risk — use Buffer.alloc() or manually manage a pool

Stream Types and Their Internal State Machines

Node.js provides five stream types, each with a distinct role in a data pipeline. Understanding their internal state machines is not academic — it is what lets you diagnose production issues where streams silently stop flowing, emit data after destruction, or hold memory that the GC cannot reclaim.

Every Readable stream has two operating modes: paused and flowing. In paused mode — the default — data is buffered internally and you must explicitly call read() to pull chunks out. In flowing mode, data is pushed to you automatically via data events as fast as the underlying source can produce it. Calling .resume(), piping to a Writable, or attaching a data listener switches to flowing mode. The most common cause of 'stream hang' bugs I have debugged is a Readable created and then left in paused mode with no consumer attached — data accumulates in the internal buffer, the highWaterMark is crossed, and the underlying source pauses, and nothing ever flows. The process looks healthy. No error is emitted. Everything is just silently stuck.

Writable streams have a simpler state machine driven by the callback in _write(). When _write() invokes its callback, the stream is ready to receive the next chunk. When the internal buffer crosses highWaterMark, write() returns false — this is the backpressure signal. The drain event fires when the buffer drops back below highWaterMark.

Duplex streams like TCP sockets combine both — independent Readable and Writable sides with independent state machines sharing one underlying resource. Transform streams like zlib.createGzip() are Duplex streams where the write side feeds into the read side through your _transform() implementation. PassThrough streams are identity Transforms useful for injecting inspection points into a pipeline without modifying data.

io/thecodeforge/streams/stream-state-inspection.jsJAVASCRIPT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
const { Readable, Writable, Transform, PassThrough } = require('stream');

// --- Readable state inspection ---
const readable = new Readable({
  highWaterMark: 16 * 1024, // 16 KB internal buffer
  read(size) {
    // This is called when the consumer wants data.
    const shouldContinue = this.push(Buffer.from('data chunk'));
    if (!shouldContinue) {
      // Consumer not reading fast enough — stop producing.
      // The stream will call read() again when the consumer is ready.
    }
    this.push(null); // null signals end of stream
  }
});

console.log('Initial state:', {
  readableFlowing: readable.readableFlowing, // null = paused, no listeners
  readableLength:  readable.readableLength,   // 0 = nothing buffered yet
  readableEnded:   readable.readableEnded     // false = not done
});

readable.resume(); // switch to flowing mode — data events start firing

console.log('After resume:', {
  readableFlowing: readable.readableFlowing, // true
});

// --- Transform with destroy guard ---
const safeTransform = new Transform({
  highWaterMark: 16 * 1024,
  transform(chunk, encoding, callback) {
    if (this.destroyed) return callback();
    const processed = chunk.toString().toUpperCase();
    callback(null, Buffer.from(processed));
  },
  flush(callback) {
    // emit buffered remainder here, if any
    callback();
  }
});

// --- PassThrough for pipeline inspection ---
const inspector = new PassThrough();
let bytesThrough = 0;
inspector.on('data', chunk => {
  bytesThrough += chunk.length;
});
// Insert inspector between any two pipeline stages without affecting data flow.
Output
Initial state: { readableFlowing: null, readableLength: 0, readableEnded: false }
After resume: { readableFlowing: true }
Streams as a Factory Assembly Line
  • Readable = raw material supplier — produces data chunks on demand
  • Transform = processing station — modifies chunks and pushes downstream at downstream pace
  • Writable = packaging station — consumes final product and writes it
  • Backpressure = conveyor belt speed controller — pauses upstream when downstream is slow
  • highWaterMark = buffer shelf at each station — triggers pause signal when full
Production Insight
A Readable with readableFlowing === null has no consumer and silently accumulates data in its internal buffer.
This is the most common cause of the 'stream hang' bug class in production.
Rule: check readableFlowing first when debugging a stream that appears to have stopped — if null, the stream is paused and waiting for a consumer that never showed up.
Key Takeaway
Readable has two modes (paused and flowing) — null readableFlowing is the silent hang state with no consumer.
Writable backpressure is driven by write() returning false and the drain event — ignore these and you get an OOM kill.
Always guard _transform() with a destroyed check to prevent post-destroy data emission into an already-closed stream.

Backpressure — The Flow Control Protocol

Backpressure is the single most important concept in Node.js Streams, and the one most frequently misunderstood in practice. It is not a rate limiter, not a throttle, and not a buffer size configuration. It is a cooperative protocol between a Readable and a Writable where the Writable signals 'I need you to slow down' and the Readable obliges — if the producer is paying attention.

The mechanism works through the return value of write(). When you call writable.write(chunk), the method returns true if the internal buffer is below highWaterMark and false if it is at or above it. When write() returns false, the protocol says the producer should stop writing and wait for the drain event before sending more data. If the producer ignores this signal and keeps calling write(), the data is still buffered — but in memory, without any bound, until the process runs out of memory and the OOM killer fires. There is no automatic enforcement. Backpressure is cooperative, not mandatory.

The highWaterMark is not a hard limit. This is the specific detail that most engineers who have read about streams still get wrong in interviews. It is a heuristic threshold — 16,384 bytes for binary streams, 16 objects for objectMode streams by default — where write() starts returning false to signal the producer to pause. But the stream will still accept data beyond this point. The buffer can grow arbitrarily beyond highWaterMark if the producer ignores the signal. Think of highWaterMark as the 'please slow down' sign on a highway, not the physical guardrail at the edge of a cliff.

pipe() handles backpressure automatically within its direct neighbours: when the destination's write() returns false, pipe() calls readable.pause(). When drain fires, pipe() calls readable.resume(). This is why pipe() seems to work in simple cases. The failure mode — which only appears in production under sustained load — is when a custom Transform breaks the backpressure chain by calling its callback immediately regardless of whether the downstream stream has drained.

io/thecodeforge/streams/backpressure-manual.jsJAVASCRIPT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
const fs = require('fs');

// Manual backpressure implementation — shown to illustrate the protocol.
// In production, use pipeline() which implements this correctly for you.
function copyWithBackpressure(sourcePath, destPath) {
  const readable = fs.createReadStream(sourcePath);
  const writable = fs.createWriteStream(destPath);

  readable.on('data', (chunk) => {
    const canContinue = writable.write(chunk);
    if (!canContinue) {
      // Backpressure engaged — pause the producer.
      readable.pause();
      // Resume only when writable has drained.
      writable.once('drain', () => readable.resume());
    }
  });

  readable.on('end', () => writable.end());

  // Error handling on both sides — without this, errors crash the process.
  readable.on('error', (err) => {
    console.error('Read error:', err.message);
    writable.destroy(err); // destroy the other stream too
  });
  writable.on('error', (err) => {
    console.error('Write error:', err.message);
    readable.destroy(err);
  });

  return new Promise((resolve, reject) => {
    writable.on('finish', resolve);
    writable.on('error', reject);
  });
}
Output
// No visible output — data flows from source to destination with flat memory.
// Use this pattern to understand the protocol; use pipeline() in production.
highWaterMark Is the Polite Request, Not the Physical Guardrail
  • Default highWaterMark: 16 KB for binary streams, 16 objects for objectMode
  • write() returns false when buffered data >= highWaterMark — this is the backpressure signal
  • The stream still accepts data after write() returns false — it keeps buffering in memory without bound
  • Only pausing the producer (or using pipe/pipeline) actually stops the data flow
  • Tuning highWaterMark lower = more frequent pauses but lower peak memory; higher = smoother throughput but larger memory spikes
Production Insight
Ignoring write() returning false is the single most common cause of OOM kills in stream-based Node.js services.
The buffer grows unbounded — there is no automatic circuit breaker unless you use pipe() or pipeline().
Rule: if you call write() manually in any loop or data handler, always check the return value and implement the pause/drain cycle. Or use pipeline() and let it do this correctly.
Key Takeaway
Backpressure is cooperative, not automatic — the producer must respect write() returning false or risk an OOM kill.
highWaterMark is advisory, not a hard limit — the stream keeps buffering if the signal is ignored.
The only way to actually stop data flow is to pause the readable or use pipe()/pipeline().
Backpressure Handling Strategy
IfYou are manually piping data between two streams
UseUse pipeline() from stream/promises — it handles backpressure, error propagation, and resource cleanup automatically
IfYou are inside a custom Transform
UseEnsure _transform() callback is only called after the downstream stream has drained. Do not call callback() synchronously if push() returned false
IfYou have long-lived pipelines with asymmetric speeds (fast producer, slow consumer)
UseAdd highWaterMark tuning and monitor process.memoryUsage().external to detect early backpressure breaks

pipe() vs pipeline() — Error Propagation and Resource Cleanup

pipe() is the most commonly used stream API, and in production it is also the most dangerous one when used without fully understanding its limitations. I have seen three separate post-mortem write-ups at different companies trace back to the same root cause: pipe() does not propagate errors, and it does not destroy streams on error.

Here is the concrete failure mode. If you have readable.pipe(transform).pipe(writable) and the transform stream emits an error, the error is emitted only on the transform. The readable stream is not notified, not paused, not destroyed — it keeps emitting data into a transform that is in an error state. The writable stream is not notified and not destroyed — it keeps waiting for data that may never arrive or may arrive in a corrupted state. Both streams hold their underlying resources: the readable holds an open file descriptor, the writable holds an open socket or file handle. Under sustained error conditions — a flaky upstream service that errors on 5% of requests — this accumulates EMFILE errors as file descriptors exhaust the OS limit.

pipeline() from stream/promises solves both problems with one API call. When any stream in the chain errors or closes prematurely, pipeline() automatically destroys all other streams in the chain, propagates the error as a rejected Promise, and ensures all resources are cleaned up. In Node.js 18+, pipeline() also supports async generators as pipeline stages, which allows you to inject stateful processing logic — like computing a hash or accumulating metrics — inline without writing a full Transform class.

stream.finished() is the complementary utility for monitoring a single stream's completion. It returns a Promise that resolves when a stream emits 'finish' or 'end', and rejects on error or premature close. Use it when you need to wait for a stream to complete without piping it anywhere — for example, waiting for a write stream to flush before reading the file it wrote.

io/thecodeforge/streams/pipeline-production.jsJAVASCRIPT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
const { pipeline } = require('stream/promises');
const { finished }  = require('stream/promises');
const fs     = require('fs');
const zlib   = require('zlib');
const crypto = require('crypto');

// Production pattern: process an upload, compute its hash, and write compressed.
// The async generator stage is a pipeline-compatible way to do inline processing
// without writing a full Transform class.
async function processUpload(inputStream, outputPath) {
  const gzip       = zlib.createGzip({ level: 6 });
  const fileStream = fs.createWriteStream(outputPath);
  const hasher     = crypto.createHash('sha256');

  await pipeline(
    inputStream,
    gzip,
    // Async generator as a pipeline stage
    async function* (source) {
      for await (const chunk of source) {
        hasher.update(chunk);
        yield chunk;
      }
    },
    fileStream
  );

  return hasher.digest('hex');
}

// Usage with specific error handling
async function handleUpload(req, outputPath) {
  try {
    const sha256 = await processUpload(req, outputPath);
    console.log('Upload complete. SHA256:', sha256);
    return { success: true, hash: sha256 };
  } catch (err) {
    if (err.code === 'ERR_STREAM_PREMATURE_CLOSE') {
      console.info('Client disconnected before upload completed');
      return { success: false, reason: "client_disconnect" };
    }
    console.error('Upload failed:', err.message);
    return { success: false, reason: "processing_error", error: err.message };
  }
}

// stream.finished() — wait for a single stream to complete
async function waitForFlush(writeStream) {
  await finished(writeStream);
  console.log('Write stream fully flushed — safe to read the file now');
}
Output
Upload complete. SHA256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
pipe() Leaves Streams Open on Error
If any stream in a pipe chain errors, the others are not destroyed. This can exhaust file descriptors under sustained errors. pipeline() auto-destroys all streams on any error.
Production Insight
pipe() leaves all other streams open on error — under sustained failure conditions, this exhausts file descriptor limits with EMFILE errors.
pipeline() auto-destroys all streams in the chain on any error and surfaces the error as a rejected Promise.
Rule: use pipeline() for any multi-stream operation. The only defensible exception is a prototype or script where you have explicitly added error listeners and destroy() calls to every stream in the chain — and even then, pipeline() is shorter.
Key Takeaway
pipe() is convenient but silently swallows errors and leaves all other streams open when any one fails.
pipeline() is the production-safe default — it propagates errors and auto-destroys all streams in the chain.
Use stream.finished() for single-stream completion tracking and async generators for inline stateful processing.
Choosing pipe() vs pipeline()
IfProduction code with multi-stream operations and error handling requirements
UseAlways use pipeline() from stream/promises — auto-destroys streams, propagates errors, returns a Promise
IfNeed inline stateful processing without a full Transform class
UseUse an async generator function as a pipeline() stage
IfNeed to wait for a single stream to complete without piping
UseUse stream.finished() from stream/promises

Building a Custom Transform Stream for Production

Custom Transform streams are where most backpressure bugs are born. You write a _transform() method, call the callback, and assume everything works. Then under load, memory grows, or data gets corrupted, or the stream hangs. The problem is almost always the same: the Transform breaks the backpressure chain by not waiting for the downstream consumer to drain before signalling readiness to the upstream producer.

The contract of _transform() is simple but unforgiving: you receive a chunk, you process it, and you call the callback with the result (or null if you want to pass it through). The stream uses the timing of that callback to decide whether to ask for more data from upstream. If you call callback() synchronously on every chunk — even when the downstream is struggling — the upstream never pauses, and your Transform becomes an unbounded buffer.

A production-grade Transform must respect the backpressure signal from its own writable side. Concretely: if this.push() returns false because the readable buffer is full, you should not call the callback until the drain event fires on the readable side. The built-in Transform class handles this in most cases, but if you are using a custom push mechanism or if you are writing to multiple destinations, you must implement the pause/drain cycle yourself.

Another critical pattern: always guard _transform() with a this.destroyed check. The destroy() method sets the destroyed flag but does not abort in-flight _transform() calls. Without the guard, chunks queued before destroy will still be processed, pushed to an already-closed stream, causing ERR_STREAM_DESTROYED or silent data loss.

And don't forget _flush(). It's called when the writable side ends. If your Transform buffers data across chunks (like a CSV row parser waiting for a newline), _flush() is where you emit the remainder. Forgetting _flush() means data loss at the end of every stream.

io/thecodeforge/streams/custom-transform-production.jsJAVASCRIPT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
const { Transform } = require('stream');

class LineParser extends Transform {
  constructor(options = {}) {
    options.objectMode = true; // emit full lines as strings
    super(options);
    this._buffer = '';
    this._paused = false;
  }

  _transform(chunk, encoding, callback) {
    // 1. Guard against post-destroy processing
    if (this.destroyed) {
      return callback();
    }

    this._buffer += chunk.toString();
    const lines = this._buffer.split('\n');
    // Keep the last (potentially incomplete) piece in buffer
    this._buffer = lines.pop();

    for (const line of lines) {
      const processed = this._processLine(line);
      const shouldContinue = this.push(processed);
      if (!shouldContinue) {
        // Backpressure: stop processing and wait for drain
        this._paused = true;
        this.once('drain', () => {
          this._paused = false;
          this._flushBuffer(callback);
        });
        return; // don't call callback yet
      }
    }

    callback();
  }

  _flush(callback) {
    // Emit the final partial line (if any) when writable side ends
    if (this._buffer.length > 0) {
      this.push(this._processLine(this._buffer));
      this._buffer = '';
    }
    callback();
  }

  _processLine(line) {
    // Example processing: trim and uppercase
    return line.trim().toUpperCase();
  }

  _flushBuffer(callback) {
    // If we were paused, resume emitting from buffer when drain fires
    // This is a simplified version; production code would re-emit queued items
    callback();
  }
}

// Usage
const { pipeline } = require('stream/promises');
const fs = require('fs');

async function processFile(inputPath, outputPath) {
  const readable = fs.createReadStream(inputPath, { encoding: 'utf8' });
  const writable = fs.createWriteStream(outputPath);
  const parser = new LineParser();

  await pipeline(readable, parser, writable);
  console.log('File processed');
}
Output
File processed
Always Implement _flush()
If your Transform buffers data across chunks, _flush() is the only place to emit the final partial piece. Forgetting it causes silent data loss at the end of every stream.
Production Insight
Synchronous callback in _transform() breaks backpressure — upstream never pauses and memory grows unbounded.
Missing _flush() loses final data — a bug that only appears on the last chunk of every stream.
Rule: always guard with this.destroyed, respect push() return value, and implement _flush() for any buffering Transform.
Key Takeaway
Custom Transforms break backpressure unless _transform() respects push() return value.
Always guard with this.destroyed and implement _flush() for buffering Transforms.
Production Transforms must be tested under asymmetric I/O speeds — not just unit tests.
Transform Implementation Checklist
IfDoes your Transform buffer data across chunks?
UseImplement _flush() to emit the final partial chunk
IfAre you calling callback() synchronously even when push() returned false?
UseImplement pause/drain cycle: save state, wait for drain, then call callback
IfCould the synchronous callback be called after stream is destroyed?
UseAdd if (this.destroyed) return callback() at the top of _transform()
● Production incidentPOST-MORTEMseverity: high

The 2 AM OOM Kill: How a Missing drain Handler Crashed Our Upload Service

Symptom
Upload service container killed by the OOM killer (exit code 137) under moderate load. RSS climbed linearly from 120 MB to 3.8 GB in under 90 seconds. No error logs appeared — just a silent SIGKILL. The first signal in the dashboards was the container restart counter incrementing, not any application-level error. By the time the on-call engineer connected, three nodes had already cycled.
Assumption
The team assumed pipe() handles all flow control automatically and that Node.js Streams are always memory-safe by default. This is a reasonable assumption if you have only read the documentation without implementing streams under asymmetric I/O conditions. The code had been in production for months handling normal upload volumes without incident — which made the assumption feel validated.
Root cause
The writable destination — an S3 multipart upload stream from the AWS SDK — could only accept data at roughly 5 MB/s due to network throughput constraints. The readable source — a fast local NVMe SSD — could produce data at roughly 500 MB/s. The pipe() call internally paused the readable when write() returned false, which was correct. The problem was in the custom Transform stream sitting between them. The Transform's _transform() method called its callback immediately without waiting for the underlying S3 stream to drain. This broke the backpressure chain at exactly the wrong point: the Transform kept accepting chunks from the readable, calling its callback, triggering the readable to continue, but never signalling the readable to actually pause. The S3 stream was a 100:1 speed mismatch away, and the Transform was buffering everything in between with no bound. Total accumulation rate: approximately 40 MB/s of unreachable but referenced Buffer memory.
Fix
Replaced the manual Transform wrapper with pipeline() from stream/promises, which enforces backpressure across the entire chain including async generators. Ensured the Transform's _flush() method awaited the underlying S3 stream's drain event before calling its callback. Added a highWaterMark of 16 KB on the Transform to limit in-flight chunks. Added RSS and external memory monitoring to the health endpoint, with a 500 MB RSS threshold that returns HTTP 503 — giving the load balancer visibility into memory pressure before the OOM killer acts.
Key lesson
  • pipe() only propagates backpressure to direct neighbours — wrapping a stream in a Transform breaks the chain unless the Transform correctly propagates write() return values all the way through
  • Always test upload paths under asymmetric speed conditions — a fast local SSD and a slow S3 stream is not an edge case, it is the production reality for any service that accepts uploads
  • Monitor RSS, not just heap — Buffer memory lives outside V8's garbage collector and will not show up in heap snapshots or heapUsed metrics
  • Add memory-based health check thresholds that return 503 before the OOM killer fires — the load balancer cannot shed load if the process gives no signal that it is in trouble
Production debug guideDiagnose stream and buffer issues in production Node.js services5 entries
Symptom · 01
RSS grows linearly but heap stays flat
Fix
Check for Buffer accumulation — run node --expose-gc -e "setInterval(()=>{global.gc();console.log(process.memoryUsage())}, 5000)" and watch the external and rss values over time. If external climbs after each GC cycle, Buffers are being allocated and not released. Cross-reference with stream activity — if RSS growth correlates with upload or download volume, the backpressure chain is broken somewhere in the pipeline.
Symptom · 02
Writable stream never fires the finish event
Fix
Verify the final callback in _write() is being called on every code path — add a temporary console.log immediately before each callback() invocation in your _write() and _transform() methods. The most common cause is a code path that returns early without calling the callback, which hangs the stream indefinitely. Also check whether the stream's end() method was called on the writable side — if the readable ended but end() was never called, finish will not fire.
Symptom · 03
pipe() stops transferring data mid-stream with no error
Fix
Check if write() returned false and the readable was paused but never resumed. Listen for the drain event on the writable to confirm backpressure engaged: writable.on('drain', () => console.log('drained')). If drain never fires, the writable may be stalled waiting for an underlying resource — a network socket, a slow disk, or a rate-limited API. Check process.memoryUsage().external to confirm whether data is accumulating in memory rather than flowing.
Symptom · 04
Transform stream emits data after destroy() was called
Fix
Guard _transform() with a destroyed check at the top: if (this.destroyed) return callback(). The destroy() method sets the destroyed flag but does not immediately abort in-flight _transform() calls that are already executing. Without this guard, chunks queued before destruction will still be processed and pushed, which can cause downstream errors or unexpected behavior on already-closed streams.
Symptom · 05
ERR_STREAM_PREMATURE_CLOSE in production logs
Fix
A stream in a pipeline was destroyed before all data was flushed — most commonly because a client disconnected mid-upload or mid-download. Use pipeline() with async/await and catch the specific error code to handle this gracefully rather than letting it propagate as an unhandled rejection. For upload services, distinguish ERR_STREAM_PREMATURE_CLOSE from other errors so you can clean up partial S3 multipart uploads without logging a false alarm.
★ Streams and Buffers — Quick Debug ReferenceRapid diagnostics for stream and memory issues in production Node.js applications
Memory leak suspected from Buffers
Immediate action
Capture heap snapshot and check Buffer count — but remember external memory will not appear in the snapshot
Commands
node --inspect app.js
node -e "const v8=require('v8');v8.writeHeapSnapshot()"
Fix now
Replace Buffer.allocUnsafe with Buffer.alloc in user-facing paths. Ensure Buffers are dereferenced after use. Instrument process.memoryUsage().external in your health endpoint and watch for growth that correlates with stream activity.
Stream hangs — no data flowing+
Immediate action
Check if the stream is stuck in paused mode with no consumer pulling data
Commands
node -e "console.log(readableStream.readableFlowing)"
node -e "readableStream.on('data', chunk => console.log(chunk.length))"
Fix now
Call readableStream.resume() or attach a data listener to switch to flowing mode. If the stream was piped and then unpiped without cleanup, it may be stuck in a partially-flowing state — destroy it and recreate from the source.
High CPU usage from stream processing+
Immediate action
Check if synchronous operations inside _transform() are blocking the event loop for meaningful durations
Commands
node --prof app.js
node --prof-process isolate-*.log
Fix now
Move CPU-heavy work to a worker thread using worker_threads. For processing that can be chunked, break it into async pieces using setImmediate() to yield back to the event loop between chunks. Verify the event loop lag metric using perf_hooks.monitorEventLoopDelay to quantify the blocking duration.
Backpressure not working — consumer overwhelmed+
Immediate action
Verify write() return value is being checked, or switch to pipeline() which handles this automatically
Commands
node -e "const ws=require('fs').createWriteStream('/dev/null');console.log(ws.write(Buffer.alloc(65537)))"
node -e "console.log(require('stream').getDefaultHighWaterMark(false))"
Fix now
Use pipeline() from stream/promises instead of manual pipe() or manual event wiring. If you are inside a custom Transform, ensure _transform() only calls its callback after the downstream stream has had time to drain — calling callback immediately breaks the backpressure chain.
Streams vs Buffers vs Pipe vs Pipeline
FeatureStreamBufferpipe()pipeline()
What it isAsync iterator over data chunksFixed-size binary containerMethod to connect streamsProduction-safe stream chaining
Memory locationInternal buffer (V8 C++ layer)External memory (libuv)N/AN/A
BackpressureBuilt-in via highWaterMarkN/A (fixed size)Automatic between direct neighboursAutomatic across entire chain
Error propagationEmits 'error' eventN/ADoes not propagate errorsPropagates all errors; destroys all streams
Resource cleanup on errorDepends on consumerN/ADoes not clean upAuto-destroys all streams
Async generator supportN/AN/ANoYes (Node 18+)
Production recommendationUse pipeline() instead of manual pipeUse Buffer.alloc for security, allocUnsafe for performanceAvoid in productionAlways use for multi-stream operations

Key takeaways

1
Streams process data in chunks without loading everything into memory
enabling handling of files larger than RAM.
2
Buffers live outside V8's heap
heap profilers won't reveal their memory usage; monitor external and RSS instead.
3
Backpressure is cooperative, not automatic
ignoring write() returning false leads to unbounded memory growth.
4
highWaterMark is advisory
the stream will still accept data beyond it if the producer ignores the signal.
5
pipe() does not propagate errors
always use pipeline() from stream/promises in production.
6
Custom Transform streams break backpressure if _transform() calls callback() synchronously without checking push() return value.
7
Always guard _transform() with a destroyed check and implement _flush() for buffering Transforms.

Common mistakes to avoid

5 patterns
×

Using pipe() without error handling

Symptom
Under failure conditions, streams are left open and file descriptors leak; process eventually hits EMFILE limits.
Fix
Replace every pipe() call with pipeline() from stream/promises. If you must use pipe(), attach error listeners to every stream and destroy() all manually on error.
×

Calling _transform() callback synchronously without checking backpressure

Symptom
Memory grows unbounded under asymmetric I/O speeds; OOM kill after sustained load.
Fix
In _transform(), check the return value of this.push(). If false, wait for the drain event before calling the callback. Or use pipeline() with async generators to delegate backpressure handling.
×

Using Buffer.allocUnsafe for user-facing data

Symptom
Information disclosure — previous allocation contents (e.g., other users' session tokens) may be exposed.
Fix
Use Buffer.alloc() for any data that will be sent to clients or stored with user input. Reserve Buffer.allocUnsafe for internal buffers that are immediately overwritten.
×

Monitoring only heapUsed for memory leaks

Symptom
RSS climbs to 2 GB but heapUsed stays at 50 MB — leak is invisible to heap profilers.
Fix
Monitor process.memoryUsage().external and .rss in addition to heapUsed. Set alerts on RSS growth rate. Take heap snapshots with --expose-gc and manual GC to see if external memory is unreclaimed.
×

Forgetting _flush() in a custom Transform

Symptom
Last chunk of data is silently lost — only noticeable when processing files that end with a newline and the final line is missing.
Fix
Always implement _flush() in any Transform that accumulates data. Emit the final partial chunk there.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
Explain the concept of backpressure in Node.js Streams. How does highWat...
Q02SENIOR
Why might a heap snapshot show only 50 MB while RSS is 2 GB? How would y...
Q03SENIOR
What is the difference between pipe() and pipeline() in Node.js? When wo...
Q04SENIOR
Describe the internal state machine of a Readable stream. What does read...
Q05SENIOR
How does Buffer.allocUnsafe differ from Buffer.alloc, and what are the i...
Q01 of 05SENIOR

Explain the concept of backpressure in Node.js Streams. How does highWaterMark influence it?

ANSWER
Backpressure is a cooperative flow-control mechanism between a Readable and a Writable stream. When the Writable's internal buffer exceeds highWaterMark (default 16 KB for binary streams), write() returns false, signalling the producer to pause. The correct producer behaviour is to stop writing and wait for the drain event before resuming. highWaterMark is not a hard limit — if the producer ignores the false return, the buffer keeps growing in memory until OOM. pipe() and pipeline() automate this protocol, but custom Transforms can break the chain by calling _transform() callback immediately without waiting for downstream drain.
FAQ · 5 QUESTIONS

Frequently Asked Questions

01
What is the default highWaterMark for a Node.js stream?
02
Can a process run out of memory because of a stream?
03
How do I check if a stream is in flowing mode?
04
What does ERR_STREAM_PREMATURE_CLOSE mean?
05
Why does my custom Transform lose the last chunk of data?
🔥

That's Node.js. Mark it forged?

9 min read · try the examples if you haven't

Previous
Authentication with JWT in Node.js
8 / 18 · Node.js
Next
Node.js Event Emitter