Diagnosing Unbounded Buffering in Node.js Streams

Resident set size climbs steadily and never plateaus while a Node.js process pipes data from a fast source into a slower destination — a specific failure mode within the broader topic of stream backpressure & memory growth, part of Node.js Server-Side Memory Management.


Symptom-to-fix diagnostic matrix

Symptom Root Cause Immediate Action
RSS climbs linearly during file/socket copy write() return value ignored Check return value, pause on false
writableLength grows every tick No 'drain' listener registered Add drain handler before resuming
Fast DB read, slow HTTP write leaks memory Manual data handler with no pause Use stream.pipeline() instead
Custom transform buffers unboundedly Side array used instead of stream API Push through this.push() only
Works locally, OOMs under real traffic highWaterMark too high for payload size Lower highWaterMark, measure again

Root Cause

Every writable stream in Node.js keeps an internal buffer of chunks that have been accepted via .write() but not yet flushed to the underlying resource — disk, socket, or downstream consumer. That buffer has a configurable soft ceiling, highWaterMark, and once the buffered byte count crosses it, .write() returns false as a signal: “stop sending more data until I emit 'drain'.” This return value is the entire backpressure contract on the writable side. Nothing in the stream internals forces a producer to respect it — if the calling code ignores the boolean and keeps calling .write() on every 'data' event from a fast readable, the buffer keeps accepting chunks and growing without bound, exactly as covered generally in stream backpressure & memory growth.

This is different from a classic reference leak. There is no forgotten closure or detached object here — every buffered chunk is legitimately reachable and pending a real I/O operation. The heap is doing exactly what it was told: hold data until the destination catches up. The bug is that the destination never gets a chance to signal “I’m behind” in a way the producer listens to. You can confirm this distinction by watching writable.writableLength (bytes currently buffered) alongside process.memoryUsage().rss — in a genuine backpressure leak, both climb in lockstep with the number of chunks processed, not with time or request count, which rules out a slow leak elsewhere and points squarely at the write path.

The most common trigger is a manual readable.on('data', chunk => writable.write(chunk)) handler. The 'data' event fires as fast as the readable can produce chunks — from a fast disk read, a database cursor, or a decompression stream — with no built-in throttle. If the destination is a network socket, a slow downstream service, or another transform doing CPU-bound work, its write buffer fills far faster than it drains. Async iterators and stream.pipeline() avoid this specific failure because they only pull the next chunk once the previous write has resolved, but a hand-rolled data/write pair has no such coupling unless you add it yourself with an explicit pause/drain cycle.

The diagram below contrasts the two paths: a producer that honours the false return value against one that ignores it.

Node.js stream backpressure: respected vs ignored write() signal Two parallel timelines. The left timeline shows a readable emitting a chunk, writable.write() being called, a false return being checked, the readable pausing, and a drain event resuming it — buffer stays bounded. The right timeline shows the same write() call but its return value discarded, so the readable never pauses, writableLength keeps growing every tick, and RSS climbs without limit. Return value checked — bounded Readable emits fast chunk writable.write(chunk) called Return value === false checked Readable paused, wait for drain 'drain' fires, reading resumes writableLength stays near highWaterMark Return value ignored — unbounded Readable emits fast chunk write(chunk) called, return discarded No pause — next chunk pushed anyway writableLength grows every tick RSS climbs without limit eventual OOM under real load Fix: check write(), or use pipeline()/for await

Step-by-Step Fix

Step 1 — Reproduce the growth with instrumentation

Action: Before touching any code, confirm the shape of the problem. Run the affected pipe under load and log writable.writableLength and process.memoryUsage().rss on an interval.

Expected output: Both values increase steadily and do not fall back down between chunks, even while 'data' events keep arriving normally.

Verification checkpoint: If writableLength oscillates and returns near zero periodically, backpressure is already working and the growth is elsewhere — check for a leak in a separate part of the code rather than continuing this fix.


Step 2 — Find where the write() return value is ignored

Action: Search the codebase for .write( calls whose boolean result is not checked, particularly inside readable.on('data', ...) handlers.

Expected output: A shortlist of call sites shaped like readable.on('data', c => writable.write(c)), where the return value has nowhere to go.

Verification checkpoint: Each flagged site should have no accompanying if (!ok) branch and no matching 'drain' listener anywhere in the same module.


Step 3 — Pause the source and wait for drain

Action: Capture the return value of .write(). When it is false, call readable.pause() immediately, then call readable.resume() only inside the writable’s 'drain' event handler.

Expected output: writableLength now rises to highWaterMark, dips as the destination flushes, and repeats in a bounded sawtooth instead of a straight climb.

Verification checkpoint: Chrome DevTools is not applicable to a headless Node process here — instead confirm in the terminal log from Step 1 that RSS growth stops increasing and settles into a stable band after the fix.


Step 4 — Prefer pipeline() or async iteration over manual loops

Action: Where the manual data/write pattern exists only to move bytes from A to B with no custom logic, replace it entirely with stream.pipeline(readable, writable, callback) or a for await (const chunk of readable) { ... } loop that awaits each write.

Expected output: The manual pause/resume/drain bookkeeping from Step 3 is no longer needed — pipeline() and async iteration handle it internally.

Verification checkpoint: pipeline()'s callback fires exactly once with null on success, or with the underlying error on failure — confirm no unhandled 'error' events are still attached separately on either stream.


Step 5 — Verify writableLength stays bounded under load

Action: Re-run the same load test used in Step 1 against the fixed code path, ideally with node --max-old-space-size=2048 set explicitly so the process fails predictably instead of being killed by the OS if the fix is incomplete.

Expected output: writableLength stays within a small multiple of highWaterMark (default 16384 bytes for streams in object mode off, or 16 objects in object mode) for the entire test duration.

Verification checkpoint: RSS reported by process.memoryUsage().rss plateaus within the first few seconds of sustained load and stays flat regardless of how long the test runs afterward.


Command & Code Reference

Manual pause/drain handling for a producer that must stay in 'data'-event mode:

// copy-with-backpressure.js — respects write()'s return value
const { createReadStream, createWriteStream } = require('fs');

const src = createReadStream('./big-input.log');
const dst = createWriteStream('./big-output.log');

src.on('data', (chunk) => {
  // write() returns false once highWaterMark is exceeded
  const ok = dst.write(chunk);
  if (!ok) {
    src.pause(); // stop pulling more data from the source
  }
});

dst.on('drain', () => {
  src.resume(); // destination caught up, safe to keep reading
});

src.on('end', () => dst.end());

Replacing the manual loop above with pipeline(), which is the recommended approach whenever no custom per-chunk logic is needed:

// copy-with-pipeline.js — backpressure handled internally
const { pipeline } = require('stream/promises');
const { createReadStream, createWriteStream } = require('fs');

async function copyFile(inPath, outPath) {
  // pipeline() awaits each write before pulling the next chunk
  await pipeline(
    createReadStream(inPath),
    createWriteStream(outPath),
  );
  // Rejects with the first stream error; no manual cleanup needed
}

copyFile('./big-input.log', './big-output.log')
  .then(() => console.log('copy complete, buffer stayed bounded'))
  .catch((err) => console.error('pipeline failed:', err));

Logging buffer and memory growth to confirm the fix before and after, using only process.memoryUsage() and writableLength:

// measure-buffer.js — attach to any writable under test
function watchBuffer(writable, label) {
  setInterval(() => {
    const mb = (process.memoryUsage().rss / 1048576).toFixed(1);
    // writableLength is bytes buffered but not yet flushed
    console.log(
      `${label} rss=${mb}MB ` +
      `writableLength=${writable.writableLength}B`,
    );
  }, 500).unref(); // unref so this timer never blocks exit
}

Verification & Regression Prevention

Metric Before fix Target after fix
writableLength under sustained load Climbs unbounded Bounded near highWaterMark
RSS during 60s load test Linear climb, no plateau Plateaus within a few seconds
write() calls with unchecked return Present in hot path Zero, or replaced by pipeline()

Add a lightweight ESLint rule to catch new regressions before they reach production, flagging discarded .write() return values:

// .eslintrc.js — flag write() calls with no captured result
module.exports = {
  rules: {
    // eslint-plugin-node's handle-callback-err covers callbacks;
    // pair it with a local rule requiring `if`/assignment around
    // any `.write(` call inside a `readable.on('data', ...)`
    // handler, or prefer stream/promises' pipeline() outright.
    'no-unused-expressions': ['error', { allowTernary: false }],
  },
};

For CI, add a monitoring threshold rather than only a unit test: run the affected pipe against a fixture file large enough to exceed highWaterMark many times over, and assert that process.memoryUsage().rss measured at the end of the run is within, say, 1.5x its value taken 2 seconds into the run — a flat ratio confirms the buffer never ran away. This pairs well with the general diagnosis techniques in diagnosing Node.js memory with heapdump & Clinic.js when the RSS growth is subtler than a straightforward stream copy, and with a broader budget check against Node.js memory limits & out-of-heap errors so the CI gate fails before a real deployment ever hits --max-old-space-size.


Frequently Asked Questions

Does raising highWaterMark fix the memory growth?

No — it only raises the ceiling before backpressure kicks in. If the code ignores write()'s return value and never waits for 'drain', the internal buffer will still grow past whatever highWaterMark you set, because nothing is enforcing the limit; highWaterMark is a threshold for signalling, not a hard cap on memory.

Why does writableLength keep growing even though write() sometimes returns true?

write() returns true whenever the buffer is below highWaterMark at that instant, but a fast producer can call write() many times per event-loop tick, faster than the writable can drain. Each individual call may report true while the cumulative writableLength still climbs, because nothing is pausing the producer between calls.

Do async iterators need manual backpressure handling like .pipe() does?

No. A for await...of loop over a readable stream, or piping into a writable via stream.pipeline(), already awaits internal readiness before pulling the next chunk. The main risk with async iterators is inside the loop body itself — an unawaited async write or a side-channel buffer (like pushing chunks into an array) reintroduces the same unbounded growth manually.