heapdump vs Clinic.js vs node --inspect for Node.js Memory

Choosing between the heapdump module, node --inspect with Chrome DevTools, and Clinic.js is really a question of what access you have to the process and what you need to see, and it sits directly beneath the broader Node.js memory tooling workflow within Node.js Server-Side Memory Management.

Symptom-to-Fix Diagnostic Matrix

Symptom Root Cause Immediate Action
Can’t decide which tool to reach for first No shared decision rule across the team Classify by access model, then overhead
heapdump fails to npm install in CI Native node-gyp build, no compiler in image Switch to built-in v8.writeHeapSnapshot()
--inspect port left open after debugging Tunnel or flag not torn down post-incident Kill the SSH tunnel; restart without --inspect
Clinic.js report shows no clear heap trend Doctor is a classifier, not an attributor Pull a .heapsnapshot; Doctor alone won’t attribute
Snapshot capture adds seconds of latency Full GC + serialize blocks the event loop Only capture on a drained, load-balancer-pulled node

Root Cause

The three tools exist because they solve different halves of the same problem and require different levels of access to the running process. heapdump (and its built-in equivalent, v8.writeHeapSnapshot()) answers “what is currently retained” — it needs nothing more than the ability to send a signal or call a function inside the process, which makes it the only option on a locked-down production host where you have a shell but no network path to a debug port. node --inspect answers the much broader question “let me look inside this running process interactively” — stepping through code, evaluating expressions, watching allocations live — but it requires an open inspector protocol port, which is a serious attack surface if reachable from outside 127.0.0.1. Clinic.js sits upstream of both: its Doctor command answers “do I even have a heap problem, or is this event-loop delay or CPU”, by sampling process.memoryUsage(), active handles, and loop lag under generated load and rendering synchronized charts, while its HeapProfiler subcommand takes a lower-overhead sampling profile of allocation sites without a full stop-the-world snapshot.

Overhead scales in the opposite direction from access requirements. A signal-triggered .heapsnapshot write is a single blocking pause — expensive per-capture but otherwise silent. --inspect with the Memory panel open adds continuous instrumentation while attached, changing GC timing for as long as the DevTools session is live. Clinic.js wraps your process with sampling hooks for the full duration of the run, which is the highest sustained overhead of the three but the only one that produces a synchronized, correlated view across memory, CPU, and event-loop delay rather than one signal in isolation. Picking the wrong one for the access you actually have is why teams either open a debug port they didn’t need to, or reach for a full snapshot when Doctor alone would have told them the problem was elsewhere — a mistake that is especially costly on the kind of per-request heap growth described in SSR heap exhaustion and per-request memory, where the wrong capture tool can cost you the exact request window you needed to observe.

Choosing a Node.js Memory Capture Tool A flow starting from "what access do you have", branching to three tools: heapdump or v8.writeHeapSnapshot for signal-only shell access, node --inspect with DevTools for an available debug port and interactive need, and Clinic.js Doctor for classifying the fault before either capture method, each annotated with overhead and output type. What access do you have? shell only / debug port / neither yet heapdump / writeHeapSnapshot Access: shell + signal only Overhead: one blocking pause Output: .heapsnapshot file Use: attribute a confirmed leak to a retainer chain node --inspect Access: debug port + tunnel Overhead: while attached Output: live DevTools session Use: interactive step-through, timeline recording Clinic.js Doctor / HeapProfiler Access: wrap process at launch Overhead: sustained, whole run Output: synchronized HTML report Use: classify heap vs loop-delay before deep capture Typical order: classify, then capture Doctor first to confirm a heap trend, then heapdump or --inspect to attribute it to a retainer chain

Step-by-Step Fix

Work through these steps to pick the right tool the first time, instead of opening a debug port you didn’t need or waiting on a full snapshot when a lighter check would do.

Step 1 — Establish what access you actually have

Action: check whether you have an existing shell/SSH session to the host (signal access) or whether policy allows opening an inbound port for --inspect (network access). Many locked-down production environments allow the former but forbid the latter entirely.

Expected checkpoint: you can answer, in one sentence, “I can kill -USR2 this process” or “I can SSH-tunnel to its debug port” — if neither is true, stop and get access approved before profiling.

Step 2 — Classify the fault before capturing anything

Command: npx clinic doctor --on-port \ 'autocannon -d 30 localhost:$PORT' -- node server.js

Expected output: an HTML report with a verdict banner. If it says “Detected memory issue” with a steadily climbing memory chart, proceed to a snapshot. If it flags event-loop delay instead, redirect to interpreting heap snapshots only once memory is the confirmed fault, not before.

Step 3 — Capture with the lowest-overhead tool your access allows

If you have shell access only: node --heapsnapshot-signal=SIGUSR2 server.js, then kill -USR2 <pid>. Expected output: a .heapsnapshot file appears in the working directory within seconds.

If you have network access and need interactivity: node --inspect=127.0.0.1:9229 server.js, then tunnel with ssh -L 9229:localhost:9229 host. Expected output: chrome://inspect shows the target under “Remote Target” once the tunnel is up.

Step 4 — Read the output with the matching workflow

DevTools path for snapshots: DevTools → Memory → Heap Snapshot → Comparison view, load both files, sort by # Delta — the exact sequence is walked through in take & compare heap snapshots. Expected checkpoint: a constructor with high positive delta and near-zero deletions is your candidate.

For a Clinic.js report, expected checkpoint: the memory chart’s floor visibly ratchets upward across load batches — a flat, recovering floor means no leak was reproduced yet.

Step 5 — Close the access you opened

Action: kill the SSH tunnel, stop any process launched with --inspect, and confirm with lsof -i :9229 (or your platform equivalent) that nothing is still listening. Expected checkpoint: the port returns “connection refused” from outside the host.

Command & Code Reference

Use this reference to trigger a snapshot from your own signal handler when you need control over the filename or an upload step that the bare flag cannot give you.

const v8 = require('v8');     // built-in, ships with Node.js
const path = require('path');

// Sync write: forces a full GC, then serializes the heap.
function writeSnapshot(tag) {
  const dir = process.env.SNAPSHOT_DIR
    || '/var/tmp';             // must be writable at runtime
  const file = path.join(
    dir,
    `heap-${tag}-${process.pid}.heapsnapshot`
  );
  v8.writeHeapSnapshot(file); // blocks the event loop!
  return file;                // hand off to an uploader here
}

// SIGUSR2 is the conventional signal; confirm it is free on
// this host before relying on it (nodemon also claims it).
process.on('SIGUSR2', () => {
  const file = writeSnapshot('sigusr2');
  console.error(`heap snapshot: ${file}`); // stderr, not stdout
});

Use this reference to open --inspect only over a tunnel, never on a routable interface, then attach DevTools for an interactive session.

# On the server: bind the inspector to loopback only.
node --inspect=127.0.0.1:9229 server.js

# From your machine: forward a local port over SSH.
ssh -N -L 9229:localhost:9229 deploy@prod-host

# Then in Chrome: chrome://inspect -> Configure ->
# add "localhost:9229" -> Open dedicated DevTools for Node.

Use this reference as a shared npm-script contract so the whole team runs Clinic.js the same way instead of memorising flag combinations.

{
  "scripts": {
    "profile:doctor": "clinic doctor -- node server.js",
    "profile:heap":
      "clinic heapprofiler -- node server.js"
  }
}

Verification & Regression Prevention

Confirm the chosen tool actually answered the question before closing the incident:

Metric Target after verification
Snapshot capture stall on affected host Under 3 s, on a drained instance only
--inspect port reachability from outside host Refused; tunnel torn down
Clinic.js verdict re-run post-fix No memory-issue banner
heapUsed floor across 3 load batches Within ±5% of baseline

Add a lightweight CI/CD guard so a debug port never ships live: fail the deploy pipeline if --inspect appears in a production start script, and add a monitoring alert on process.memoryUsage().heapUsed that pages when the floor exceeds 75% of --max-old-space-size across two consecutive samples, which is the trigger that should send someone to this decision flow in the first place rather than straight to a snapshot.

# CI check: fail the build if --inspect leaks into prod scripts
grep -rn -- '--inspect' package.json Dockerfile \
  && { echo "Remove --inspect from prod entrypoints"; exit 1; } \
  || echo "OK: no --inspect in production entrypoints"

Frequently Asked Questions

Is it safe to run node --inspect directly against a production process?

Only if the debug port is never exposed to the public network. --inspect binds to 127.0.0.1 by default, which is safe as long as you reach it through an SSH tunnel rather than binding --inspect=0.0.0.0. An open inspector port grants full code execution in the process, so treat it as equivalent to a root shell.

Does Clinic.js replace the need for a heap snapshot?

No. Clinic.js Doctor classifies whether you have a heap, event-loop, or CPU problem from sampled metrics; it does not show individual objects or retainer chains. Once Doctor confirms a heap trend, you still need a .heapsnapshot from heapdump, v8.writeHeapSnapshot, or the HeapProfiler subcommand to attribute the growth to specific constructors.

Which of the three has the lowest overhead for continuous production monitoring?

None of them is safe to run continuously at full capability. For always-on monitoring, sample process.memoryUsage() on an interval instead — it costs microseconds. Reserve heapdump, --inspect, and Clinic.js Doctor for on-demand, time-boxed investigation once that lightweight sampling flags an anomaly.