Fixing Memory Leaks in Next.js Server Rendering
If your Next.js server’s resident set climbs steadily under sustained traffic and only a restart brings it back down, you are almost certainly looking at SSR heap exhaustion caused by state that outlives the request it was created for, a pattern covered more broadly in Node.js Server-Side Memory Management.
| Symptom | Root Cause | Immediate Action |
|---|---|---|
| RSS climbs per request, never drops | Module-scope Map/array grows | Grep for top-level new Map() |
heapUsed grows on flat traffic |
Data Cache keys never expire | Add revalidate to fetch calls |
| Leak starts after N unique users | Per-user singleton at module scope | Move client into request scope |
| OOM under concurrent SSR requests | Full HTML buffered in old space | Stream with renderToPipeableStream |
Root Cause
Next.js in a self-hosted node server.js deployment (and in many serverless “warm container” configurations) reuses the same Node.js process, and therefore the same module registry, across many HTTP requests. Anything declared at module scope — a top-level const cache = new Map(), a database client built outside a handler, a let requestLog = [] used for debugging — is created exactly once when the module is first require’d or imported, and then lives for the entire lifetime of the process. Every request that touches it adds to the same structure. Nothing about a browser tab’s per-navigation cleanup applies here: there is no unmount event, no route change, no automatic release. This is the same class of problem as SSR heap exhaustion and per-request memory in general, but Next.js specifically layers two extra long-lived structures on top of the developer’s own module scope: the Data Cache (the deduplicated fetch() result store used by React Server Components) and the Full Route Cache (rendered RSC payloads kept for static and ISR routes).
Both caches are intentional and bounded by design — entries are meant to be evicted by a revalidate window, an explicit revalidateTag()/revalidatePath() call, or process memory pressure. The leak appears when a route opts into fetch(url, { cache: 'force-cache' }) or unstable_cache() with a dynamic cache key (a user ID, a search query, a timestamp) and no revalidate value. Every unique key becomes a permanent entry; the cache key space grows without bound instead of the intended small, stable set of keys. Combine that with a module-scope singleton — a rate limiter Map keyed by IP, an in-memory session store, an array used to buffer per-request analytics events before a batch flush that never triggers — and old-space growth becomes monotonic. Once you tune --max-old-space-size upward to buy time, you have only delayed the same crash, exactly as with any structural leak covered in Node.js memory limits and out-of-heap errors.
The diagram below shows why this differs from a normal request lifecycle: request-scoped objects are reclaimed by the next minor GC once the response is sent, but anything reachable from the module registry survives every collection because it is rooted outside the request.
Step-by-Step Fix
Step 1 — Reproduce the growth under repeated load
Start the server with a deliberately low ceiling so the leak surfaces quickly, then hammer one route.
# Lower the ceiling so a leak surfaces in minutes.
node --max-old-space-size=512 server.js &
SERVER_PID=$!
# Hit a single SSR route 20,000 times at moderate concurrency.
npx autocannon -c 20 -a 20000 \
http://localhost:3000/products/42
Expected output / verification: poll process.memoryUsage().rss every 5 seconds during the run. A healthy server plateaus after a warm-up window; a leaking server shows RSS climbing near-linearly with request count and never levelling off.
Step 2 — Diff heap snapshots taken before and after load
# Enable on-demand snapshots without attaching a debugger.
node --heapsnapshot-signal=SIGUSR2 server.js &
kill -USR2 $! # baseline, before load
# ...run the autocannon load from step 1...
kill -USR2 $! # peak, after load
Expected output / verification: two .heapsnapshot files appear in the working directory. Open DevTools → Memory → Heap Snapshot → Load for both, switch to Comparison view, and sort by Retained Size (delta). A module-scope leak shows a single Map or Array constructor with a delta count matching the request count almost exactly.
Step 3 — Locate module-scope retainers in the code
# Search for collections declared at module top level, outside
# any function/handler — the classic leak signature.
grep -rn "^const.*=\s*new Map()" ./app ./lib
grep -rn "unstable_cache(" ./app
Expected output / verification: every match should be reviewed. A Map declared at module scope with no corresponding .delete() or size cap anywhere in the file is the retainer confirmed by step 2’s Retainers tree.
Step 4 — Bound or scope the offending state
Replace the unbounded structure with a size-capped LRU cache, or move the data inside the request handler so it becomes request-scoped and GC-eligible after the response is sent (see the code reference below for both patterns).
Expected output / verification: re-run the heap snapshot diff from step 2. The same constructor’s retained size delta should now stay flat regardless of request count, converging to maxEntries × averageEntrySize.
Step 5 — Verify RSS plateaus under sustained load
Re-run the autocannon load test from step 1 for at least 5 minutes.
Expected output / verification: process.memoryUsage().rss should rise during warm-up (JIT compilation, initial cache population) and then hold within a ±10% band for the remainder of the run — no further linear growth.
Command & Code Reference
Bounding a module-scope cache with an LRU eviction policy instead of an unbounded Map, so entries are capped regardless of unique key volume.
// lib/response-cache.js — module scope, but now size-bounded.
const MAX_ENTRIES = 500; // cap keeps memory predictable
class BoundedCache {
constructor(max) {
this.max = max;
this.map = new Map(); // insertion order == LRU order
}
get(key) {
if (!this.map.has(key)) return undefined;
const val = this.map.get(key);
this.map.delete(key); // move to most-recent position
this.map.set(key, val);
return val;
}
set(key, val) {
if (this.map.has(key)) this.map.delete(key);
else if (this.map.size >= this.max) {
// evict the oldest (first) entry once the cap is hit
this.map.delete(this.map.keys().next().value);
}
this.map.set(key, val);
}
}
// One instance per process, but bounded — safe at module scope.
module.exports = new BoundedCache(MAX_ENTRIES);
Fixing an unbounded fetch Data Cache key space by pinning a revalidate window instead of force-cache with a dynamic key.
// app/products/[id]/page.js — React Server Component
export default async function ProductPage({ params }) {
// BAD: force-cache + dynamic key => one cache entry per
// product id, forever, with no expiry.
// const res = await fetch(`https://api/products/${params.id}`,
// { cache: 'force-cache' });
// GOOD: revalidate bounds the entry's lifetime so stale
// keys are evicted and replaced instead of accumulating.
const res = await fetch(
`https://api/products/${params.id}`,
{ next: { revalidate: 60 } } // seconds until refetch
);
const product = await res.json();
return <ProductView product={product} />;
}
Triggering an on-demand production heap snapshot without a debugger attached, wrapped in a process-manager-friendly start command.
# --heapsnapshot-signal writes a snapshot on SIGUSR2, no
# --inspect flag or restart required.
node --heapsnapshot-signal=SIGUSR2 \
--max-old-space-size=2048 \
server.js 2>>logs/server-$(date +%Y%m%d).log &
echo $! > server.pid
# Trigger later from a deploy hook or cron:
kill -USR2 "$(cat server.pid)"
Verification & Regression Prevention
Set explicit numeric targets before closing the fix out: process.memoryUsage().rss should plateau within 15% of its warm-up value across a 30-minute, 20,000-request load test, and any module-scope Map/Array identified during triage must show a bounded retained size in a follow-up heap snapshot — not merely a slower growth rate. For Next.js Data Cache entries, confirm revalidate is set on every dynamic fetch() call by grepping the app directory for force-cache and manually justifying each remaining match.
Add a CI guard so this class of leak cannot silently return: a custom ESLint rule (or a no-restricted-syntax rule targeting top-level VariableDeclarator nodes initialised with new Map()/new Array()/[] outside any function) flags new module-scope collections for manual review. Pair it with a scheduled load-test job that runs the autocannon script from Step 1 against a staging deployment nightly and fails the pipeline if rss at the end of the run exceeds rss at the 2-minute mark by more than 20%. In production, alert when RSS growth exceeds 5% per hour sustained over 3 hours — a slope that flat CPU/traffic metrics cannot explain — using whatever process-level exporter feeds your monitoring stack (Prometheus node exporter, Datadog Agent, or a custom /healthz/memory route reporting process.memoryUsage() fields directly). For deeper triage beyond snapshots, pair this workflow with heapdump and Clinic.js tooling to get flame-graph-level attribution of allocation sites.
Frequently Asked Questions
Does restarting the Next.js server fix a memory leak temporarily?
Yes, but only as a stopgap. A restart clears the module registry and every module-scope cache built up in it, so RSS drops back to baseline immediately. The underlying retainer is still in the code, so memory climbs again on the same trajectory. Use restarts (via PM2 or a Kubernetes liveness probe on RSS) only as a safety net while you fix the actual root cause.
Is the Next.js Data Cache itself a memory leak?
Not by design. The Data Cache stores fetch results in memory (or on disk in some deployments) keyed by request signature, and entries are meant to be evicted by a revalidate window or an explicit revalidation call. It becomes a leak only when routes use force-cache with unbounded dynamic keys, or unstable_cache() with no revalidate value, so the key space grows forever without any entry ever expiring.
How do I take a heap snapshot from a production Next.js server without stopping it?
Start the process with node --heapsnapshot-signal=SIGUSR2 wrapping the Next.js server entrypoint, then send kill -USR2 <pid> to write a .heapsnapshot file to disk on demand. This avoids attaching a debugger to a live production process. Expect a multi-second event-loop pause while the snapshot serialises, so trigger it during low-traffic windows or on a canary instance.
Related
- SSR Heap Exhaustion and Per-Request Memory — parent guide covering the broader per-request retention pattern
- Caching vs Memory Bloat in SSR Data Layers — deeper treatment of bounding server-side caches
- Diagnosing Node.js Memory with heapdump & Clinic.js — tooling for attributing retained memory to specific allocation sites
- Node.js Memory Limits and Out-of-Heap Errors — main section on heap ceilings and OOM diagnosis
- Node.js Server-Side Memory Management — main section for server-side memory patterns