Skip to content

fix: memoize channel(name) to make Channel identity stable per name#27

Merged
bm1549 merged 3 commits into
mainfrom
brian.marks/memoize-channel-by-name
May 8, 2026
Merged

fix: memoize channel(name) to make Channel identity stable per name#27
bm1549 merged 3 commits into
mainfrom
brian.marks/memoize-channel-by-name

Conversation

@bm1549
Copy link
Copy Markdown
Contributor

@bm1549 bm1549 commented May 8, 2026

What does this PR do?

Memoizes dc.channel(name) by name in patch-garbage-collection-bug.js, so subsequent calls for the same name always return the same Channel object regardless of what the underlying `channel()` returns. Also adds a regression test that pins the contract.

 module.exports = function(unpatched) {
   const dc_channel = unpatched.channel;
   const channels = new WeakSet();
+  const byName = new Map();

   const dc = { ...unpatched };

   dc.channel = function() {
+    const name = arguments[0];
+    if (byName.has(name)) return byName.get(name);
     const ch = dc_channel.apply(this, arguments);
+    byName.set(name, ch);

     if (channels.has(ch)) return ch;

Motivation

The PHONY_SUBSCRIBE anti-GC patch keeps a created Channel alive, but Node's underlying channel(name) can still return a brand-new Channel object for the same name on a subsequent call (observed on Node 18.20.8 under heavy workloads — e.g. proxyquire-driven reload of a tracer + plugin manager teardown/recreate cycles). The existing tracking — `channels = new WeakSet()` keyed by Channel-object identity — can't unify those: each new object isn't in the WeakSet, so dc-polyfill treats it as new, attaches its own phony, and returns it.

The visible failure mode for callers (a common instrumentation pattern):

// instrumentation module loaded once
const ch = dc.channel('foo')
shimmer.wrap(target, 'method', () => function () {
  if (!ch.hasSubscribers) return original.apply(this, arguments) // fast path
  ch.publish({ ... })
  return original.apply(this, arguments)
})

A separate consumer doing `dc.channel('foo').subscribe(...)` later may get a different Channel instance, attach its handler there, and never see publishes from the wrap — because the wrap's closure holds the earlier Channel. With only the phony left on the publish-side Channel, hasSubscribers returns false and the fast-path bypass kicks in. Publishes go nowhere.

Why this fix lives here, not upstream in Node

The underlying bug is in Node's diagnostics_channel registry — the right architectural home for the fix is Node itself. It is not realistic to land the fix there, however:

  • The bug only manifests on Node 18. dc-polyfill's own hasFullSupport() check treats Node 20.6+ as fully supported (so this whole patch file is inert there). Empirically I see the same: dd-trace-js's reproducer triggers on Node 18.20.8 and not on Node 23.3.0 against the same code.
  • The upstream fix that landed in Node 20.6 is referenced in this file's own comment: nodejs/node#47520. Whatever wider registry-stability fix(es) shipped alongside that resolved the symptom in 20.6+. There's no upstream gap to file a new PR against.
  • Node 18 reached EOL on 2025-04-30 — over a year ago. Backports to EOL'd lines effectively only happen for security CVEs with active maintainer interest. "Diagnostic-channel returns inconsistent objects under specific workloads" wouldn't clear that bar, especially when a polyfill-layer workaround already exists.
  • This whole patch file (`patch-garbage-collection-bug.js`) is gated by `hasGarbageCollectionBug()` and runs only on Node versions that have the bug. As consumers drop Node 18 support, this code path becomes inert and can eventually be deleted alongside the version checks. Patching at the polyfill layer is exactly what the polyfill is for.

Verification

  • All existing dc-polyfill tests pass with the patch on Node 18.20.8 (`./test.sh`: 26/26 suites pre-existing, 27/27 with the new regression test).
  • Lint clean (`npm run lint`).
  • New regression test (test/garbage-collection-patch.spec.js) injects a mock `unpatched` whose `channel(name)` returns a different Channel each call — simulates the Node 18 misbehavior at the unit level. Fails 6/8 assertions on `main`, passes 8/8 with this patch. Covers identity stability, per-name memoization, and the subscribe/publish-across-calls case.
  • Empirically verified end-to-end against dd-trace-js's ws plugin stress workflow (10 runners × 25 iterations on Node 18.20.8 — 250 runs total): 0/10 runners pass without this patch, 10/10 pass with it. Context PR: DataDog/dd-trace-js#8297.

Additional Notes

  • The map is bounded by the number of unique channel names a process uses. In practice this is a small fixed set per process (per-instrumentation), so no meaningful retention concern.
  • Doesn't conflict with the cross-version process-symbol registry in acquire-channel-registry.js — that registry sits at a different layer (shared tracingChannel wrappers across dc-polyfill versions); this patch makes the underlying per-version channel() results stable.
  • Strengthens the contract callers already expect: `channel(name)` returns the channel for name. Doesn't weaken anything.

🤖 Generated with Claude Code

bm1549 added 2 commits May 8, 2026 09:54
The phony AVOID_GARBAGE_COLLECTION subscriber installed by this patch
keeps a created Channel alive, but Node's underlying channel(name) can
still return a brand-new Channel object for the same name on a
subsequent call (observed on Node 18 under certain workloads — e.g.
heavy proxyquire reload + plugin teardown/recreate cycles). Because the
existing anti-GC tracking is a WeakSet keyed by Channel-object
identity, each new object is treated as new and gets its own phony
subscriber. The two objects are never unified.

The visible failure mode for callers: code that captures a Channel in
a module-level closure (a common pattern for instrumentation
publishers) ends up publishing to a different Channel than later
subscribers attach to. Publishes go nowhere; subscribers see nothing.
hasSubscribers reports false (only the phony is on the publish-side
Channel) and traceSync/traceCallback fast-path-bypass the publish.

This fix memoizes dc.channel(name) by name and short-circuits before
delegating to channel() on subsequent calls. Channel identity is now
stable per name regardless of what the underlying channel() returns
— the contract callers actually depend on.

Empirically verified end-to-end against dd-trace-js's ws plugin
stress workflow (10 runners × 25 iterations of plugin tests on
Node 18.20.8, 250 runs total): 0/10 runners pass without this patch,
10/10 pass with it. dd-trace-js PR for context: DataDog/dd-trace-js#8297.

All existing dc-polyfill tests pass (26/26 suites).
Adds a unit test that injects a mock unpatched module whose channel(name)
returns a different Channel object on every call, simulating the Node 18
bug. Without the memoization in patch-garbage-collection-bug.js, this
test fails 6/8 assertions (channel identity diverges, subscribers attach
to a different Channel than later publishers, the underlying channel()
is re-invoked on every memoizable lookup). With the patch, all 8 pass.

This pins the contract the patch is meant to enforce — dc.channel(name)
returns a stable Channel object per name regardless of what the
underlying registry hands back — without requiring a real Node 18 bug
to be triggered to exercise the patch.
Comment thread patch-garbage-collection-bug.js Outdated
@bm1549 bm1549 marked this pull request as ready for review May 8, 2026 14:30
@bm1549 bm1549 force-pushed the brian.marks/memoize-channel-by-name branch from 6f0a888 to deac894 Compare May 8, 2026 15:24
Copy link
Copy Markdown
Member

@rochdev rochdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The global subscribe was introduced to fix this. Let's use that and avoid patching a deprecated API. For versions that don't have it we should still do since it should also be in the polyfill. It's also possible to work around the issue in the integration itself by keeping a reference to channels instead of using channel() again every time (which is how this API was originally designed to be used in the first place).

module.exports = function(unpatched) {
const dc_channel = unpatched.channel;
const channels = new WeakSet();
const byName = new Map();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will grow forever but I think that's acceptable as channels should not be high cardinality and are pretty lightweight.

@bm1549 bm1549 merged commit 4edfca2 into main May 8, 2026
85 checks passed
@bm1549 bm1549 deleted the brian.marks/memoize-channel-by-name branch May 8, 2026 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants