perf(propagation): tighten tracestate, baggage, and tag inject paths#8234
Draft
perf(propagation): tighten tracestate, baggage, and tag inject paths#8234
Conversation
W3C tracestate parse, baggage extract, and `_dd.p.*` tag inject all
run on every traced HTTP request. Several sub-allocations and one
quadratic parse loop are dropped.
`tracestate.js#fromString` walked `value.matchAll(regex)` and
inserted each match at the front of the result with `Array#unshift`,
which is `O(n)` per call. `push` plus a single `reverse()` gives the
same final order in `O(n)`; the `Map` then iterates oldest-first so
`toString`'s prepend builds the W3C-spec newest-first wire form.
`forVendor` reuses the `state.toString()` value computed one line
above instead of recomputing it.
`text_map.js` swaps three `replaceAll(/[\xNN]/g, ...)` regex
literals matching a single character for `replaceAll('=', '~')` /
`replaceAll('~', '=')`, which skip the regex match path. The
`m`-flag on the extract regex was always dead (no anchors in the
pattern). `_injectTraceparent` walks `trace.tags` via
`Object.keys(...)` instead of banned `for-in`; the trace-tags
object's prototype chain isn't ours to trust, and `for-in`
enumerates inherited keys. `_injectBaggageItems` swaps
`Object.entries` for `Object.keys` + indexed read, dropping the
per-baggage-item `[k, v]` tuple. `_extractBaggageItems` caches the
`baggageTagKeys` `Set` on the propagator (rebuilt only when the
config array reference changes, e.g. remote-config rotation) and
gates `decodeURIComponent` behind `value.includes('%')` — a
microbenchmark pins the gated path at 13.4x faster than running
`decodeURIComponent` on plain ASCII baggage and only 2% slower than
the raw call on percent-encoded values.
DSM observes every Kafka, SQS, SNS, Kinesis, Pub/Sub, and AMQP
message when enabled, so the per-checkpoint hot path compounds.
Several allocations are removed without changing the wire format:
1. `getSizeOrZero` stopped allocating a fresh Buffer copy of every
string just to read its UTF-8 byte length. `Buffer.byteLength`
returns the same value with no allocation. `getHeadersSize`'s
`Object.entries(...).reduce(...)` becomes a `for (const key of
Object.keys(headers))` loop, dropping the per-header `[k, v]`
tuple and the reducer closure.
2. `pathway.js#shaHash` extracted the first 8 bytes of SHA-256 by
round-tripping through a 64-char hex string + a 16-char slice +
a hex-decoded Buffer. `digest().subarray(0, 8)` produces the same
bytes directly. `computeHash` now also caches
`hashableEdgeTags.join('')` and `propagationHashBigInt.toString(16)`
once per call (each was computed twice), gates the
`manual_checkpoint:true` filter on `includes(...)` so the common
path skips the alloc, and reuses a module-scope 20-byte scratch
buffer to assemble `encodePathwayContext` with a single
`Buffer.from(subarray)` copy-out instead of seven nested allocs.
3. `setCheckpoint` precomputes `PATHWAY_HEADER_BYTES` from the static
header overhead instead of allocating a temp object, encoding
it, and JSON-stringifying just to read its length. It now reads
the direction from `edgeTags[0]` directly: every in-tree caller
places it there, the `DataStreamsCheckpointer` shape is updated
to match, and the test fixture pinning that arg order is updated
in the same commit.
Drive-by fix:
* `recordCheckpoint` reuses the `BigInt` already computed by the
`StatsPoint` returned from `forCheckpoint(...)` instead of running
`readBigUInt64LE` a second time. `setCheckpoint` returns
`undefined` (rather than `null`) on the disabled fast path so
the function shape matches the rest of the file.
* `processor.js` drops the `DsmPathwayCodec` import that the
inlined byte-count made unreachable; `pathway.js` exports
`CONTEXT_PROPAGATION_KEY_BASE64` so the constant calculation is
anchored to the actual header key.
* `encoding.js` adds an `encodeVarintInto(target, offset, value)`
helper so the pathway encoder can write directly into the scratch
buffer instead of allocating a per-varint `Uint8Array` and
copying.
|
✨ Fix all issues with BitsAI or with Cursor
|
BenchmarksBenchmark execution time: 2026-05-01 23:11:27 Comparing candidate commit 00b73c3 in PR branch Found 109 performance improvements and 0 performance regressions! Performance is the same for 1638 metrics, 97 unstable metrics. scenario:datastreams-consume-18
scenario:datastreams-consume-20
scenario:datastreams-consume-22
scenario:datastreams-consume-24
scenario:datastreams-produce-18
scenario:datastreams-produce-20
scenario:datastreams-produce-22
scenario:datastreams-produce-24
scenario:datastreams-produce-high-cardinality-18
scenario:datastreams-produce-high-cardinality-20
scenario:datastreams-produce-high-cardinality-22
scenario:datastreams-produce-high-cardinality-24
scenario:datastreams-produce-manual-checkpoint-18
scenario:datastreams-produce-manual-checkpoint-20
scenario:datastreams-produce-manual-checkpoint-22
scenario:datastreams-produce-manual-checkpoint-24
scenario:datastreams-produce-with-message-size-18
scenario:datastreams-produce-with-message-size-20
scenario:datastreams-produce-with-message-size-22
scenario:datastreams-produce-with-message-size-24
scenario:propagation-extract-18
scenario:propagation-extract-20
scenario:propagation-extract-22
scenario:propagation-extract-24
scenario:propagation-extract-baggage-ascii-18
scenario:propagation-extract-baggage-ascii-20
scenario:propagation-extract-baggage-ascii-22
scenario:propagation-extract-baggage-ascii-24
scenario:propagation-extract-inject-18
scenario:propagation-extract-inject-20
scenario:propagation-extract-inject-22
scenario:propagation-extract-inject-24
scenario:propagation-inject-18
scenario:propagation-inject-20
scenario:propagation-inject-22
scenario:propagation-inject-24
|
Contributor
Overall package sizeSelf size: 5.68 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | import-in-the-middle | 3.0.1 | 82.56 kB | 817.39 kB | | dc-polyfill | 0.1.10 | 26.73 kB | 26.73 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
W3C tracestate parse, baggage extract, and
_dd.p.*tag inject allrun on every traced HTTP request. Several sub-allocations and one
quadratic parse loop are dropped.
tracestate.js#fromStringwalkedvalue.matchAll(regex)andinserted each match at the front of the result with
Array#unshift,which is
O(n)per call.pushplus a singlereverse()gives thesame final order in
O(n); theMapthen iterates oldest-first sotoString's prepend builds the W3C-spec newest-first wire form.forVendorreuses thestate.toString()value computed one lineabove instead of recomputing it.
text_map.jsswaps threereplaceAll(/[\xNN]/g, ...)regexliterals matching a single character for
replaceAll('=', '~')/replaceAll('~', '='), which skip the regex match path. Them-flag on the extract regex was always dead (no anchors in thepattern).
_injectTraceparentwalkstrace.tagsviaObject.keys(...)instead of bannedfor-in; the trace-tagsobject's prototype chain isn't ours to trust, and
for-inenumerates inherited keys.
_injectBaggageItemsswapsObject.entriesforObject.keys+ indexed read, dropping theper-baggage-item
[k, v]tuple._extractBaggageItemscaches thebaggageTagKeysSeton the propagator (rebuilt only when theconfig array reference changes, e.g. remote-config rotation) and
gates
decodeURIComponentbehindvalue.includes('%')— amicrobenchmark pins the gated path at 13.4x faster than running
decodeURIComponenton plain ASCII baggage and only 2% slower thanthe raw call on percent-encoded values.