Skip to content

v5.111.0 proposal#9150

Merged
rochdev merged 50 commits into
v5.xfrom
v5.111.0-proposal
Jul 2, 2026
Merged

v5.111.0 proposal#9150
rochdev merged 50 commits into
v5.xfrom
v5.111.0-proposal

Conversation

@dd-octo-sts

@dd-octo-sts dd-octo-sts Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Features

  • AppSec: In App WAF support for lambda #7783
  • General: SVLS-9168 add aws.durable.operation_attempt tag to durable operation spans #8595
  • OpenTelemetry: OTLP trace metrics support #8206
  • release: Add breaking changes to release proposal #9196
  • Test Optimization: Add vitest no-worker init mode #9173

Fixes

  • AppSec: Scope the mongodb nosql-analysis marker per query #9076
  • aws-durable-execution-sdk-js: Treat FAILED checkpoints as replays #9160
  • datastreams: Flush on write when flushInterval is 0 #9120
  • General: Wrap lazily defined fs.opendir on Node 20 #9094
  • Test Optimization: Handle missing beforeEach task result #9129
  • Test Optimization: Report typecheck tests #9176
  • Test Optimization: Route thread workers through main process #9169

Performance

  • General: Load the GCP pubsub push plugin lazily #9178
  • loader: Match instrumented modules with a shouldInclude predicate #9026

Documentation

  • agents: Clarify internal import ordering rule #9172
  • v6: Updated documentation for v6 #9159

Internal (CI, Testing, Benchmarking)

  • child_process: Fix Bluebird flake that cascades to every later spec #9078
  • child_process: Load the mock agent once per suite #9113
  • coverage: Merge per-integration coverage in All Green before upload #9086
  • Dependencies: Bump bullmq #9164
  • Dependencies: Bump openai #9122
  • Dependencies: Bump the vendor-minor-and-patch-dependencies group across 1 directory with 2 updates #9163
  • Dynamic Instrumentation: Cover breakpoint error paths #8996
  • engines: Widen engines.node to >=18 in CI to keep Node 18/20 jobs running #9145
  • General: Derive supported config paths from canonical names #9112
  • General: Drop dead profile assert helper and fix telemetry typo #7686
  • General: Introduce namespace field and start pruning internalPropertyName #8943
  • General: Key plugin version folders by single-digit major #9052
  • General: Widen GC pause p95 bound to deflake #9121
  • graphql: Migrate shimmer to orchestrion instrumentation #7757
  • kafkajs: Drop retries:0 from header-disable producer tests #9106
  • kafkajs: Stop pinning the produce offset in the sendBatch test #9074
  • LLM Observability: Bump tested langchain versions with new cassette #9135
  • LLM Observability: Use DD_LLMOBS_ENABLED in tagger sampling fixtures #9107
  • oracledb: Bound connect and query timeouts in the ESM integration fixture #9105
  • Add runtime family tag on appsec lambda span #9153
  • feat(test optimization) Add Test Optimization HTTP cache reader #8860
  • release: Add v6 into release workflows #9103
  • release: Fix branch-diff Infinity crash on scientific notation SHAs #9101
  • release: Fix publish by dropping dist-tag add #9100
  • scripts: Disable V8 Maglev for Windows test children #9131
  • Test Optimization: Handle 5xx retry in getKnownTests error test #9110
  • Test Optimization: Prepare test metadata in main #9171
  • Test Optimization: Split main and worker instrumentation #9170

rochdev and others added 20 commits June 30, 2026 06:26
The OIDC-exchanged token from the npm registry is only valid for the
publish operation; using it for npm dist-tag add produced E401. Remove
the multi-tag logic and the OIDC exchange entirely: each branch now
publishes with a single tag (latest for the current release line,
latest-nodeXX for older lines), which is all npm's trusted publishing
model supports without a stored token.

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ad (#9086)

Each test cell uploaded its own report to Codecov, so a commit sent ~430
uploads. Codecov silently parks uploads past its ~150-per-commit ceiling in
`started` and never merges them, so roughly 40 reports' worth of coverage was
dropped from every commit. The Datadog coverage upload was separately broken:
`upload-coverage-artifact` probed for files with `find -maxdepth 1`, but the
report lives one level deeper at `coverage/node-<version>/`, so the check found
nothing, no `coverage-*` artifact was produced, and `datadog-ci coverage
upload` reported nothing while passing green.

All Green already downloads every `coverage-*` artifact to drive the Datadog
upload, so it is the one place that sees a whole commit's coverage. It now
groups the per-cell reports by integration and uploads ~100 groups to both
backends instead of ~430 per-cell reports:

1. `upload-coverage-artifact` recurses for the report files and names each
   artifact `coverage-<flag>__<job>-<index>` so matrix cells that share a flag
   (cypress varies `spec` outside its flag) stop clobbering each other.
2. `scripts/group-coverage.mjs` sorts each cell's report into its integration's
   directory, stripping Node.js and library versions, which are noise for
   "which integration regressed". Reports are not merged locally — both
   backends merge same-flag uploads server-side — so each report passes through
   byte-for-byte and the harness needs no istanbul dependency in All Green's
   sparse checkout. ~430 cells collapse to ~100 groups.
3. Each cell emits both lcov and istanbul JSON: Codecov reads branch and
   function coverage from the JSON (its lcov parser ingests only line hits),
   Datadog reads the lcov and does not ingest the JSON. All Green uploads each
   format to the backend that reads it, one group per integration, flagged with
   the integration name.

`master-coverage` still rides every Codecov upload on PRs targeting master so
the `codecov/patch` gate fires; reruns de-duplicate to the newest run so a
stale rerun's counters are not double-counted.
…9074)

The "should emit one kafka.produce span per topicMessages entry" test
hard-coded kafka.messages.offsets to start_offset "0". Kafka produce is
at-least-once: a transient NOT_LEADER_FOR_PARTITION right after topic
creation makes kafkajs retry and advance the broker-assigned base offset
past 0, so the span faithfully reports a non-zero offset (CI observed
"1") and the assertion never matched before the timeout. The expected
offsets are read back from the sendBatch result instead, which still
pins the per-topic isolation the test was written for.

Each topicMessages entry is its own root span, so the two spans are
separate traces the agent may deliver in a single payload in any order;
the span lookup now scans every trace rather than only traces[0].
Key each expanded major by its bare major (`versions/mongodb@3`) instead of a
bounded range (`versions/mongodb@>=3.0.0 <4.0.0`). The bare major reads cleanly
as a folder name and covers each major's latest, including the floor major's,
which the range form dropped. Follows the shared resolver from #9019.

Widening the matrix to every major's latest surfaced several latent failures:

1. A bare-major key resolves to that major's newest version, so a range ending
   inside its top major overshoots: microgateway-core `>=2.1 <=3.0.0` keyed `3`
   installed 3.3.7 and the span came back `web.request` instead of
   `microgateway.request`. The top major keeps the declared range whenever it
   stops short of the major's ceiling; fully-spanned and lower majors stay bare.
2. `versions/ai@4` and `@langchain/core@0` resolve to versions that have no VCR
   cassette and would hit the live API (401). A central `brokenVersions`
   registry drops a matching resolved version and surfaces the reason as a
   pending test, each entry a stop-gap carrying a TODO.
3. A manifest carrying a `workspace:` protocol dependency was copied verbatim
   into a generated workspace, so yarn failed with "Couldn't find any versions
   for X that matches workspace:*". Fall back to the pinned compatible version.
4. The Apollo fetch-failure test gated the error span on `version > '2.3.0'`, a
   lexicographic compare that breaks once the key is bare (`'2' > '2.3.0'` is
   false). Compare the resolved version with `semver.gt`.
5. Single-digit keying renames folders that several specs hard-code by range
   (express, langchain, bedrock runtime, aws-sdk). The bedrock require threw
   after `agent.load` with no `agent.close`, leaving the Remote Config poll
   running and hanging the job to the 45-minute timeout; the others silently
   skipped suites. Point the requires at the renamed folders.
…As (#9101)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…lPropertyName (#8943)

`internalPropertyName` carried a hand-maintained full property path
(`telemetry.debug`, `remoteConfig.enabled`) that diverged from the canonical
env name, so the same configuration was named twice and the two could drift.

A new `namespace` field nests the canonical env name under a property path
(`telemetry.DD_TELEMETRY_DEBUG`), so the runtime path is derived from the
canonical name plus a namespace with no separate alias to maintain. It takes
precedence over `configurationNames` and `internalPropertyName` when resolving
the path, in the eslint sync rule, and in the type generator. Every group of
entries (remoteConfig, telemetry, appsec api-security and sca, profiling,
stats, llmobs, iast security-controls, the per-integration llm span limits)
moves onto it, and their runtime consumers are updated to the renamed keys.

The canonical name telemetry reports is unchanged: the rename only affects how
the property path is derived, not which env name is sent. The namespace object
is always built from the defaults, so the optional chaining and `?? 0`
fallbacks on the api-security accesses guarded a state that cannot occur and
are dropped.

Drive-by fix:

* Exempt integration-test fixture apps from `n/no-extraneous-require`: they
  `require('dd-trace')` as a customer does, so the rule fired once dd-trace
  became locally resolvable (yarn link) but stayed silent on a clean install.
…te (#9026)

import-in-the-middle scanned the include and exclude arrays once per resolved
module — up to ~290 include entries (RegExp.test or string compare) plus a
fileURLToPath on every resolve, nearly all against modules that match nothing.
Supplying iitm 3.2.0's shouldInclude predicate replaces that scan with a single
Set lookup for bare specifiers and one combined RegExp covering every
instrumented node_modules path and the configured security-control subpaths,
plus one RegExp for the exclusions. Over a mixed resolve corpus this drops the
per-resolve matching cost from ~2.5µs to ~25ns (about 100x).

The Set also carries each built-in's node: specifier, mirroring iitm's include
expansion, so `import 'node:crypto'` stays instrumented alongside
`import 'crypto'`. Package names pass through regexpEscape so a metacharacter in
a future package name cannot mis-match.

The .mjs rewriter loader spec was the repository's only .spec.mjs and no CI job
ran it: the misc suite glob matched .spec.js only, and the exercised-tests gate
collected .spec.js/.test.mjs but not .spec.mjs, so it could not flag the orphan.

1. Match *.spec.{js,mjs} in test:instrumentations:misc so the spec runs.
2. Widen verify-exercised-tests globs to @(spec|test).@(js|mjs|cjs) so every
   naming convention is tracked and an unrun one fails the gate.
3. Load the loader through require(esm) where the runtime supports it so its
   transforms land on nyc's CommonJS instrumentation path; gate on
   process.features.require_module so Node 18 falls back to import() instead of
   crashing the suite on the CommonJS compiler's SyntaxError.
The sampling tests in #9030 build their own taggers with `{ llmobs:
{ enabled: true } }`, and #8943 renamed that config key to
`DD_LLMOBS_ENABLED` everywhere it could see. The two landed in parallel,
so #8943 normalized the rest of the file but never saw these four
fixtures. On master the tagger now reads `DD_LLMOBS_ENABLED`, finds it
undefined, and returns before registering the span; `Tagger.tagMap.get`
then yields undefined and the "DROPPED at sampleRate 0" test throws
synchronously, aborting the whole `test:llmobs:sdk:ci` run with exit 7.

Fixes: https://github.com/DataDog/dd-trace-js/actions/runs/28265509637/job/83751636644
)

feat(graphql): migrate instrumentation to orchestrion

  Migrates GraphQL instrumentation from shimmer wrappers to orchestrion AST rewriting for graphql execute / parse / validate entry points, including CJS and ESM paths for graphql >=0.10 and @graphql-tools/executor.

  Moves resolver instrumentation into the GraphQL execute plugin. The execute plugin now owns per-execute root context, resolver wrapping, resolve-span lifecycle, source tracking, and resolver hook invocation. The old separate resolve plugin is removed.

  Preserves and tests the existing cross-feature contracts:
  - IAST still receives one apm:graphql:resolve:start publish per resolver call, using the actual GraphQL args object.
  - AppSec still receives resolver payloads through datadog:graphql:resolver:start and can abort synchronously through the shared abort controller.
  - depth only limits resolve-span creation; IAST/AppSec resolver publishes still happen for depth-gated fields.
  - depth-gated resolvers now honor abort signals before falling through the no-span fast path.
  - caller-owned execute args and contextValue are preserved without mutation.
  - default field resolver behavior matches graphql for primitive parent values.
  - graphql-yoga / @graphql-tools/executor execution is instrumented.

  Adds public TypeScript declarations for the GraphQL resolve hook and FieldContext payload.

  Keeps the implementation orchestrion-only, with no shimmer fallback, and updates the GraphQL long benchmark calibration for the migrated hot path.

  Regression coverage was added for:
  - resolver abort behavior past the configured depth
  - depth: 0 AppSec resolver-channel publishing
  - primitive-source defaultFieldResolver parity
  - caller-supplied and frozen execute args
  - primitive contextValue forwarding
  - Yoga normalized executor instrumentation
  - IAST/AppSec per-resolver channel cardinality

  Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
  Co-authored-by: Ruben Bridgewater <ruben@bridgewater.de>
Bumps the test-versions group with 1 update in the /integration-tests/esbuild directory: [openai](https://github.com/openai/openai-node).


Updates `openai` from 6.44.0 to 6.45.0
- [Release notes](https://github.com/openai/openai-node/releases)
- [Changelog](https://github.com/openai/openai-node/blob/main/CHANGELOG.md)
- [Commits](openai/openai-node@v6.44.0...v6.45.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-version: 6.45.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: test-versions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…th 10 updates (#9127)

Bumps the cloud-and-messaging group with 10 updates in the /packages/dd-trace/test/plugins/versions directory:

| Package | From | To |
| --- | --- | --- |
| [@aws-sdk/client-bedrock-runtime](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-bedrock-runtime) | `3.1074.0` | `3.1075.0` |
| [@aws-sdk/client-dynamodb](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-dynamodb) | `3.1074.0` | `3.1075.0` |
| [@aws-sdk/client-kinesis](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-kinesis) | `3.1074.0` | `3.1075.0` |
| [@aws-sdk/client-lambda](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-lambda) | `3.1074.0` | `3.1075.0` |
| [@aws-sdk/client-s3](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-s3) | `3.1074.0` | `3.1075.0` |
| [@aws-sdk/client-sfn](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-sfn) | `3.1074.0` | `3.1075.0` |
| [@aws-sdk/client-sns](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-sns) | `3.1074.0` | `3.1075.0` |
| [@aws-sdk/client-sqs](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-sqs) | `3.1074.0` | `3.1075.0` |
| [azure-functions-core-tools](https://github.com/Azure/azure-functions-core-tools) | `4.12.0` | `4.12.1` |
| [durable-functions](https://github.com/Azure/azure-functions-durable-js) | `3.3.1` | `3.4.0` |



Updates `@aws-sdk/client-bedrock-runtime` from 3.1074.0 to 3.1075.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-bedrock-runtime/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1075.0/clients/client-bedrock-runtime)

Updates `@aws-sdk/client-dynamodb` from 3.1074.0 to 3.1075.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-dynamodb/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1075.0/clients/client-dynamodb)

Updates `@aws-sdk/client-kinesis` from 3.1074.0 to 3.1075.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-kinesis/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1075.0/clients/client-kinesis)

Updates `@aws-sdk/client-lambda` from 3.1074.0 to 3.1075.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-lambda/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1075.0/clients/client-lambda)

Updates `@aws-sdk/client-s3` from 3.1074.0 to 3.1075.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-s3/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1075.0/clients/client-s3)

Updates `@aws-sdk/client-sfn` from 3.1074.0 to 3.1075.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-sfn/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1075.0/clients/client-sfn)

Updates `@aws-sdk/client-sns` from 3.1074.0 to 3.1075.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-sns/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1075.0/clients/client-sns)

Updates `@aws-sdk/client-sqs` from 3.1074.0 to 3.1075.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-sqs/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1075.0/clients/client-sqs)

Updates `azure-functions-core-tools` from 4.12.0 to 4.12.1
- [Release notes](https://github.com/Azure/azure-functions-core-tools/releases)
- [Changelog](https://github.com/Azure/azure-functions-core-tools/blob/4.12.1/release_notes.md)
- [Commits](Azure/azure-functions-core-tools@4.12.0...4.12.1)

Updates `durable-functions` from 3.3.1 to 3.4.0
- [Release notes](https://github.com/Azure/azure-functions-durable-js/releases)
- [Commits](Azure/azure-functions-durable-js@v3.3.1...v3.4.0)

---
updated-dependencies:
- dependency-name: "@aws-sdk/client-bedrock-runtime"
  dependency-version: 3.1075.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: cloud-and-messaging
- dependency-name: "@aws-sdk/client-dynamodb"
  dependency-version: 3.1075.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: cloud-and-messaging
- dependency-name: "@aws-sdk/client-kinesis"
  dependency-version: 3.1075.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: cloud-and-messaging
- dependency-name: "@aws-sdk/client-lambda"
  dependency-version: 3.1075.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: cloud-and-messaging
- dependency-name: "@aws-sdk/client-s3"
  dependency-version: 3.1075.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: cloud-and-messaging
- dependency-name: "@aws-sdk/client-sfn"
  dependency-version: 3.1075.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: cloud-and-messaging
- dependency-name: "@aws-sdk/client-sns"
  dependency-version: 3.1075.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: cloud-and-messaging
- dependency-name: "@aws-sdk/client-sqs"
  dependency-version: 3.1075.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: cloud-and-messaging
- dependency-name: azure-functions-core-tools
  dependency-version: 4.12.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: cloud-and-messaging
- dependency-name: durable-functions
  dependency-version: 3.4.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: cloud-and-messaging
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…pdates (#9128)

Bumps the test-versions group with 4 updates in the /packages/dd-trace/test/plugins/versions directory: [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node), [pnpm](https://github.com/pnpm/pnpm/tree/HEAD/pnpm11/pnpm), [protobufjs](https://github.com/protobufjs/protobuf.js) and [stripe](https://github.com/stripe/stripe-node).


Updates `@types/node` from 26.0.0 to 26.0.1
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

Updates `pnpm` from 11.8.0 to 11.9.0
- [Release notes](https://github.com/pnpm/pnpm/releases)
- [Changelog](https://github.com/pnpm/pnpm/blob/main/pnpm11/pnpm/CHANGELOG.md)
- [Commits](https://github.com/pnpm/pnpm/commits/v11.9.0/pnpm11/pnpm)

Updates `protobufjs` from 8.6.4 to 8.6.5
- [Release notes](https://github.com/protobufjs/protobuf.js/releases)
- [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md)
- [Commits](protobufjs/protobuf.js@protobufjs-v8.6.4...protobufjs-v8.6.5)

Updates `stripe` from 22.2.3 to 22.3.0
- [Release notes](https://github.com/stripe/stripe-node/releases)
- [Changelog](https://github.com/stripe/stripe-node/blob/master/CHANGELOG.md)
- [Commits](stripe/stripe-node@v22.2.3...v22.3.0)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-version: 26.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: test-versions
- dependency-name: pnpm
  dependency-version: 11.9.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: test-versions
- dependency-name: protobufjs
  dependency-version: 8.6.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: test-versions
- dependency-name: stripe
  dependency-version: 22.3.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: test-versions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…#9110)

The request helper retries once on a 5xx with a 5–7.5 s jittered delay, but
the getKnownTests "should return an error if the request fails" test mocked a
single 500 interceptor and used real timers. The retried request had no
interceptor and its retry timer always exceeded mocha's 5 s timeout, so the
callback never fired and the test timed out. Collapse the retry delay to 0 ms
and add a second 500 interceptor so the test exercises the real retry path and
asserts both requests are consumed.
…9112)

`internalPropertyName` made each supported config carry a second hand-maintained runtime path next to its canonical env name. Drop that alias and derive the config-object path from the canonical name, with Test Optimization entries grouped under `testOptimization` and top-level entries using their canonical key directly.

The plugin shared-config boundary still forwards the existing per-plugin keys, including the flat dynamic-instrumentation flag `CiPlugin.configure` receives; plugins do not receive the namespaced tracer config.

Drive-by fix:

* Drop duplicate benchmark `enabled` leaves left behind by the previous namespace migration.
The NoSQL injection analyzer used `enterWith` to mark the async context,
which leaked the marker past the query. A request that ended before its
query finished stranded the marker for the next request, so that
request's injection went unreported. Two concurrent queries within the
same request also saw each other's marker, leaving one unanalyzed.

Binding the marker on the query-build channel fixed the leak but lost it
on deferred queries. A mongoose query builds, executes, and reaches the
driver in three separate async steps. `find().then()`/`.exec()` runs the
driver a turn after the synchronous build, outside the build's
`runStores` scope, so the driver re-analyzed the same filter and
reported the injection twice. Binding the marker around the execution
channel instead covers the full async scope that reaches the driver, and
`runStores` restores the parent on its own.

This re-enables the mongoose nosqli suite on Node 20 + Express 5 (skipped
for APPSEC-66705) and the mquery nosqli integration suite (skipped for
APPSEC-62431, where the unscoped marker caused each injection to be
reported N+1 times). The mquery suite skips mongodb >=7 on Node < 20:
that driver reads Web Crypto off the `crypto` global, which Node 18 does
not expose by default.
The `depth` filter counted a resolver's full execution path, including the
numeric list indices that `collapse` later folds away. The same query therefore
reached a different depth depending on whether `collapse` was on: a field one
list-hop below the limit was instrumented when collapsing was off and dropped
when it was on, even though both describe the same selection-set nesting.

Count only selection-set segments (string path keys) toward `depth`, so the
limit tracks query structure rather than execution artifacts.

This shifts which resolvers are instrumented at a given `depth`, so it is gated
behind `DD_MAJOR`: the v5 line keeps the old list-index counting when collapsing
is on, and v6 counts selection-set depth only. The `countListIndices` config
flag carries the gate so `shouldInstrumentNode` stays free of version checks.

Fixes: #7468
Adds AppSec support for AWS lambda to dd-trace-js by introducing DC handlers that allow the datadog-lambda-js layer to delegate WAF execution to the tracer.
@dd-octo-sts dd-octo-sts Bot mentioned this pull request Jun 30, 2026
@dd-octo-sts

dd-octo-sts Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

Overall package size

Self size: 6.53 MB
Deduped: 7.6 MB
No deduping: 7.6 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | import-in-the-middle | 3.2.0 | 104.26 kB | 843.44 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | dc-polyfill | 0.1.11 | 25.74 kB | 25.74 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

@datadog-prod-us1-5

datadog-prod-us1-5 Bot commented Jun 30, 2026

Copy link
Copy Markdown

Pipelines  Tests

Fix all issues with BitsAI

⚠️ Warnings

🚦 3 Pipeline jobs failed

DataDog/apm-reliability/dd-trace-js | benchmark: [20, 2]   View in Datadog   GitLab

DataDog/apm-reliability/dd-trace-js | benchmark: [24, 2]   View in Datadog   GitLab

DataDog/apm-reliability/dd-trace-js | benchmark: [26, 2]   View in Datadog   GitLab

ℹ️ Info

No other issues found (see more)

🧪 All tests passed
❄️ No new flaky tests detected

🎯 Code Coverage (details)
Patch Coverage: 82.49%
Overall Coverage: 88.00%

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: b9822a7 | Docs | Datadog PR Page | Give us feedback!

@codecov

codecov Bot commented Jun 30, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 90.89452% with 284 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.76%. Comparing base (6b35e7d) to head (b9822a7).
⚠️ Report is 1683 commits behind head on v5.x.

Files with missing lines Patch % Lines
...instrumentations/src/vitest-main-no-worker-init.js 80.43% 125 Missing ⚠️
...ckages/datadog-instrumentations/src/vitest-main.js 87.28% 94 Missing ⚠️
...ages/datadog-instrumentations/src/vitest-worker.js 94.50% 22 Missing ⚠️
loader-hook.mjs 63.33% 11 Missing ⚠️
packages/datadog-plugin-graphql/src/execute.js 97.98% 6 Missing ⚠️
...-visibility/test-optimization-http-cache-schema.js 94.20% 4 Missing ⚠️
.../src/ci-visibility/test-optimization-http-cache.js 97.56% 4 Missing ⚠️
...ckages/datadog-instrumentations/src/vitest-util.js 96.25% 3 Missing ⚠️
packages/datadog-plugin-graphql/src/parse.js 87.50% 2 Missing ⚠️
packages/datadog-plugin-vitest/src/index.js 92.00% 2 Missing ⚠️
... and 8 more
Additional details and impacted files
@@             Coverage Diff             @@
##             v5.x    #9150       +/-   ##
===========================================
+ Coverage   83.19%   93.76%   +10.57%     
===========================================
  Files         476      898      +422     
  Lines       20153    52365    +32212     
  Branches        0    12324    +12324     
===========================================
+ Hits        16766    49102    +32336     
+ Misses       3387     3263      -124     
Flag Coverage Δ
aiguard 34.62% <45.00%> (?)
aiguard-integration 41.78% <52.63%> (?)
apm-bucket-0 35.38% <43.69%> (?)
apm-bucket-1 40.25% <42.97%> (?)
apm-bucket-2 37.26% <42.97%> (?)
apm-capabilities-tracing 48.91% <57.44%> (?)
apm-integrations-aerospike 32.87% <43.69%> (?)
apm-integrations-confluentinc-kafka-javascript 39.78% <47.05%> (?)
apm-integrations-couchbase 33.15% <43.69%> (?)
apm-integrations-http 41.81% <42.36%> (?)
apm-integrations-kafkajs 40.00% <47.05%> (?)
apm-integrations-next 29.40% <41.32%> (?)
apm-integrations-prisma 34.94% <42.97%> (?)
apm-integrations-tedious 33.78% <42.97%> (?)
appsec 57.70% <78.50%> (?)
appsec-express_fastify_graphql 53.65% <68.43%> (?)
appsec-integration 35.61% <22.69%> (?)
appsec-kafka_ldapjs_lodash 43.54% <42.96%> (?)
appsec-mongodb-core_mongoose_mysql 48.71% <49.77%> (?)
appsec-next 27.68% <41.32%> (?)
appsec-node-serialize_passport_postgres 47.94% <46.33%> (?)
appsec-sourcing_stripe_template 45.44% <44.49%> (?)
debugger 44.46% <50.31%> (?)
instrumentations-bucket-0 27.92% <42.01%> (?)
instrumentations-bucket-1 37.33% <43.44%> (?)
instrumentations-bucket-10 40.29% <41.66%> (?)
instrumentations-bucket-11 27.72% <42.01%> (?)
instrumentations-bucket-12 28.56% <41.32%> (?)
instrumentations-bucket-13 27.55% <42.01%> (?)
instrumentations-bucket-2 30.13% <41.80%> (?)
instrumentations-bucket-3 35.82% <43.44%> (?)
instrumentations-bucket-4 28.33% <42.01%> (?)
instrumentations-bucket-5 36.20% <42.97%> (?)
instrumentations-bucket-6 38.18% <42.97%> (?)
instrumentations-bucket-7 35.93% <45.66%> (?)
instrumentations-bucket-8 36.88% <43.44%> (?)
instrumentations-bucket-9 39.39% <41.66%> (?)
instrumentations-instrumentation-couchbase 45.38% <50.00%> (?)
instrumentations-integration-esbuild 24.43% <11.13%> (?)
llmobs-ai_anthropic_bedrock 39.47% <49.21%> (?)
llmobs-google-genai_langchain_vertex-ai 36.91% <50.75%> (?)
llmobs-openai 39.51% <50.38%> (?)
llmobs-sdk 43.33% <50.00%> (?)
openfeature 37.73% <51.18%> (?)
openfeature-unit 49.92% <51.28%> (?)
platform-core_esbuild_instrumentations-misc 22.94% <10.79%> (?)
platform-integration 47.41% <54.49%> (?)
platform-shimmer_unit-guardrails_webpack 18.30% <7.54%> (?)
plugins-bucket-0 36.32% <51.02%> (?)
plugins-bucket-1 39.59% <43.10%> (?)
plugins-bucket-11 38.37% <42.40%> (?)
plugins-bucket-17 38.98% <42.97%> (?)
plugins-bucket-18 41.93% <22.26%> (?)
plugins-bucket-19 39.50% <82.40%> (?)
plugins-bucket-20 43.14% <43.80%> (?)
plugins-bucket-4 37.66% <43.44%> (?)
plugins-bullmq_cassandra_cookie 39.67% <43.80%> (?)
plugins-cookie-parser_crypto_dd-trace-api 32.98% <43.69%> (?)
plugins-fetch_fs_generic-pool 35.95% <44.62%> (?)
plugins-google-cloud-pubsub_grpc_handlebars 42.84% <45.45%> (?)
plugins-hapi_hono_ioredis 37.69% <42.97%> (?)
plugins-jest_knex_langgraph 32.36% <42.74%> (?)
plugins-ldapjs_light-my-request_limitd-client 27.62% <42.01%> (?)
plugins-lodash_mariadb_memcached 34.90% <43.69%> (?)
plugins-mongodb_mongodb-core_mongoose 36.21% <42.85%> (?)
plugins-multer_mysql_mysql2 34.88% <43.69%> (?)
plugins-nats_node-serialize_opensearch 37.07% <42.97%> (?)
plugins-passport-http_pino_postgres 35.28% <44.53%> (?)
plugins-process_pug_redis 34.01% <43.69%> (?)
plugins-undici_url_valkey 35.78% <42.97%> (?)
plugins-vm_winston_ws 37.47% <43.80%> (?)
profiling 43.65% <51.56%> (?)
serverless-aws-sdk-aws-sdk 33.11% <42.40%> (?)
serverless-aws-sdk-bedrockruntime 31.98% <42.40%> (?)
serverless-aws-sdk-client 36.21% <33.33%> (?)
serverless-aws-sdk-dynamodb 33.96% <43.20%> (?)
serverless-aws-sdk-eventbridge 27.16% <41.37%> (?)
serverless-aws-sdk-kinesis 37.33% <42.40%> (?)
serverless-aws-sdk-lambda 34.42% <42.40%> (?)
serverless-aws-sdk-s3 32.41% <42.40%> (?)
serverless-aws-sdk-serverless-peer-service 39.40% <50.50%> (?)
serverless-aws-sdk-sns 38.18% <43.20%> (?)
serverless-aws-sdk-sqs 37.92% <43.20%> (?)
serverless-aws-sdk-stepfunctions 33.00% <42.40%> (?)
serverless-aws-sdk-util 46.67% <100.00%> (?)
serverless-bucket-0 39.41% <47.61%> (?)
serverless-lambda 34.04% <54.54%> (?)
test-optimization-cucumber 52.55% <49.69%> (?)
test-optimization-cypress 49.12% <47.22%> (?)
test-optimization-jest 55.78% <50.40%> (?)
test-optimization-mocha 53.50% <50.00%> (?)
test-optimization-playwright-playwright-atr 43.36% <37.09%> (?)
test-optimization-playwright-playwright-efd 43.39% <37.57%> (?)
test-optimization-playwright-playwright-final-status 43.48% <41.06%> (?)
test-optimization-playwright-playwright-impacted-tests 42.93% <37.57%> (?)
test-optimization-playwright-playwright-reporting 43.24% <38.93%> (?)
test-optimization-playwright-playwright-test-management 44.50% <40.86%> (?)
test-optimization-playwright-playwright-test-span 44.24% <39.47%> (?)
test-optimization-selenium 45.39% <39.27%> (?)
test-optimization-testopt 46.75% <23.39%> (?)
test-optimization-vitest 52.63% <77.56%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

CarlesDD and others added 6 commits July 1, 2026 06:30
1. Remove test/asserts/profile.js. Nothing requires it; the only profile
   value-type assertion in use is the standalone helper in the profiling
   agent exporter spec.
2. Fix the "amont" -> "amount" typo in both telemetry heartbeat comments.
* ci(scripts): disable V8 Maglev for Windows test children

Network-heavy specs intermittently abort with STATUS_STACK_BUFFER_OVERRUN
(0xC0000409) on Windows: mocha-run-file forces process.exit() the moment
mocha finishes, and that races V8's Maglev teardown while libuv still has
in-flight sockets from the spec's real HTTP traffic. The child dies with no
stderr and no crash report, so mocha-parallel-files only sees a non-zero exit
and reports the file as crashed (e.g. inferred_proxy.spec.js, which is just
whichever network spec lost the race that run).

--no-maglev sidesteps the faulty tier and can only be passed as a CLI flag,
not through NODE_OPTIONS, so the runner injects it into the spawned per-file
node processes on win32.

Refs: nodejs/node#62260

* ci(scripts): gate --no-maglev by V8 version

The Windows --no-maglev workaround was appended on every win32 child, but the
top-level --maglev/--no-maglev toggle only exists from V8 11 (Node 20). On the
supported Node 18 line (V8 10) the flag does not exist, so each spawned spec
aborts with `bad option: --no-maglev` (exit 9) before mocha runs, breaking the
parallel runner for Windows Node 18. Gate on the running V8 major, which the
children inherit via the shared binary.
Node 20 defines `fs.opendir` / `fs.opendirSync` as lazy getter+setter accessor
properties that resolve the real function on first read. Handing such a property
to `shimmer.wrap` instrumented the getter, so the property access was traced
while the real call ran uninstrumented — IAST then saw no `opendir` operation
and reported no PATH_TRAVERSAL vulnerability. Node 18/22 define these as plain
data properties, which is why the gap was Node-20-only.

Extend `shimmer.wrap`'s existing `replaceGetter` option to cover the lazy
getter+setter case: resolve the value once through the getter and wrap that,
rather than re-implementing the resolution in the fs instrumentation. The
property keeps its original shape — a getter+setter pair stays a getter+setter
pair whose setter still materializes a writable data property on assignment, so
the descriptor remains observationally identical for a downstream consumer that
inspects or overwrites it on that Node.js version.

A getter+setter pair without `replaceGetter` keeps being wrapped in place — the
wrapper becomes the new getter and the original setter is left untouched, as the
`url` instrumentation relies on for the `URL.prototype` `host` / `hostname`
accessors. Only a setter-only property throws. Narrowing the guard to reject
every unguarded getter+setter pair would have thrown inside the `url` hook,
silently dropping that instrumentation and the AppSec / IAST coverage built on
it.

`fs.js` now passes `{ replaceGetter: true }` through its `wrap` / `massWrap`
helpers instead of carrying its own materialization helper.
The second sendBatch/send in these tests is a real broker call that the
test expects to succeed after the stub is restored. A fresh topic's first
produce routinely returns the retryable NOT_LEADER_FOR_PARTITION while
metadata propagates; kafkajs normally refreshes metadata and retries it,
but retry:{retries:0} stripped that safety net, surfacing the transient
error as a hard KafkaJSNonRetriableError and flaking CI.

retries:0 bought nothing for the first call it was meant to speed up: the
stubbed UNKNOWN error is non-retryable, so it already fails on the first
attempt regardless of the retry count. Removing it restores the retry on
the real call while leaving the stubbed-rejection assertions unchanged.
The non-native runtime metrics test asserted that
runtime.node.gc.pause.by.type.95percentile lands in [0.1ms, 100ms).
On a fast or idle runner a gc_type can have a single scavenge sample
whose p95 sits below 0.1ms, so the matcher rejected a legitimate value
and the test failed on Windows.

The bounds exist to catch a unit-conversion regression, not to assert a
minimum pause length: a sub-microsecond value would mean the ms->ns
conversion was dropped, a value over 100ms that it was left in ms or
seconds. Lower the floor to 1µs so it still trips on a dropped
conversion while no longer assuming a GC pause takes at least 0.1ms.
juan-fernandez and others added 5 commits July 2, 2026 16:22
The http plugin requires datadog-plugin-google-cloud-pubsub's
pubsub-push-subscription module at the top of its own file, so every process
that instruments http pulls in that plugin graph even though it is only used
on GCP Cloud Run (K_SERVICE set and DD_TRACE_GCP_PUBSUB_PUSH_ENABLED not
opted out). Moving the require inside the existing gate keeps the module — and
its transitive graph — out of the startup path for every other deployment.

Drive-by fix:

* Drop the try/catch around the require: the pubsub plugin ships with the
  tracer, so this require cannot fail independently of the tracer's own load.
…spec (#9078)

The Bluebird specs set global.Promise = Bluebird in beforeEach and restored it
only in afterEach, holding the mutation across the awaited span round-trip. Under
load the subprocess span arrives late, and the shared assertion helper only
rejects once a payload has already errored, so the spec hangs to mocha's 5 s
timeout while leftover async keeps global.Promise flipped, corrupting context for
every later spec in the file.

1. The instrumentation only reads global.Promise synchronously, when the wrapped
   method runs. Scope the swap to that window via withBluebird() and restore it in
   a finally before any await.
2. assert.strictEqual(promise.constructor, Bluebird) asserts a property that
   depends on module-load order and the call-time global, not on any contract the
   instrumentation makes; it held locally and on Node 18 but saw the native
   constructor on Node 20. Assert the real contract instead: the promisified call
   resolves with the expected stdout and still produces the span.
3. Await the span expectation and the subprocess completion together in one
   Promise.all per spec, so a late or failed span cannot float onto the shared
   agent and match a later spec's expectation.

Drive-by fix:

* Drop the dead `delete require.cache[require.resolve('util')]` (util is a builtin).
* Replace the try/catch rejection check with assert.rejects.
* feat(span-stats): add OTLP metrics export for span stats

Export client-computed span stats as OTLP metrics (dd.trace.span.hits,
dd.trace.span.errors, dd.trace.span.top_level_hits, dd.trace.span.duration)
via a new OtlpStatsExporter alongside the existing Datadog /v0.6/stats
exporter.

Enabled via DD_TRACE_OTEL_METRICS_ENABLED=true, or auto-enabled when both
OTEL_TRACES_EXPORTER=otlp and OTEL_METRICS_EXPORTER=otlp are set. URL and
protocol are derived from the OTLP trace export configuration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(config): register traceMetrics as internal runtime property

traceMetrics is a computed aggregate derived from OTEL_TRACES_EXPORTER,
OTEL_METRICS_EXPORTER, and DD_TRACE_OTEL_METRICS_ENABLED — not a raw
user-facing key — so it belongs in INTERNAL_RUNTIME_PROPERTIES alongside
sampler and stableConfig.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(config): guard traceMetrics URL derivation against invalid OTLP endpoint

When hostname is an unbracketed IPv6 address (e.g. ::1), the defaultOtlpBase
is http://::1:4318 which is not a valid URL. The new URL() call in the
traceMetrics block was the first code path to actually parse the string,
causing a TypeError that crashed config construction.

Wrap the URL derivation in a try/catch so that a malformed traces endpoint
falls back to the localhost default without throwing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(config): regenerate config types for DD_TRACE_OTEL_METRICS_ENABLED

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(config): add configurationNames to DD_TRACE_OTEL_METRICS_ENABLED entry

The eslint-config-names-sync rule verifies that every leaf property in
TracerOptions (index.d.ts) has a matching configurationNames entry in
supported-configurations.json. The entry for DD_TRACE_OTEL_METRICS_ENABLED
only had internalPropertyName, which is not checked by the rule.

Adding configurationNames: ["traceMetricsEnabled"] ties the two files
together and satisfies the lint check. Regenerated config types to match.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* bug fix

* update post RFC discussion

* bring implementation bug fix

* clean how cinfgs are set

* feat(otlp-trace-metrics): align span-stats export with the trace-metrics contract

Update the OTLP trace-metrics export to match the agreed RFC/system-test contract:

- Rename the enablement env var to OTEL_CLIENT_STATS_COMPUTATION_ENABLED and add
  DD_TRACE_OTEL_SEMANTICS_ENABLED (OTel-semantics mode: emit only OTel attributes, no dd.*).
- Emit a single histogram named traces.span.sdk.metrics.duration.
- Map dimensions to OTel attributes (span.name, span.kind, http.*, rpc.* from grpc tags) and
  convey errors via OTel status.code; default mode also adds dd.operation.name, dd.span.type,
  dd.origin and dd.span.top_level.
- Add telemetry.sdk.{name,language,version} resource attributes and emit process tags as dd.<key>
  (default mode only); gate all dd.* resource attributes behind default mode.
- Drive the flush/export cadence from OTEL_METRIC_EXPORT_INTERVAL and drop the
  _DD_TRACE_STATS_WRITER_INTERVAL override.
- Read grpc.status.code from span.metrics (numeric) with a meta fallback.

Update unit tests accordingly and regenerate config types.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(otlp-trace-metrics): report service identity per InstrumentationScope

Partition span-stats data points by service so one OTLP payload can carry
multiple services, each as its own InstrumentationScope with service.name,
service.version and deployment.environment.name. These move off the resource,
which now only carries SDK identity, host.name and dd.* attributes.

Fix the trace-metrics flush cadence at 10s (no longer driven by
OTEL_METRIC_EXPORT_INTERVAL); the internal _DD_TRACE_METRICS_OTEL_FLUSH_INTERVAL
overrides it in tests only.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(otlp-trace-metrics): apply internal flush interval override

The generic env applier only reads DD_/OTEL_ prefixed vars, so the
internal _DD_TRACE_METRICS_OTEL_FLUSH_INTERVAL (which starts with _DD_)
was never wired into config. Read it explicitly so the test-only flush
cadence override takes effect.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(otlp-span-stats): emit fixed explicit-bounds histogram from the DDSketch

Derive the OTLP duration histogram from each group's DDSketch into the
spanmetrics-connector default bounds (in seconds), and drop the duplicate
exact-cell accumulator in span_stats. Each group now emits at most two data
points (ok/error) with a per-group dd.span.top_level heuristic, mirroring
libdatadog.

Co-authored-by: Cursor <cursoragent@cursor.com>

* refactor(span-stats): carry service identity as resource attributes

Move service.name/service.version/deployment.environment.name onto the OTLP
resource (the configured default service), emit a single InstrumentationScope,
and add service.name as a data-point attribute only when a span's service
differs from the configured default. Thread DD_SERVICE through the processor so
the transformer can compare against it.

Co-authored-by: Cursor <cursoragent@cursor.com>

* refactor(span-stats): drop redundant dd-trace InstrumentationScope

The exported OTLP metrics no longer carry an InstrumentationScope: a `dd-trace`
scope (name/version) is redundant with the resource's telemetry.sdk.* attributes.
The single scopeMetrics omits the scope field.

Co-authored-by: Cursor <cursoragent@cursor.com>

* refactor(span-stats): datadog.* attribute prefix and OTEL_TRACES_SPAN_METRICS_ENABLED

Rename the OTLP trace-metric attributes from dd.* to datadog.* (operation.name,
span.type, span.top_level, origin, runtime_id, datadog.<process tags>) and rename
the enablement env var OTEL_CLIENT_STATS_COMPUTATION_ENABLED ->
OTEL_TRACES_SPAN_METRICS_ENABLED.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(otlp): set _dd.stats_computed resource attribute on OTLP traces when trace metrics enabled

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(span-stats): use timer.unref?.() for Electron compatibility

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(rebase): restore extractRootTags; rename traceMetricsEnabled; add otelSemanticsEnabled to types

span_format.js: rebase conflict resolved to branch's addTag refactor which no longer exists in
master — revert to explicit typeof checks while keeping the FR06.3 BUG comment.

index.d.ts: rename traceMetricsEnabled -> otlpTraceMetricsEnabled to match supported-configurations.json;
add otelSemanticsEnabled (DD_TRACE_OTEL_SEMANTICS_ENABLED). Fixes eslint-config-names-sync errors.

Regenerate generated-config-types.d.ts from updated inputs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(span-stats): wire OTLP metrics endpoint/protocol and trim dead code

The crash fix: SpanStatsProcessor read config.otelMetricsUrl/otelMetricsProtocol,
which Config never set (dropped in 40014ae), so `new URL(undefined)` threw
ERR_INVALID_URL and crashed tracer init whenever OTLP trace metrics were enabled.
Read the canonical OTEL_EXPORTER_OTLP_METRICS_ENDPOINT/OTEL_EXPORTER_OTLP_METRICS_PROTOCOL
directly instead of introducing redundant alias properties.

Also fix the dead auto-enable check: `this.otelMetricsEnabled` does not exist
(the property is DD_METRICS_OTEL_ENABLED), so `undefined === true` made the
"auto-enable when OTLP traces + OTEL metrics are on" path never trigger.

Minimize the diff vs master without changing behavior:
- drop 4 unused SpanAggStats fields (errorDuration/topLevel*) and their test
- collapse the duplicate JSON/protobuf transformer methods into transform()
- remove two `// BUG` WIP narration comments (reverts the comment-only
  span_format.js hunk; tracked separately)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>

* refactor(span-stats): privatize internals, trim telemetry, simplify transformer

- Make _drainBuckets / _toLegacyPayload true private (#) — neither
  crosses the class boundary; _ prefix implied false publicness
- Guard SpanStatsExporter construction behind !otlpTraceMetricsEnabled
  so it is never instantiated when the OTLP path is active
- Replace #errorStatus() / #boolAttr() one-shot methods with inline
  literals and a module-level ERROR_STATUS_ATTR constant to avoid
  per-call allocations
- sketchToFixedHistogram now returns number[] directly; #pushPoint
  references EXPLICIT_BOUNDS_SECONDS from the module constant
- Remove this.recordTelemetry calls from OtlpStatsExporter.export —
  not part of the OTLP trace-metrics spec
- Rewrite whitebox _drainBuckets test as blackbox: assert buckets are
  empty after onInterval() instead of calling the private method

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix configs

Co-authored-by: Munir Abdinur <munir.abdinur@datadoghq.com>

* chore: regenerate config types after supported-configurations.json update

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(opentelemetry): use otelSemanticsEnabled config key instead of DD_TRACE_OTEL_SEMANTICS_ENABLED

Our branch maps DD_TRACE_OTEL_SEMANTICS_ENABLED to the internal property
otelSemanticsEnabled via supported-configurations.json internalPropertyName.
The merged master code was reading config.DD_TRACE_OTEL_SEMANTICS_ENABLED
directly, which was undefined in our config layout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): update span.spec.js to use otelSemanticsEnabled config key

The test was setting config.DD_TRACE_OTEL_SEMANTICS_ENABLED but span.js
now reads config.otelSemanticsEnabled (the internal property name mapped
from DD_TRACE_OTEL_SEMANTICS_ENABLED via supported-configurations.json).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(config): use string default for _DD_TRACE_METRICS_OTEL_FLUSH_INTERVAL

Schema requires default to be a string or null, not a number literal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(config): update description for _DD_TRACE_METRICS_OTEL_FLUSH_INTERVAL to match registry

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(config): use short description for _DD_TRACE_METRICS_OTEL_FLUSH_INTERVAL

The description field maps to Short Description in the config registry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(config): remove description field from _DD_TRACE_METRICS_OTEL_FLUSH_INTERVAL

No other int entry with allowed field uses description; may be mutually exclusive in schema.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(otlp-span-stats): address Codex review comments

- Use dd-trace VERSION (not app version) for telemetry.sdk.version resource attribute
- Pass OTEL_EXPORTER_OTLP_METRICS_HEADERS and OTEL_EXPORTER_OTLP_METRICS_TIMEOUT
  to OtlpStatsExporter so authenticated/custom endpoints work correctly
- Fix index.d.ts doc: env var is OTEL_TRACES_SPAN_METRICS_ENABLED and
  auto-enable condition is DD_METRICS_OTEL_ENABLED (not OTEL_METRICS_EXPORTER=otlp)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): fix import order and no-useless-undefined in span_stats and otlp-span-stats

- Move ../../../version import before ./constants to satisfy import/order rule
- Remove explicit = undefined default for headers param (unicorn/no-useless-undefined)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(config): cover _DD_TRACE_METRICS_OTEL_FLUSH_INTERVAL override branch

Adds test that exercises the setAndTrack call inside the conditional
that reads the internal flush interval override from the environment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(config): align DD_TRACE_OTEL_SEMANTICS_ENABLED with registry definition

The config registry has this entry as a plain boolean with default "false"
and no internalPropertyName. Revert our custom mapping so the entry matches
the registry exactly — the validator compares against the registered definition.

All code that previously accessed config.otelSemanticsEnabled now reads
config.DD_TRACE_OTEL_SEMANTICS_ENABLED directly; the destructuring alias
in span_stats.js preserves the otelSemanticsEnabled local variable name.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): add configurationNames for otelSemanticsEnabled to satisfy eslint-config-names-sync

The rule requires that every option name in index.d.ts has a corresponding
entry in supported-configurations.json (as a key, configurationNames value,
or internalPropertyName). Adding configurationNames: ["otelSemanticsEnabled"]
to DD_TRACE_OTEL_SEMANTICS_ENABLED satisfies this while keeping default: "false"
to match the registry. The generator uses configurationNames[0] as the config
key, so code reverts to config.otelSemanticsEnabled.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(config): remove internalPropertyName and unnecessary configurationNames

Per reviewer feedback:
- Remove internalPropertyName from _DD_TRACE_METRICS_OTEL_FLUSH_INTERVAL; the flush
  interval is read directly via getEnvironmentVariable() in #applyCalculated
- Remove configurationNames/internalPropertyName from OTEL_TRACES_SPAN_METRICS_ENABLED
  and drop otlpTraceMetricsEnabled as a programmatic option from index.d.ts; use
  this.OTEL_TRACES_SPAN_METRICS_ENABLED directly in #applyCalculated instead
- Remove configurationNames from DD_TRACE_OTEL_SEMANTICS_ENABLED and drop
  otelSemanticsEnabled as a programmatic option from index.d.ts; all callers
  now read config.DD_TRACE_OTEL_SEMANTICS_ENABLED directly

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(config): resolve Unknown Config properties for otlpTraceMetricsEnabled/ddTraceMetricsOtelFlushInterval

- Map setAndTrack to OTEL_TRACES_SPAN_METRICS_ENABLED (a declared config key)
  instead of the undeclared otlpTraceMetricsEnabled alias; all call sites
  updated to read config.OTEL_TRACES_SPAN_METRICS_ENABLED directly
- Move _DD_TRACE_METRICS_OTEL_FLUSH_INTERVAL reading from #applyCalculated
  into span_stats.js; import getEnvironmentVariable there directly — removes
  the undeclared ddTraceMetricsOtelFlushInterval setAndTrack write
- Update tests to use the env-var key names and remove the now-irrelevant
  config override test

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): remove blank line before closing brace in config spec

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(config): generate GeneratedEnvVarConfig interface for env var types

- Extract getBaseType helper from getTypeForEntry to share base type computation
- Add getEnvVarType that only adds undefined when there is no registered default
- Add generateEnvVarConfigTypes to map every env var name (canonical + aliases) to its resolved type
- Append GeneratedEnvVarConfig interface to generated-config-types.d.ts

Rationale: Callers of getValueFromEnvSources need per-env-var typed return values instead of the full config property union, enabling type-safe lookups by literal env var name.

This commit made by [/dd:git:commit:quick](https://github.com/DataDog/claude-marketplace/tree/main/dd/commands/git/commit/quick.md)

* fix(span-stats): address PR review comments

- Gate OTLP-only SpanAggKey dimensions (origin, spanKind, rpcMethod,
  rpcStatusCode) on otlpTraceMetricsEnabled to avoid inflating legacy
  span stats aggregation key cardinality
- Thread _DD_TRACE_METRICS_OTEL_FLUSH_INTERVAL through the typed config
  system (via setAndTrack/getValueFromEnvSources) instead of reading
  the raw env var directly in SpanStatsProcessor
- Add config tests covering OTEL_TRACES_SPAN_METRICS_ENABLED auto-enable
  logic (both conditions, explicit override)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(span-stats): pass otlpEnabled=true in transformer test bucket helper

makeBucket is used exclusively by OtlpStatsTransformer tests, so spans
must be keyed with otlpEnabled=true to populate the OTLP-gated fields
(origin, spanKind, rpcMethod, rpcStatusCode).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style(span-stats): minor style and readability cleanups

- Extract flush interval to variable before setAndTrack call
- Remove unnecessary quotes on property key
- Tighten test description wording

Rationale: Small consistency and readability improvements from PR review

This commit made by [/dd:git:commit:quick](https://github.com/DataDog/claude-marketplace/tree/main/dd/commands/git/commit/quick.md)

* fix(span-stats): remove rpc.method from otlp span stats aggregation key

- Drop grpc.method.name from SpanAggKey and OtlpStatsTransformer
- rpc.method inflates aggregation key cardinality without sufficient benefit
- Update all affected test assertions

This commit made by [/dd:git:commit:quick](https://github.com/DataDog/claude-marketplace/tree/main/dd/commands/git/commit/quick.md)

* refactor(span-stats): remove stale comments and clarify TODO

- Remove redundant inline comments in otlp-span-stats transformer
- Replace misleading comment about OTLP-only dimensions with a TODO
  noting origin and spanKind should eventually be included in legacy
  client stats aggregation

This commit made by [/dd:git:commit:quick](https://github.com/DataDog/claude-marketplace/tree/main/dd/commands/git/commit/quick.md)

* Apply suggestion from @mabdinur

* refactor(span-stats): remove redundant inline comments

- Drop comments that restate what the code already shows
- Keep code self-documenting per project style guidelines

This commit made by [/dd:git:commit:quick](https://github.com/DataDog/claude-marketplace/tree/main/dd/commands/git/commit/quick.md)

* fix(opentelemetry): fix max-len lint violation in span_processor.js

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(opentelemetry): fix max-len lint violation in span_processor.js

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add otlp-span-stats exporter to CODEOWNERS

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(opentelemetry): encapsulate OTLP span stats in opentelemetry/metrics

Move OtlpStatsExporter and OtlpStatsTransformer into opentelemetry/metrics/
so the opentelemetry package is self-contained ahead of potential extraction
into its own npm package.

Key changes:
- Move exporters/otlp-span-stats/{index,transformer}.js to
  opentelemetry/metrics/otlp_span_stats_{exporter,transformer}.js
- Move buildResourceAttributes from span_stats.js to opentelemetry/metrics/index.js;
  add createOtlpSpanStatsExporter factory there
- Wire OtlpStatsExporter via DI: opentracing/tracer.js creates it when
  OTEL_TRACES_SPAN_METRICS_ENABLED and passes it through SpanProcessor to
  SpanStatsProcessor — span_stats.js no longer imports from opentelemetry/
- config/index.js mirrors OTEL_TRACES_SPAN_METRICS_ENABLED into
  stats.DD_TRACE_STATS_COMPUTATION_ENABLED so downstream checks are unified
- Remove otlpEnabled flag from SpanAggKey/SpanBuckets — origin, spanKind,
  rpcStatusCode are always populated
- Remove OTEL-specific check from AgentExporter (relies on mirrored flag)
- Remove CODEOWNERS entry for deleted exporters/otlp-span-stats/ path
- Move tests to test/opentelemetry/metrics/

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(opentelemetry): align grpc stats with libdatadog and enforce mutual exclusion

- Move GRPC_STATUS_CODE constant to ext/tags.js
- Emit rpc.response.status_code as string (aligns with libdatadog kv_str)
- Use else if in onInterval to make native and OTLP export mutually exclusive

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(opentelemetry): trim JSDoc, privatize transformer, drop empty description

- Make OtlpStatsExporter#transformer a private field (#transformer)
- Remove empty description field from histogram metric
- Trim redundant @param prose in exporter and transformer JSDoc
- Use GRPC_STATUS_CODE import in transformer spec instead of string literal

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(opentelemetry): suppress self-instrumentation spans from OTLP exporter requests

Wrap sendPayload's HTTP request in legacyStorage.run({ noop: true }) so the
tracer does not instrument its own outbound connections to the OTLP collector.
Without this, tcp.connect client spans for /v1/metrics requests were fed into
the traces.span.sdk.metrics.duration histogram, displacing real span data points
and inflating counts in the system-tests parametric suite.

Same pattern used by exporters/common/request.js and exporters/common/agents.js.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(opentelemetry): improve patch coverage for span stats OTLP export

- Inline URL parsing into OtlpHttpExporterBase constructor; remove setUrl
  (the if(telemetryTags !== undefined) branch was dead — telemetryTags is
  always undefined when setUrl was called from the constructor, and no
  external caller ever invoked it post-construction)
- Add tests for buildResourceAttributes (sdk identity, runtime-id, OTel-
  semantics mode) and createOtlpSpanStatsExporter
- Add tests for HTTP error response and request error paths in sendPayload

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(opentelemetry): emit raw grpc.status.code name for rpc.response.status_code

Prefer the meta status NAME string over the numeric metrics tag and emit it
upper-cased to rpc.response.status_code, aligning with the OTel gRPC semantic
conventions (canonical status name) without any code<->name mapping.

* fix(opentelemetry): restore setUrl method removed by dead-code cleanup

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(opentelemetry): trim redundant comments in grpc status mapping

Co-authored-by: Cursor <cursoragent@cursor.com>

* refactor(opentelemetry): read grpc.status.code from meta only

Drop the numeric metrics fallback; the gRPC status code is the canonical status
NAME and is read from span meta.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(opentelemetry): translate numeric grpc.status.code from metrics to status name

The dd gRPC plugin sets grpc.status.code as a numeric integer via
span.setTag, which span_format.js routes into span.metrics rather than
span.meta. SpanAggKey was reading meta only, so rpcStatusCode was always
empty for real gRPC spans.

Now falls back to span.metrics[GRPC_STATUS_CODE] and translates the
integer to the canonical status name (OK, NOT_FOUND, etc.) using the
gRPC status code table. Meta string takes priority when both are present.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(opentelemetry): split agg key by top-level to fix mixed-bucket OTLP metrics

When a bucket contained both top-level and measured non-top-level spans,
the heuristic (topLevelHits === hits) always resolved to false, causing
the OTLP histogram to be emitted as datadog.span.top_level=false and
dropping top-level traffic from APM metrics.

Adding topLevel as a dimension to SpanAggKey causes top-level and
non-top-level spans to bucket separately. Each bucket is now always
purely top-level or purely non-top-level, so the attribute is always
accurate. The native stats path is unaffected because toJSON() omits
topLevel; the Agent merges groups with identical key fields, preserving
the same Hits/TopLevelHits totals.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(opentelemetry): track per-top-level distributions to fix native stats regression

The previous fix added topLevel to SpanAggKey to separate top-level and
non-top-level spans into distinct buckets. This created a real regression
in the native /v0.6/stats path: the Agent's mergeDuplicates() correctly
sums Hits/Errors/Duration from duplicate rows but silently drops
TopLevelHits from the merged-away entry. If the non-top-level row is
processed first and becomes the canonical, TopLevelHits from the
top-level row is lost.

Fix: revert topLevel from SpanAggKey (no more duplicate rows). Instead,
split SpanAggStats into four distributions (topLevelOk, topLevelError,
nonTopLevelOk, nonTopLevelError). The native stats path merges them at
export time so toJSON() produces the same combined OkSummary/ErrorSummary
as before. The OTLP path emits separate data points per top-level status
with the correct datadog.span.top_level attribute. OTel-semantics mode
merges the distributions (no top-level attribute to distinguish them).

Also adds SpanKind, Origin, and RpcStatusCode to the native stats
toJSON() payload so the Agent receives these new dimensions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(opentelemetry): split native stats rows by top-level status; use GRPCStatusCode key

toJSON() now returns an array of up to 2 rows (top-level row first, non-top-level
row second). #toLegacyPayload uses flatMap to flatten them. This eliminates the
merge-time DDSketch allocation and ensures TopLevelHits is always non-zero on
the top-level row, so the Agent's mergeDuplicates retains it as the canonical entry.

Duration and Errors are derived from distribution .sum/.count, removing the
redundant this.duration and this.errors accumulators.

GRPCStatusCode matches the agent's msgpack decoder key (confirmed from
pkg/proto/pbgo/trace/stats_gen.go).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(opentelemetry): minimize comments, move GRPC_STATUS_NAMES to constants

- Remove descriptive/narrating comments throughout; keep only non-obvious constraints
- Move GRPC_STATUS_NAMES from span_stats.js into constants.js
- Rename #toLegacyPayload -> #v06Payload
- Remove section comment from ext/tags.js

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(opentelemetry): restrict ORIGIN_KEY to synthetics boolean; remove origin from aggregation key

ORIGIN_KEY now only populates SpanAggKey.synthetics. The origin string field
is removed from SpanAggKey, toString(), and the v0.6 payload (Origin is not
a field the agent decodes). In the OTLP path, datadog.origin='synthetics' is
emitted when aggKey.synthetics is true.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: remove pr_description.md

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Munir <munir.abdinur@datadoghq.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
With `flushInterval: 0` (the test-agent config and the AWS Lambda config) the
processor armed `setInterval(onInterval, 0)`, which fires on every event-loop
tick instead of when a checkpoint is recorded. A tick landing in the window
between the test agent tearing its listener down and bringing it back up posts
to a dead port; the bucket is cleared on serialize, so that single payload is
lost. A producer-only DSM test waiting on exactly one payload then times out.

Honor `flushInterval === 0` as the flush-on-write sentinel the agent and
agentless trace exporters already use: skip the timer and push each checkpoint,
offset, and transaction the moment it is recorded, while the writer URL is live.
@dd-octo-sts dd-octo-sts Bot force-pushed the v5.111.0-proposal branch from 5c1620a to ade7b83 Compare July 2, 2026 16:22
* add breaking changes to release proposal

* fetch from master instead

* removes stable path fallback

* fix wrong look up
@dd-octo-sts dd-octo-sts Bot force-pushed the v5.111.0-proposal branch from ade7b83 to 1a8b486 Compare July 2, 2026 18:27
pabloerhard and others added 2 commits July 2, 2026 18:37
* docs(v6): updated documentation for v6

* update release date

* fix CI test session migratin doc

* Fix for test-opt concerns
@dd-octo-sts dd-octo-sts Bot force-pushed the v5.111.0-proposal branch from 1a8b486 to b9822a7 Compare July 2, 2026 18:37
@rochdev rochdev marked this pull request as ready for review July 2, 2026 19:07
@rochdev rochdev requested review from a team as code owners July 2, 2026 19:07
@rochdev rochdev requested review from khanayan123 and tlhunter and removed request for a team July 2, 2026 19:07

@datadog-prod-us1-5 datadog-prod-us1-5 Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Datadog Autotest: PASS

More details

PR aggregates 30+ merged features, fixes, and refactorings (AWS durable FAILED checkpoint replay, DataStreams flush-on-write, MongoDB nosql scoping, AppSec Lambda WAF, OpenTelemetry OTLP metrics, Vitest no-worker init refactor). All highest-risk behavioral changes carry comprehensive test coverage — attempt normalization, concurrent query isolation, checkpoint handling, and error cases all verified. No concrete bugs found; logic inversions and renamings are consistent and intentional.

Was this helpful? React 👍 or 👎

📊 Validated against 9 scenarios · Open Bits AI session

🤖 Datadog Autotest · Commit b9822a7 · What is Autotest? · Any feedback? Reach out in #autotest

@rochdev rochdev merged commit 8db8e69 into v5.x Jul 2, 2026
648 of 652 checks passed
@rochdev rochdev deleted the v5.111.0-proposal branch July 2, 2026 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.