[Debugger] Add memory pressure monitoring telemetry for Dynamic Instrumentation (observe-only) by dudikeleti · Pull Request #7834 · DataDog/dd-trace-dotnet

dudikeleti · 2025-11-17T15:52:49Z

Summary of changes

Adds an observe-only memory pressure monitor for Dynamic Instrumentation.
Samples runtime/system memory pressure around DI activity (no background timer).
Emits low-cardinality telemetry only when entering or exiting high memory pressure, plus a one-shot counter if the monitor disables itself.
Does not block, throttle, skip, or otherwise change probe behavior.

Reason for change

Dynamic Instrumentation can perform memory-sensitive work when capturing variables, creating snapshots, serializing payloads, and enqueueing/uploading data.

Before using memory pressure to affect DI behavior, we need production data showing how often high pressure occurs, what drives it, how severe each signal is at entry, how long episodes last, where the signal is even available, and whether it is stable enough for future throttling decisions. This PR adds exactly that observability and nothing else.

Implementation details

The monitor detects high pressure from concrete runtime/system signals only:

Memory load ratio (fraction of available memory in use, so the value is comparable across platforms):
- On .NET Core 3.1+, uses GC.GetGCMemoryInfo() (MemoryLoadBytes / TotalAvailableMemoryBytes). This is container/cgroup-aware, and is system/container-wide load rather than process-private bytes - the right scope for "are we close to an OOM".
- On .NET Framework/Windows, uses GlobalMemoryStatusEx (dwMemoryLoad, clamped to [0,1]). This is machine-wide and not container/job-object aware.
- On runtimes/platforms where neither is available, the memory signal is reported as unsupported.
Gen2 collections per second:
- Calculated from GC.CollectionCount(2) deltas over elapsed time. Identical and process-specific on every platform.

Sampling is activity-driven rather than always-on. DI calls into the monitor around capture/snapshot/upload-relevant work, and the monitor refreshes only when the last sample is stale (default min interval 1s). If DI is idle, the monitor does not continuously sample memory.

Time is read from Environment.TickCount64 on .NET Core 3.0+ (a monotonic Stopwatch-based fallback is used on net461/netstandard2.0), so the hot RefreshIfStale() fast path is a couple of volatile reads with no clock abstraction.

If neither memory nor GC signals are available on the host, or a provider throws, the monitor permanently disables itself (it never samples again), writes a single log line (debug for "no signals", error for an exception), and records a one-shot disabled counter tagged by reason so a silent zero in the transition metrics is explainable rather than ambiguous.

Telemetry is emitted on transitions only (enter high pressure / exit high pressure):

Transition count tagged by state (enter/exit) and trigger (memory/gen2/both on enter; none on exit) - i.e. which signal drove entry.
Severity values on transition: memory usage percent and Gen2 collections/sec (tagged by state).
Duration of the high-pressure period, recorded on exit.

No high-cardinality tags such as probe ID, service name, file path, exception type, or method name are added.

flowchart TD
    diStart["DynamicInstrumentation starts"] --> createMonitor["Create MemoryPressureMonitor"]
    createMonitor --> idleMonitor["No background sampling while idle"]

    probeActivity["DI capture/snapshot/upload activity"] --> refreshIfStale["RefreshIfStale (min interval)"]
    refreshIfStale --> signalsAvailable{"Memory or GC signal available?"}
    signalsAvailable -->|"No / provider error"| disable["Disable monitor (logged) + disabled{reason}"]
    signalsAvailable -->|"Yes"| sampleSignals["Sample memory load and Gen2/sec"]
    sampleSignals --> pressureDecision{"High pressure transition?"}

    pressureDecision -->|"ENTER"| enterTelemetry["transitions{state:enter, trigger:*} + severity"]
    pressureDecision -->|"EXIT"| exitTelemetry["transitions{state:exit, trigger:none} + severity + duration"]
    pressureDecision -->|"No transition"| noOp["No telemetry emission"]

    enterTelemetry --> observeOnly["No DI behavior change"]
    exitTelemetry --> observeOnly
    noOp --> observeOnly
    disable --> observeOnly

    diStart --> dispose["DynamicInstrumentation.Dispose"]
    dispose --> stopMonitor["Dispose monitor (lock-free)"]

Risk is kept low by avoiding per-second telemetry gauges, background sampling while idle, probe-level cardinality and allocation on hot paths. Refreshes are guarded by a non-blocking single-writer CAS so concurrent DI activity does not run overlapping samples.

Metrics implementation

All metrics are instrumentation-telemetry metrics, defined as common metrics (isCommon: true) under the live_debugger namespace. They surface in the backend as dd.instrumentation_telemetry_data.live_debugger.memory_pressure.*. Tags are fixed, low-cardinality enums.

Metric	Type	Tags	Emitted
`memory_pressure.transitions`	count	`state` = `enter`\|`exit`, `trigger` = `none`\|`memory`\|`gen2`\|`both`	Once per high-pressure state change. `trigger` is the entry cause on `enter`; `none` on `exit`.
`memory_pressure.disabled`	count	`reason` = `no_signals`\|`error`	Once, when the monitor permanently disables itself (no signals available, or a provider threw).
`memory_pressure.memory_usage_pct`	count	`state` = `enter`\|`exit`, `bucket` = `lt_70`\|`70_80`\|`80_85`\|`85_90`\|`gte_90`	On each transition - count bucketed by memory load percent at that moment.
`memory_pressure.gen2_per_sec`	count	`state` = `enter`\|`exit`, `bucket` = `lt_1`\|`1_2`\|`2_5`\|`gte_5`	On each transition - count bucketed by Gen2 collections/sec at that moment.
`memory_pressure.duration_ms`	count	`bucket` = `lt_1s`\|`1_5s`\|`5_30s`\|`gte_30s`	On `exit` - count bucketed by length of the high-pressure episode.

Runtime/platform segmentation (runtime name, OS, architecture, tracer version) is not added as per-metric tags; it comes from the telemetry payload's application/host metadata and is applied at query time in the backend. This keeps series count bounded while still allowing per-runtime analysis.

The severity and duration metrics use normal telemetry count metrics with a fixed bucket tag instead of distributions. Distribution telemetry is not supported for this internal telemetry path, and bucketed counts preserve the decision-making signal with bounded cardinality.

The trigger tag lives only on the transition count, not on the severity bucket counts: if trigger:memory, the memory bucket already captures how severe the memory signal was, and splitting every bucket by trigger would mostly restate the transition breakdown with extra series. The transition count's trigger plus the state-tagged bucket counts answer "what drove it" and "how severe / how long".

Dashboards and how we will use the data

The goal is a single dashboard that answers, per runtime/OS, whether DI should ever gate capture under memory pressure and, if so, on which signal and at what threshold.

Widget	Source metric	Question	Decision it informs
Coverage timeseries / count by `reason`	`memory_pressure.disabled`	Where does the monitor not run at all?	Defines the observed population - distinguishes "no pressure" from "no data".
Enter rate timeseries, split by `trigger`	`memory_pressure.transitions{state:enter}`	How often does high pressure happen, and is it memory- or GC-driven?	Whether gating is worth building at all, and which signal to gate on.
Trigger breakdown (top-list/pie)	`memory_pressure.transitions{state:enter}`	`memory` vs `gen2` vs `both` mix	Which threshold actually matters.
Memory severity bucket breakdown	`memory_pressure.memory_usage_pct{state:enter}`	How deep is memory pressure at entry?	Where to set a memory gating threshold.
GC severity bucket breakdown	`memory_pressure.gen2_per_sec{state:enter}`	How hot is GC at entry?	Where to set a gen2 gating threshold.
Episode duration bucket breakdown	`memory_pressure.duration_ms`	How long do episodes last?	Whether gating needs hysteresis/cooldown or episodes are too short to act on.

Every widget can be grouped by the dimensions the telemetry intake stamps onto each series from the payload's application/host metadata.

How the data is gathered:

These metrics flow through the existing tracer telemetry pipeline (generate-metrics payloads) to the Datadog backend - no new transport.
Because emission is transition-only and activity-driven, idle apps cost nothing and noisy apps cannot spam per-second points.
We collect across runtimes for a few weeks, then read the dashboard top-down: memory_pressure.disabled (coverage) → memory_pressure.transitions{state:enter} by trigger (frequency + cause) → severity distributions (thresholds) → duration (persistence).
The resulting numbers define concrete gating thresholds (or show that gating is unnecessary) for a follow-up PR.

Scope: what this telemetry answers (and what it doesn't)

This is a fleet-aggregate, pre-decision dataset: it tells us whether high memory pressure occurs during DI work often, severely, and long enough to justify gating capture, and on which signal at what threshold.
It is deliberately not a tool for measuring the impact of a future gate.

Test coverage

Added/updated tests cover:

Memory threshold enter/exit behavior and hysteresis.
Gen2/sec threshold and rate behavior.
Consecutive high/low cycle (debounce) behavior.
Unavailable memory/GC signals → self-disable with reason:no_signals, no further sampling.
Provider exception handling → self-disable with reason:error, no crash.
Refresh overlap handling (non-blocking single writer) and dispose during an in-flight refresh.
Dispose/lifecycle behavior (no further sampling or emission after dispose).
Activity-driven stale refresh (no sampling while idle; samples only when stale).
Transition telemetry: enter emitted once, exit emitted once, severity recorded on transition, duration recorded on exit, no repeat while state is unchanged.
trigger classification: gen2 when only GC drives entry, both when memory and GC cross together, memory/none on the enter/exit pair.
Extreme/edge configuration values are clamped to safe minimums.
Windows GlobalMemoryStatusEx interop returns valid values.
DI integration: the monitor is sampled only through the dedicated hook and is disposed by DynamicInstrumentation.
Telemetry aggregation: transitions tagged by state+trigger, the disabled counter, and the three bucketed count metrics aggregate into the expected series/tags.

Other details

Initial thresholds are observation thresholds, not throttling thresholds:

Memory pressure threshold: 0.85.
Memory exit threshold with default hysteresis: 0.80.
Gen2 threshold: 2 collections/sec.
Gen2 exit threshold with default hysteresis: 1 collection/sec.

These values are intentionally conservative starting points for data collection and can be tuned after production telemetry is available.

This PR intentionally does not include an explicit/manual pressure event signal. There is no concrete production caller or well-defined meaning for that signal yet. If a future DI path has a specific condition, such as snapshot capture exceeding a memory budget, it should add a named API and metric for that scenario.

Any future DI throttling or blocking based on memory pressure should be done in a separate PR after reviewing the collected telemetry.

Note

Incidental DynamicInstrumentation lifecycle hardening (not memory-pressure specific): Wiring the new monitor in as a disposable owned by DynamicInstrumentation, exposed pre-existing races in the start/subscribe/dispose paths. DynamicInstrumentation is disposed at runtime via RCM, not just at shutdown. Since the monitor's lifecycle is tied to those paths, we hardened them here.

Copilot

Pull request overview

This PR adds an observe-only memory pressure monitor for Dynamic Instrumentation (DI) and emits low-cardinality telemetry on high-pressure transitions (enter/exit), plus a one-shot “disabled” counter when signals are unavailable or the monitor errors.

Changes:

Introduces MemoryPressureMonitor + supporting system/GC signal readers (incl. Windows interop) and wires it into DI via a dedicated refresh hook.
Adds new telemetry metric definitions (counts + shared distributions) and regenerates telemetry collector code for all TFMs.
Adds focused unit/integration tests validating monitor behavior, transitions, and telemetry aggregation.

Reviewed changes

Copilot reviewed 14 out of 54 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tracer/src/Datadog.Trace/Debugger/RateLimiting/MemoryPressureMonitor.cs	New monitor implementation + transition/disable telemetry emission
tracer/src/Datadog.Trace/Debugger/RateLimiting/MemoryPressureConfig.cs	Configuration defaults and thresholds/hysteresis knobs
tracer/src/Datadog.Trace/Debugger/RateLimiting/SystemMemorySource.cs	Production memory/GC signal providers
tracer/src/Datadog.Trace/Debugger/RateLimiting/WindowsMemoryInfo.cs	Windows `GlobalMemoryStatusEx` interop helper
tracer/src/Datadog.Trace/Debugger/DynamicInstrumentation.cs	Owns/disposes monitor and exposes dedicated refresh hook
tracer/src/Datadog.Trace/Debugger/Expressions/ProbeProcessor.cs	Calls the dedicated memory-pressure refresh hook around capture-related activity
tracer/src/Datadog.Trace/Debugger/DebuggerFactory.cs	Constructs and injects the monitor into DI
tracer/src/Datadog.Trace/Debugger/Caching/DefaultMemoryChecker.cs	Reuses new Windows memory helper instead of local interop copy
tracer/src/Datadog.Trace/Telemetry/Metrics/MetricTags.cs	Adds low-cardinality tag enums for debugger memory-pressure metrics
tracer/src/Datadog.Trace/Telemetry/Metrics/Count.cs	Adds count metrics for transitions + disabled reasons
tracer/src/Datadog.Trace/Telemetry/Metrics/DistributionShared.cs	Adds shared distributions for severity + duration
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: null-collector stubs for new shared distributions
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_Count.g.cs	Regenerated: null-collector stubs for new counts
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: buffers + recorders for new shared distributions
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_Count.g.cs	Regenerated: buffers + recorders for new counts
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: interface methods for new shared distributions
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_Count.g.cs	Regenerated: interface methods for new counts
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/DistributionSharedExtensions.g.cs	Regenerated: names/common-flag/namespaces for new shared distributions
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CountExtensions.g.cs	Regenerated: names/common-flag/namespaces for new counts
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: CI visibility collector distribution buffers updated
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_Count.g.cs	Regenerated: CI visibility collector count buffers updated
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: null-collector stubs for new shared distributions
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_Count.g.cs	Regenerated: null-collector stubs for new counts
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: buffers + recorders for new shared distributions
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_Count.g.cs	Regenerated: buffers + recorders for new counts
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: interface methods for new shared distributions
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_Count.g.cs	Regenerated: interface methods for new counts
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/DistributionSharedExtensions.g.cs	Regenerated: names/common-flag/namespaces for new shared distributions
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CountExtensions.g.cs	Regenerated: names/common-flag/namespaces for new counts
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: CI visibility collector distribution buffers updated
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_Count.g.cs	Regenerated: CI visibility collector count buffers updated
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: null-collector stubs for new shared distributions
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_Count.g.cs	Regenerated: null-collector stubs for new counts
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: buffers + recorders for new shared distributions
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_Count.g.cs	Regenerated: buffers + recorders for new counts
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: interface methods for new shared distributions
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_Count.g.cs	Regenerated: interface methods for new counts
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/DistributionSharedExtensions.g.cs	Regenerated: names/common-flag/namespaces for new shared distributions
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CountExtensions.g.cs	Regenerated: names/common-flag/namespaces for new counts
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: CI visibility collector distribution buffers updated
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_Count.g.cs	Regenerated: CI visibility collector count buffers updated
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: null-collector stubs for new shared distributions
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_Count.g.cs	Regenerated: null-collector stubs for new counts
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: buffers + recorders for new shared distributions
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_Count.g.cs	Regenerated: buffers + recorders for new counts
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: interface methods for new shared distributions
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_Count.g.cs	Regenerated: interface methods for new counts
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/DistributionSharedExtensions.g.cs	Regenerated: names/common-flag/namespaces for new shared distributions
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CountExtensions.g.cs	Regenerated: names/common-flag/namespaces for new counts
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_DistributionShared.g.cs	Regenerated: CI visibility collector distribution buffers updated
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_Count.g.cs	Regenerated: CI visibility collector count buffers updated
tracer/test/Datadog.Trace.Tests/Debugger/RateLimiting/MemoryPressureMonitorTests.cs	New unit tests for thresholds, hysteresis, disable paths, and concurrency behavior
tracer/test/Datadog.Trace.Tests/Debugger/DynamicInstrumentationTests.cs	Integration tests for refresh hook usage + monitor disposal
tracer/test/Datadog.Trace.Tests/Telemetry/Collectors/MetricsTelemetryCollectorTests.cs	Adds aggregation test for new memory-pressure metrics

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 14 out of 54 changed files in this pull request and generated 2 comments.

dd-trace-dotnet-ci-bot · 2026-05-29T01:28:28Z

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing This PR (7834) and master.

✅ No regressions detected - check the details below

Full Metrics Comparison

FakeDbCommand

Metric	Master (Mean ± 95% CI)	Current (Mean ± 95% CI)	Change	Status
.NET Framework 4.8 - Baseline
duration	76.09 ± (75.90 - 76.46) ms	75.83 ± (75.68 - 76.27) ms	-0.3%	✅
.NET Framework 4.8 - Bailout
duration	78.52 ± (78.32 - 78.80) ms	78.36 ± (78.35 - 78.89) ms	-0.2%	✅
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration	1105.77 ± (1105.04 - 1112.66) ms	1106.19 ± (1107.63 - 1114.94) ms	+0.0%	✅⬆️
.NET Core 3.1 - Baseline
process.internal_duration_ms	22.52 ± (22.47 - 22.58) ms	22.75 ± (22.68 - 22.81) ms	+1.0%	✅⬆️
process.time_to_main_ms	85.10 ± (84.80 - 85.40) ms	85.66 ± (85.36 - 85.95) ms	+0.7%	✅⬆️
runtime.dotnet.exceptions.count	0 ± (0 - 0)	0 ± (0 - 0)	+0.0%	✅
runtime.dotnet.mem.committed	10.91 ± (10.91 - 10.92) MB	10.92 ± (10.92 - 10.93) MB	+0.1%	✅⬆️
runtime.dotnet.threads.count	12 ± (12 - 12)	12 ± (12 - 12)	+0.0%	✅
.NET Core 3.1 - Bailout
process.internal_duration_ms	22.39 ± (22.34 - 22.43) ms	22.27 ± (22.23 - 22.31) ms	-0.5%	✅
process.time_to_main_ms	86.15 ± (85.94 - 86.37) ms	84.90 ± (84.69 - 85.10) ms	-1.5%	✅
runtime.dotnet.exceptions.count	0 ± (0 - 0)	0 ± (0 - 0)	+0.0%	✅
runtime.dotnet.mem.committed	10.95 ± (10.94 - 10.95) MB	10.95 ± (10.95 - 10.96) MB	+0.0%	✅⬆️
runtime.dotnet.threads.count	13 ± (13 - 13)	13 ± (13 - 13)	+0.0%	✅
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms	214.93 ± (214.00 - 215.85) ms	212.40 ± (211.56 - 213.25) ms	-1.2%	✅
process.time_to_main_ms	544.18 ± (542.88 - 545.49) ms	539.75 ± (538.48 - 541.02) ms	-0.8%	✅
runtime.dotnet.exceptions.count	0 ± (0 - 0)	0 ± (0 - 0)	+0.0%	✅
runtime.dotnet.mem.committed	48.20 ± (48.16 - 48.25) MB	48.37 ± (48.34 - 48.40) MB	+0.3%	✅⬆️
runtime.dotnet.threads.count	28 ± (28 - 28)	28 ± (28 - 28)	+0.3%	✅⬆️
.NET 6 - Baseline
process.internal_duration_ms	21.27 ± (21.23 - 21.31) ms	21.64 ± (21.58 - 21.69) ms	+1.7%	✅⬆️
process.time_to_main_ms	73.40 ± (73.24 - 73.57) ms	76.41 ± (76.10 - 76.73) ms	+4.1%	✅⬆️
runtime.dotnet.exceptions.count	0 ± (0 - 0)	0 ± (0 - 0)	+0.0%	✅
runtime.dotnet.mem.committed	10.63 ± (10.63 - 10.63) MB	10.63 ± (10.63 - 10.64) MB	+0.0%	✅⬆️
runtime.dotnet.threads.count	10 ± (10 - 10)	10 ± (10 - 10)	+0.0%	✅
.NET 6 - Bailout
process.internal_duration_ms	21.82 ± (21.76 - 21.87) ms	21.10 ± (21.06 - 21.14) ms	-3.3%	✅
process.time_to_main_ms	78.27 ± (78.02 - 78.52) ms	75.11 ± (74.93 - 75.29) ms	-4.0%	✅
runtime.dotnet.exceptions.count	0 ± (0 - 0)	0 ± (0 - 0)	+0.0%	✅
runtime.dotnet.mem.committed	10.74 ± (10.74 - 10.75) MB	10.74 ± (10.74 - 10.74) MB	-0.0%	✅
runtime.dotnet.threads.count	11 ± (11 - 11)	11 ± (11 - 11)	+0.0%	✅
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms	368.48 ± (366.34 - 370.61) ms	368.57 ± (366.58 - 370.55) ms	+0.0%	✅⬆️
process.time_to_main_ms	549.44 ± (547.97 - 550.91) ms	554.14 ± (552.73 - 555.56) ms	+0.9%	✅⬆️
runtime.dotnet.exceptions.count	0 ± (0 - 0)	0 ± (0 - 0)	+0.0%	✅
runtime.dotnet.mem.committed	49.78 ± (49.76 - 49.80) MB	49.79 ± (49.77 - 49.81) MB	+0.0%	✅⬆️
runtime.dotnet.threads.count	28 ± (28 - 28)	28 ± (28 - 28)	+0.0%	✅⬆️
.NET 8 - Baseline
process.internal_duration_ms	19.43 ± (19.40 - 19.47) ms	19.62 ± (19.56 - 19.67) ms	+1.0%	✅⬆️
process.time_to_main_ms	72.96 ± (72.80 - 73.12) ms	74.31 ± (74.00 - 74.62) ms	+1.9%	✅⬆️
runtime.dotnet.exceptions.count	0 ± (0 - 0)	0 ± (0 - 0)	+0.0%	✅
runtime.dotnet.mem.committed	7.66 ± (7.66 - 7.67) MB	7.67 ± (7.67 - 7.67) MB	+0.1%	✅⬆️
runtime.dotnet.threads.count	10 ± (10 - 10)	10 ± (10 - 10)	+0.0%	✅
.NET 8 - Bailout
process.internal_duration_ms	19.36 ± (19.33 - 19.39) ms	19.62 ± (19.56 - 19.67) ms	+1.3%	✅⬆️
process.time_to_main_ms	73.75 ± (73.57 - 73.94) ms	76.30 ± (76.00 - 76.59) ms	+3.5%	✅⬆️
runtime.dotnet.exceptions.count	0 ± (0 - 0)	0 ± (0 - 0)	+0.0%	✅
runtime.dotnet.mem.committed	7.71 ± (7.71 - 7.72) MB	7.73 ± (7.72 - 7.73) MB	+0.2%	✅⬆️
runtime.dotnet.threads.count	11 ± (11 - 11)	11 ± (11 - 11)	+0.0%	✅
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms	293.43 ± (291.42 - 295.44) ms	295.99 ± (293.70 - 298.28) ms	+0.9%	✅⬆️
process.time_to_main_ms	502.11 ± (500.92 - 503.31) ms	500.04 ± (498.81 - 501.27) ms	-0.4%	✅
runtime.dotnet.exceptions.count	0 ± (0 - 0)	0 ± (0 - 0)	+0.0%	✅
runtime.dotnet.mem.committed	36.81 ± (36.78 - 36.84) MB	36.85 ± (36.82 - 36.88) MB	+0.1%	✅⬆️
runtime.dotnet.threads.count	27 ± (27 - 27)	27 ± (27 - 27)	+0.0%	✅

HttpMessageHandler

Metric	Master (Mean ± 95% CI)	Current (Mean ± 95% CI)	Change	Status
.NET Framework 4.8 - Baseline
duration	200.89 ± (200.87 - 201.63) ms	200.62 ± (200.14 - 201.02) ms	-0.1%	✅
.NET Framework 4.8 - Bailout
duration	204.05 ± (203.56 - 204.42) ms	204.08 ± (203.66 - 204.51) ms	+0.0%	✅⬆️
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration	1213.30 ± (1211.42 - 1217.19) ms	1210.60 ± (1209.36 - 1216.61) ms	-0.2%	✅
.NET Core 3.1 - Baseline
process.internal_duration_ms	196.78 ± (196.27 - 197.29) ms	194.78 ± (194.38 - 195.18) ms	-1.0%	✅
process.time_to_main_ms	85.56 ± (85.25 - 85.86) ms	85.07 ± (84.78 - 85.36) ms	-0.6%	✅
runtime.dotnet.exceptions.count	3 ± (3 - 3)	3 ± (3 - 3)	+0.0%	✅
runtime.dotnet.mem.committed	16.04 ± (16.01 - 16.06) MB	16.11 ± (16.09 - 16.13) MB	+0.5%	✅⬆️
runtime.dotnet.threads.count	20 ± (20 - 20)	20 ± (20 - 20)	+0.3%	✅⬆️
.NET Core 3.1 - Bailout
process.internal_duration_ms	195.40 ± (195.02 - 195.78) ms	194.15 ± (193.75 - 194.55) ms	-0.6%	✅
process.time_to_main_ms	86.62 ± (86.35 - 86.89) ms	85.87 ± (85.61 - 86.13) ms	-0.9%	✅
runtime.dotnet.exceptions.count	3 ± (3 - 3)	3 ± (3 - 3)	+0.0%	✅
runtime.dotnet.mem.committed	16.10 ± (16.08 - 16.13) MB	16.13 ± (16.10 - 16.15) MB	+0.1%	✅⬆️
runtime.dotnet.threads.count	21 ± (21 - 21)	21 ± (20 - 21)	-0.7%	✅
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms	388.31 ± (386.98 - 389.64) ms	387.76 ± (386.39 - 389.13) ms	-0.1%	✅
process.time_to_main_ms	540.20 ± (538.94 - 541.46) ms	540.25 ± (538.99 - 541.50) ms	+0.0%	✅⬆️
runtime.dotnet.exceptions.count	3 ± (3 - 3)	3 ± (3 - 3)	+0.0%	✅
runtime.dotnet.mem.committed	57.76 ± (57.55 - 57.98) MB	57.87 ± (57.63 - 58.10) MB	+0.2%	✅⬆️
runtime.dotnet.threads.count	30 ± (30 - 30)	30 ± (30 - 30)	-0.3%	✅
.NET 6 - Baseline
process.internal_duration_ms	198.71 ± (198.24 - 199.18) ms	198.50 ± (198.07 - 198.93) ms	-0.1%	✅
process.time_to_main_ms	73.00 ± (72.74 - 73.25) ms	73.08 ± (72.80 - 73.35) ms	+0.1%	✅⬆️
runtime.dotnet.exceptions.count	4 ± (4 - 4)	4 ± (4 - 4)	+0.0%	✅
runtime.dotnet.mem.committed	16.34 ± (16.31 - 16.36) MB	16.38 ± (16.36 - 16.40) MB	+0.3%	✅⬆️
runtime.dotnet.threads.count	19 ± (19 - 19)	19 ± (19 - 19)	-0.6%	✅
.NET 6 - Bailout
process.internal_duration_ms	197.21 ± (196.75 - 197.67) ms	196.64 ± (196.24 - 197.04) ms	-0.3%	✅
process.time_to_main_ms	73.44 ± (73.23 - 73.66) ms	73.90 ± (73.64 - 74.16) ms	+0.6%	✅⬆️
runtime.dotnet.exceptions.count	4 ± (4 - 4)	4 ± (4 - 4)	+0.0%	✅
runtime.dotnet.mem.committed	16.39 ± (16.32 - 16.47) MB	16.43 ± (16.40 - 16.45) MB	+0.2%	✅⬆️
runtime.dotnet.threads.count	20 ± (20 - 20)	20 ± (20 - 20)	+1.2%	✅⬆️
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms	585.27 ± (582.89 - 587.66) ms	586.63 ± (583.90 - 589.35) ms	+0.2%	✅⬆️
process.time_to_main_ms	546.51 ± (545.58 - 547.45) ms	547.48 ± (546.34 - 548.62) ms	+0.2%	✅⬆️
runtime.dotnet.exceptions.count	4 ± (4 - 4)	4 ± (4 - 4)	+0.0%	✅
runtime.dotnet.mem.committed	61.09 ± (61.00 - 61.18) MB	61.07 ± (60.98 - 61.16) MB	-0.0%	✅
runtime.dotnet.threads.count	31 ± (31 - 31)	31 ± (31 - 31)	+0.3%	✅⬆️
.NET 8 - Baseline
process.internal_duration_ms	197.00 ± (196.56 - 197.44) ms	195.63 ± (195.18 - 196.08) ms	-0.7%	✅
process.time_to_main_ms	72.70 ± (72.41 - 72.99) ms	72.12 ± (71.90 - 72.34) ms	-0.8%	✅
runtime.dotnet.exceptions.count	4 ± (4 - 4)	4 ± (4 - 4)	+0.0%	✅
runtime.dotnet.mem.committed	11.69 ± (11.66 - 11.72) MB	11.71 ± (11.69 - 11.72) MB	+0.1%	✅⬆️
runtime.dotnet.threads.count	18 ± (18 - 18)	18 ± (18 - 18)	+0.1%	✅⬆️
.NET 8 - Bailout
process.internal_duration_ms	196.15 ± (195.71 - 196.59) ms	195.11 ± (194.60 - 195.63) ms	-0.5%	✅
process.time_to_main_ms	73.59 ± (73.35 - 73.82) ms	73.11 ± (72.90 - 73.32) ms	-0.7%	✅
runtime.dotnet.exceptions.count	4 ± (4 - 4)	4 ± (4 - 4)	+0.0%	✅
runtime.dotnet.mem.committed	11.73 ± (11.71 - 11.76) MB	11.77 ± (11.75 - 11.79) MB	+0.3%	✅⬆️
runtime.dotnet.threads.count	19 ± (19 - 19)	19 ± (19 - 19)	+0.0%	✅⬆️
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms	512.45 ± (509.93 - 514.96) ms	515.06 ± (512.50 - 517.62) ms	+0.5%	✅⬆️
process.time_to_main_ms	495.75 ± (495.00 - 496.50) ms	496.40 ± (495.57 - 497.23) ms	+0.1%	✅⬆️
runtime.dotnet.exceptions.count	4 ± (4 - 4)	4 ± (4 - 4)	+0.0%	✅
runtime.dotnet.mem.committed	50.53 ± (50.49 - 50.57) MB	50.55 ± (50.51 - 50.58) MB	+0.0%	✅⬆️
runtime.dotnet.threads.count	29 ± (29 - 30)	30 ± (30 - 30)	+0.8%	✅⬆️

Comparison explanation

Execution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:

Welch test with statistical test for significance of 5%
Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

Duration charts

FakeDbCommand (.NET Framework 4.8)

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (76ms)  : 71, 80
    master - mean (76ms)  : 72, 80

    section Bailout
    This PR (7834) - mean (79ms)  : 74, 83
    master - mean (79ms)  : 75, 82

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (1,111ms)  : 1059, 1164
    master - mean (1,109ms)  : 1054, 1164

FakeDbCommand (.NET Core 3.1)

gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (116ms)  : 110, 121
    master - mean (115ms)  : 108, 121

    section Bailout
    This PR (7834) - mean (114ms)  : 111, 117
    master - mean (115ms)  : 112, 119

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (789ms)  : 759, 818
    master - mean (797ms)  : 773, 821

FakeDbCommand (.NET 6)

gantt
    title Execution time (ms) FakeDbCommand (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (105ms)  : 98, 111
    master - mean (101ms)  : 98, 104

    section Bailout
    This PR (7834) - mean (103ms)  : 99, 106
    master - mean (107ms)  : 102, 111

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (952ms)  : 914, 990
    master - mean (950ms)  : 916, 983

FakeDbCommand (.NET 8)

gantt
    title Execution time (ms) FakeDbCommand (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (102ms)  : 96, 108
    master - mean (100ms)  : 98, 103

    section Bailout
    This PR (7834) - mean (104ms)  : 99, 109
    master - mean (101ms)  : 99, 103

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (825ms)  : 787, 863
    master - mean (825ms)  : 788, 862

HttpMessageHandler (.NET Framework 4.8)

gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (201ms)  : 196, 205
    master - mean (201ms)  : 197, 205

    section Bailout
    This PR (7834) - mean (204ms)  : 200, 208
    master - mean (204ms)  : 199, 209

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (1,213ms)  : 1160, 1266
    master - mean (1,214ms)  : 1173, 1256

HttpMessageHandler (.NET Core 3.1)

gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (289ms)  : 283, 295
    master - mean (292ms)  : 285, 299

    section Bailout
    This PR (7834) - mean (289ms)  : 284, 294
    master - mean (291ms)  : 285, 297

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (966ms)  : 944, 987
    master - mean (968ms)  : 947, 989

HttpMessageHandler (.NET 6)

gantt
    title Execution time (ms) HttpMessageHandler (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (280ms)  : 273, 288
    master - mean (280ms)  : 274, 287

    section Bailout
    This PR (7834) - mean (279ms)  : 271, 286
    master - mean (279ms)  : 274, 284

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (1,165ms)  : 1132, 1197
    master - mean (1,163ms)  : 1120, 1207

HttpMessageHandler (.NET 8)

gantt
    title Execution time (ms) HttpMessageHandler (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (278ms)  : 269, 286
    master - mean (280ms)  : 274, 286

    section Bailout
    This PR (7834) - mean (278ms)  : 272, 285
    master - mean (280ms)  : 273, 286

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (1,043ms)  : 997, 1088
    master - mean (1,042ms)  : 1000, 1083

pr-commenter · 2026-05-29T01:39:41Z

Benchmarks

Benchmark execution time: 2026-06-03 17:08:56

Comparing candidate commit e6b821c in PR branch dudik/cb-memory with baseline commit d8273e8 in branch master.

Found 0 performance improvements and 2 performance regressions! Performance is the same for 70 metrics, 0 unstable metrics, 62 known flaky benchmarks, 64 flaky benchmarks without significant changes.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

🟩 = significantly better candidate vs. baseline
🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

scenario:Benchmarks.Trace.DbCommandBenchmark.ExecuteNonQuery net472

🟥 throughput [-22591.875op/s; -19477.941op/s] or [-6.363%; -5.486%]

scenario:Benchmarks.Trace.HttpClientBenchmark.SendAsync net472

🟥 throughput [-6975.472op/s; -6343.280op/s] or [-7.963%; -7.241%]

Known flaky benchmarks

These benchmarks are marked as flaky and will not trigger a failure. Modify FLAKY_BENCHMARKS_REGEX to control which benchmarks are marked as flaky.

scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild net472

🟥 throughput [-7794.907op/s; -7321.272op/s] or [-9.242%; -8.681%]

scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces net472

🟥 execution_time [+312.435ms; +319.611ms] or [+155.041%; +158.602%]
🟥 throughput [-44.798op/s; -40.520op/s] or [-8.060%; -7.290%]

scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces net6.0

🟥 execution_time [+381.232ms; +383.117ms] or [+301.197%; +302.686%]
🟩 throughput [+89.136op/s; +92.309op/s] or [+11.752%; +12.171%]

scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1

🟥 execution_time [+390.139ms; +393.719ms] or [+345.258%; +348.426%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody net472

🟥 allocated_mem [+1.308KB; +1.308KB] or [+27.528%; +27.540%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody net6.0

🟥 allocated_mem [+471 bytes; +472 bytes] or [+9.976%; +9.987%]
🟩 execution_time [-15.598ms; -11.405ms] or [-7.285%; -5.327%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody netcoreapp3.1

🟥 allocated_mem [+1.272KB; +1.272KB] or [+27.500%; +27.510%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody net472

🟥 allocated_mem [+1.307KB; +1.307KB] or [+105.743%; +105.758%]
🟥 throughput [-275337.526op/s; -271716.265op/s] or [-28.113%; -27.744%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody net6.0

🟥 allocated_mem [+471 bytes; +472 bytes] or [+38.557%; +38.566%]
🟩 execution_time [-26.250ms; -21.372ms] or [-11.706%; -9.531%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody netcoreapp3.1

🟥 allocated_mem [+1.272KB; +1.272KB] or [+105.288%; +105.304%]
🟥 throughput [-153745.536op/s; -137847.194op/s] or [-22.090%; -19.806%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorMoreComplexBody net6.0

🟩 throughput [+9016.970op/s; +11934.796op/s] or [+5.737%; +7.594%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorMoreComplexBody netcoreapp3.1

🟩 throughput [+9638.279op/s; +12323.534op/s] or [+7.678%; +9.817%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody net6.0

🟩 throughput [+441590.006op/s; +461715.638op/s] or [+14.724%; +15.396%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody netcoreapp3.1

🟩 execution_time [-19.569ms; -15.232ms] or [-9.021%; -7.021%]
🟩 throughput [+181197.887op/s; +234383.557op/s] or [+7.192%; +9.303%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeArgs net472

🟥 execution_time [+297.990ms; +299.152ms] or [+148.895%; +149.476%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeArgs net6.0

🟥 execution_time [+300.641ms; +304.179ms] or [+151.614%; +153.398%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeArgs netcoreapp3.1

🟥 execution_time [+299.801ms; +303.010ms] or [+151.016%; +152.633%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs net472

🟥 execution_time [+297.122ms; +298.073ms] or [+145.935%; +146.402%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs net6.0

🟥 execution_time [+292.507ms; +295.000ms] or [+142.996%; +144.215%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs netcoreapp3.1

🟥 execution_time [+298.114ms; +300.353ms] or [+148.997%; +150.116%]

scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmarkWithAttack net6.0

🟥 execution_time [+27.834µs; +51.531µs] or [+8.886%; +16.451%]
🟥 throughput [-473.414op/s; -274.121op/s] or [-14.758%; -8.545%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest net472

🟥 execution_time [+300.314ms; +301.109ms] or [+149.888%; +150.284%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest net6.0

🟥 execution_time [+416.840ms; +424.132ms] or [+452.914%; +460.837%]
🟩 throughput [+629.463op/s; +795.277op/s] or [+5.172%; +6.535%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest netcoreapp3.1

🟥 execution_time [+356.434ms; +366.147ms] or [+270.637%; +278.012%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net472

unstable execution_time [+298.462ms; +338.680ms] or [+137.230%; +155.722%]
🟥 throughput [-538.798op/s; -488.961op/s] or [-48.820%; -44.305%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net6.0

unstable execution_time [+201.524ms; +334.804ms] or [+85.881%; +142.679%]
🟥 throughput [-669.072op/s; -585.663op/s] or [-44.627%; -39.064%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1

🟥 execution_time [+336.626ms; +350.616ms] or [+201.341%; +209.709%]
🟥 throughput [-433.407op/s; -393.081op/s] or [-30.178%; -27.370%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice net6.0

🟥 execution_time [+83.856µs; +92.966µs] or [+5.761%; +6.387%]
🟥 throughput [-41.275op/s; -37.306op/s] or [-6.008%; -5.430%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool net6.0

🟩 throughput [+55.169op/s; +123.454op/s] or [+5.948%; +13.311%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool netcoreapp3.1

unstable throughput [-19.980op/s; +48.206op/s] or [-3.729%; +8.998%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice net6.0

🟩 throughput [+26.109op/s; +43.420op/s] or [+5.154%; +8.572%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch net472

🟥 execution_time [+302.239ms; +304.319ms] or [+152.202%; +153.249%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch net6.0

🟥 execution_time [+300.954ms; +301.989ms] or [+150.809%; +151.327%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch netcoreapp3.1

🟥 execution_time [+300.372ms; +303.522ms] or [+150.894%; +152.477%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync net472

🟥 execution_time [+302.059ms; +303.745ms] or [+151.684%; +152.530%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync net6.0

🟥 execution_time [+297.656ms; +300.065ms] or [+147.178%; +148.369%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync netcoreapp3.1

🟥 execution_time [+302.121ms; +306.176ms] or [+153.128%; +155.183%]

scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync net472

🟥 execution_time [+301.742ms; +305.394ms] or [+151.447%; +153.280%]

scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync net6.0

🟥 execution_time [+299.874ms; +302.154ms] or [+149.460%; +150.596%]
🟩 throughput [+52577.144op/s; +62138.621op/s] or [+10.440%; +12.339%]

scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync netcoreapp3.1

🟥 execution_time [+301.492ms; +304.323ms] or [+149.990%; +151.398%]

scenario:Benchmarks.Trace.ILoggerBenchmark.EnrichedLog net6.0

🟩 execution_time [-15.978ms; -12.322ms] or [-7.430%; -5.730%]
🟩 throughput [+20284.268op/s; +27216.395op/s] or [+5.565%; +7.466%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark net472

unstable execution_time [+3.143µs; +48.735µs] or [+0.776%; +12.038%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark net6.0

🟩 allocated_mem [-19.459KB; -19.437KB] or [-7.098%; -7.090%]
unstable execution_time [-51.289µs; +5.304µs] or [-10.137%; +1.048%]
unstable throughput [-9.945op/s; +192.470op/s] or [-0.496%; +9.604%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark netcoreapp3.1

unstable execution_time [-40.870µs; +22.689µs] or [-7.082%; +3.932%]
unstable throughput [-54.596op/s; +121.102op/s] or [-3.119%; +6.919%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark net6.0

unstable execution_time [+5.546µs; +11.041µs] or [+13.109%; +26.097%]
🟥 throughput [-4681.151op/s; -2612.048op/s] or [-19.706%; -10.996%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark netcoreapp3.1

unstable execution_time [-14.261µs; -6.189µs] or [-22.125%; -9.602%]
unstable throughput [+1603.303op/s; +3352.240op/s] or [+9.837%; +20.567%]

scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog net472

🟥 execution_time [+302.842ms; +304.185ms] or [+153.073%; +153.752%]

scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog net6.0

🟥 execution_time [+301.910ms; +306.807ms] or [+153.671%; +156.164%]

scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog netcoreapp3.1

🟥 execution_time [+300.068ms; +303.411ms] or [+150.221%; +151.895%]

scenario:Benchmarks.Trace.RedisBenchmark.SendReceive net6.0

🟩 throughput [+33990.125op/s; +36707.835op/s] or [+6.434%; +6.948%]

scenario:Benchmarks.Trace.SerilogBenchmark.EnrichedLog net472

🟥 execution_time [+301.763ms; +304.173ms] or [+150.402%; +151.603%]

scenario:Benchmarks.Trace.SerilogBenchmark.EnrichedLog net6.0

🟥 execution_time [+301.245ms; +302.633ms] or [+151.271%; +151.968%]

scenario:Benchmarks.Trace.SerilogBenchmark.EnrichedLog netcoreapp3.1

🟥 execution_time [+304.020ms; +306.816ms] or [+154.179%; +155.598%]

scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore net472

🟥 execution_time [+299.602ms; +300.578ms] or [+149.443%; +149.930%]
🟩 throughput [+61145175.522op/s; +61541392.831op/s] or [+44.530%; +44.818%]

scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore net6.0

🟥 execution_time [+418.513ms; +423.233ms] or [+520.495%; +526.365%]

scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore netcoreapp3.1

🟥 execution_time [+298.792ms; +299.990ms] or [+149.031%; +149.628%]
🟩 throughput [+11666913.478op/s; +16638455.320op/s] or [+5.168%; +7.370%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope net6.0

🟩 throughput [+65954.827op/s; +80514.054op/s] or [+6.158%; +7.517%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope netcoreapp3.1

🟩 throughput [+47296.598op/s; +67445.174op/s] or [+5.474%; +7.807%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan net6.0

🟩 throughput [+78917.839op/s; +108946.902op/s] or [+6.108%; +8.433%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan netcoreapp3.1

🟩 throughput [+79044.134op/s; +87219.940op/s] or [+7.850%; +8.662%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes net6.0

🟩 throughput [+35985.579op/s; +47013.592op/s] or [+6.534%; +8.537%]

scenario:Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin net6.0

🟩 throughput [+83438.183op/s; +101599.857op/s] or [+9.322%; +11.351%]

Known flaky benchmarks without significant changes:

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan net472
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan net6.0
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan netcoreapp3.1
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_AddEvent_Sampled net472
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_AddEvent_Sampled net6.0
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_AddEvent_Sampled netcoreapp3.1
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_GetContext_Sampled net472
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_GetContext_Sampled net6.0
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_GetContext_Sampled netcoreapp3.1
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetAttributes_Sampled net472
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetAttributes_Sampled net6.0
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetAttributes_Sampled netcoreapp3.1
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetStatus_Sampled net472
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetStatus_Sampled net6.0
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetStatus_Sampled netcoreapp3.1
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_UpdateName_Sampled net472
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_UpdateName_Sampled net6.0
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_UpdateName_Sampled netcoreapp3.1
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan net472
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan net6.0
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan netcoreapp3.1
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_AddEvent_Sampled net472
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_AddEvent_Sampled net6.0
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_AddEvent_Sampled netcoreapp3.1
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_GetContext_Sampled net472
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_GetContext_Sampled net6.0
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_GetContext_Sampled netcoreapp3.1
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_RecordException_Sampled net472
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_RecordException_Sampled net6.0
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_RecordException_Sampled netcoreapp3.1
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetAttributes_Sampled net472
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetAttributes_Sampled net6.0
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetAttributes_Sampled netcoreapp3.1
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetStatus_Sampled net472
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetStatus_Sampled net6.0
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetStatus_Sampled netcoreapp3.1
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_UpdateName_Sampled net472
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_UpdateName_Sampled net6.0
scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_UpdateName_Sampled netcoreapp3.1
scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild net6.0
scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorMoreComplexBody net472
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody net472
scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmark net472
scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmark net6.0
scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmark netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmarkWithAttack net472
scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmarkWithAttack netcoreapp3.1
scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice net472
scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice netcoreapp3.1
scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool net472
scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice net472
scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice netcoreapp3.1
scenario:Benchmarks.Trace.ILoggerBenchmark.EnrichedLog net472
scenario:Benchmarks.Trace.ILoggerBenchmark.EnrichedLog netcoreapp3.1
scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark net472
scenario:Benchmarks.Trace.RedisBenchmark.SendReceive net472
scenario:Benchmarks.Trace.RedisBenchmark.SendReceive netcoreapp3.1
scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope net472
scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan net472
scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes net472
scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes netcoreapp3.1
scenario:Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin net472
scenario:Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin netcoreapp3.1

Copilot

Pull request overview

Copilot reviewed 14 out of 54 changed files in this pull request and generated 2 comments.

Copilot

Pull request overview

Copilot reviewed 14 out of 54 changed files in this pull request and generated 1 comment.

Copilot

Pull request overview

Copilot reviewed 16 out of 56 changed files in this pull request and generated 1 comment.

jpbempel · 2026-06-02T06:46:44Z

+                        AppDomain.CurrentDomain.AssemblyLoad += CheckUnboundProbes;
+                        assemblyLoadSubscribed = true;
+                        StartBackgroundProcess();
+                        Volatile.Write(ref _initializationState, 2);


nit: make it a named constant

jpbempel · 2026-06-02T09:12:28Z

+        // Refresh writes these fields through a single CAS-guarded writer; every cross-thread access
+        // goes through Volatile.Read/Volatile.Write (and Interlocked where a read-modify-write is needed).
+        // We deliberately use the Volatile.* helpers rather than the `volatile` keyword so the idiom is
+        // uniform across the int/long fields too (`long` cannot be marked `volatile`).


TIL: cannot mark a long field volatile! Now I understand why you are using it so scarcely while in Java we are using it more often :/

that being said, why not using a sub object with all stats and using a volatile ref to update in one shot for the refresh?
That way you have only one place to manage the concurency.

example:

class stats { private long _lastGen2Count; private long _lastRefreshMs; private long _highPressureStartMs; private bool _hasHighPressureStart; private int _highStreak; private int _lowStreak; private bool _hasGen2Baseline; } void refresh() { Stats stats = new stats(); stats._lastGen2Count = ...; // ... stats._hasGen2Baseline = ...; currentStats = stats; // currentStats is volatile field, assigned }

then the reader just need to Volatile read currentStats to snapshot the values then use them locally

But yes the downside of this is one alloc for stats at each call of refresh.

You can use Interlocked.Exchange(ref _long) though, which is generally preferable anyway IMO, but snapshotting is probably still the better design IMO.

In terms of allocations, you can use a "front and back buffer object with atomic exchange" - that way you create 2 objects for the life of the process 🙂

You can use Interlocked.Exchange(ref _long) though

Yes but for this case volatile is enough imo

but snapshotting is probably still the better design IMO.
In terms of allocations, you can use a "front and back buffer object with atomic exchange" - that way you create 2 objects for the life of the process 🙂

I like it. Currently, no one really uses it in production, but I would definitely do that in the future if these values become part of a decision-making process. Noted.

On second thought, I need to get back to this PR anyway because of the distribution metrics limitation, so I might handle this while I'm at it.

On third thought 😆

I tried the two-buffer snapshot approach locally, but I’m leaning toward keeping the current simpler publication model for now.

It does give a more coherent internal snapshot, but it also adds quite a bit of complexity/versioning, more synchronization.

I think it’s worth revisiting if/when this becomes decision-making state or we expose a single snapshot API, but for the current observe-only telemetry use, I’d prefer to keep the simpler volatile fields.

jpbempel · 2026-06-02T09:36:32Z

+        // Only ever called from inside the CAS-guarded Refresh, so no synchronization is required.
+        private bool DisableCore(MetricTags.DebuggerMemoryPressureDisabledReason reason)


For my knowledge as probably I missed something: if it is only used in Refresh method why not put it like ComputeNextHigh or ToScaledInt as "inner method" of Referesh?

The reasoning behind that is that, unlike ComputeNextHigh and ToScaledInt, which are purely local helpers and should remain local functions, DisableCore has meaning beyond the refresh algorithm, and I haven't fully settled on its future shape yet (since this PR is observe-only).

But you didn't miss anything. For this PR, it is currently only called from Refresh, as you noted.

Done 5c99511

andrewlock

As discussed, I have concerns that these metrics won't actually be usable, because we don't tag them with service identity. So you have no way to compare between enter/exit conditions? 🤔 However, I now realise that you're relying on distributions and going to try to do some statistical comparison between them? I'm not sure..

However, the count metrics are fine in general, but we literally don't support distribution metrics at the moment (intentionally, because they have perf problems), so we'll need to remove those.

dudikeleti · 2026-06-02T13:09:35Z

As discussed, I have concerns that these metrics won't actually be usable, because we don't tag them with service identity. So you have no way to compare between enter/exit conditions? 🤔 However, I now realise that you're relying on distributions and going to try to do some statistical comparison between them? I'm not sure..

However, the count metrics are fine in general, but we literally don't support distribution metrics at the moment (intentionally, because they have perf problems), so we'll need to remove those.

The missing service tag isn't a blocker. We dont have to pair a given app's enter with its own exit. We want to be able to say:
"Memory pressure during DI is common, driven mainly by memory (not GC), clusters at 90%+, and episodes last tens of seconds - so gating is worth building, gate on memory at ~0.90 with a short cooldown." Or the opposite: "episodes are sub-second and rare - don't bother.

dudikeleti · 2026-06-02T18:06:24Z

However, the count metrics are fine in general, but we literally don't support distribution metrics at the moment (intentionally, because they have perf problems), so we'll need to remove those.

I changed the memory-pressure severity/duration metrics from distributions to bucketed count metrics.

andrewlock

I haven't reviewed the Debugger code, just the shared metrics code. The important thing is you'll need to merge your changes into the telemetry intake before these new metrics will be accepted.

andrewlock · 2026-06-03T11:32:00Z

 #nullable enable
 using System.Diagnostics.CodeAnalysis;
 using Datadog.Trace.SourceGenerators;
+using NS = Datadog.Trace.Telemetry.MetricNamespaceConstants;


Suggested change

using NS = Datadog.Trace.Telemetry.MetricNamespaceConstants;

86f6bb3cd59a512bd61d8c4fe63ffa9e4d6f31ae

andrewlock · 2026-06-03T11:35:28Z

+      ],
+      "metric_type": "count",
+      "data_type": "transitions",
+      "description": "The number of Dynamic Instrumentation memory-pressure state transitions, tagged by state and the signal that triggered entry",


It's best to include the actual state tags allowed here, e.g.

Suggested change

"description": "The number of Dynamic Instrumentation memory-pressure state transitions, tagged by state and the signal that triggered entry",

"description": "The number of Dynamic Instrumentation memory-pressure state transitions, tagged by state (`state:enter` or `state:exit`) and the signal that triggered entry (`trigger:none`, `trigger:memory`, `trigger:gen2`, `trigger:both`)",

The same applies to all the other definitions here.

Additionally, these metrics need to be added to the telemetry intake, and deployed, otherwise they will be blocked. These metrics are common across all languages too, just FYI, so we should try to keep the tags applicable to a wide range of languages if possible (e.g. gen2 is very .NET specific, maybe we can use a more generic tag name?)

86f6bb3cd59a512bd61d8c4fe63ffa9e4d6f31ae

andrewlock · 2026-06-03T11:37:04Z

+
+    /// <summary>
+    /// Count of Dynamic Instrumentation high-memory-pressure periods bucketed by duration, recorded on exit.
+    /// </summary>
+    [TelemetryMetric<MetricTags.DebuggerMemoryPressureDurationBucket>("memory_pressure.duration_ms", isCommon: true, NS.LiveDebugger)] DebuggerMemoryPressureDurationMs,


Is this really what you want? The total duration of high-memory pressure periods? Maybe it is, just feels kind of strange to me 😅

andrewlock · 2026-06-03T11:37:37Z

+    /// <summary>
+    /// Count of Dynamic Instrumentation memory-pressure transitions bucketed by Gen2 collections per second at the transition.
+    /// </summary>
+    [TelemetryMetric<MetricTags.DebuggerMemoryPressureState, MetricTags.DebuggerMemoryPressureGen2Bucket>("memory_pressure.gen2_per_sec", isCommon: true, NS.LiveDebugger)] DebuggerMemoryPressureGen2PerSec,


It seems a bit weird to have a count of a rate... I guess this is just trying to hack around current lack of distribution support? 🤔

andrewlock · 2026-06-03T11:38:26Z

+    {
+        [Description("bucket:lt_70")] LessThan70,
+        [Description("bucket:70_80")] From70To80,
+        [Description("bucket:80_85")] From80To85,


What if it's 80%? Which bucket does it go in? Might be preferable to make that clear

86f6bb3cd59a512bd61d8c4fe63ffa9e4d6f31ae

Co-authored-by: Cursor <cursoragent@cursor.com>

Remove the IGCInfoProvider/IHighResolutionClock/IMemoryPressureMonitor/ ISamplerScheduler abstractions and their System* implementations in favor of a single SystemMemorySource. Update the monitor, config, and debugger consumers accordingly, and adjust tests (dropping the fake clock/GC/ scheduler helpers).

Add debugger.memory_pressure.* count/distribution metrics and supporting metric tags (state/trigger/disabled-reason), plus regenerated telemetry collector source. Update collector tests.

dudikeleti force-pushed the dudik/cb-memory branch from d0b375f to 61a1154 Compare February 11, 2026 18:04

dudikeleti force-pushed the dudik/cb-memory branch from 61a1154 to 7d4161f Compare May 28, 2026 12:11

This comment has been minimized.

Sign in to view

dudikeleti changed the title ~~Introduce MemoryPressureMonitor to detect high memory pressure~~ [Debugger] Add memory pressure monitoring telemetry for Dynamic Instrumentation (observe-only) May 28, 2026

dudikeleti added the area:debugger label May 28, 2026

dudikeleti force-pushed the dudik/cb-memory branch from 7d4161f to 82d7293 Compare May 29, 2026 00:24

dudikeleti requested a review from Copilot May 29, 2026 00:25

Copilot started reviewing on behalf of dudikeleti May 29, 2026 00:25 View session