Skip to content

[Debugger] Add memory pressure monitoring telemetry for Dynamic Instrumentation (observe-only)#7834

Merged
dudikeleti merged 12 commits into
masterfrom
dudik/cb-memory
Jun 3, 2026
Merged

[Debugger] Add memory pressure monitoring telemetry for Dynamic Instrumentation (observe-only)#7834
dudikeleti merged 12 commits into
masterfrom
dudik/cb-memory

Conversation

@dudikeleti

@dudikeleti dudikeleti commented Nov 17, 2025

Copy link
Copy Markdown
Contributor

Summary of changes

  • Adds an observe-only memory pressure monitor for Dynamic Instrumentation.
  • Samples runtime/system memory pressure around DI activity (no background timer).
  • Emits low-cardinality telemetry only when entering or exiting high memory pressure, plus a one-shot counter if the monitor disables itself.
  • Does not block, throttle, skip, or otherwise change probe behavior.

Reason for change

Dynamic Instrumentation can perform memory-sensitive work when capturing variables, creating snapshots, serializing payloads, and enqueueing/uploading data.

Before using memory pressure to affect DI behavior, we need production data showing how often high pressure occurs, what drives it, how severe each signal is at entry, how long episodes last, where the signal is even available, and whether it is stable enough for future throttling decisions. This PR adds exactly that observability and nothing else.

Implementation details

The monitor detects high pressure from concrete runtime/system signals only:

  • Memory load ratio (fraction of available memory in use, so the value is comparable across platforms):
    • On .NET Core 3.1+, uses GC.GetGCMemoryInfo() (MemoryLoadBytes / TotalAvailableMemoryBytes). This is container/cgroup-aware, and is system/container-wide load rather than process-private bytes - the right scope for "are we close to an OOM".
    • On .NET Framework/Windows, uses GlobalMemoryStatusEx (dwMemoryLoad, clamped to [0,1]). This is machine-wide and not container/job-object aware.
    • On runtimes/platforms where neither is available, the memory signal is reported as unsupported.
  • Gen2 collections per second:
    • Calculated from GC.CollectionCount(2) deltas over elapsed time. Identical and process-specific on every platform.

Sampling is activity-driven rather than always-on. DI calls into the monitor around capture/snapshot/upload-relevant work, and the monitor refreshes only when the last sample is stale (default min interval 1s). If DI is idle, the monitor does not continuously sample memory.

Time is read from Environment.TickCount64 on .NET Core 3.0+ (a monotonic Stopwatch-based fallback is used on net461/netstandard2.0), so the hot RefreshIfStale() fast path is a couple of volatile reads with no clock abstraction.

If neither memory nor GC signals are available on the host, or a provider throws, the monitor permanently disables itself (it never samples again), writes a single log line (debug for "no signals", error for an exception), and records a one-shot disabled counter tagged by reason so a silent zero in the transition metrics is explainable rather than ambiguous.

Telemetry is emitted on transitions only (enter high pressure / exit high pressure):

  • Transition count tagged by state (enter/exit) and trigger (memory/gen2/both on enter; none on exit) - i.e. which signal drove entry.
  • Severity values on transition: memory usage percent and Gen2 collections/sec (tagged by state).
  • Duration of the high-pressure period, recorded on exit.

No high-cardinality tags such as probe ID, service name, file path, exception type, or method name are added.

flowchart TD
    diStart["DynamicInstrumentation starts"] --> createMonitor["Create MemoryPressureMonitor"]
    createMonitor --> idleMonitor["No background sampling while idle"]

    probeActivity["DI capture/snapshot/upload activity"] --> refreshIfStale["RefreshIfStale (min interval)"]
    refreshIfStale --> signalsAvailable{"Memory or GC signal available?"}
    signalsAvailable -->|"No / provider error"| disable["Disable monitor (logged) + disabled{reason}"]
    signalsAvailable -->|"Yes"| sampleSignals["Sample memory load and Gen2/sec"]
    sampleSignals --> pressureDecision{"High pressure transition?"}

    pressureDecision -->|"ENTER"| enterTelemetry["transitions{state:enter, trigger:*} + severity"]
    pressureDecision -->|"EXIT"| exitTelemetry["transitions{state:exit, trigger:none} + severity + duration"]
    pressureDecision -->|"No transition"| noOp["No telemetry emission"]

    enterTelemetry --> observeOnly["No DI behavior change"]
    exitTelemetry --> observeOnly
    noOp --> observeOnly
    disable --> observeOnly

    diStart --> dispose["DynamicInstrumentation.Dispose"]
    dispose --> stopMonitor["Dispose monitor (lock-free)"]
Loading

Risk is kept low by avoiding per-second telemetry gauges, background sampling while idle, probe-level cardinality and allocation on hot paths. Refreshes are guarded by a non-blocking single-writer CAS so concurrent DI activity does not run overlapping samples.

Metrics implementation

All metrics are instrumentation-telemetry metrics, defined as common metrics (isCommon: true) under the live_debugger namespace. They surface in the backend as dd.instrumentation_telemetry_data.live_debugger.memory_pressure.*. Tags are fixed, low-cardinality enums.

Metric Type Tags Emitted
memory_pressure.transitions count state = enter|exit, trigger = none|memory|gen2|both Once per high-pressure state change. trigger is the entry cause on enter; none on exit.
memory_pressure.disabled count reason = no_signals|error Once, when the monitor permanently disables itself (no signals available, or a provider threw).
memory_pressure.memory_usage_pct count state = enter|exit, bucket = lt_70|70_80|80_85|85_90|gte_90 On each transition - count bucketed by memory load percent at that moment.
memory_pressure.gen2_per_sec count state = enter|exit, bucket = lt_1|1_2|2_5|gte_5 On each transition - count bucketed by Gen2 collections/sec at that moment.
memory_pressure.duration_ms count bucket = lt_1s|1_5s|5_30s|gte_30s On exit - count bucketed by length of the high-pressure episode.

Runtime/platform segmentation (runtime name, OS, architecture, tracer version) is not added as per-metric tags; it comes from the telemetry payload's application/host metadata and is applied at query time in the backend. This keeps series count bounded while still allowing per-runtime analysis.

The severity and duration metrics use normal telemetry count metrics with a fixed bucket tag instead of distributions. Distribution telemetry is not supported for this internal telemetry path, and bucketed counts preserve the decision-making signal with bounded cardinality.

The trigger tag lives only on the transition count, not on the severity bucket counts: if trigger:memory, the memory bucket already captures how severe the memory signal was, and splitting every bucket by trigger would mostly restate the transition breakdown with extra series. The transition count's trigger plus the state-tagged bucket counts answer "what drove it" and "how severe / how long".

Dashboards and how we will use the data

The goal is a single dashboard that answers, per runtime/OS, whether DI should ever gate capture under memory pressure and, if so, on which signal and at what threshold.

Widget Source metric Question Decision it informs
Coverage timeseries / count by reason memory_pressure.disabled Where does the monitor not run at all? Defines the observed population - distinguishes "no pressure" from "no data".
Enter rate timeseries, split by trigger memory_pressure.transitions{state:enter} How often does high pressure happen, and is it memory- or GC-driven? Whether gating is worth building at all, and which signal to gate on.
Trigger breakdown (top-list/pie) memory_pressure.transitions{state:enter} memory vs gen2 vs both mix Which threshold actually matters.
Memory severity bucket breakdown memory_pressure.memory_usage_pct{state:enter} How deep is memory pressure at entry? Where to set a memory gating threshold.
GC severity bucket breakdown memory_pressure.gen2_per_sec{state:enter} How hot is GC at entry? Where to set a gen2 gating threshold.
Episode duration bucket breakdown memory_pressure.duration_ms How long do episodes last? Whether gating needs hysteresis/cooldown or episodes are too short to act on.

Every widget can be grouped by the dimensions the telemetry intake stamps onto each series from the payload's application/host metadata.

How the data is gathered:

  1. These metrics flow through the existing tracer telemetry pipeline (generate-metrics payloads) to the Datadog backend - no new transport.
  2. Because emission is transition-only and activity-driven, idle apps cost nothing and noisy apps cannot spam per-second points.
  3. We collect across runtimes for a few weeks, then read the dashboard top-down: memory_pressure.disabled (coverage) → memory_pressure.transitions{state:enter} by trigger (frequency + cause) → severity distributions (thresholds) → duration (persistence).
  4. The resulting numbers define concrete gating thresholds (or show that gating is unnecessary) for a follow-up PR.

Scope: what this telemetry answers (and what it doesn't)

This is a fleet-aggregate, pre-decision dataset: it tells us whether high memory pressure occurs during DI work often, severely, and long enough to justify gating capture, and on which signal at what threshold.
It is deliberately not a tool for measuring the impact of a future gate.

Test coverage

Added/updated tests cover:

  • Memory threshold enter/exit behavior and hysteresis.
  • Gen2/sec threshold and rate behavior.
  • Consecutive high/low cycle (debounce) behavior.
  • Unavailable memory/GC signals → self-disable with reason:no_signals, no further sampling.
  • Provider exception handling → self-disable with reason:error, no crash.
  • Refresh overlap handling (non-blocking single writer) and dispose during an in-flight refresh.
  • Dispose/lifecycle behavior (no further sampling or emission after dispose).
  • Activity-driven stale refresh (no sampling while idle; samples only when stale).
  • Transition telemetry: enter emitted once, exit emitted once, severity recorded on transition, duration recorded on exit, no repeat while state is unchanged.
  • trigger classification: gen2 when only GC drives entry, both when memory and GC cross together, memory/none on the enter/exit pair.
  • Extreme/edge configuration values are clamped to safe minimums.
  • Windows GlobalMemoryStatusEx interop returns valid values.
  • DI integration: the monitor is sampled only through the dedicated hook and is disposed by DynamicInstrumentation.
  • Telemetry aggregation: transitions tagged by state+trigger, the disabled counter, and the three bucketed count metrics aggregate into the expected series/tags.

Other details

Initial thresholds are observation thresholds, not throttling thresholds:

  • Memory pressure threshold: 0.85.
  • Memory exit threshold with default hysteresis: 0.80.
  • Gen2 threshold: 2 collections/sec.
  • Gen2 exit threshold with default hysteresis: 1 collection/sec.

These values are intentionally conservative starting points for data collection and can be tuned after production telemetry is available.

This PR intentionally does not include an explicit/manual pressure event signal. There is no concrete production caller or well-defined meaning for that signal yet. If a future DI path has a specific condition, such as snapshot capture exceeding a memory budget, it should add a named API and metric for that scenario.

Any future DI throttling or blocking based on memory pressure should be done in a separate PR after reviewing the collected telemetry.

Note

Incidental DynamicInstrumentation lifecycle hardening (not memory-pressure specific): Wiring the new monitor in as a disposable owned by DynamicInstrumentation, exposed pre-existing races in the start/subscribe/dispose paths. DynamicInstrumentation is disposed at runtime via RCM, not just at shutdown. Since the monitor's lifecycle is tied to those paths, we hardened them here.

@datadog-official

This comment has been minimized.

@dudikeleti dudikeleti changed the title Introduce MemoryPressureMonitor to detect high memory pressure [Debugger] Add memory pressure monitoring telemetry for Dynamic Instrumentation (observe-only) May 28, 2026
@dudikeleti dudikeleti requested a review from Copilot May 29, 2026 00:25

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an observe-only memory pressure monitor for Dynamic Instrumentation (DI) and emits low-cardinality telemetry on high-pressure transitions (enter/exit), plus a one-shot “disabled” counter when signals are unavailable or the monitor errors.

Changes:

  • Introduces MemoryPressureMonitor + supporting system/GC signal readers (incl. Windows interop) and wires it into DI via a dedicated refresh hook.
  • Adds new telemetry metric definitions (counts + shared distributions) and regenerates telemetry collector code for all TFMs.
  • Adds focused unit/integration tests validating monitor behavior, transitions, and telemetry aggregation.

Reviewed changes

Copilot reviewed 14 out of 54 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tracer/src/Datadog.Trace/Debugger/RateLimiting/MemoryPressureMonitor.cs New monitor implementation + transition/disable telemetry emission
tracer/src/Datadog.Trace/Debugger/RateLimiting/MemoryPressureConfig.cs Configuration defaults and thresholds/hysteresis knobs
tracer/src/Datadog.Trace/Debugger/RateLimiting/SystemMemorySource.cs Production memory/GC signal providers
tracer/src/Datadog.Trace/Debugger/RateLimiting/WindowsMemoryInfo.cs Windows GlobalMemoryStatusEx interop helper
tracer/src/Datadog.Trace/Debugger/DynamicInstrumentation.cs Owns/disposes monitor and exposes dedicated refresh hook
tracer/src/Datadog.Trace/Debugger/Expressions/ProbeProcessor.cs Calls the dedicated memory-pressure refresh hook around capture-related activity
tracer/src/Datadog.Trace/Debugger/DebuggerFactory.cs Constructs and injects the monitor into DI
tracer/src/Datadog.Trace/Debugger/Caching/DefaultMemoryChecker.cs Reuses new Windows memory helper instead of local interop copy
tracer/src/Datadog.Trace/Telemetry/Metrics/MetricTags.cs Adds low-cardinality tag enums for debugger memory-pressure metrics
tracer/src/Datadog.Trace/Telemetry/Metrics/Count.cs Adds count metrics for transitions + disabled reasons
tracer/src/Datadog.Trace/Telemetry/Metrics/DistributionShared.cs Adds shared distributions for severity + duration
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_DistributionShared.g.cs Regenerated: null-collector stubs for new shared distributions
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_Count.g.cs Regenerated: null-collector stubs for new counts
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_DistributionShared.g.cs Regenerated: buffers + recorders for new shared distributions
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_Count.g.cs Regenerated: buffers + recorders for new counts
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_DistributionShared.g.cs Regenerated: interface methods for new shared distributions
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_Count.g.cs Regenerated: interface methods for new counts
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/DistributionSharedExtensions.g.cs Regenerated: names/common-flag/namespaces for new shared distributions
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CountExtensions.g.cs Regenerated: names/common-flag/namespaces for new counts
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_DistributionShared.g.cs Regenerated: CI visibility collector distribution buffers updated
tracer/src/Datadog.Trace/Generated/netstandard2.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_Count.g.cs Regenerated: CI visibility collector count buffers updated
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_DistributionShared.g.cs Regenerated: null-collector stubs for new shared distributions
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_Count.g.cs Regenerated: null-collector stubs for new counts
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_DistributionShared.g.cs Regenerated: buffers + recorders for new shared distributions
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_Count.g.cs Regenerated: buffers + recorders for new counts
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_DistributionShared.g.cs Regenerated: interface methods for new shared distributions
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_Count.g.cs Regenerated: interface methods for new counts
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/DistributionSharedExtensions.g.cs Regenerated: names/common-flag/namespaces for new shared distributions
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CountExtensions.g.cs Regenerated: names/common-flag/namespaces for new counts
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_DistributionShared.g.cs Regenerated: CI visibility collector distribution buffers updated
tracer/src/Datadog.Trace/Generated/netcoreapp3.1/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_Count.g.cs Regenerated: CI visibility collector count buffers updated
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_DistributionShared.g.cs Regenerated: null-collector stubs for new shared distributions
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_Count.g.cs Regenerated: null-collector stubs for new counts
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_DistributionShared.g.cs Regenerated: buffers + recorders for new shared distributions
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_Count.g.cs Regenerated: buffers + recorders for new counts
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_DistributionShared.g.cs Regenerated: interface methods for new shared distributions
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_Count.g.cs Regenerated: interface methods for new counts
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/DistributionSharedExtensions.g.cs Regenerated: names/common-flag/namespaces for new shared distributions
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CountExtensions.g.cs Regenerated: names/common-flag/namespaces for new counts
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_DistributionShared.g.cs Regenerated: CI visibility collector distribution buffers updated
tracer/src/Datadog.Trace/Generated/net6.0/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_Count.g.cs Regenerated: CI visibility collector count buffers updated
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_DistributionShared.g.cs Regenerated: null-collector stubs for new shared distributions
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/NullMetricsTelemetryCollector_Count.g.cs Regenerated: null-collector stubs for new counts
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_DistributionShared.g.cs Regenerated: buffers + recorders for new shared distributions
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/MetricsTelemetryCollector_Count.g.cs Regenerated: buffers + recorders for new counts
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_DistributionShared.g.cs Regenerated: interface methods for new shared distributions
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/IMetricsTelemetryCollector_Count.g.cs Regenerated: interface methods for new counts
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/DistributionSharedExtensions.g.cs Regenerated: names/common-flag/namespaces for new shared distributions
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CountExtensions.g.cs Regenerated: names/common-flag/namespaces for new counts
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_DistributionShared.g.cs Regenerated: CI visibility collector distribution buffers updated
tracer/src/Datadog.Trace/Generated/net461/Datadog.Trace.SourceGenerators/TelemetryMetricGenerator/CiVisibilityMetricsTelemetryCollector_Count.g.cs Regenerated: CI visibility collector count buffers updated
tracer/test/Datadog.Trace.Tests/Debugger/RateLimiting/MemoryPressureMonitorTests.cs New unit tests for thresholds, hysteresis, disable paths, and concurrency behavior
tracer/test/Datadog.Trace.Tests/Debugger/DynamicInstrumentationTests.cs Integration tests for refresh hook usage + monitor disposal
tracer/test/Datadog.Trace.Tests/Telemetry/Collectors/MetricsTelemetryCollectorTests.cs Adds aggregation test for new memory-pressure metrics

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tracer/src/Datadog.Trace/Debugger/Expressions/ProbeProcessor.cs
Comment thread tracer/src/Datadog.Trace/Debugger/RateLimiting/MemoryPressureMonitor.cs Outdated
Comment thread tracer/src/Datadog.Trace/Debugger/RateLimiting/SystemMemorySource.cs Outdated
Comment thread tracer/src/Datadog.Trace/Debugger/RateLimiting/SystemMemorySource.cs Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 54 changed files in this pull request and generated 2 comments.

Comment thread tracer/src/Datadog.Trace/Debugger/RateLimiting/MemoryPressureMonitor.cs Outdated
@dd-trace-dotnet-ci-bot

dd-trace-dotnet-ci-bot Bot commented May 29, 2026

Copy link
Copy Markdown

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing This PR (7834) and master.

✅ No regressions detected - check the details below

Full Metrics Comparison

FakeDbCommand

Metric Master (Mean ± 95% CI) Current (Mean ± 95% CI) Change Status
.NET Framework 4.8 - Baseline
duration76.09 ± (75.90 - 76.46) ms75.83 ± (75.68 - 76.27) ms-0.3%
.NET Framework 4.8 - Bailout
duration78.52 ± (78.32 - 78.80) ms78.36 ± (78.35 - 78.89) ms-0.2%
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration1105.77 ± (1105.04 - 1112.66) ms1106.19 ± (1107.63 - 1114.94) ms+0.0%✅⬆️
.NET Core 3.1 - Baseline
process.internal_duration_ms22.52 ± (22.47 - 22.58) ms22.75 ± (22.68 - 22.81) ms+1.0%✅⬆️
process.time_to_main_ms85.10 ± (84.80 - 85.40) ms85.66 ± (85.36 - 85.95) ms+0.7%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.91 ± (10.91 - 10.92) MB10.92 ± (10.92 - 10.93) MB+0.1%✅⬆️
runtime.dotnet.threads.count12 ± (12 - 12)12 ± (12 - 12)+0.0%
.NET Core 3.1 - Bailout
process.internal_duration_ms22.39 ± (22.34 - 22.43) ms22.27 ± (22.23 - 22.31) ms-0.5%
process.time_to_main_ms86.15 ± (85.94 - 86.37) ms84.90 ± (84.69 - 85.10) ms-1.5%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.95 ± (10.94 - 10.95) MB10.95 ± (10.95 - 10.96) MB+0.0%✅⬆️
runtime.dotnet.threads.count13 ± (13 - 13)13 ± (13 - 13)+0.0%
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms214.93 ± (214.00 - 215.85) ms212.40 ± (211.56 - 213.25) ms-1.2%
process.time_to_main_ms544.18 ± (542.88 - 545.49) ms539.75 ± (538.48 - 541.02) ms-0.8%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed48.20 ± (48.16 - 48.25) MB48.37 ± (48.34 - 48.40) MB+0.3%✅⬆️
runtime.dotnet.threads.count28 ± (28 - 28)28 ± (28 - 28)+0.3%✅⬆️
.NET 6 - Baseline
process.internal_duration_ms21.27 ± (21.23 - 21.31) ms21.64 ± (21.58 - 21.69) ms+1.7%✅⬆️
process.time_to_main_ms73.40 ± (73.24 - 73.57) ms76.41 ± (76.10 - 76.73) ms+4.1%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.63 ± (10.63 - 10.63) MB10.63 ± (10.63 - 10.64) MB+0.0%✅⬆️
runtime.dotnet.threads.count10 ± (10 - 10)10 ± (10 - 10)+0.0%
.NET 6 - Bailout
process.internal_duration_ms21.82 ± (21.76 - 21.87) ms21.10 ± (21.06 - 21.14) ms-3.3%
process.time_to_main_ms78.27 ± (78.02 - 78.52) ms75.11 ± (74.93 - 75.29) ms-4.0%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.74 ± (10.74 - 10.75) MB10.74 ± (10.74 - 10.74) MB-0.0%
runtime.dotnet.threads.count11 ± (11 - 11)11 ± (11 - 11)+0.0%
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms368.48 ± (366.34 - 370.61) ms368.57 ± (366.58 - 370.55) ms+0.0%✅⬆️
process.time_to_main_ms549.44 ± (547.97 - 550.91) ms554.14 ± (552.73 - 555.56) ms+0.9%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed49.78 ± (49.76 - 49.80) MB49.79 ± (49.77 - 49.81) MB+0.0%✅⬆️
runtime.dotnet.threads.count28 ± (28 - 28)28 ± (28 - 28)+0.0%✅⬆️
.NET 8 - Baseline
process.internal_duration_ms19.43 ± (19.40 - 19.47) ms19.62 ± (19.56 - 19.67) ms+1.0%✅⬆️
process.time_to_main_ms72.96 ± (72.80 - 73.12) ms74.31 ± (74.00 - 74.62) ms+1.9%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed7.66 ± (7.66 - 7.67) MB7.67 ± (7.67 - 7.67) MB+0.1%✅⬆️
runtime.dotnet.threads.count10 ± (10 - 10)10 ± (10 - 10)+0.0%
.NET 8 - Bailout
process.internal_duration_ms19.36 ± (19.33 - 19.39) ms19.62 ± (19.56 - 19.67) ms+1.3%✅⬆️
process.time_to_main_ms73.75 ± (73.57 - 73.94) ms76.30 ± (76.00 - 76.59) ms+3.5%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed7.71 ± (7.71 - 7.72) MB7.73 ± (7.72 - 7.73) MB+0.2%✅⬆️
runtime.dotnet.threads.count11 ± (11 - 11)11 ± (11 - 11)+0.0%
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms293.43 ± (291.42 - 295.44) ms295.99 ± (293.70 - 298.28) ms+0.9%✅⬆️
process.time_to_main_ms502.11 ± (500.92 - 503.31) ms500.04 ± (498.81 - 501.27) ms-0.4%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed36.81 ± (36.78 - 36.84) MB36.85 ± (36.82 - 36.88) MB+0.1%✅⬆️
runtime.dotnet.threads.count27 ± (27 - 27)27 ± (27 - 27)+0.0%

HttpMessageHandler

Metric Master (Mean ± 95% CI) Current (Mean ± 95% CI) Change Status
.NET Framework 4.8 - Baseline
duration200.89 ± (200.87 - 201.63) ms200.62 ± (200.14 - 201.02) ms-0.1%
.NET Framework 4.8 - Bailout
duration204.05 ± (203.56 - 204.42) ms204.08 ± (203.66 - 204.51) ms+0.0%✅⬆️
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration1213.30 ± (1211.42 - 1217.19) ms1210.60 ± (1209.36 - 1216.61) ms-0.2%
.NET Core 3.1 - Baseline
process.internal_duration_ms196.78 ± (196.27 - 197.29) ms194.78 ± (194.38 - 195.18) ms-1.0%
process.time_to_main_ms85.56 ± (85.25 - 85.86) ms85.07 ± (84.78 - 85.36) ms-0.6%
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed16.04 ± (16.01 - 16.06) MB16.11 ± (16.09 - 16.13) MB+0.5%✅⬆️
runtime.dotnet.threads.count20 ± (20 - 20)20 ± (20 - 20)+0.3%✅⬆️
.NET Core 3.1 - Bailout
process.internal_duration_ms195.40 ± (195.02 - 195.78) ms194.15 ± (193.75 - 194.55) ms-0.6%
process.time_to_main_ms86.62 ± (86.35 - 86.89) ms85.87 ± (85.61 - 86.13) ms-0.9%
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed16.10 ± (16.08 - 16.13) MB16.13 ± (16.10 - 16.15) MB+0.1%✅⬆️
runtime.dotnet.threads.count21 ± (21 - 21)21 ± (20 - 21)-0.7%
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms388.31 ± (386.98 - 389.64) ms387.76 ± (386.39 - 389.13) ms-0.1%
process.time_to_main_ms540.20 ± (538.94 - 541.46) ms540.25 ± (538.99 - 541.50) ms+0.0%✅⬆️
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed57.76 ± (57.55 - 57.98) MB57.87 ± (57.63 - 58.10) MB+0.2%✅⬆️
runtime.dotnet.threads.count30 ± (30 - 30)30 ± (30 - 30)-0.3%
.NET 6 - Baseline
process.internal_duration_ms198.71 ± (198.24 - 199.18) ms198.50 ± (198.07 - 198.93) ms-0.1%
process.time_to_main_ms73.00 ± (72.74 - 73.25) ms73.08 ± (72.80 - 73.35) ms+0.1%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.34 ± (16.31 - 16.36) MB16.38 ± (16.36 - 16.40) MB+0.3%✅⬆️
runtime.dotnet.threads.count19 ± (19 - 19)19 ± (19 - 19)-0.6%
.NET 6 - Bailout
process.internal_duration_ms197.21 ± (196.75 - 197.67) ms196.64 ± (196.24 - 197.04) ms-0.3%
process.time_to_main_ms73.44 ± (73.23 - 73.66) ms73.90 ± (73.64 - 74.16) ms+0.6%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.39 ± (16.32 - 16.47) MB16.43 ± (16.40 - 16.45) MB+0.2%✅⬆️
runtime.dotnet.threads.count20 ± (20 - 20)20 ± (20 - 20)+1.2%✅⬆️
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms585.27 ± (582.89 - 587.66) ms586.63 ± (583.90 - 589.35) ms+0.2%✅⬆️
process.time_to_main_ms546.51 ± (545.58 - 547.45) ms547.48 ± (546.34 - 548.62) ms+0.2%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed61.09 ± (61.00 - 61.18) MB61.07 ± (60.98 - 61.16) MB-0.0%
runtime.dotnet.threads.count31 ± (31 - 31)31 ± (31 - 31)+0.3%✅⬆️
.NET 8 - Baseline
process.internal_duration_ms197.00 ± (196.56 - 197.44) ms195.63 ± (195.18 - 196.08) ms-0.7%
process.time_to_main_ms72.70 ± (72.41 - 72.99) ms72.12 ± (71.90 - 72.34) ms-0.8%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed11.69 ± (11.66 - 11.72) MB11.71 ± (11.69 - 11.72) MB+0.1%✅⬆️
runtime.dotnet.threads.count18 ± (18 - 18)18 ± (18 - 18)+0.1%✅⬆️
.NET 8 - Bailout
process.internal_duration_ms196.15 ± (195.71 - 196.59) ms195.11 ± (194.60 - 195.63) ms-0.5%
process.time_to_main_ms73.59 ± (73.35 - 73.82) ms73.11 ± (72.90 - 73.32) ms-0.7%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed11.73 ± (11.71 - 11.76) MB11.77 ± (11.75 - 11.79) MB+0.3%✅⬆️
runtime.dotnet.threads.count19 ± (19 - 19)19 ± (19 - 19)+0.0%✅⬆️
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms512.45 ± (509.93 - 514.96) ms515.06 ± (512.50 - 517.62) ms+0.5%✅⬆️
process.time_to_main_ms495.75 ± (495.00 - 496.50) ms496.40 ± (495.57 - 497.23) ms+0.1%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed50.53 ± (50.49 - 50.57) MB50.55 ± (50.51 - 50.58) MB+0.0%✅⬆️
runtime.dotnet.threads.count29 ± (29 - 30)30 ± (30 - 30)+0.8%✅⬆️
Comparison explanation

Execution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

Duration charts
FakeDbCommand (.NET Framework 4.8)
gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (76ms)  : 71, 80
    master - mean (76ms)  : 72, 80

    section Bailout
    This PR (7834) - mean (79ms)  : 74, 83
    master - mean (79ms)  : 75, 82

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (1,111ms)  : 1059, 1164
    master - mean (1,109ms)  : 1054, 1164

Loading
FakeDbCommand (.NET Core 3.1)
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (116ms)  : 110, 121
    master - mean (115ms)  : 108, 121

    section Bailout
    This PR (7834) - mean (114ms)  : 111, 117
    master - mean (115ms)  : 112, 119

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (789ms)  : 759, 818
    master - mean (797ms)  : 773, 821

Loading
FakeDbCommand (.NET 6)
gantt
    title Execution time (ms) FakeDbCommand (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (105ms)  : 98, 111
    master - mean (101ms)  : 98, 104

    section Bailout
    This PR (7834) - mean (103ms)  : 99, 106
    master - mean (107ms)  : 102, 111

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (952ms)  : 914, 990
    master - mean (950ms)  : 916, 983

Loading
FakeDbCommand (.NET 8)
gantt
    title Execution time (ms) FakeDbCommand (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (102ms)  : 96, 108
    master - mean (100ms)  : 98, 103

    section Bailout
    This PR (7834) - mean (104ms)  : 99, 109
    master - mean (101ms)  : 99, 103

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (825ms)  : 787, 863
    master - mean (825ms)  : 788, 862

Loading
HttpMessageHandler (.NET Framework 4.8)
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (201ms)  : 196, 205
    master - mean (201ms)  : 197, 205

    section Bailout
    This PR (7834) - mean (204ms)  : 200, 208
    master - mean (204ms)  : 199, 209

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (1,213ms)  : 1160, 1266
    master - mean (1,214ms)  : 1173, 1256

Loading
HttpMessageHandler (.NET Core 3.1)
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (289ms)  : 283, 295
    master - mean (292ms)  : 285, 299

    section Bailout
    This PR (7834) - mean (289ms)  : 284, 294
    master - mean (291ms)  : 285, 297

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (966ms)  : 944, 987
    master - mean (968ms)  : 947, 989

Loading
HttpMessageHandler (.NET 6)
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (280ms)  : 273, 288
    master - mean (280ms)  : 274, 287

    section Bailout
    This PR (7834) - mean (279ms)  : 271, 286
    master - mean (279ms)  : 274, 284

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (1,165ms)  : 1132, 1197
    master - mean (1,163ms)  : 1120, 1207

Loading
HttpMessageHandler (.NET 8)
gantt
    title Execution time (ms) HttpMessageHandler (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7834) - mean (278ms)  : 269, 286
    master - mean (280ms)  : 274, 286

    section Bailout
    This PR (7834) - mean (278ms)  : 272, 285
    master - mean (280ms)  : 273, 286

    section CallTarget+Inlining+NGEN
    This PR (7834) - mean (1,043ms)  : 997, 1088
    master - mean (1,042ms)  : 1000, 1083

Loading

@pr-commenter

pr-commenter Bot commented May 29, 2026

Copy link
Copy Markdown

Benchmarks

Benchmark execution time: 2026-06-03 17:08:56

Comparing candidate commit e6b821c in PR branch dudik/cb-memory with baseline commit d8273e8 in branch master.

Found 0 performance improvements and 2 performance regressions! Performance is the same for 70 metrics, 0 unstable metrics, 62 known flaky benchmarks, 64 flaky benchmarks without significant changes.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

scenario:Benchmarks.Trace.DbCommandBenchmark.ExecuteNonQuery net472

  • 🟥 throughput [-22591.875op/s; -19477.941op/s] or [-6.363%; -5.486%]

scenario:Benchmarks.Trace.HttpClientBenchmark.SendAsync net472

  • 🟥 throughput [-6975.472op/s; -6343.280op/s] or [-7.963%; -7.241%]

Known flaky benchmarks

These benchmarks are marked as flaky and will not trigger a failure. Modify FLAKY_BENCHMARKS_REGEX to control which benchmarks are marked as flaky.

scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild net472

  • 🟥 throughput [-7794.907op/s; -7321.272op/s] or [-9.242%; -8.681%]

scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces net472

  • 🟥 execution_time [+312.435ms; +319.611ms] or [+155.041%; +158.602%]
  • 🟥 throughput [-44.798op/s; -40.520op/s] or [-8.060%; -7.290%]

scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces net6.0

  • 🟥 execution_time [+381.232ms; +383.117ms] or [+301.197%; +302.686%]
  • 🟩 throughput [+89.136op/s; +92.309op/s] or [+11.752%; +12.171%]

scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1

  • 🟥 execution_time [+390.139ms; +393.719ms] or [+345.258%; +348.426%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody net472

  • 🟥 allocated_mem [+1.308KB; +1.308KB] or [+27.528%; +27.540%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody net6.0

  • 🟥 allocated_mem [+471 bytes; +472 bytes] or [+9.976%; +9.987%]
  • 🟩 execution_time [-15.598ms; -11.405ms] or [-7.285%; -5.327%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody netcoreapp3.1

  • 🟥 allocated_mem [+1.272KB; +1.272KB] or [+27.500%; +27.510%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody net472

  • 🟥 allocated_mem [+1.307KB; +1.307KB] or [+105.743%; +105.758%]
  • 🟥 throughput [-275337.526op/s; -271716.265op/s] or [-28.113%; -27.744%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody net6.0

  • 🟥 allocated_mem [+471 bytes; +472 bytes] or [+38.557%; +38.566%]
  • 🟩 execution_time [-26.250ms; -21.372ms] or [-11.706%; -9.531%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody netcoreapp3.1

  • 🟥 allocated_mem [+1.272KB; +1.272KB] or [+105.288%; +105.304%]
  • 🟥 throughput [-153745.536op/s; -137847.194op/s] or [-22.090%; -19.806%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorMoreComplexBody net6.0

  • 🟩 throughput [+9016.970op/s; +11934.796op/s] or [+5.737%; +7.594%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorMoreComplexBody netcoreapp3.1

  • 🟩 throughput [+9638.279op/s; +12323.534op/s] or [+7.678%; +9.817%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody net6.0

  • 🟩 throughput [+441590.006op/s; +461715.638op/s] or [+14.724%; +15.396%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody netcoreapp3.1

  • 🟩 execution_time [-19.569ms; -15.232ms] or [-9.021%; -7.021%]
  • 🟩 throughput [+181197.887op/s; +234383.557op/s] or [+7.192%; +9.303%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeArgs net472

  • 🟥 execution_time [+297.990ms; +299.152ms] or [+148.895%; +149.476%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeArgs net6.0

  • 🟥 execution_time [+300.641ms; +304.179ms] or [+151.614%; +153.398%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeArgs netcoreapp3.1

  • 🟥 execution_time [+299.801ms; +303.010ms] or [+151.016%; +152.633%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs net472

  • 🟥 execution_time [+297.122ms; +298.073ms] or [+145.935%; +146.402%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs net6.0

  • 🟥 execution_time [+292.507ms; +295.000ms] or [+142.996%; +144.215%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs netcoreapp3.1

  • 🟥 execution_time [+298.114ms; +300.353ms] or [+148.997%; +150.116%]

scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmarkWithAttack net6.0

  • 🟥 execution_time [+27.834µs; +51.531µs] or [+8.886%; +16.451%]
  • 🟥 throughput [-473.414op/s; -274.121op/s] or [-14.758%; -8.545%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest net472

  • 🟥 execution_time [+300.314ms; +301.109ms] or [+149.888%; +150.284%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest net6.0

  • 🟥 execution_time [+416.840ms; +424.132ms] or [+452.914%; +460.837%]
  • 🟩 throughput [+629.463op/s; +795.277op/s] or [+5.172%; +6.535%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest netcoreapp3.1

  • 🟥 execution_time [+356.434ms; +366.147ms] or [+270.637%; +278.012%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net472

  • unstable execution_time [+298.462ms; +338.680ms] or [+137.230%; +155.722%]
  • 🟥 throughput [-538.798op/s; -488.961op/s] or [-48.820%; -44.305%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net6.0

  • unstable execution_time [+201.524ms; +334.804ms] or [+85.881%; +142.679%]
  • 🟥 throughput [-669.072op/s; -585.663op/s] or [-44.627%; -39.064%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1

  • 🟥 execution_time [+336.626ms; +350.616ms] or [+201.341%; +209.709%]
  • 🟥 throughput [-433.407op/s; -393.081op/s] or [-30.178%; -27.370%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice net6.0

  • 🟥 execution_time [+83.856µs; +92.966µs] or [+5.761%; +6.387%]
  • 🟥 throughput [-41.275op/s; -37.306op/s] or [-6.008%; -5.430%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool net6.0

  • 🟩 throughput [+55.169op/s; +123.454op/s] or [+5.948%; +13.311%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool netcoreapp3.1

  • unstable throughput [-19.980op/s; +48.206op/s] or [-3.729%; +8.998%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice net6.0

  • 🟩 throughput [+26.109op/s; +43.420op/s] or [+5.154%; +8.572%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch net472

  • 🟥 execution_time [+302.239ms; +304.319ms] or [+152.202%; +153.249%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch net6.0

  • 🟥 execution_time [+300.954ms; +301.989ms] or [+150.809%; +151.327%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch netcoreapp3.1

  • 🟥 execution_time [+300.372ms; +303.522ms] or [+150.894%; +152.477%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync net472

  • 🟥 execution_time [+302.059ms; +303.745ms] or [+151.684%; +152.530%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync net6.0

  • 🟥 execution_time [+297.656ms; +300.065ms] or [+147.178%; +148.369%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync netcoreapp3.1

  • 🟥 execution_time [+302.121ms; +306.176ms] or [+153.128%; +155.183%]

scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync net472

  • 🟥 execution_time [+301.742ms; +305.394ms] or [+151.447%; +153.280%]

scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync net6.0

  • 🟥 execution_time [+299.874ms; +302.154ms] or [+149.460%; +150.596%]
  • 🟩 throughput [+52577.144op/s; +62138.621op/s] or [+10.440%; +12.339%]

scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync netcoreapp3.1

  • 🟥 execution_time [+301.492ms; +304.323ms] or [+149.990%; +151.398%]

scenario:Benchmarks.Trace.ILoggerBenchmark.EnrichedLog net6.0

  • 🟩 execution_time [-15.978ms; -12.322ms] or [-7.430%; -5.730%]
  • 🟩 throughput [+20284.268op/s; +27216.395op/s] or [+5.565%; +7.466%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark net472

  • unstable execution_time [+3.143µs; +48.735µs] or [+0.776%; +12.038%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark net6.0

  • 🟩 allocated_mem [-19.459KB; -19.437KB] or [-7.098%; -7.090%]
  • unstable execution_time [-51.289µs; +5.304µs] or [-10.137%; +1.048%]
  • unstable throughput [-9.945op/s; +192.470op/s] or [-0.496%; +9.604%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark netcoreapp3.1

  • unstable execution_time [-40.870µs; +22.689µs] or [-7.082%; +3.932%]
  • unstable throughput [-54.596op/s; +121.102op/s] or [-3.119%; +6.919%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark net6.0

  • unstable execution_time [+5.546µs; +11.041µs] or [+13.109%; +26.097%]
  • 🟥 throughput [-4681.151op/s; -2612.048op/s] or [-19.706%; -10.996%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark netcoreapp3.1

  • unstable execution_time [-14.261µs; -6.189µs] or [-22.125%; -9.602%]
  • unstable throughput [+1603.303op/s; +3352.240op/s] or [+9.837%; +20.567%]

scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog net472

  • 🟥 execution_time [+302.842ms; +304.185ms] or [+153.073%; +153.752%]

scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog net6.0

  • 🟥 execution_time [+301.910ms; +306.807ms] or [+153.671%; +156.164%]

scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog netcoreapp3.1

  • 🟥 execution_time [+300.068ms; +303.411ms] or [+150.221%; +151.895%]

scenario:Benchmarks.Trace.RedisBenchmark.SendReceive net6.0

  • 🟩 throughput [+33990.125op/s; +36707.835op/s] or [+6.434%; +6.948%]

scenario:Benchmarks.Trace.SerilogBenchmark.EnrichedLog net472

  • 🟥 execution_time [+301.763ms; +304.173ms] or [+150.402%; +151.603%]

scenario:Benchmarks.Trace.SerilogBenchmark.EnrichedLog net6.0

  • 🟥 execution_time [+301.245ms; +302.633ms] or [+151.271%; +151.968%]

scenario:Benchmarks.Trace.SerilogBenchmark.EnrichedLog netcoreapp3.1

  • 🟥 execution_time [+304.020ms; +306.816ms] or [+154.179%; +155.598%]

scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore net472

  • 🟥 execution_time [+299.602ms; +300.578ms] or [+149.443%; +149.930%]
  • 🟩 throughput [+61145175.522op/s; +61541392.831op/s] or [+44.530%; +44.818%]

scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore net6.0

  • 🟥 execution_time [+418.513ms; +423.233ms] or [+520.495%; +526.365%]

scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore netcoreapp3.1

  • 🟥 execution_time [+298.792ms; +299.990ms] or [+149.031%; +149.628%]
  • 🟩 throughput [+11666913.478op/s; +16638455.320op/s] or [+5.168%; +7.370%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope net6.0

  • 🟩 throughput [+65954.827op/s; +80514.054op/s] or [+6.158%; +7.517%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope netcoreapp3.1

  • 🟩 throughput [+47296.598op/s; +67445.174op/s] or [+5.474%; +7.807%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan net6.0

  • 🟩 throughput [+78917.839op/s; +108946.902op/s] or [+6.108%; +8.433%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan netcoreapp3.1

  • 🟩 throughput [+79044.134op/s; +87219.940op/s] or [+7.850%; +8.662%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes net6.0

  • 🟩 throughput [+35985.579op/s; +47013.592op/s] or [+6.534%; +8.537%]

scenario:Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin net6.0

  • 🟩 throughput [+83438.183op/s; +101599.857op/s] or [+9.322%; +11.351%]

Known flaky benchmarks without significant changes:

  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan netcoreapp3.1
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_AddEvent_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_AddEvent_Sampled net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_AddEvent_Sampled netcoreapp3.1
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_GetContext_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_GetContext_Sampled net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_GetContext_Sampled netcoreapp3.1
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetAttributes_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetAttributes_Sampled net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetAttributes_Sampled netcoreapp3.1
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetStatus_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetStatus_Sampled net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetStatus_Sampled netcoreapp3.1
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_UpdateName_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_UpdateName_Sampled net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_UpdateName_Sampled netcoreapp3.1
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan netcoreapp3.1
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_AddEvent_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_AddEvent_Sampled net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_AddEvent_Sampled netcoreapp3.1
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_GetContext_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_GetContext_Sampled net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_GetContext_Sampled netcoreapp3.1
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_RecordException_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_RecordException_Sampled net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_RecordException_Sampled netcoreapp3.1
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetAttributes_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetAttributes_Sampled net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetAttributes_Sampled netcoreapp3.1
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetStatus_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetStatus_Sampled net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetStatus_Sampled netcoreapp3.1
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_UpdateName_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_UpdateName_Sampled net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_UpdateName_Sampled netcoreapp3.1
  • scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild net6.0
  • scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild netcoreapp3.1
  • scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorMoreComplexBody net472
  • scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody net472
  • scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmark net472
  • scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmark net6.0
  • scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmark netcoreapp3.1
  • scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmarkWithAttack net472
  • scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmarkWithAttack netcoreapp3.1
  • scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice net472
  • scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice netcoreapp3.1
  • scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool net472
  • scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice net472
  • scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice netcoreapp3.1
  • scenario:Benchmarks.Trace.ILoggerBenchmark.EnrichedLog net472
  • scenario:Benchmarks.Trace.ILoggerBenchmark.EnrichedLog netcoreapp3.1
  • scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark net472
  • scenario:Benchmarks.Trace.RedisBenchmark.SendReceive net472
  • scenario:Benchmarks.Trace.RedisBenchmark.SendReceive netcoreapp3.1
  • scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope net472
  • scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan net472
  • scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes net472
  • scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes netcoreapp3.1
  • scenario:Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin net472
  • scenario:Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin netcoreapp3.1

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 54 changed files in this pull request and generated 2 comments.

Comment thread tracer/src/Datadog.Trace/Debugger/RateLimiting/MemoryPressureMonitor.cs Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 54 changed files in this pull request and generated 1 comment.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 56 changed files in this pull request and generated 1 comment.

Comment thread tracer/src/Datadog.Trace/Telemetry/Metrics/Count.cs
@dudikeleti dudikeleti marked this pull request as ready for review May 29, 2026 12:13
@dudikeleti dudikeleti requested review from a team as code owners May 29, 2026 12:13
@dudikeleti dudikeleti requested review from andrewlock and jpbempel June 2, 2026 07:08
AppDomain.CurrentDomain.AssemblyLoad += CheckUnboundProbes;
assemblyLoadSubscribed = true;
StartBackgroundProcess();
Volatile.Write(ref _initializationState, 2);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: make it a named constant

Comment on lines +66 to +69
// Refresh writes these fields through a single CAS-guarded writer; every cross-thread access
// goes through Volatile.Read/Volatile.Write (and Interlocked where a read-modify-write is needed).
// We deliberately use the Volatile.* helpers rather than the `volatile` keyword so the idiom is
// uniform across the int/long fields too (`long` cannot be marked `volatile`).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL: cannot mark a long field volatile! Now I understand why you are using it so scarcely while in Java we are using it more often :/

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that being said, why not using a sub object with all stats and using a volatile ref to update in one shot for the refresh?
That way you have only one place to manage the concurency.

example:

class stats 
{
        private long _lastGen2Count;
        private long _lastRefreshMs;
        private long _highPressureStartMs;
        private bool _hasHighPressureStart;
        private int _highStreak;
        private int _lowStreak;
        private bool _hasGen2Baseline;
}

void refresh() 
{
  Stats stats = new stats();
  stats._lastGen2Count = ...;
  // ...
  stats._hasGen2Baseline = ...;
  currentStats = stats; // currentStats is volatile field, assigned 
}

then the reader just need to Volatile read currentStats to snapshot the values then use them locally

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But yes the downside of this is one alloc for stats at each call of refresh.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use Interlocked.Exchange(ref _long) though, which is generally preferable anyway IMO, but snapshotting is probably still the better design IMO.

In terms of allocations, you can use a "front and back buffer object with atomic exchange" - that way you create 2 objects for the life of the process 🙂

@dudikeleti dudikeleti Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use Interlocked.Exchange(ref _long) though

Yes but for this case volatile is enough imo

but snapshotting is probably still the better design IMO.
In terms of allocations, you can use a "front and back buffer object with atomic exchange" - that way you create 2 objects for the life of the process 🙂

I like it. Currently, no one really uses it in production, but I would definitely do that in the future if these values become part of a decision-making process. Noted.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, I need to get back to this PR anyway because of the distribution metrics limitation, so I might handle this while I'm at it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On third thought 😆

I tried the two-buffer snapshot approach locally, but I’m leaning toward keeping the current simpler publication model for now.

It does give a more coherent internal snapshot, but it also adds quite a bit of complexity/versioning, more synchronization.

I think it’s worth revisiting if/when this becomes decision-making state or we expose a single snapshot API, but for the current observe-only telemetry use, I’d prefer to keep the simpler volatile fields.

Comment on lines +412 to +413
// Only ever called from inside the CAS-guarded Refresh, so no synchronization is required.
private bool DisableCore(MetricTags.DebuggerMemoryPressureDisabledReason reason)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my knowledge as probably I missed something: if it is only used in Refresh method why not put it like ComputeNextHigh or ToScaledInt as "inner method" of Referesh?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasoning behind that is that, unlike ComputeNextHigh and ToScaledInt, which are purely local helpers and should remain local functions, DisableCore has meaning beyond the refresh algorithm, and I haven't fully settled on its future shape yet (since this PR is observe-only).

But you didn't miss anything. For this PR, it is currently only called from Refresh, as you noted.

Done 5c99511

@andrewlock andrewlock left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, I have concerns that these metrics won't actually be usable, because we don't tag them with service identity. So you have no way to compare between enter/exit conditions? 🤔 However, I now realise that you're relying on distributions and going to try to do some statistical comparison between them? I'm not sure..

However, the count metrics are fine in general, but we literally don't support distribution metrics at the moment (intentionally, because they have perf problems), so we'll need to remove those.

@dudikeleti

Copy link
Copy Markdown
Contributor Author

As discussed, I have concerns that these metrics won't actually be usable, because we don't tag them with service identity. So you have no way to compare between enter/exit conditions? 🤔 However, I now realise that you're relying on distributions and going to try to do some statistical comparison between them? I'm not sure..

However, the count metrics are fine in general, but we literally don't support distribution metrics at the moment (intentionally, because they have perf problems), so we'll need to remove those.

The missing service tag isn't a blocker. We dont have to pair a given app's enter with its own exit. We want to be able to say:
"Memory pressure during DI is common, driven mainly by memory (not GC), clusters at 90%+, and episodes last tens of seconds - so gating is worth building, gate on memory at ~0.90 with a short cooldown." Or the opposite: "episodes are sub-second and rare - don't bother.

@dudikeleti

Copy link
Copy Markdown
Contributor Author

However, the count metrics are fine in general, but we literally don't support distribution metrics at the moment (intentionally, because they have perf problems), so we'll need to remove those.

I changed the memory-pressure severity/duration metrics from distributions to bucketed count metrics.

@dudikeleti dudikeleti requested a review from andrewlock June 2, 2026 18:06

@andrewlock andrewlock left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the Debugger code, just the shared metrics code. The important thing is you'll need to merge your changes into the telemetry intake before these new metrics will be accepted.

#nullable enable
using System.Diagnostics.CodeAnalysis;
using Datadog.Trace.SourceGenerators;
using NS = Datadog.Trace.Telemetry.MetricNamespaceConstants;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
using NS = Datadog.Trace.Telemetry.MetricNamespaceConstants;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

86f6bb3cd59a512bd61d8c4fe63ffa9e4d6f31ae

],
"metric_type": "count",
"data_type": "transitions",
"description": "The number of Dynamic Instrumentation memory-pressure state transitions, tagged by state and the signal that triggered entry",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's best to include the actual state tags allowed here, e.g.

Suggested change
"description": "The number of Dynamic Instrumentation memory-pressure state transitions, tagged by state and the signal that triggered entry",
"description": "The number of Dynamic Instrumentation memory-pressure state transitions, tagged by state (`state:enter` or `state:exit`) and the signal that triggered entry (`trigger:none`, `trigger:memory`, `trigger:gen2`, `trigger:both`)",

The same applies to all the other definitions here.

Additionally, these metrics need to be added to the telemetry intake, and deployed, otherwise they will be blocked. These metrics are common across all languages too, just FYI, so we should try to keep the tags applicable to a wide range of languages if possible (e.g. gen2 is very .NET specific, maybe we can use a more generic tag name?)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

86f6bb3cd59a512bd61d8c4fe63ffa9e4d6f31ae

Comment on lines +230 to +234

/// <summary>
/// Count of Dynamic Instrumentation high-memory-pressure periods bucketed by duration, recorded on exit.
/// </summary>
[TelemetryMetric<MetricTags.DebuggerMemoryPressureDurationBucket>("memory_pressure.duration_ms", isCommon: true, NS.LiveDebugger)] DebuggerMemoryPressureDurationMs,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really what you want? The total duration of high-memory pressure periods? Maybe it is, just feels kind of strange to me 😅

/// <summary>
/// Count of Dynamic Instrumentation memory-pressure transitions bucketed by Gen2 collections per second at the transition.
/// </summary>
[TelemetryMetric<MetricTags.DebuggerMemoryPressureState, MetricTags.DebuggerMemoryPressureGen2Bucket>("memory_pressure.gen2_per_sec", isCommon: true, NS.LiveDebugger)] DebuggerMemoryPressureGen2PerSec,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit weird to have a count of a rate... I guess this is just trying to hack around current lack of distribution support? 🤔

{
[Description("bucket:lt_70")] LessThan70,
[Description("bucket:70_80")] From70To80,
[Description("bucket:80_85")] From80To85,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it's 80%? Which bucket does it go in? Might be preferable to make that clear

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

86f6bb3cd59a512bd61d8c4fe63ffa9e4d6f31ae

dudikeleti and others added 12 commits June 3, 2026 18:15
Co-authored-by: Cursor <cursoragent@cursor.com>
Remove the IGCInfoProvider/IHighResolutionClock/IMemoryPressureMonitor/
ISamplerScheduler abstractions and their System* implementations in favor
of a single SystemMemorySource. Update the monitor, config, and debugger
consumers accordingly, and adjust tests (dropping the fake clock/GC/
scheduler helpers).
Add debugger.memory_pressure.* count/distribution metrics and supporting
metric tags (state/trigger/disabled-reason), plus regenerated telemetry
collector source. Update collector tests.
@dudikeleti dudikeleti merged commit 19d68e3 into master Jun 3, 2026
141 checks passed
@dudikeleti dudikeleti deleted the dudik/cb-memory branch June 3, 2026 18:35
@github-actions github-actions Bot added this to the vNext-v3 milestone Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants