[Runtime Metrics] Support for OTLP Runtime Metrics#8457
[Runtime Metrics] Support for OTLP Runtime Metrics#8457
Conversation
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing This PR (8457) and master. ✅ No regressions detected - check the details below Full Metrics ComparisonFakeDbCommand
HttpMessageHandler
Comparison explanationExecution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). Duration chartsFakeDbCommand (.NET Framework 4.8)gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8457) - mean (76ms) : 71, 81
master - mean (73ms) : 71, 76
section Bailout
This PR (8457) - mean (78ms) : 75, 82
master - mean (80ms) : 76, 84
section CallTarget+Inlining+NGEN
This PR (8457) - mean (1,082ms) : 1028, 1136
master - mean (1,076ms) : 1020, 1132
FakeDbCommand (.NET Core 3.1)gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8457) - mean (117ms) : 110, 123
master - mean (115ms) : 109, 121
section Bailout
This PR (8457) - mean (115ms) : 112, 118
master - mean (115ms) : 112, 117
section CallTarget+Inlining+NGEN
This PR (8457) - mean (780ms) : 756, 804
master - mean (779ms) : 750, 808
FakeDbCommand (.NET 6)gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8457) - mean (105ms) : 99, 112
master - mean (102ms) : 98, 107
section Bailout
This PR (8457) - mean (104ms) : 97, 111
master - mean (101ms) : 99, 104
section CallTarget+Inlining+NGEN
This PR (8457) - mean (940ms) : 905, 975
master - mean (940ms) : 894, 985
FakeDbCommand (.NET 8)gantt
title Execution time (ms) FakeDbCommand (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8457) - mean (101ms) : 97, 105
master - mean (100ms) : 97, 104
section Bailout
This PR (8457) - mean (107ms) : 102, 112
master - mean (105ms) : 100, 109
section CallTarget+Inlining+NGEN
This PR (8457) - mean (824ms) : 783, 865
master - mean (822ms) : 786, 859
HttpMessageHandler (.NET Framework 4.8)gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8457) - mean (203ms) : 197, 210
master - mean (207ms) : 197, 216
section Bailout
This PR (8457) - mean (207ms) : 199, 214
master - mean (211ms) : 201, 220
section CallTarget+Inlining+NGEN
This PR (8457) - mean (1,206ms) : 1154, 1257
master - mean (1,218ms) : 1170, 1267
HttpMessageHandler (.NET Core 3.1)gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8457) - mean (291ms) : 282, 301
master - mean (296ms) : 278, 314
section Bailout
This PR (8457) - mean (292ms) : 280, 303
master - mean (297ms) : 280, 313
section CallTarget+Inlining+NGEN
This PR (8457) - mean (965ms) : 937, 993
master - mean (976ms) : 946, 1007
HttpMessageHandler (.NET 6)gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8457) - mean (285ms) : 274, 295
master - mean (294ms) : 274, 314
section Bailout
This PR (8457) - mean (286ms) : 277, 294
master - mean (292ms) : 272, 311
section CallTarget+Inlining+NGEN
This PR (8457) - mean (1,160ms) : 1122, 1199
master - mean (1,163ms) : 1111, 1215
HttpMessageHandler (.NET 8)gantt
title Execution time (ms) HttpMessageHandler (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8457) - mean (283ms) : 275, 292
master - mean (285ms) : 274, 297
section Bailout
This PR (8457) - mean (285ms) : 272, 297
master - mean (286ms) : 275, 297
section CallTarget+Inlining+NGEN
This PR (8457) - mean (1,045ms) : 995, 1095
master - mean (1,044ms) : 993, 1094
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
BenchmarksBenchmark execution time: 2026-04-29 20:17:14 Comparing candidate commit 2b2124a in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 27 metrics, 0 unstable metrics, 57 known flaky benchmarks, 30 flaky benchmarks without significant changes.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1b582a23e4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…race-dotnet into maximo/otlp-runtime-metrics
bouwkast
left a comment
There was a problem hiding this comment.
Haven't gone through everything yet but will pick it up again tomorrow
andrewlock
left a comment
There was a problem hiding this comment.
LGTM in general, some relatively minor suggestions and questions in general.
While reviewing this I had a realization - OpenTelemetry themselves add polyfills for many of these... e.g.
What happens if you have the same pollyfilled values, with the same meter and instrument names? 🤔 Do we need to handle that explicitly, or does the runtime handled it? And have you compared our implementation to theirs?
| // indicates a counter reset (e.g. process restart). Report currentValue | ||
| // as the delta, as if the previous cumulative was 0. | ||
| // ObservableUpDownCounter is non-monotonic so negative deltas are expected. | ||
| if (delta < 0 && InstrumentType is InstrumentType.ObservableCounter) |
There was a problem hiding this comment.
Just a thought, and not strictly related to this PR: Do we need a similar guard somewhere for negative values in InstrumentTrype.Counter too? 🤔 I don't think it needs to be right here (because this method is all about observable counters), but I believe we could have the same wrapping case to handle.
I think we may also need to handle this in our statsd case too - I'll look into it!
| // ObservableUpDownCounter is non-monotonic so negative deltas are expected. | ||
| if (delta < 0 && InstrumentType is InstrumentType.ObservableCounter) | ||
| { | ||
| delta = _runningDoubleValue; |
There was a problem hiding this comment.
Had to work this one through to convince myself it's correct, so included here so others don't need to 😅
In an overflow scenario - given that you would have (for example)
_runningDoubleValueis small (due to overflow) e.g. 10_lastObservedCumulativeis huge e.g. 1,000,000,000,000
Then
_delta = _runningDoubleValue - previousCumulative~-1,000,000,000,000
So we hit this branch and set
_delta = _runningDoubleValue=~10
so LGTM!
There was a problem hiding this comment.
Aay thank you for adding this investigation here!
Summary of changes
Add support for exporting .NET runtime metrics via OTLP, using OpenTelemetry semantic convention names. When both
DD_RUNTIME_METRICS_ENABLED=trueandDD_METRICS_OTEL_ENABLED=true, runtime metrics (System.Runtime, ASP.NET Core) are collected through the existing OTLP metrics pipeline instead of DogStatsD.Related One Pager: Sending Runtime Metrics via OTLP from dd-trace-*
Reason for change
Enables runtime visibility for users on OTLP-native pipelines or without a local Datadog Agent. Aligns .NET runtime metric names with OTel semantic conventions (
dotnet.gc.collections,dotnet.process.cpu.time, etc.).Implementation details
TracerSettings: NewOtlpRuntimeMetricsEnabledproperty to check if OTLP Metrics and Runtime Metrics are enabled simultaneously .TracerManagerFactory: Disables StatsDRuntimeMetricsWriterwhen OTLP runtime metrics are active to prevent duplicate reporting.MetricReaderHandler: AddsSystem.Runtime,Microsoft.AspNetCore.Hosting, andMicrosoft.AspNetCore.Server.Kestrelmeters whenOtlpRuntimeMetricsEnabledis true, even if not explicitly listed inDD_METRICS_OTEL_METER_NAMES.RuntimeMetricsPolyfill(new): Provides .NET 9+ nativeSystem.Runtimemeter instruments for .NET 6-8 usingSystem.Diagnostics.Metrics. Includes counter reset detection forObservableCounter.MetricsRuntime: Starts polyfill on .NET <9; added thread-safe initialization.Test coverage
OpenTelemetrySdkTests.SubmitsOtlpRuntimeMetrics: End-to-end integration test verifying OTLP runtime metrics payload via Verify snapshots (separate snapshots for .NET 6 and .NET 9 due to polyfill type differences). Confirms StatsD is not emitted when OTLP runtime metrics are active.RuntimeMetricsMeterTests.InstrumentSurface: Unit test verifying the polyfill exposes the correct set of instruments on both .NET 6 and .NET 9.DD_RUNTIME_METRICS_ENABLED=falseto existingSubmitsOtlpMetricstest to prevent runtime metrics from interfering with custom meter assertions.Other details
system-testsrepo) , added needDD_RUNTIME_METRICS_ENABLED=falseadded to theirDEFAULT_ENVVARSto avoid unexpected runtime metrics in OTel metric assertions and created new test based on language expectations.