Skip to content

Track service name source#10607

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 11 commits intomasterfrom
andrea.marziali/serviename-integration
Feb 20, 2026
Merged

Track service name source#10607
gh-worker-dd-mergequeue-cf854d[bot] merged 11 commits intomasterfrom
andrea.marziali/serviename-integration

Conversation

@amarziali
Copy link
Copy Markdown
Contributor

@amarziali amarziali commented Feb 17, 2026

What Does This Do

This change introduces service name source tracking, allowing us to record which integration or feature overrides the service name on a span.

To support this, a new setServiceName API has been added to AgentSpan. In addition to the service name, this method also accepts a CharSequence representing the source of the override.
For durability, the old signature has been deprecated on AgentSpan. Also, a forbiddenApi check has been added to all the integration to discourage the usage of the previous method that is missing the source of override.

When a source is set on a span, it is automatically propagated to its local children, since they inherit the same service name.

Manual tracing is currently out of scope. However, in the future we may extend this mechanism to track manual overrides as a "manual" source.

If a source is present on a span, an eager post-processor records it under the _dd.svc_src tag.

Please note that trace statistics will also need to incorporate this source information. That update will be handled in a separate PR.

Motivation

Additional Notes

Contributor Checklist

Jira ticket: [PROJ-IDENT]

Note: Once your PR is ready to merge, add it to the merge queue by commenting /merge. /merge -c cancels the queue request. /merge -f --reason "reason" skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.

@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Feb 17, 2026

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master andrea.marziali/serviename-integration
git_commit_date 1771530866 1771583914
git_commit_sha af8b844 c0e1c11
release_version 1.60.0-SNAPSHOT~af8b84438c 1.60.0-SNAPSHOT~c0e1c11a92
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1771585617 1771585617
ci_job_id 1442454771 1442454771
ci_pipeline_id 97817314 97817314
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-h1jjur9x 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-h1jjur9x 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 62 metrics, 9 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.60.0-SNAPSHOT~c0e1c11a92, baseline=1.60.0-SNAPSHOT~af8b84438c

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.064 s) : 0, 1063569
Total [baseline] (8.742 s) : 0, 8742337
Agent [candidate] (1.066 s) : 0, 1065978
Total [candidate] (8.753 s) : 0, 8753159
section iast
Agent [baseline] (1.228 s) : 0, 1228078
Total [baseline] (9.389 s) : 0, 9389411
Agent [candidate] (1.229 s) : 0, 1228677
Total [candidate] (9.387 s) : 0, 9386927
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.064 s -
Agent iast 1.228 s 164.509 ms (15.5%)
Total tracing 8.742 s -
Total iast 9.389 s 647.074 ms (7.4%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.066 s -
Agent iast 1.229 s 162.7 ms (15.3%)
Total tracing 8.753 s -
Total iast 9.387 s 633.768 ms (7.2%)
gantt
    title insecure-bank - break down per module: candidate=1.60.0-SNAPSHOT~c0e1c11a92, baseline=1.60.0-SNAPSHOT~af8b84438c

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.195 ms) : 0, 1195
crashtracking [candidate] (1.215 ms) : 0, 1215
BytebuddyAgent [baseline] (626.134 ms) : 0, 626134
BytebuddyAgent [candidate] (629.476 ms) : 0, 629476
AgentMeter [baseline] (29.007 ms) : 0, 29007
AgentMeter [candidate] (28.957 ms) : 0, 28957
GlobalTracer [baseline] (257.348 ms) : 0, 257348
GlobalTracer [candidate] (256.927 ms) : 0, 256927
AppSec [baseline] (33.071 ms) : 0, 33071
AppSec [candidate] (32.89 ms) : 0, 32890
Debugger [baseline] (64.083 ms) : 0, 64083
Debugger [candidate] (65.586 ms) : 0, 65586
Remote Config [baseline] (619.275 µs) : 0, 619
Remote Config [candidate] (611.777 µs) : 0, 612
Telemetry [baseline] (10.584 ms) : 0, 10584
Telemetry [candidate] (9.03 ms) : 0, 9030
Flare Poller [baseline] (5.406 ms) : 0, 5406
Flare Poller [candidate] (5.213 ms) : 0, 5213
section iast
crashtracking [baseline] (1.22 ms) : 0, 1220
crashtracking [candidate] (1.192 ms) : 0, 1192
BytebuddyAgent [baseline] (793.416 ms) : 0, 793416
BytebuddyAgent [candidate] (794.184 ms) : 0, 794184
AgentMeter [baseline] (11.205 ms) : 0, 11205
AgentMeter [candidate] (11.291 ms) : 0, 11291
GlobalTracer [baseline] (246.916 ms) : 0, 246916
GlobalTracer [candidate] (247.123 ms) : 0, 247123
IAST [baseline] (27.037 ms) : 0, 27037
IAST [candidate] (26.989 ms) : 0, 26989
AppSec [baseline] (32.297 ms) : 0, 32297
AppSec [candidate] (34.718 ms) : 0, 34718
Debugger [baseline] (67.664 ms) : 0, 67664
Debugger [candidate] (64.57 ms) : 0, 64570
Remote Config [baseline] (541.875 µs) : 0, 542
Remote Config [candidate] (542.942 µs) : 0, 543
Telemetry [baseline] (8.537 ms) : 0, 8537
Telemetry [candidate] (8.627 ms) : 0, 8627
Flare Poller [baseline] (3.443 ms) : 0, 3443
Flare Poller [candidate] (3.501 ms) : 0, 3501
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.60.0-SNAPSHOT~c0e1c11a92, baseline=1.60.0-SNAPSHOT~af8b84438c

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.075 s) : 0, 1074836
Total [baseline] (10.945 s) : 0, 10944946
Agent [candidate] (1.065 s) : 0, 1065382
Total [candidate] (10.979 s) : 0, 10979177
section appsec
Agent [baseline] (1.236 s) : 0, 1236054
Total [baseline] (11.0 s) : 0, 11000190
Agent [candidate] (1.248 s) : 0, 1248346
Total [candidate] (11.112 s) : 0, 11111690
section iast
Agent [baseline] (1.235 s) : 0, 1234809
Total [baseline] (11.162 s) : 0, 11161935
Agent [candidate] (1.23 s) : 0, 1230130
Total [candidate] (11.205 s) : 0, 11204930
section profiling
Agent [baseline] (1.189 s) : 0, 1188529
Total [baseline] (10.92 s) : 0, 10919801
Agent [candidate] (1.191 s) : 0, 1190859
Total [candidate] (10.973 s) : 0, 10972945
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.075 s -
Agent appsec 1.236 s 161.218 ms (15.0%)
Agent iast 1.235 s 159.974 ms (14.9%)
Agent profiling 1.189 s 113.693 ms (10.6%)
Total tracing 10.945 s -
Total appsec 11.0 s 55.244 ms (0.5%)
Total iast 11.162 s 216.989 ms (2.0%)
Total profiling 10.92 s -25.145 ms (-0.2%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.065 s -
Agent appsec 1.248 s 182.964 ms (17.2%)
Agent iast 1.23 s 164.748 ms (15.5%)
Agent profiling 1.191 s 125.477 ms (11.8%)
Total tracing 10.979 s -
Total appsec 11.112 s 132.513 ms (1.2%)
Total iast 11.205 s 225.753 ms (2.1%)
Total profiling 10.973 s -6.232 ms (-0.1%)
gantt
    title petclinic - break down per module: candidate=1.60.0-SNAPSHOT~c0e1c11a92, baseline=1.60.0-SNAPSHOT~af8b84438c

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.207 ms) : 0, 1207
crashtracking [candidate] (1.187 ms) : 0, 1187
BytebuddyAgent [baseline] (633.786 ms) : 0, 633786
BytebuddyAgent [candidate] (628.561 ms) : 0, 628561
AgentMeter [baseline] (29.523 ms) : 0, 29523
AgentMeter [candidate] (29.008 ms) : 0, 29008
GlobalTracer [baseline] (259.928 ms) : 0, 259928
GlobalTracer [candidate] (257.92 ms) : 0, 257920
AppSec [baseline] (33.234 ms) : 0, 33234
AppSec [candidate] (33.08 ms) : 0, 33080
Debugger [baseline] (64.736 ms) : 0, 64736
Debugger [candidate] (63.598 ms) : 0, 63598
Remote Config [baseline] (637.124 µs) : 0, 637
Remote Config [candidate] (621.423 µs) : 0, 621
Telemetry [baseline] (10.879 ms) : 0, 10879
Telemetry [candidate] (9.148 ms) : 0, 9148
Flare Poller [baseline] (4.684 ms) : 0, 4684
Flare Poller [candidate] (6.231 ms) : 0, 6231
section appsec
crashtracking [baseline] (1.215 ms) : 0, 1215
crashtracking [candidate] (1.207 ms) : 0, 1207
BytebuddyAgent [baseline] (656.243 ms) : 0, 656243
BytebuddyAgent [candidate] (666.665 ms) : 0, 666665
AgentMeter [baseline] (11.961 ms) : 0, 11961
AgentMeter [candidate] (12.027 ms) : 0, 12027
GlobalTracer [baseline] (257.681 ms) : 0, 257681
GlobalTracer [candidate] (258.385 ms) : 0, 258385
AppSec [baseline] (167.192 ms) : 0, 167192
AppSec [candidate] (167.758 ms) : 0, 167758
Debugger [baseline] (66.855 ms) : 0, 66855
Debugger [candidate] (66.976 ms) : 0, 66976
Remote Config [baseline] (648.102 µs) : 0, 648
Remote Config [candidate] (652.651 µs) : 0, 653
Telemetry [baseline] (9.412 ms) : 0, 9412
Telemetry [candidate] (9.479 ms) : 0, 9479
Flare Poller [baseline] (3.674 ms) : 0, 3674
Flare Poller [candidate] (3.722 ms) : 0, 3722
IAST [baseline] (25.245 ms) : 0, 25245
IAST [candidate] (25.342 ms) : 0, 25342
section iast
crashtracking [baseline] (1.213 ms) : 0, 1213
crashtracking [candidate] (1.187 ms) : 0, 1187
BytebuddyAgent [baseline] (798.264 ms) : 0, 798264
BytebuddyAgent [candidate] (794.036 ms) : 0, 794036
AgentMeter [baseline] (11.325 ms) : 0, 11325
AgentMeter [candidate] (11.307 ms) : 0, 11307
GlobalTracer [baseline] (247.854 ms) : 0, 247854
GlobalTracer [candidate] (247.672 ms) : 0, 247672
AppSec [baseline] (34.816 ms) : 0, 34816
AppSec [candidate] (32.821 ms) : 0, 32821
Debugger [baseline] (65.681 ms) : 0, 65681
Debugger [candidate] (67.468 ms) : 0, 67468
Remote Config [baseline] (548.572 µs) : 0, 549
Remote Config [candidate] (534.697 µs) : 0, 535
Telemetry [baseline] (8.672 ms) : 0, 8672
Telemetry [candidate] (8.574 ms) : 0, 8574
Flare Poller [baseline] (3.435 ms) : 0, 3435
Flare Poller [candidate] (3.485 ms) : 0, 3485
IAST [baseline] (27.082 ms) : 0, 27082
IAST [candidate] (27.058 ms) : 0, 27058
section profiling
ProfilingAgent [baseline] (98.507 ms) : 0, 98507
ProfilingAgent [candidate] (99.023 ms) : 0, 99023
crashtracking [baseline] (1.188 ms) : 0, 1188
crashtracking [candidate] (1.189 ms) : 0, 1189
BytebuddyAgent [baseline] (681.33 ms) : 0, 681330
BytebuddyAgent [candidate] (682.314 ms) : 0, 682314
AgentMeter [baseline] (8.513 ms) : 0, 8513
AgentMeter [candidate] (8.553 ms) : 0, 8553
GlobalTracer [baseline] (215.696 ms) : 0, 215696
GlobalTracer [candidate] (216.157 ms) : 0, 216157
AppSec [baseline] (32.503 ms) : 0, 32503
AppSec [candidate] (32.917 ms) : 0, 32917
Debugger [baseline] (66.865 ms) : 0, 66865
Debugger [candidate] (66.992 ms) : 0, 66992
Remote Config [baseline] (621.58 µs) : 0, 622
Remote Config [candidate] (619.336 µs) : 0, 619
Telemetry [baseline] (9.026 ms) : 0, 9026
Telemetry [candidate] (8.831 ms) : 0, 8831
Flare Poller [baseline] (3.745 ms) : 0, 3745
Flare Poller [candidate] (3.731 ms) : 0, 3731
Profiling [baseline] (99.085 ms) : 0, 99085
Profiling [candidate] (99.591 ms) : 0, 99591
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master andrea.marziali/serviename-integration
git_commit_date 1771530866 1771583914
git_commit_sha af8b844 c0e1c11
release_version 1.60.0-SNAPSHOT~af8b84438c 1.60.0-SNAPSHOT~c0e1c11a92
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1771586208 1771586208
ci_job_id 1442454775 1442454775
ci_pipeline_id 97817314 97817314
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-2-vmzmyypz 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-2-vmzmyypz 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 1 performance improvements and 2 performance regressions! Performance is the same for 17 metrics, 16 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:insecure-bank:iast:high_load worse
[+69.266µs; +162.277µs] or [+2.848%; +6.673%]
unsure
[+105.634µs; +574.568µs] or [+1.459%; +7.936%]
unstable
[-210.354op/s; +80.979op/s] or [-14.459%; +5.566%]
2.548ms 7.580ms 1390.125op/s 2.432ms 7.240ms 1454.812op/s
scenario:load:petclinic:no_agent:high_load better
[-1.982ms; -0.517ms] or [-10.579%; -2.758%]
unstable
[-4.046ms; -0.693ms] or [-12.889%; -2.209%]
unstable
[-7.114op/s; +43.739op/s] or [-2.942%; +18.090%]
17.483ms 29.023ms 260.094op/s 18.733ms 31.392ms 241.781op/s
scenario:load:petclinic:appsec:high_load worse
[+391.301µs; +923.829µs] or [+2.139%; +5.051%]
unsure
[+73.181µs; +1418.121µs] or [+0.244%; +4.725%]
unstable
[-30.025op/s; +15.900op/s] or [-12.049%; +6.381%]
18.948ms 30.756ms 242.125op/s 18.291ms 30.011ms 249.188op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~c0e1c11a92, baseline=1.60.0-SNAPSHOT~af8b84438c
    dateFormat X
    axisFormat %s
section baseline
no_agent (19.306 ms) : 19107, 19505
.   : milestone, 19306,
appsec (18.731 ms) : 18541, 18922
.   : milestone, 18731,
code_origins (17.782 ms) : 17607, 17957
.   : milestone, 17782,
iast (17.721 ms) : 17546, 17895
.   : milestone, 17721,
profiling (18.612 ms) : 18426, 18798
.   : milestone, 18612,
tracing (17.788 ms) : 17611, 17965
.   : milestone, 17788,
section candidate
no_agent (17.94 ms) : 17758, 18122
.   : milestone, 17940,
appsec (19.275 ms) : 19083, 19468
.   : milestone, 19275,
code_origins (18.324 ms) : 18137, 18510
.   : milestone, 18324,
iast (17.714 ms) : 17538, 17890
.   : milestone, 17714,
profiling (18.898 ms) : 18710, 19086
.   : milestone, 18898,
tracing (17.914 ms) : 17736, 18093
.   : milestone, 17914,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 19.306 ms [19.107 ms, 19.505 ms] -
appsec 18.731 ms [18.541 ms, 18.922 ms] -574.46 µs (-3.0%)
code_origins 17.782 ms [17.607 ms, 17.957 ms] -1.524 ms (-7.9%)
iast 17.721 ms [17.546 ms, 17.895 ms] -1.585 ms (-8.2%)
profiling 18.612 ms [18.426 ms, 18.798 ms] -694.0 µs (-3.6%)
tracing 17.788 ms [17.611 ms, 17.965 ms] -1.518 ms (-7.9%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 17.94 ms [17.758 ms, 18.122 ms] -
appsec 19.275 ms [19.083 ms, 19.468 ms] 1.335 ms (7.4%)
code_origins 18.324 ms [18.137 ms, 18.51 ms] 383.621 µs (2.1%)
iast 17.714 ms [17.538 ms, 17.89 ms] -226.238 µs (-1.3%)
profiling 18.898 ms [18.71 ms, 19.086 ms] 957.681 µs (5.3%)
tracing 17.914 ms [17.736 ms, 18.093 ms] -25.879 µs (-0.1%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~c0e1c11a92, baseline=1.60.0-SNAPSHOT~af8b84438c
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.181 ms) : 1170, 1192
.   : milestone, 1181,
iast (3.144 ms) : 3102, 3186
.   : milestone, 3144,
iast_FULL (5.797 ms) : 5739, 5856
.   : milestone, 5797,
iast_GLOBAL (3.597 ms) : 3540, 3655
.   : milestone, 3597,
profiling (2.059 ms) : 2040, 2077
.   : milestone, 2059,
tracing (1.793 ms) : 1778, 1808
.   : milestone, 1793,
section candidate
no_agent (1.174 ms) : 1162, 1185
.   : milestone, 1174,
iast (3.292 ms) : 3246, 3337
.   : milestone, 3292,
iast_FULL (5.626 ms) : 5570, 5683
.   : milestone, 5626,
iast_GLOBAL (3.466 ms) : 3408, 3523
.   : milestone, 3466,
profiling (2.049 ms) : 2031, 2067
.   : milestone, 2049,
tracing (1.796 ms) : 1779, 1814
.   : milestone, 1796,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.181 ms [1.17 ms, 1.192 ms] -
iast 3.144 ms [3.102 ms, 3.186 ms] 1.963 ms (166.2%)
iast_FULL 5.797 ms [5.739 ms, 5.856 ms] 4.616 ms (390.9%)
iast_GLOBAL 3.597 ms [3.54 ms, 3.655 ms] 2.416 ms (204.6%)
profiling 2.059 ms [2.04 ms, 2.077 ms] 877.61 µs (74.3%)
tracing 1.793 ms [1.778 ms, 1.808 ms] 612.471 µs (51.9%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.174 ms [1.162 ms, 1.185 ms] -
iast 3.292 ms [3.246 ms, 3.337 ms] 2.118 ms (180.4%)
iast_FULL 5.626 ms [5.57 ms, 5.683 ms] 4.452 ms (379.3%)
iast_GLOBAL 3.466 ms [3.408 ms, 3.523 ms] 2.292 ms (195.2%)
profiling 2.049 ms [2.031 ms, 2.067 ms] 874.889 µs (74.5%)
tracing 1.796 ms [1.779 ms, 1.814 ms] 622.427 µs (53.0%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master andrea.marziali/serviename-integration
git_commit_date 1771530866 1771583914
git_commit_sha af8b844 c0e1c11
release_version 1.60.0-SNAPSHOT~af8b84438c 1.60.0-SNAPSHOT~c0e1c11a92
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1771585825 1771585825
ci_job_id 1442454778 1442454778
ci_pipeline_id 97817314 97817314
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-loaih57q 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-loaih57q 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics.

Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~c0e1c11a92, baseline=1.60.0-SNAPSHOT~af8b84438c
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.483 ms) : 1471, 1495
.   : milestone, 1483,
appsec (2.602 ms) : 2544, 2661
.   : milestone, 2602,
iast (2.27 ms) : 2201, 2340
.   : milestone, 2270,
iast_GLOBAL (2.308 ms) : 2238, 2378
.   : milestone, 2308,
profiling (2.117 ms) : 2060, 2173
.   : milestone, 2117,
tracing (2.083 ms) : 2029, 2137
.   : milestone, 2083,
section candidate
no_agent (1.476 ms) : 1465, 1488
.   : milestone, 1476,
appsec (2.532 ms) : 2477, 2587
.   : milestone, 2532,
iast (2.262 ms) : 2192, 2331
.   : milestone, 2262,
iast_GLOBAL (2.33 ms) : 2259, 2401
.   : milestone, 2330,
profiling (2.107 ms) : 2052, 2163
.   : milestone, 2107,
tracing (2.079 ms) : 2025, 2133
.   : milestone, 2079,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.483 ms [1.471 ms, 1.495 ms] -
appsec 2.602 ms [2.544 ms, 2.661 ms] 1.119 ms (75.5%)
iast 2.27 ms [2.201 ms, 2.34 ms] 787.297 µs (53.1%)
iast_GLOBAL 2.308 ms [2.238 ms, 2.378 ms] 825.277 µs (55.6%)
profiling 2.117 ms [2.06 ms, 2.173 ms] 633.742 µs (42.7%)
tracing 2.083 ms [2.029 ms, 2.137 ms] 599.807 µs (40.4%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.476 ms [1.465 ms, 1.488 ms] -
appsec 2.532 ms [2.477 ms, 2.587 ms] 1.056 ms (71.5%)
iast 2.262 ms [2.192 ms, 2.331 ms] 785.571 µs (53.2%)
iast_GLOBAL 2.33 ms [2.259 ms, 2.401 ms] 853.642 µs (57.8%)
profiling 2.107 ms [2.052 ms, 2.163 ms] 631.228 µs (42.8%)
tracing 2.079 ms [2.025 ms, 2.133 ms] 602.703 µs (40.8%)
Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~c0e1c11a92, baseline=1.60.0-SNAPSHOT~af8b84438c
    dateFormat X
    axisFormat %s
section baseline
no_agent (14.996 s) : 14996000, 14996000
.   : milestone, 14996000,
appsec (14.903 s) : 14903000, 14903000
.   : milestone, 14903000,
iast (17.746 s) : 17746000, 17746000
.   : milestone, 17746000,
iast_GLOBAL (17.723 s) : 17723000, 17723000
.   : milestone, 17723000,
profiling (14.719 s) : 14719000, 14719000
.   : milestone, 14719000,
tracing (14.663 s) : 14663000, 14663000
.   : milestone, 14663000,
section candidate
no_agent (14.944 s) : 14944000, 14944000
.   : milestone, 14944000,
appsec (14.811 s) : 14811000, 14811000
.   : milestone, 14811000,
iast (17.99 s) : 17990000, 17990000
.   : milestone, 17990000,
iast_GLOBAL (17.694 s) : 17694000, 17694000
.   : milestone, 17694000,
profiling (14.744 s) : 14744000, 14744000
.   : milestone, 14744000,
tracing (14.576 s) : 14576000, 14576000
.   : milestone, 14576000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 14.996 s [14.996 s, 14.996 s] -
appsec 14.903 s [14.903 s, 14.903 s] -93.0 ms (-0.6%)
iast 17.746 s [17.746 s, 17.746 s] 2.75 s (18.3%)
iast_GLOBAL 17.723 s [17.723 s, 17.723 s] 2.727 s (18.2%)
profiling 14.719 s [14.719 s, 14.719 s] -277.0 ms (-1.8%)
tracing 14.663 s [14.663 s, 14.663 s] -333.0 ms (-2.2%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 14.944 s [14.944 s, 14.944 s] -
appsec 14.811 s [14.811 s, 14.811 s] -133.0 ms (-0.9%)
iast 17.99 s [17.99 s, 17.99 s] 3.046 s (20.4%)
iast_GLOBAL 17.694 s [17.694 s, 17.694 s] 2.75 s (18.4%)
profiling 14.744 s [14.744 s, 14.744 s] -200.0 ms (-1.3%)
tracing 14.576 s [14.576 s, 14.576 s] -368.0 ms (-2.5%)

@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Feb 17, 2026

Kafka / consumer-benchmark

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master andrea.marziali/serviename-integration
git_commit_date 1771530866 1771583914
git_commit_sha af8b844 c0e1c11
See matching parameters
Baseline Candidate
ci_job_date 1771584948 1771584948
ci_job_id 1442454787 1442454787
ci_pipeline_id 97817314 97817314
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
jdkVersion 11.0.25 11.0.25
jmhVersion 1.36 1.36
jvm /usr/lib/jvm/java-11-openjdk-amd64/bin/java /usr/lib/jvm/java-11-openjdk-amd64/bin/java
jvmArgs -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/consumer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/consumer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant
vmName OpenJDK 64-Bit Server VM OpenJDK 64-Bit Server VM
vmVersion 11.0.25+9-post-Ubuntu-1ubuntu122.04 11.0.25+9-post-Ubuntu-1ubuntu122.04

Summary

Found 1 performance improvements and 0 performance regressions! Performance is the same for 2 metrics, 0 unstable metrics.

scenario Δ mean throughput
scenario:only-tracing-dsm-disabled-benchmarks/KafkaConsumerBenchmark.benchConsume better
[+3680.734op/s; +21345.148op/s] or [+1.243%; +7.208%]
See unchanged results
scenario Δ mean throughput
scenario:not-instrumented/KafkaConsumerBenchmark.benchConsume unsure
[+474.320op/s; +10580.746op/s] or [+0.163%; +3.639%]
scenario:only-tracing-dsm-enabled-benchmarks/KafkaConsumerBenchmark.benchConsume same

@amarziali amarziali force-pushed the andrea.marziali/serviename-integration branch from 7a4d869 to 32dba74 Compare February 18, 2026 12:59
@amarziali amarziali changed the title wip [WIP] Track service name source Feb 18, 2026
@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Feb 18, 2026

Kafka / producer-benchmark

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master andrea.marziali/serviename-integration
git_commit_date 1771530866 1771583914
git_commit_sha af8b844 c0e1c11
See matching parameters
Baseline Candidate
ci_job_date 1771585053 1771585053
ci_job_id 1442454785 1442454785
ci_pipeline_id 97817314 97817314
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
jdkVersion 11.0.25 11.0.25
jmhVersion 1.36 1.36
jvm /usr/lib/jvm/java-11-openjdk-amd64/bin/java /usr/lib/jvm/java-11-openjdk-amd64/bin/java
jvmArgs -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/producer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/producer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant
vmName OpenJDK 64-Bit Server VM OpenJDK 64-Bit Server VM
vmVersion 11.0.25+9-post-Ubuntu-1ubuntu122.04 11.0.25+9-post-Ubuntu-1ubuntu122.04

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 3 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean throughput
scenario:not-instrumented/KafkaProduceBenchmark.benchProduce same
scenario:only-tracing-dsm-disabled-benchmarks/KafkaProduceBenchmark.benchProduce same
scenario:only-tracing-dsm-enabled-benchmarks/KafkaProduceBenchmark.benchProduce unsure
[+774.156op/s; +4023.107op/s] or [+0.526%; +2.735%]

@amarziali amarziali force-pushed the andrea.marziali/serviename-integration branch from f4aa3fe to a7f8b76 Compare February 18, 2026 16:24
@amarziali amarziali marked this pull request as ready for review February 18, 2026 16:24
@amarziali amarziali requested review from a team as code owners February 18, 2026 16:24
@amarziali amarziali requested review from claponcet, jandro996 and mcculls and removed request for a team February 18, 2026 16:24
@amarziali amarziali changed the title [WIP] Track service name source Track service name source Feb 18, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 18, 2026

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

  • Add at least one type, and one component or instrumentation label to the pull request

If you need help, please check our contributing guidelines.

@amarziali amarziali added type: enhancement Enhancements and improvements comp: core Tracer core labels Feb 18, 2026
@raphaelgavache
Copy link
Copy Markdown
Member

is the client stats part being done in a different PR? https://github.com/DataDog/datadog-agent/pull/45982/changes
srv_src field on client stats

@amarziali
Copy link
Copy Markdown
Contributor Author

is the client stats part being done in a different PR? https://github.com/DataDog/datadog-agent/pull/45982/changes srv_src field on client stats

yep I will stack another PR on top

@raphaelgavache
Copy link
Copy Markdown
Member

could you send a few test data on datadog with this PR to double check from UI

Copy link
Copy Markdown
Member

@raphaelgavache raphaelgavache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Copy Markdown
Member

@jandro996 jandro996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! FYI: I’m currently working on a PR that affects Inferred Proxy Span. If it gets merged first, we might need to make a few small adjustments on our side: #10561

@amarziali amarziali force-pushed the andrea.marziali/serviename-integration branch from d070b9a to 8bfae7a Compare February 19, 2026 15:23
@amarziali amarziali added this pull request to the merge queue Feb 20, 2026
@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts bot commented Feb 20, 2026

/merge

@gh-worker-devflow-routing-ef8351
Copy link
Copy Markdown

gh-worker-devflow-routing-ef8351 bot commented Feb 20, 2026

View all feedbacks in Devflow UI.

2026-02-20 18:53:50 UTC ℹ️ Start processing command /merge


2026-02-20 18:53:54 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in master is approximately 1h (p90).


2026-02-20 19:53:01 UTC ℹ️ MergeQueue: This merge request was merged

@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 20, 2026
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d bot merged commit 20abae4 into master Feb 20, 2026
571 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d bot deleted the andrea.marziali/serviename-integration branch February 20, 2026 19:53
@github-actions github-actions bot added this to the 1.60.0 milestone Feb 20, 2026
gh-worker-dd-mergequeue-cf854d bot pushed a commit to DataDog/dd-trace-go that referenced this pull request Mar 12, 2026
This PR is a first of many that enriches service source to service overrides. In particalar the PR contains:
- introduction of `instrumentation.ServiceNameWithSource` method to be used by all integrations
- 4 integrations covered: grpc, gin-gonic, go-redis, database/sql
- inheritence of service source
- coverage of service mapping configuration
- encoding of source in span.Meta

See other similar PRs in dd-trace-java [PR1 - integration services](DataDog/dd-trace-java#10607),  [PR2- client stats](DataDog/dd-trace-java#10653), [PR3 - config cases](DataDog/dd-trace-java#10658),  [PR4 - manual source](DataDog/dd-trace-java#10704)

<img width="1027" height="655" alt="Screenshot 2026-03-10 at 11 48 36" src="https://github.com/user-attachments/assets/a7db0a35-34cd-4541-bf23-1c8d500af032" />



### Reviewer's Checklist
<!--
* Authors can use this list as a reference to ensure that there are no problems
  during the review but the signing off is to be done by the reviewer(s).
-->

- [ ] Changed code has unit tests for its functionality at or near 100% coverage.
- [ ] [System-Tests](https://github.com/DataDog/system-tests/) covering this feature have been added and enabled with the va.b.c-dev version tag.
- [ ] There is a benchmark for any new code, or changes to existing code.
- [ ] If this interacts with the agent in a new way, a system test has been added.
- [ ] New code is free of linting errors. You can check this by running `make lint` locally.
- [ ] New code doesn't break existing tests. You can check this by running `make test` locally.
- [ ] Add an appropriate team label so this PR gets put in the right place for the release notes.
- [ ] All generated files are up to date. You can check this by running `make generate` locally.
- [ ] Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild. Make sure all nested modules are up to date by running `make fix-modules` locally.

Unsure? Have a question? Request a review!


Co-authored-by: raphael.gavache <raphael.gavache@datadoghq.com>
NachoEchevarria added a commit to DataDog/dd-trace-dotnet that referenced this pull request Mar 13, 2026
## Summary of changes

Add `_dd.svc_src` meta tag to spans to track the source of the service
name. When an integration overrides the default service name (e.g.,
schema V0 without `removeClientServiceNames`), the tag is set to the
integration name (e.g., `"redis"`, `"kafka"`, `"http-client"`). When a
service name mapping is configured via `DD_TRACE_SERVICE_MAPPING`, the
tag is set to `"opt.service_mapping"`. When the default service name is
used, no tag is emitted.

Jira: https://datadoghq.atlassian.net/browse/APMLP-1015
RFC:
https://docs.google.com/document/d/11OnbVYMDK-c5D-_V4QfOvL0Pc0z5oFQFGY3xSI-W7xk/edit?tab=t.0

## Reason for change

.NET equivalent of
[dd-trace-java#10607](DataDog/dd-trace-java#10607).
Service name source attribution lets the backend know which component
set the service name on each span.

## Implementation details

- **Tag constant**: `Tags.ServiceNameSource = "_dd.svc_src"` and
`SpanContext.ServiceNameSource` property.
- **`ServiceNameMetadata`**: Encapsulates resolved service name and
source attribution. Returned by all schema `GetServiceNameMetadata()`
methods and `PerTraceSettings.GetServiceNameMetadata()`.
- **Schema-level source**: `DatabaseSchema`, `MessagingSchema`, and
`ClientSchema` each use a `Resolve` helper that determines the source:
`"opt.service_mapping"` when from `DD_TRACE_SERVICE_MAPPING`, the
integration key when service name ≠ default, or `null` otherwise.
- **`PerTraceSettings.GetServiceNameMetadata()`**: For AdoNet and AWS
integrations that resolve service names dynamically, returns
`"opt.service_mapping"` for mapped names, the integration key for
suffixed names, or `null` for default.
- **Integration callsites**: ~30 files updated to pass
`serviceNameSource`. Server-side integrations using the default service
name are unchanged.

## Test coverage

- **`DatabaseSchemaTests`**, **`MessagingSchemaTests`**,
**`ClientSchemaTests`**: Tests for source attribution.
- **Snapshot files**: 130+ integration test snapshots and 3 smoke test
snapshots updated.

## Follow-up items

- V1 schema with `DD_TRACE_SERVICE_MAPPING` could emit
`opt.service_mapping` (mapped name differs from default). V1
SpanMetadata rules and snapshots would need updating. Deferred to a
follow-up PR.
- Manual set services (`"m"` source) and client stats payload — separate
PRs.

## Other details
korniltsev-grafanista added a commit to grafana/pyroscope-dotnet that referenced this pull request Apr 14, 2026
* [Version Bump] 3.40.0 (#8280)

The following files were found to be modified (as expected)

- [x] docs/CHANGELOG.md
- [x] .azure-pipelines/ultimate-pipeline.yml
- [x]
profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/CMakeLists.txt
- [x]
profiler/src/ProfilerEngine/Datadog.Profiler.Native.Windows/Resource.rc
- [x]
profiler/src/ProfilerEngine/Datadog.Profiler.Native/dd_profiler_version.h
- [x]
profiler/src/ProfilerEngine/Datadog.Linux.ApiWrapper/CMakeLists.txt
- [x] profiler/src/ProfilerEngine/ProductVersion.props
- [x] shared/src/Datadog.Trace.ClrProfiler.Native/CMakeLists.txt
- [x] shared/src/Datadog.Trace.ClrProfiler.Native/Resource.rc
- [x] shared/src/msi-installer/WindowsInstaller.wixproj
- [x] shared/src/native-src/version.h
- [x] tracer/build/artifacts/dd-dotnet.sh
- [x] tracer/build/_build/Build.cs
- [x]
tracer/samples/AutomaticTraceIdInjection/MicrosoftExtensionsExample/MicrosoftExtensionsExample.csproj
- [x]
tracer/samples/AutomaticTraceIdInjection/Log4NetExample/Log4NetExample.csproj
- [x]
tracer/samples/AutomaticTraceIdInjection/NLog40Example/NLog40Example.csproj
- [x]
tracer/samples/AutomaticTraceIdInjection/NLog45Example/NLog45Example.csproj
- [x]
tracer/samples/AutomaticTraceIdInjection/NLog46Example/NLog46Example.csproj
- [x]
tracer/samples/AutomaticTraceIdInjection/SerilogExample/SerilogExample.csproj
- [x] tracer/samples/ConsoleApp/Alpine3.10.dockerfile
- [x] tracer/samples/ConsoleApp/Alpine3.9.dockerfile
- [x] tracer/samples/ConsoleApp/Debian.dockerfile
- [x] tracer/samples/OpenTelemetry/Debian.dockerfile
- [x] tracer/samples/WindowsContainer/Dockerfile
- [x] tracer/src/Datadog.Trace.ClrProfiler.Managed.Loader/Startup.cs
- [x] tracer/src/Datadog.Tracer.Native/CMakeLists.txt
- [x] tracer/src/Datadog.Tracer.Native/dd_profiler_constants.h
- [x] tracer/src/Datadog.Tracer.Native/Resource.rc
- [x] tracer/src/Directory.Build.props
- [x] tracer/src/Datadog.Trace/TracerConstants.cs

@DataDog/apm-dotnet

Co-authored-by: bouwkast <8877527+bouwkast@users.noreply.github.com>

* Add an `IArrayPool<char>` implementation for vendored Newtonsoft.JSON (#8228)

## Summary of changes

Adds a simple `IArrayPool<char>` for use by Newtonsoft.JSON, and uses it
everywhere we can

## Reason for change

Newtonsoft.JSON fundamentally works with .NET's `char` type (UTF-16),
(as opposed to System.Text.Json which works with UTF-8 where it can). To
do so, it needs to create a bunch of `char[]` instances to use as
buffers.

The `JsonTextReader` and `JsonTextWriter` abstractions allow plugging in
an `IArrayPool<char>` implementation, and luckily this matches (pretty
much exactly) the API exposed by the `ArrayPool` in corelib (+
vendored), so it's easy to implement.

This should help alleviate some GC pressure, as we currently do a fair
amount of serializing and deserializing.

## Implementation details

Pretty simple:
- Implement `IArrayPool<char>` using `ArrayPool<char>.Shared`
- Get 🤖 to find everywhere that we could use it (`JsonTextReader` and
`JsonTextWriter`) and initialize
- Fix some cases where these weren't being disposed.

> [!WARNING]
> It's important that we _do_ dispose these, so that the arrays are
correctly returned to the pool, so that we don't leak memory

There are actually other places we can update too, as this PR doesn't
cover the common `JsonConvert.Serialize()` etc, but I'll follow up with
those in a separate PR.

## Test coverage

All the existing tests should pass. I worked on this as part of general
perf work on remote config, so the results are a bit fuzzy (as I can't
remember if it includes the savings from #8226 as well), but the results
are pretty conclusive, especially for big payloads 😅


| Method | size | Mean | Error | Allocated |
| ---------------------------- | ----- | ----------: | ------------: |
---------: |
| DeserializeResponse_Original | Small | 22.71 ns | 2,076.771 ns | 23.83
KB |
| DeserializeResponse_Updated | Small | 13.70 us | 0.186 us | 17.23 KB |
| | | | | |
| DeserializeResponse_Original | Big | 1,953.04 us | 58,665.219 ns
|2,343.26 KB |
| DeserializeResponse_Updated | Big | 614.46 us | 11.988 us | 252.37 KB
|


## Other details


https://datadoghq.atlassian.net/browse/LANGPLAT-940

Discovered this while exploring remote config optimizations

* Use `IArrayPool` in more places for JSON (#8236)

## Summary of changes

Adds JSON helper APIs to ensure we use the array pool where possible

## Reason for change

There are various "helper" APIs, which are wrappers around
Newtonsoft.JSON's `JsonSerializer` and `JsonReader`/`JsonWriter` APIs.
In #8228 we updated the explicit usages to use an array pool
implementation for `JsonReader`/`JsonWriter` calls, but they're used
internally without a pool in the helper cases.

This PR creates alternative implementations which _do_ use the pool,
updates existing code to use them, and adds the existing APIs to the
"banned API" list.

> There's another possible approach, in which we update the vendored
code to _always_ use the array pool. I was torn as to which is the
better option, and went for this approach in the end, but I'm not wedded
to it, so happy to take the alternative approach if people think it's
preferable?

## Implementation details

- Use 🤖 to find all the potential places that we need to convert.
- Create "array pool" versions of the APIs
- Update the call sites to use the new APIs
- Add the original APIs to the list of "banned" APIs to avoid using them
accidentally in the future

## Test coverage

Covered by all our existing tests, added some unit tests to confirm the
new tests behave as expected

## Other details

https://datadoghq.atlassian.net/browse/LANGPLAT-940

Discovered this while exploring remote config optimizations, but should
help lots of areas.

* Remote configuration tests and performance improvements (#8237)

## Summary of changes

- Add tests for `RcmSubscriptionManager` and `RemoteConfigurationPath`
- Replace regex with string comparison in `RemoteConfigurationPath`

## Reason for change

- We were missing unit tests for remote config stuff, and I want to
improve it without breaking things
- The `RemoteConfigurationPath` is running a `Regex` on every file
listed in the remote config response (which happens every 5s), but it's
a simple pattern that can be easily directly implemented

## Implementation details

- Used 🤖 to generate bunch of tests, and verified they are really how we
want things to work
  - High level tests for `RcmSubscriptionManager`
  - Tests for `RemoteConfigurationPath` covering changes in this PR
- More 🤖 in the conversion, but it's relatively simple, once you decode
the allowed patterns from the Regex 😄

## Test coverage

Unit tests in this PR cover compatibility with the existing
implementation.

Simple benchmarking for the regex improvements:


| Method | Runtime | Mean | Error | Allocated |
| ---------------------------------------- | ------------------ |
-------: | ------: | --------: |
| RemoteConfigurationPathFromPath_Original | .NET 10.0 | 181.5 ns | 2.51
ns | 768 B |
| RemoteConfigurationPathFromPath_Updated | .NET 10.0 | 54.8 ns | 3.90
ns | 152 B |
| | | | | |
| RemoteConfigurationPathFromPath_Original | .NET 6.0 | 204.0 ns | 2.64
ns | 768 B |
| RemoteConfigurationPathFromPath_Updated | .NET 6.0 | 66.4 ns | 1.13 ns
| 152 B |
| | | | | |
| RemoteConfigurationPathFromPath_Original | .NET Core 2.1 | 296.9 ns |
4.09 ns | 872 B |
| RemoteConfigurationPathFromPath_Updated | .NET Core 2.1 | 82.2 ns |
2.21 ns | 160 B |
| | | | | |
| RemoteConfigurationPathFromPath_Original | .NET Core 3.1 | 281.0 ns |
3.79 ns | 768 B |
| RemoteConfigurationPathFromPath_Updated | .NET Core 3.1 | 72.8 ns |
1.90 ns | 152 B |
| | | | | |
| RemoteConfigurationPathFromPath_Original | .NET Framework 4.8 | 326.7
ns | 2.11 ns | 875 B |
| RemoteConfigurationPathFromPath_Updated | .NET Framework 4.8 | 110.0
ns | 1.76 ns | 160 B |

<details><summary>Benchmarking code</summary>
<p>

```csharp
[MemoryDiagnoser, GroupBenchmarksBy(BenchmarkLogicalGroupRule.ByCategory), CategoriesColumn]
public class RemoteConfigBenchmark
{
    private string _pathToTest;
    private string _result;

    [GlobalSetup]
    public void GlobalSetup()
    {
        _pathToTest = "datadog/2/ASM_FEATURES/ASM_FEATURES-third/testname";
    }

    [Benchmark]
    public string RemoteConfigurationPathFromPath_Original()
    {
        var result = OriginalRemoteConfigurationPath.FromPath(_pathToTest);
        _result = result.Id;
        return result.Path;
    }

    [Benchmark]
    public string RemoteConfigurationPathFromPath_Updated()
    {
        var result = RemoteConfigurationPath.FromPath(_pathToTest);
        _result = result.Id;
        return result.Path;
    }
}
```


</p>
</details> 




## Other details

https://datadoghq.atlassian.net/browse/LANGPLAT-940

All part of the Remote Config perf stack

* Reduce payload size sent by agent in remote config (#8238)

## Summary of changes

Stop sending the fixed `RootVersion = 1` with every remote config
request

## Reason for change

Currently we're sending a fixed value of `RootVersion = 1` for all our
remote config requests, but doing so causes the agent to repeatedly send
us all the root certificates, significantly increasing the payload size,
because it thinks we haven't seen them. Sending the "final" version,
acknowledges it, and stops all the extra data, saving ~35kB per call.

## Implementation details

- Deserialize the roots in `GetRcmResponse`, leaving them as base64
encoded `string`s (which is how they are sent in the payload)
- When processing the response, de-encode the _last_ root object, and
deserialize it
- Using the `Base64DecodingStream` introduced in #8226 to avoid extra
allocations
  - Using the `IArrayPool` from #8228
- Introduces a `MinimalTufRoot` (in contrast to `TufRoot`) so that we
only materialize what we need (the `roots.signed.version` key)

This implementation avoids ~35kB per call for subsequent remote config
requests.

## Test coverage

Added unit test, and did manual test with the real agent, to confirm the
expected behaviour (reduction in data sent)

## Other details

https://datadoghq.atlassian.net/browse/LANGPLAT-940

All part of the Remote Config perf stack

* Exclude Otel for MongoDb 3.7 (#8279)

## Summary of changes

Adds MongoDB.Driver to the Activity ignore handler, to avoid duplicate
instrumentations

## Reason for change

MongoDB .NET driver v3.7.0 [adds support for
OpenTelemetry](https://github.com/mongodb/mongo-csharp-driver/releases/tag/v3.7.0),
but this results in duplicate instrumentations for our MongoDb
instrumentation. You can see this in play in [the test-package-version
PR](https://github.com/DataDog/dd-trace-dotnet/pull/8278) here

An AI analysis (shown in full below), has the following summary:

**Cannot replace custom instrumentation with OTel spans.** Key blockers:

1. **No query body** (`mongodb.query`) — This is the most valuable tag
for debugging and is completely absent from OTel spans
2. **Wrong span type** (`http` instead of `mongodb`) — Would break DB
categorization in Datadog UI
3. **Wrong span name** (`client.request` instead of `mongodb.query`) —
Loses MongoDB-specific identification
4. **No separate service name** in SchemaV0 — Breaks service map
topology
5. **Doubled span count** — 30 extra spans per trace adds overhead

There's a bunch of other tag differences, some of which may or may-not
be a problem, but the fact they don't tag the query body likely
precludes us deferring to the otel approach, as it would be crucial
lacking information.

<details><summary>AI analysis comparing custom instrumentation to
OTel</summary>
<p>

# MongoDB 3.7.0 OTel Instrumentation Analysis

## Context

MongoDB.Driver 3.7.0 adds built-in OpenTelemetry instrumentation. When
used with our existing dd-trace-dotnet custom instrumentation, this
creates duplicate/overlapping spans. We need to determine whether the
OTel spans can replace our custom instrumentation, or whether we should
disable them.

This analysis is based on the diff at `mongodb.diff`, which compares the
SchemaV0 snapshot for the `3_0` package version (before) against the new
output with MongoDB 3.7.0 (after).

---

## Span Count Change

| | Before (3.0) | After (3.7.0) |
|---|---|---|
| App spans (internal) | 4 | 4 |
| DD `mongodb.query` spans | 16 | 15 |
| OTel Level 1 spans (logical operation) | 0 | **15** |
| OTel Level 2 spans (wire command) | 0 | **15** |
| **Total** | **20** | **49** |

The span count more than doubles: **+30 new OTel spans** (15 L1 + 15
L2).

Note: the "before" count has 16 mongodb.query spans vs 15 in "after".
The initial `find` that was directly under Main() (Id_2) now has an OTel
L1 parent (Id_6), and there's one fewer mongodb.query span for the
`countDocuments` operations - actually looking more carefully, the count
is the same (15 DD spans each for the 3 groups of sync/async x 5
operations + 1 initial find = 16... let me recount). Actually both
before and after have 15 `mongodb.query` spans - same count. The diff
just renumbers IDs.

---

## New Span Hierarchy

### Before (2-tier):
```
App span (e.g., "sync-calls")
  └── mongodb.query (our custom instrumentation)
```

### After (4-tier):
```
App span (e.g., "sync-calls")
  └── OTel L1 - "client.request" (logical operation)
       └── mongodb.query (our custom instrumentation, RE-PARENTED)
            └── OTel L2 - "client.request" (wire protocol command)
```

Our `mongodb.query` spans are now sandwiched between two OTel spans. The
OTel L1 span becomes the parent of our span, and our span becomes the
parent of the OTel L2 span.

---

## Two Levels of OTel Spans

### OTel Level 1 — Logical Operation Spans

| Property | Value |
|---|---|
| Name | `client.request` |
| Resource | `<operation> <db>.<collection>` (e.g., `find
test-db.employees`) |
| Service | `Samples.MongoDB` (app service name, NOT separate) |
| Type | `http` (incorrect for a DB span!) |
| span.kind | `client` |

**Tags:**
- `db.collection.name`: e.g., `employees`
- `db.namespace`: e.g., `test-db`
- `db.operation.name`: e.g., `find`, `delete`, `insert`,
`countDocuments`
- `db.operation.summary`: e.g., `find test-db.employees`
- `db.system.name`: `mongodb`
- `otel.library.name`: `MongoDB.Driver`
- `otel.library.version`: `3.7.0`
- `otel.status_code`: `STATUS_CODE_OK`

**No metrics.** No host/port info at this level.

### OTel Level 2 — Wire Protocol Command Spans

| Property | Value |
|---|---|
| Name | `client.request` |
| Resource | `<command>` (e.g., `find`, `delete`, `aggregate`) — **just
the command, no db name** |
| Service | `Samples.MongoDB` (app service name) |
| Type | `http` (incorrect for a DB span!) |
| span.kind | `client` |

**Tags:**
- `db.collection.name`: e.g., `employees`
- `db.command.name`: e.g., `find`, `delete`, `insert`, `aggregate`
- `db.mongodb.lsid`: Session ID (BSON)
- `db.namespace`: e.g., `test-db`
- `db.query.summary`: e.g., `find test-db.employees`
- `db.system.name`: `mongodb`
- `network.transport`: `tcp`
- `server.address`: e.g., `mongo`
- `otel.library.name`: `MongoDB.Driver`
- `otel.library.version`: `3.7.0`
- `otel.status_code`: `STATUS_CODE_OK`

**Metrics:**
- `db.mongodb.driver_connection_id`: e.g., `3.0`
- `db.mongodb.server_connection_id`: e.g., `7.0`
- `server.port`: e.g., `27017.0`

---

## Tag-by-Tag Comparison: OTel Spans vs DD Custom Spans

### DD Custom `mongodb.query` span tags:
| Tag | DD Value | OTel L1 Equivalent | OTel L2 Equivalent |
|---|---|---|---|
| `component` | `MongoDb` | **MISSING** | **MISSING** |
| `db.name` | `test-db` | `db.namespace` = `test-db` | `db.namespace` =
`test-db` |
| `mongodb.collection` | `employees` | `db.collection.name` =
`employees` | `db.collection.name` = `employees` |
| `mongodb.query` | Full BSON query JSON | **MISSING** | **MISSING** |
| `out.host` | `mongo` | **MISSING** | `server.address` = `mongo` |
| `out.port` | `27017` | **MISSING** | `server.port` = `27017.0`
(metric) |
| `span.kind` | `client` | `client` | `client` |
| `_dd.base_service` | `Samples.MongoDB` | N/A | N/A |

### DD Custom span properties:
| Property | DD Value | OTel L1 | OTel L2 |
|---|---|---|---|
| Name | `mongodb.query` | `client.request` | `client.request` |
| Resource | `<op> <db>` (e.g., `find test-db`) | `<op> <db>.<coll>`
(e.g., `find test-db.employees`) | `<command>` (e.g., `find`) |
| Service | `Samples.MongoDB-mongodb` (SchemaV0) | `Samples.MongoDB` |
`Samples.MongoDB` |
| Type | `mongodb` | `http` | `http` |

### Tags present in OTel but NOT in DD custom spans:
- `db.operation.name` / `db.command.name`
- `db.operation.summary` / `db.query.summary`
- `db.system.name`
- `db.mongodb.lsid` (session ID - L2 only)
- `network.transport` (L2 only)
- `server.address` (L2 only — similar to `out.host`)
- `otel.library.name`, `otel.library.version`
- `otel.status_code`

### Metrics present in OTel but NOT in DD custom spans:
- `db.mongodb.driver_connection_id` (L2 only)
- `db.mongodb.server_connection_id` (L2 only)
- `server.port` (L2 only — similar to `out.port`)

---

## Operation Name Mapping: `countDocuments` vs `aggregate`

Notable: The OTel L1 span uses the **logical operation name**
`countDocuments`, while our DD custom span (and OTel L2) use the **wire
protocol command** `aggregate`. This is because MongoDB implements
`countDocuments()` as an aggregate pipeline internally. The OTel L1 span
provides the more user-friendly logical name.

---

## Critical Gaps if Replacing DD Custom with OTel

### 1. **`mongodb.query` tag — MISSING from OTel**
The full BSON query body is the most significant tag in our custom
instrumentation. **Neither OTel level provides this.** The OTel spans
only include `db.query.summary` (e.g., `find test-db.employees`) which
is just the operation + namespace, not the actual query filter/pipeline.

### 2. **Span Type — `http` instead of `mongodb`**
Both OTel span levels use Type `http`, while our custom spans correctly
use Type `mongodb`. This affects categorization in the Datadog UI (DB
queries vs HTTP calls).

### 3. **Span Name — `client.request` instead of `mongodb.query`**
Generic OTel name vs our specific operation name.

### 4. **Service Name — No separate service (SchemaV0)**
In SchemaV0, our custom spans use `Samples.MongoDB-mongodb` (separate
service), while OTel spans use the app service name `Samples.MongoDB`.
This means no dedicated MongoDB service in the service map.

### 5. **Resource Name — Different format**
- DD: `find test-db` (operation + database)
- OTel L1: `find test-db.employees` (operation + db.collection — **more
specific, arguably better**)
- OTel L2: `find` (just the command — less useful)

### 6. **`component` tag — MISSING from OTel**
Our DD spans set `component: MongoDb`. OTel spans don't have this.

---

## What OTel Does Better

1. **Collection name in resource**: `find test-db.employees` is more
informative than `find test-db`
2. **Logical operation names**: `countDocuments` instead of `aggregate`
(L1 only)
3. **Connection metadata**: `driver_connection_id`,
`server_connection_id`, `network.transport`
4. **Session tracking**: `db.mongodb.lsid`
5. **Semantic conventions**: Uses standard OTel DB semantic conventions
(`db.namespace`, `db.system.name`, etc.)

---

## Summary / Recommendation

**Cannot replace custom instrumentation with OTel spans.** Key blockers:

1. **No query body** (`mongodb.query`) — This is the most valuable tag
for debugging and is completely absent from OTel spans
2. **Wrong span type** (`http` instead of `mongodb`) — Would break DB
categorization in Datadog UI
3. **Wrong span name** (`client.request` instead of `mongodb.query`) —
Loses MongoDB-specific identification
4. **No separate service name** in SchemaV0 — Breaks service map
topology
5. **Doubled span count** — 30 extra spans per trace adds overhead

**Recommended approach: Disable the OTel instrumentation** for MongoDB
3.7.0+ to maintain the existing behavior. This likely means either:
- Suppressing the MongoDB.Driver OTel ActivitySource at startup
- Or adjusting the integration to detect and skip OTel span creation
when our instrumentation is active

If keeping both is desired for any reason, the OTel spans should at
minimum be filtered out from the test snapshots, and ideally suppressed
at runtime to avoid the 2.5x span multiplication.



</p>
</details> 

## Implementation details

Add `MongoDB.Driver` to the activity ignore list.

## Test coverage

Bumped the tests to run with 3.7.0, so should be covered

## Other details

I do wonder if this is definitely the approach we _should_ be taking,
but let's take that offline

* Fix Datadog.Trace.Annotations dependency version in Datadog.AzureFunctions nuspec (#8285)

## Summary of changes

The `BuildAzureFunctionsNuget` target used `.SetVersion(Version)` which
appears to pass a `-p:Version` global MSBuild property, which overrode
the version of the Annotations project in the generated nuspec for the
Datadog.AzureFunctions package.

## Reason for change

In 3.39.0 we realized that this `SetVersion` overrides the version of
dependent projects when we build and pack them in CI, intention of the
`SetVersion` call was to ease local development / debugging.

## Implementation details

## Test coverage

## Other details
<!-- Fixes #{issue} -->

This was used for Build-AzureFunctionsNuget.ps1 to help with local
development / debugging for generating versions to avoid NuGet caching
but had the side effect of changing the version of the Annotations in
the nuspec

<!--  ⚠️ Note:

Where possible, please obtain 2 approvals prior to merging. Unless
CODEOWNERS specifies otherwise, for external teams it is typically best
to have one review from a team member, and one review from apm-dotnet.
Trivial changes do not require 2 reviews.

MergeQueue is NOT enabled in this repository. If you have write access
to the repo, the PR has 1-2 approvals (see above), and all of the
required checks have passed, you can use the Squash and Merge button to
merge the PR. If you don't have write access, or you need help, reach
out in the #apm-dotnet channel in Slack.
-->

* Retry ChromeDriver startup in Selenium tests (#8284)

<!-- dd-meta
{"pullId":"5d5a8ab7-cda5-48c4-85e8-2eee4bebe9fa","source":"chat","resourceId":"edb8418c-6513-4707-a649-1a74f8d8cc5a","workflowId":"1a1e3ffe-b6c3-4088-82e5-f2474a241011","codeChangeId":"1a1e3ffe-b6c3-4088-82e5-f2474a241011","sourceType":"action_platform_custom_agent"}
-->
## Summary of changes
Improve resilience of the Selenium CI integration path against transient
Chrome startup crashes by retrying the ChromeDriver initialization.
```
System.InvalidOperationException: session not created: Chrome failed to start: crashed.
(session not created: DevToolsActivePort file doesn't exist)
(The process started from chrome location C:\Users\AzDevOps\.cache\selenium\chrome\win64\146.0.7680.66\chrome.exe is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
```

## Reason for change
We observed failures that appear to conflict when tests start close
together in CI. A longer pause between ChromeDriver startup retries
reduces pressure on shared CI resources and gives Chrome more time to
recover between attempts.

## Implementation details
- Existing behavior retained:
  - ChromeDriver creation is retried up to 3 total attempts.
- Startup exceptions (`InvalidOperationException` and
`WebDriverException`) are retried.
  - Partially initialized driver instances are disposed before retry.
- Scope remains minimal and targeted: only ChromeDriver initialization
is retried.

## Test coverage

## Other details
<!-- Fixes #{issue} -->


<!--  ⚠️ Note:

Where possible, please obtain 2 approvals prior to merging. Unless
CODEOWNERS specifies otherwise, for external teams it is typically best
to have one review from a team member, and one review from apm-dotnet.
Trivial changes do not require 2 reviews.

MergeQueue is NOT enabled in this repository. If you have write access
to the repo, the PR has 1-2 approvals (see above), and all of the
required checks have passed, you can use the Squash and Merge button to
merge the PR. If you don't have write access, or you need help, reach
out in the #apm-dotnet channel in Slack.
-->

---

PR by Bits
[View session in
Datadog](https://app.datadoghq.com/code/edb8418c-6513-4707-a649-1a74f8d8cc5a)

Comment @datadog to request changes

Co-authored-by: datadog-datadog-prod-us1-2[bot] <261164178+datadog-datadog-prod-us1-2[bot]@users.noreply.github.com>

* [Test Package Versions Bump] Updating package versions (#8278)

Updates the package versions for integration tests.

Co-authored-by: andrewlock <18755388+andrewlock@users.noreply.github.com>

* More remote config performance improvements (#8241)

## Summary of changes

A variety of minor performance improvements to remote config

## Reason for change

I did some initial benchmarking of remote config, as well as running a
test app with ASM/Debugger enabled (which use RCM), and the results
weren't great. Given we make RCM requests every 5s, smallish changes
here should add up, though the real-world effect will be tricky to
gauge.

## Implementation details

Most of the individual changes are small. In summary:
- Making work lazy where possible (delaying creating collections)
  - Don't re-create dictionaries inside loops if not required
- Avoid creating empty dictionaries and collections for no-op RCM
responses
- Avoid creating empty collections in JSON objects when they're not in
the JSON
- Cache things that change rarely
- The `ExtraServicesProvider` will rarely see new service names, so
cache the array (as the collection is append-only)
- Don't create a new request each time, just update any values that may
have changed
  - Cache the capabilities array which will rarely change
- There's actually a potential threading bug around the use of
`BigInteger`. We probably could/should use `ulong` instead but that I'll
look at that in a separate PR
- Use abstractions introduced earlier in the stack (e.g. #8226, #8228)

Additionally, I did a little bit of cleanup:
- Remove unused members from the `IRcmSubscriptionManager` interface
- Add `#nullable enable` to the RCM types (and fix nullability where
required)

## Test coverage

Mostly covered by existing unit tests, also did some manual testing.
Finally, ran a few benchmarks, but it's a bit tricky to check reliably.
This benchmark is benchmarking `_manager.SendRequest(_rcmTracer, _ =>
_steadyStateResponseTask)` and passing the same response every time. In
practice, the response changes every time, so this isn't strictly
representative, but with the changes to request caching, this should
actually mean our improvements are _better_ than the original would be.
The regression in the .NET duration is curious, but I'm not massively
concerned, and the allocations are obviously down a lot at least


| Method | Runtime | Mean | Error | Allocated |
| ------------------------ | ------------------ | ----------: |
-----------: | --------: |
| PollSteadyState_Original | .NET 10.0 | 14,235.8 ns | 275.09 ns | 12.04
KB |
| PollSteadyState_Updated | .NET 10.0 | 20,472.0 ns | 1,352.567 ns |
3.90 KB |
| | | | | |
| PollSteadyState_Original | .NET 6.0 | 33,915.7 ns | 3,339.42 ns |
12.27 KB |
| PollSteadyState_Updated | .NET 6.0 | 14,979.7 ns | 447.784 ns | 3.99
KB |
| | | | | |
| PollSteadyState_Original | .NET Core 2.1 | 21,488.2 ns | 365.39 ns |
12.85 KB |
| PollSteadyState_Updated | .NET Core 2.1 | 17,847.2 ns | 693.510 ns |
4.04 KB |
| | | | | |
| PollSteadyState_Original | .NET Core 3.1 | 18,925.2 ns | 304.16 ns |
12.23 KB |
| PollSteadyState_Updated | .NET Core 3.1 | 15,407.7 ns | 313.26 ns |
4.01 KB |
| | | | | |
| PollSteadyState_Original | .NET Framework 4.8 | 20,946.8 ns | 320.06
ns | 14.21 KB |
| PollSteadyState_Updated | .NET Framework 4.8 | 16,635.2 ns | 266.526
ns | 4.40 KB |


## Other details
https://datadoghq.atlassian.net/browse/LANGPLAT-940

All part of the Remote Config perf stack. I think this is probably about
the end of it for now

* Rename telemetry `GetData` methods to `GetIncrementalData` (#8269)

## Summary of changes

Renames `GetData` to `GetIncrementalData` to differentiate from
`GetFullData()`

## Reason for change

In #8227 it was flagged that `GetData()` and `GetFullData()` are easy to
confuse. Renaming `GetData` to `GetIncrementalData` should solve it.

## Implementation details

Deterministic rename (thank you IDEs, take that AI)

## Test coverage

All covered by existing

* WCF cleanup and trace context extraction (#8263)

## Summary of changes

Follow on fixes from WCF improvements made in #7842

## Reason for change

@zacharycmontoya and I identified some things to fix while working on
the above PR, but deferred fixing them till later. And now is that time!
We flagged two main things:

- We're using a weak table currently for storing the scopes. That's
fine, but we always dispose the scope, we should just remove the object
from the table at the same time to avoid accessing a closed span (and
actively reducing the size of the table)
- We should skip extracting WCF message headers if there's an active
scope. Previously we were only doing that if we were working with http
headers

While trying to create a repro for the second point (by using [the Otel
WCF
instrumentation](https://github.com/open-telemetry/opentelemetry-dotnet-contrib/blob/a1c22d0dac923b564ae7d2cfa0d27c479e65455c/src/OpenTelemetry.Instrumentation.Wcf/README.md))
🤖 discovered a different issue, whereby the WCF server `wcf.request`
spans were incorrectly parented under the manual span instead, of the
OTel WCF client span `dotnet_wcf.client.request` for `NetTcpBinding`
(when `DD_TRACE_OTEL_ENABLED=true` and OpenTelemetry WCF client
instrumentation was active).

The latter issue was initially related to how we interact with the Otel
instrumentation in the sample, but switching to "pure" OTel for
propagation still revealed a parenting problem, because we were never
extracting the OTel headers.

> Note that this latter point _only_ impacts the `NetTcpBinding` because
with HTTP we handle all the propagation as expected.

## Implementation details

There are 3 fixes:
- (minor) Remove scope from weak table when we dispose it.
- (minor) Don't try to extract propagation context in server-side WCF if
there's already an active scope - couldn't repro, but it makes sense
- (fix) Also check the W3C trace context namespace for headers injected

## Test coverage

To increase coverage, reworked the `Samples.Wcf` to allow optionally
using the Otel instrumentation:

- Switch to combinatorial data (makes it easier to add another boolean)
- Optionally allow setting up wcf client-side Otel instrumentation
- When enabled, we configure a `TraceProvider`, add the
`TelemetryEndpointBehavior`, and _stop_ manually injecting our headers
into the message
  - When disabled, the sample is identical to before
- Update the snapshots
- Rename the existing snapshots to include the additional
`useOtelClientInstrumentation=False` suffix
  - Generate new snapshots for `useOtelClientInstrumentation=True`
- Make the fix, and then update the snapshots in an additional commit,
to demonstrate the change

## Other details

Heavily used Claude Code on the testing and implementation. We knew
about the first two issues, and it discovered the third when trying to
follow our suggestion for a repro

* Detect whether an app has been trimmed (#8281)

## Summary of changes

Detects whether an app has been published with trimming and whether our
Trimming.xml file has been used.

## Reason for change

Using trimming and _not_ using our trimming file can cause all sorts of
strange errors. It would be useful for support (and for informing
customers) if an app _is_ using trimming and is _not_ using our trimming
file

## Implementation details

Uses three `Type`s in the BCL as "probes" for whether an app is trimmed:
- `System.Resources.ResourceWriter, System.Resources.Writer`
- `System.IO.IsolatedStorage.IsolatedStorageScope,
System.IO.IsolatedStorage`
- `System.Net.NetworkInformation.PingCompletedEventArgs,
System.Net.Ping`

These are intentionally somewhat obtuse types that we would _expect_ to
be trimmed. Settled on them by iterating with 🤖 but we could certainly
change these.

The overall approach is
- Try to load the first two types. These are manually specified in our
trimming file, so if we _fail_ to load _either_ one, then we know we
_are_ in a trimmed app, _and_ they didn't add our file
- Try to load the third type, which _isn't_ in our trimming file. If it
loads, we're _not_ trimmed at all. If it doesn't load, we're in a
trimmed app, but they _did_ add our file.

If we detect the bad situation, we add a warning to the logs. Either
way, we tag the telemetry error logs with `trim:err/yes/no` where:
- `err` is trimmed and didn't add our file
- `yes` is trimmed but they added our file
- `no` not trimmed

## Test coverage

- Updated unit tests
- Added integration test about telemetry error logs in app trimming
scenario


[Pushed an initial
test](https://dev.azure.com/datadoghq/dd-trace-dotnet/_build/results?buildId=197054&view=logs&j=038b8080-1d19-502b-3685-9d5eff966aef&t=7b04c0a4-d19c-5f7e-a67e-3b6a219d2507&l=40)
without adding anything to the trimming file, and confirmed that we
tagged as `err` and write the error log (which correctly caused the
integration tests and trimming smoke tests to fail).

## Other details

The main consideration is the performance impact of loading these extra
types, from obtuse assemblies, on the hot path of app load. Each of the
assemblies is ~80kb (I swapped from System.Net.Mail because it's ~5x as
big), but then there's the dependency tree too... I considered using an
ACL and unloading, but as I understand it, that wouldn't necessarily
_actually_ unload them, seeing as they're part of the shared framework,
but I confess I'm trusting the AI on that one 😅

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Allow generating a snapshot of the MSI contents (#8270)

## Summary of changes

Adds a "snapshot generator" for the MSI contents

## Reason for change

We've discussed updating/switching to newer versions of Wix, and we want
to make sure we don't regress anything

## Implementation details

Uses the _WixToolset.Dtf.WindowsInstaller_ nuget package to read the
contents of the MSI. We then scrub values which we expect to change
(version numbers, filesizes etc) and dump the values out as a yaml file
(could have done any format, we already had a transient reference to
yamldotnet, I just made it explicit)

## Test coverage

Tested with a couple of MSIs from master, and they pass.

## Other details

I considered an alternative, where we try to understand the _impact_ of
installing the MSI, which on the surface is what we _really_ care about,
but seemed like a much harder prospect 😅

* Remove unnecessary use of duck typing in HTTP integrations (#8275)

## Summary of changes

Removes unnecessary use of duck typing in HTTP integrations

## Reason for change

We're currently always using duck typing in the various HTTP
integrations, but these types are always unconditionally available in
.NET Core (and we already reference those assemblies), so the duck
typing is an unnecessary layer that will reduce performance for no
benefit.

## Implementation details

`#if` our way to glory on the HTTP (and one of the gRPC) integrations.
If we're on .NET Core, these types are available, so we just use them

## Test coverage

Just a refactoring, so all the existing tests should cover this change.

_Hopefully_ we'll see some tiny movement on the benchmarks, but I don't
hold out a huge amount of hope for that. Either way, I think this change
can only be an improvement, so is worth it.

## Other details

Deleted the excessive comments on our integrations, seeing as they're
just noise and don't tell us much (we exclude them by default on new
integrations now)

One very interesting point - we _can't_ reference the "well known types"
_directly_ in the integrations, because this causes failures when there
are multiple Assembly Load Contexts. This is kinda surprising, but
something to bear in mind, and something we potentially need to look
into elsewhere too...

* Update Wix to 5.x.x (#8268)

## Summary of changes

Updates our MSI project to use [Wix
5.x.x](https://docs.firegiant.com/wix/whatsnew/#whats-new-in-wix-v5)
instead of Wix 3

## Reason for change

[Wix 3 was deprecated a year ago](https://docs.firegiant.com/wix/wix3/),
and is generally clunky and hard to use, as it relies on a global
install + .NET Framework 3.5. The newer versions of Wix use newer
SDK-style projects, are deployed as nuget packages, and can just be
built with a normal `dotnet build`

- Wix 4: Quite a big change
- Wix 5: Pretty much back-compatible with 4
- Wix 6: [Shifted licensing
model](https://docs.firegiant.com/wix/whatsnew/#open-source-maintenance-fee)
- we need to look into this if we want to upgrade further.
- Wix 7: As above 

## Implementation details

This was entirely 🤖 driven, but [there's also a .NET
tool](https://docs.firegiant.com/wix/whatsnew/#convert-wix-authoring-from-the-command-line)
to help with the conversion. _Mostly_ the changes are just "annoying",
e.g. moving values from being element text to a `Value` property, etc.

## Test coverage

At the end of the day, the generated MSI is _essentially_ the same as
confirmed by the snapshots created in #8270. The changes all appear to
be benign changes in hashing algorithms, or renaming of wix properties.

What's more, I tested the install, and it _looks_ the same (and works),
and the MSI tests all pass, which is obviously the important thing! 😄

<img width="495" height="387" alt="image"
src="https://github.com/user-attachments/assets/54995470-b846-419c-9f18-e07c1daae127"
/>
<img width="495" height="387" alt="image"
src="https://github.com/user-attachments/assets/0997801f-7143-4c25-88d4-d66ad2404968"
/>
<img width="495" height="387" alt="image"
src="https://github.com/user-attachments/assets/63381b12-f312-4b45-9ae3-d2c77b23d377"
/>
<img width="495" height="387" alt="image"
src="https://github.com/user-attachments/assets/320758a9-f0b7-42e8-a1db-4c8b4b8514ad"
/>
<img width="495" height="387" alt="image"
src="https://github.com/user-attachments/assets/65e1fa04-f02e-42bf-8fcd-efe704000901"
/>


## Other details

The removal of the `Win64="yes"` and `Win64="$(var.Win64)"` attributes
were the main thing I was unsure about. There _is_ [a `Bitness`
attribute now](https://docs.firegiant.com/wix/schema/wxs/component/),
with values `default`, `always32`, or `always64`, which is pretty much
equivalent. However, seeing as we _only_ produce an x64 installer, and
not an x86 installer, I think this is essentially just legacy cruft
which is ok to remove. We _might_ regret that choice if/when we need an
arm64 installer, but I think we'll need to look at everything again at
that point anyway, so I don't think it's worth worrying about 😄

---------

Co-authored-by: Claude <noreply@anthropic.com>

* (CI) Migrating benchmark-serverless CI job to short-lived tokens (#8219)

## Summary of changes
Migrated from use of Gitlab personal access token to short-lived token
(using authanywhere) when cloning the serverless-tools repo for the
benchmark-serverless CI job.

## Reason for change
Keeping up-to-date with current policies about tokens, avoiding token
expiry issues.

## Implementation details

## Test coverage

## Other details
<!-- Fixes #{issue} -->


<!--  ⚠️ Note:

Where possible, please obtain 2 approvals prior to merging. Unless
CODEOWNERS specifies otherwise, for external teams it is typically best
to have one review from a team member, and one review from apm-dotnet.
Trivial changes do not require 2 reviews.

MergeQueue is NOT enabled in this repository. If you have write access
to the repo, the PR has 1-2 approvals (see above), and all of the
required checks have passed, you can use the Squash and Merge button to
merge the PR. If you don't have write access, or you need help, reach
out in the #apm-dotnet channel in Slack.
-->

* Add _dd.svc_src meta tag to track service name source (#8274)

## Summary of changes

Add `_dd.svc_src` meta tag to spans to track the source of the service
name. When an integration overrides the default service name (e.g.,
schema V0 without `removeClientServiceNames`), the tag is set to the
integration name (e.g., `"redis"`, `"kafka"`, `"http-client"`). When a
service name mapping is configured via `DD_TRACE_SERVICE_MAPPING`, the
tag is set to `"opt.service_mapping"`. When the default service name is
used, no tag is emitted.

Jira: https://datadoghq.atlassian.net/browse/APMLP-1015
RFC:
https://docs.google.com/document/d/11OnbVYMDK-c5D-_V4QfOvL0Pc0z5oFQFGY3xSI-W7xk/edit?tab=t.0

## Reason for change

.NET equivalent of
[dd-trace-java#10607](https://github.com/DataDog/dd-trace-java/pull/10607).
Service name source attribution lets the backend know which component
set the service name on each span.

## Implementation details

- **Tag constant**: `Tags.ServiceNameSource = "_dd.svc_src"` and
`SpanContext.ServiceNameSource` property.
- **`ServiceNameMetadata`**: Encapsulates resolved service name and
source attribution. Returned by all schema `GetServiceNameMetadata()`
methods and `PerTraceSettings.GetServiceNameMetadata()`.
- **Schema-level source**: `DatabaseSchema`, `MessagingSchema`, and
`ClientSchema` each use a `Resolve` helper that determines the source:
`"opt.service_mapping"` when from `DD_TRACE_SERVICE_MAPPING`, the
integration key when service name ≠ default, or `null` otherwise.
- **`PerTraceSettings.GetServiceNameMetadata()`**: For AdoNet and AWS
integrations that resolve service names dynamically, returns
`"opt.service_mapping"` for mapped names, the integration key for
suffixed names, or `null` for default.
- **Integration callsites**: ~30 files updated to pass
`serviceNameSource`. Server-side integrations using the default service
name are unchanged.

## Test coverage

- **`DatabaseSchemaTests`**, **`MessagingSchemaTests`**,
**`ClientSchemaTests`**: Tests for source attribution.
- **Snapshot files**: 130+ integration test snapshots and 3 smoke test
snapshots updated.

## Follow-up items

- V1 schema with `DD_TRACE_SERVICE_MAPPING` could emit
`opt.service_mapping` (mapped name differs from default). V1
SpanMetadata rules and snapshots would need updating. Deferred to a
follow-up PR.
- Manual set services (`"m"` source) and client stats payload — separate
PRs.

## Other details

* Move running of smoke tests to Nuke (#8271)

## Summary of changes

Instead of most of the logic for running smoke tests being embedded in
the yaml and bash inside yaml, move the building and running of smoke
tests into Nuke.

## Reason for change

The previous design had some downsides:
- Very tied to Azure Devops. If we want to migrate to gitlab at some
point, this _should_ make it easier, because devops is "doing less"
- Hard to run smoke tests locally. If you wanted to investigate a
scenario, you'd have to decode all the docker, docker compose, and bash
scripts that you needed to run to get something _resembling_ the test
setup.
- There was a lot of duplication, because it's hard to remove that in a
clean way from some of the yaml without creating loads of fine-grained
steps (which have their own difficulties). Moving to C# makes it easy to
(for example) have try-catch blocks, custom retries etc
- Bash in YAML is kind of ewww

## Implementation details

I initially tried to implement this over a year ago, using
TestContainers, but tl;dr; I ran into a bunch of limitations that I
couldn't get past (APIs that we needed, which just didn't exist,
differences between windows/linux etc), so I abandoned it. Until 🤖 made
exploring these things easier again!

The latest approach uses the https://github.com/dotnet/Docker.DotNet/
project, which provides a strongly-typed way to call the docker HTTP API
(which is what TestContainers actually uses under the hood - it even
uses this project). This made it _much_ easier to convert the explicit
steps that we are doing currently in bash/yaml/docker-compose to being
simple C# methods.

At a high level, the implementation roughly follows what we have today,
but it's tied much _less_ to the azure devops infrastructure, as we just
run our Nuke tasks in the same way we do today (i.e. directly on the box
for Windows, in a docker container for Linux).

A high level overview:
- The `GenerateVariables` stage still generates the matrix of variables,
but it only needs to generate a _category_ (e.g.
`LinuxX64Installer`/`WindowsFleetInstallerIis`), and an associated
_scenario_ (the specific test, e.g. `ubuntu, .NET 10, .deb`).
- Renamed the stages (and associated matricies) to make them more
consistent e.g..
`smoke_<x64|arm64|win|macos>_<installer|nuget|fleet>_tests`. We can
easily tweak this if we prefer
- To run a test (e.g. locally) `build.ps1 RunArtifactSmokeTests
-SmokeTestCategory "LinuxX64Installer" -SmokeTestScenario
"someScenario"`
- All of the work for building the images, building/pulling the test
agent/running the smoke tests/running crash tests/Doing snapshot
verification is handled by Nuke. We have automatic retries around all
the parts that could fail (i.e. anything docker or HTTP related)

That also means we can delete various things
- All the old stages in the pipelin
- The old run-snapshot-test.yml
- The entries in the docker-compose (the test-agent is actually still
used in a few places, so those stay)

Also includes a few tiny tweaks and cleanup (commented in the files as
appropriate)

## Test coverage

The same hopefully!? I've run the full sweet of tests several times, and
spot checked various of the tests to make sure everything looks ok, and
as far as I can tell, it does! Also temporarily [modified the snapshots
](https://dev.azure.com/datadoghq/dd-trace-dotnet/_build/results?buildId=196876&view=results)
to confirm that causes everything to fail too

## Other details

The _big_ one which I didn't/couldn't easily convert is the macos smoke
tests. These are written _completely_ differently today, because they
don't run in containers (which means we have to handle a whole bunch of
different issues) and rather just duplicate a whole bunch of logic. It's
_probably_ not worth the effort to port them into Nuke at the moment,
but I'm open to doing it in a follow up if people feel one way or the
other.

The other thing is that I _didn't_ move the "downloading of artifacts"
into the nuke job, though technically we could, and it would make
running locally even easier. My reason for _not_ doing that was that it
ties the nuke side to the azure devops side completely then, and if we
rename an artifact in the yaml (for some reason) it's far more likely
we'll forget it on the c# side.

https://datadoghq.atlassian.net/browse/LANGPLAT-823

* [Metrics] Fixing sync-over-async patterns in DogStatsD Client Fork (#8265)

## Summary of changes

Fix sync-over-async patterns in the vendored DogStatsD client and add
async disposal for shutdown.

## Reason for change

`NamedPipeTransport.SendBuffer()` calls `WriteAsync().Wait()` on
`NamedPipeClientStream` which leads to sync-over-async that can deadlock
under thread pool starvation. `AsynchronousWorker.Dispose()` also blocks
with `worker.Wait()` on LongRunning tasks, making shutdown slow when the
agent is unreachable.

Part of [Enabling .NET Runtime Metrics by
Default](https://docs.google.com/document/d/1tekvCvlOkn12pU3jLK-ePZ0nrynSW7p8IWgG-Dh7zwQ/edit?pli=1&tab=t.0#heading=h.7u0xyjk7jxq5).
This should land before we enable Runtime Metrics by default since it
hardens the shared DogStatsD client that runtime metrics uses.

## Implementation details

- `NamedPipeTransport`: Replace `WriteAsync().Wait()` with synchronous
`Write()`. This is a worker thread sending metrics, so sync is the right
fit.
- Add `DisposeAsync()` through the disposal chain: `AsynchronousWorker`
-> `StatsBufferize` -> `StatsdData` -> `DogStatsdService` ->
`StatsdManager`.
- `StatsdClientHolder` uses a `TaskCompletionSource` to signal when
async disposal finishes, bridging the sync `Release()` path and the
async shutdown path.
- `TracerManager.RunShutdownTasksAsync` now awaits
`StatsdManager.DisposeAsync()` instead of synchronous `Dispose()`.

## Test coverage

Existing tests still pass 🤞.

## Other details
<!-- Fixes #issue -->

* [Metrics] Send Diagnostics GC Pause Time as Counter instead of Timer (#8266)

## Summary of changes

Send Diagnostics GC pause time as a Counter instead of a Timer.

## Reason for change

The Diagnostics listener calls `GC.GetTotalPauseDuration()` which
returns cumulative pause for the whole 10s interval. Sending that as a
single `Timer()` inflates `.median`/`.max` compared to EventListener,
which sends one `Timer()` per individual GC event. There's no public
.NET API for per-GC pause duration without EventPipe.

After consulting with the Agent Metrics team (#agent-metrics), the
recommended approach is a separate Counter for total pause, with average
computed at query time. See [RFC: Enabling .NET Runtime Metrics by
Default — GC Pause
Time](https://docs.google.com/document/d/1tekvCvlOkn12pU3jLK-ePZ0nrynSW7p8IWgG-Dh7zwQ/edit?pli=1&tab=t.0#heading=h.hz8udk7cv9zf).

## Implementation details

- Add `GcPauseTimeTotal` constant (`runtime.dotnet.gc.pause_time.total`)
to `MetricsNames`.
- Replace `statsd.Timer(MetricsNames.GcPauseTime, ...)` with
`statsd.Counter(MetricsNames.GcPauseTimeTotal, (long)totalPauseDelta)`
in `DiagnosticsMetricsRuntimeMetricsListener`.
- Average per-GC pause can be computed at query time: `pause_time.total
/ (gc.count.gen0 + gen1 + gen2)` in OOTB dashboard updated in a separate
PR:
https://github.com/DataDog/integrations-internal-core/pull/2823

## Test coverage

Updated
`DiagnosticMetricsRuntimeMetricsListenerTests.MonitorGarbageCollections`
to assert `Counter(GcPauseTimeTotal)` instead of `Timer(GcPauseTime)`.

## Other details

Part of [Enabling .NET Runtime Metrics by
Default](https://docs.google.com/document/d/1tekvCvlOkn12pU3jLK-ePZ0nrynSW7p8IWgG-Dh7zwQ/edit?pli=1&tab=t.0#heading=h.7u0xyjk7jxq5).

* [Agent Skill] Rename to `analyze-azdo-build`, add retry support, and more (#8247)

## Summary of changes

- ⭐ Rename skill from `troubleshoot-ci-build` to `analyze-azdo-build`
- ⭐ Add support for retrying failed/canceled stages
- Show Stage > Job > Task failures as a hierarchy (visual change only)
- Add prerequisite install docs for `gh`, `az` CLI, and `azure-devops`
extension

## Reason for change

The CI build analysis skill needed several improvements:
- The name `troubleshoot-ci-build` was generic; `analyze-azdo-build` is
more specific
- Failed stages/jobs/tasks were shown as flat lists, making it hard to
trace failures through the pipeline hierarchy
- API helper functions were duplicated and not reusable by other scripts
- There was no way to retry failed stages without navigating the Azure
DevOps UI
- Prerequisites lacked install instructions, leaving users to figure it
out themselves

## Implementation details

- **Skill rename**: Moved `.claude/skills/troubleshoot-ci-build/` to
`.claude/skills/analyze-azdo-build/` and updated the skill
name/description
- **Failure hierarchy**: Added `Get-FailureHierarchy` function that
walks `parentId` chains in timeline records to build a Stage > Job >
Task tree view
- **Shared module** (`AzureDevOpsHelpers.psm1`): Extracted
`Invoke-AzDevOpsApi`, `Get-BuildIdFromPR`, and new `Resolve-BuildId`
into a reusable PowerShell module
- **Retry script** (`Retry-AzureDevOpsFailedStages.ps1`): Retries
failed/canceled stages via the Azure DevOps stages API; supports `-All`,
`-Stage`, `-ForceRetryAllJobs`, `-WhatIf`, and interactive selection
- **Prerequisite docs**: Added verify commands, one-liner installs
(`winget`/`brew`), and official docs links for all CLI tools
- **Tighten `allowed-tools`**: Replaced broad `Bash(pwsh:*)` permission
with specific patterns for the three pwsh commands the skill uses (`pwsh
-Version`, `Get-AzureDevOpsBuildAnalysis.ps1`,
`Retry-AzureDevOpsFailedStages.ps1`)
- Updated `SKILL.md` and `scripts-reference.md` documentation throughout

## Test coverage

Tested against real Azure DevOps builds (`196471`, `196475`) with
different failure patterns.

## Other details

> *"I tried to retry my own failed stages once, but my therapist said
that's called 'rumination'."* — Claude 🤖

* [Runtime Metrics] Enables Runtime Metrics by Default for .NET 6+ (#8267)

## Summary of changes

Enable runtime metrics by default on .NET 6+ using the Diagnostics
listener for services where the config is unset otherwise continue to
use EventListener, and default existing explicit users to Diagnostics on
.NET 8+ since there is no loss of ASP.NET Core Metrics.

## Reason for change

Runtime metrics are currently opt-in. The EventListener/EventPipe
implementation has known runtime bugs: shutdown crashes
(dotnet/runtime#103480) and CPU/memory leaks (dotnet/runtime#111368).
The Diagnostics listener avoids these and has comparable or better
performance in our tests.

See [RFC: Enabling .NET Runtime Metrics by Default — Option
A](https://docs.google.com/document/d/1tekvCvlOkn12pU3jLK-ePZ0nrynSW7p8IWgG-Dh7zwQ/edit?pli=1&tab=t.0#heading=h.bs3q57sbzfh3).

## Implementation details

Configuration logic for `DD_RUNTIME_METRICS_ENABLED`:
- If explicitly set, use that value
- If not set, default to `true` on .NET 6+

Configuration logic for
`DD_RUNTIME_METRICS_DIAGNOSTICS_METRICS_API_ENABLED`:
- If explicitly set, use that value
- If not set:
- .NET 8+: defaults to `true` (full metric coverage including ASP.NET
Core meters)
- .NET 6/7 with `DD_RUNTIME_METRICS_ENABLED` not set: defaults to `true`
- .NET 6/7 with `DD_RUNTIME_METRICS_ENABLED=true`: defaults to `false`
(keeps EventListener to preserve ASP.NET Core EventCounter metrics)

The .NET 8 check uses `Environment.Version.Major >= 8`.

Also fixes `RuntimeMetricsWriter` to dispose
`DiagnosticsMetricsRuntimeMetricsListener` on .NET 6+ (`MeterListener`
is safe to dispose, unlike `EventListener`).

The runtime metrics collector selected depends on the runtime version
and whether `DD_RUNTIME_METRICS_ENABLED` was explicitly configured:

| Runtime | `DD_RUNTIME_METRICS_ENABLED` | Collector used |
|---|---|---|
| .NET 8+ | not set / invalid | Diagnostics (default) |
| .NET 8+ | `true` | Diagnostics (default) |
| .NET 6/7 | not set / invalid | Diagnostics (default) |
| .NET 6/7 | `true` (explicit) | EventListener (preserves ASP.NET Core
EventCounter metrics not yet available via Diagnostics on < .NET 8) |
| Any | `false` | Disabled |
| Any | `OTEL_METRICS_EXPORTER=none` | Disabled (takes precedence over
the default) |

**Note for .NET 6/7 users** who explicitly set
`DD_RUNTIME_METRICS_ENABLED=true` and want to opt into the new
Diagnostics-based collector (at the cost of losing ASP.NET Core
EventCounter metrics): also set
`DD_RUNTIME_METRICS_DIAGNOSTICS_METRICS_API_ENABLED=true`.
## Test coverage

- Updated conditional `[InlineData]` in `RuntimeMetricsEnabled` test to
expect `true` when unset on .NET 6+.
- Added
`RuntimeMetrics_DefaultsToDignosticsOnNet6Plus_WhenNotExplicitlySet`.
- Added `RuntimeMetrics_ExplicitEnable_RespectsExplicitDiagnosticsFlag`
(explicit `true`/`false` cases).
- Added `RuntimeMetrics_ExplicitEnable_DefaultsToDiagnosticsOnNet8Plus`
(net8.0 TFM).
- Added
`RuntimeMetrics_ExplicitEnable_DefaultsToEventListenerOnNet6And7`
(net6.0/net7.0 TFMs).
- Added `RuntimeMetrics_ExplicitDisable_OverridesDefault`.
- Added RuntimeMetrics_InvalidValue_TreatedAsUnset_DefaultsToDiagnostics
(codifies that invalid values like DD_RUNTIME_METRICS_ENABLED=blah are
treated as unset, not as explicitly configured)

## Other details

Part of [Enabling .NET Runtime Metrics by
Default](https://docs.google.com/document/d/1tekvCvlOkn12pU3jLK-ePZ0nrynSW7p8IWgG-Dh7zwQ/edit?pli=1&tab=t.0#heading=h.hz8udk7cv9zf).
Should land after the PRs below are merged which include necessary
fixes:

- https://github.com/DataDog/dd-trace-dotnet/pull/8265
- https://github.com/DataDog/dd-trace-dotnet/pull/8266

### Moving forward

After this lands, monitor:
1. **Telemetry**: `DD_RUNTIME_METRICS_ENABLED` and
`DD_RUNTIME_METRICS_DIAGNOSTICS_METRICS_API_ENABLED` usage to track
adoption of the new defaults.
2. **Support tickets**: Customers reporting missing metrics on .NET 6/7
(`contention_time`, ASP.NET Core counters, `compacting_gc` tag) after
upgrading.
3. **Future**: Consider defaulting
`DD_RUNTIME_METRICS_DIAGNOSTICS_METRICS_API_ENABLED=true` for all .NET
6+ in a later release once the .NET 6/7 EventCounter gap is documented
and communicated.

* Bump test dependencies and add test.final_status to benchmark exporter (#8301)

## Summary
- Bump `DatadogTestCollector` from 0.0.52 to 0.0.53
- Bump `DatadogTestLogger` from 0.0.52 to 0.0.53
- Add `test.final_status` tag to the benchmark exporter
(`DatadogExporter`)
- Bump `timeitsharp` from 0.4.7 to 0.4.8

## Test plan
- [ ] CI passes with the updated packages
- [ ] Benchmark tests emit `test.final_status` tag correctly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* [Test Package Versions Bump] Updating package versions (#8306)

Updates the package versions for integration tests.

Co-authored-by: andrewlock <18755388+andrewlock@users.noreply.github.com>

* Only allow creating hotfix branches from a tag (#8297)

## Summary of changes

Ensures you're creating a hotfix branch from a tag

## Reason for change

The "correct" usage is to create a hotfix from a previous normal
release, or from a previous hotfix. The easy way to enforce that is to
ensure you create from a tag

## Implementation details

Check the reference passed in, like we do for other workflows

## Test coverage

Tested it
[here](https://github.com/DataDog/dd-trace-dotnet/actions/runs/23005198182)
and it failed (as expected).

* ServiceNameSource for inferred proxy (#8309)

## Summary of changes

Pass explicit `serviceNameSource` for inferred proxy span factories so
`_dd.svc_src` correctly identifies the integration that set the service
name.

## Reason for change

Per the service name source
[RFC](https://docs.google.com/document/d/1c47iSTWxIOHMHfZTF2nT9xfyQaIBP9KJvI9sRn5SvpM/edit?tab=t.0),
integration-driven service overrides must set `_dd.svc_src` to the
integration name. The proxy factories were passing `serviceName` without
`serviceNameSource`, leaving the tag unset.

## Implementation details

- `AwsApiGatewaySpanFactory`: passes `serviceNameSource:
"aws-apigateway"`
- `AzureApiManagementSpanFactory`: passes `serviceNameSource:
"azure-apim"`

## Test coverage

Updated 30 snapshot files to include the new `_dd.svc_src` tag on
inferred proxy spans.

## Other details

Part of the `_dd.svc_src` RFC implementation. Manual instrumentation
source (`"m"`) is handled in a separate PR.
<!-- Fixes #{issue} -->


<!--  ⚠️ Note:

Where possible, please obtain 2 approvals prior to merging. Unless
CODEOWNERS specifies otherwise, for external teams it is typically best
to have one review from a team member, and one review from apm-dotnet.
Trivial changes do not require 2 reviews.

MergeQueue is NOT enabled in this repository. If you have write access
to the repo, the PR has 1-2 approvals (see above), and all of the
required checks have passed, you can use the Squash and Merge button to
merge the PR. If you don't have write access, or you need help, reach
out in the #apm-dotnet channel in Slack.
-->

* [Profiler] Add unit tests for ManagedCodeCache (#8307)

* Add some documentation on running system-tests (#8298)

## Summary of changes

This adds a brief document on how to test PR changes in/against
system-tests

## Reason for change

Feedback on onboarding into the repository where it wasn't obvious how
to run system-tests (it isn't great)

## Implementation details

Searched for `system-tests.git` in `ultimate-pipeline.yaml` and outlined
changing the branch
Looked at the `docker_image_artifacts` label (introduced in
https://github.com/DataDog/dd-trace-dotnet/pull/7337) and documented
that

## Test coverage

None!

## Other details
<!-- Fixes #{issue} -->

I haven't _really_ done this much so it may be wrong.
We have some documentation on running locally but last time I did that
(~6 mos ago) it didn't work 😭

<!--  ⚠️ Note:

Where possible, please obtain 2 approvals prior to merging. Unless
CODEOWNERS specifies otherwise, for external teams it is typically best
to have one review from a team member, and one review from apm-dotnet.
Trivial changes do not require 2 reviews.

MergeQueue is NOT enabled in this repository. If you have write access
to the repo, the PR has 1-2 approvals (see above), and all of the
required checks have passed, you can use the Squash and Merge button to
merge the PR. If you don't have write access, or you need help, reach
out in the #apm-dotnet channel in Slack.
-->

* Adjust Log levels in OltpExporter (#8133)

## Summary of changes

Changes the log levels in `OtlpExporter` to skip telemetry for some
errors (while making them go from Debug to Error) and swapping some
Debug

## Reason for change

Noticed a lot of new errors but they just seem to be transient
networking errors.

https://app.datadoghq.com/error-tracking/issue/443da854-fe18-11f0-ae87-da7ad0900002

Some of the current logs were either the wrong level or absent.

## Implementation details

- Bad Request (400) - not retrying - swapped to Error from Warning
- Add two `Log.ErrorSkipTelemetry("An error occurred while sending OTLP
request to {AgentEndpoint}.` for when we get transient network issue
- `Log.Error(ex, "Error sending OTLP request (attempt {Attempt})",
(attempt + 1).ToString());` -> `Log.Debug<int>(ex, "Error sending OTLP
request (attempt {Attempt})", attempt + 1);` along with removing the
string allocation

## Test coverage

## Other details
<!-- Fixes #{issue} -->

<!--  ⚠️ Note:

Where possible, please obtain 2 approvals prior to merging. Unless
CODEOWNERS specifies otherwise, for external teams it is typically best
to have one review from a team member, and one review from apm-dotnet.
Trivial changes do not require 2 reviews.

MergeQueue is NOT enabled in this repository. If you have write access
to the repo, the PR has 1-2 approvals (see above), and all of the
required checks have passed, you can use the Squash and Merge button to
merge the PR. If you don't have write access, or you need help, reach
out in the #apm-dotnet channel in Slack.
-->

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Refactor log directory resolution into smaller methods, add tests (#7255)

## Summary of changes

Refactor `DatadogLoggingFactory` log directory resolution logic into
smaller, testable methods and add unit tests.

## Reason for change

The log directory resolution logic was a single large method mixing
concerns:
- config reading
- default directory lookup
- directory creation 😦
- fallback logic

Breaking it apart makes each piece independently testable and easier to
understand.

## Implementation details

- Extract `GetDefaultLogDirectory()` — determines the _default_ log
directory based on platform and Azure environment
- Extract `GetProgramDataDirectory()` — resolves the `ProgramData` path
on Windows (with Nano Server fallbacks)
- Extract `TryCreateLogDirectory()` — attempts to _create_ a log
directory, returning success/failure
- Make extracted methods `internal` with `[TestingAndPrivateOnly]` for
testability

Simplify `GetLogDirectory()` to orchestrate the above methods with clear
fallback chain:
  1. `DD_TRACE_LOG_DIRECTORY` env var
2. `DD_TRACE_LOG_PATH` (deprecated, directory extracted from file path)
  3. `GetDefaultLogDirectory()` (platform/Azure-aware default)
  4. `Path.GetTempPath()` (last resort)

## Test coverage

Flattened nested test classes (`FileLoggingConfiguration`,
`RedactedLogConfiguration`, `SinkConfiguration`) into top-level methods
with consistent `MethodUnderTest_Scenario_ExpectedBehavior` naming:

| Old (nested class.method) | New (flat method) |
|---|---|
| `FileLoggingConfiguration.UsesLogDirectoryWhenItExists` |
`GetConfiguration_WithLogDirectory_UsesLogDirectory` |
| `FileLoggingConfiguration.UsesObsoleteLogDirectoryWhenAvailable` |
`GetConfiguration_WithObsoleteLogPath_UsesObsoleteLogDirectory` |
| `FileLoggingConfiguration.UsesEnvironmentFallBackWhenBothNull` |
`GetConfiguration_WithNoLogDirectorySettings_UsesEnvironmentFallback` |
| `FileLoggingConfiguration.CreatesLogDirectoryWhenItDoesntExist` |
`GetConfiguration_WithNonExistentLogDirectory_CreatesDirectory` |
|
`RedactedLogConfiguration.WhenNoOrInvalidConfiguration_TelemetryLogsEnabled`
|
`GetConfiguration_WithNoOrInvalidTelemetryLogsSetting_EnablesErrorLogging`
|
| `RedactedLogConfiguration.WhenDisabled_TelemetryLogsDisabled` |
`GetConfiguration_WithTelemetryLogsDisabled_DisablesErrorLogging` |
| `SinkConfiguration.WhenNoSinksProvided_UsesFileSink` |
`GetConfiguration_WithNoSinksSetting_UsesFileSink` |
| `SinkConfiguration.WhenFileSinkIsIncluded_UsesFileSink` |
`GetConfiguration_WithFileSinkIncluded_UsesFileSink` |
| `SinkConfiguration.WhenFileSinkIsNotIncluded_DoesNotUseFileSink` |
`GetConfiguration_WithFileSinkNotIncluded_DoesNotUseFileS…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: core Tracer core type: enhancement Enhancements and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants