Skip to content

Provide optimized writers for OpenTelemetry's "trace.proto" wire protocol#11120

Open
mcculls wants to merge 19 commits intomasterfrom
mcculls/otlp-traces-proto
Open

Provide optimized writers for OpenTelemetry's "trace.proto" wire protocol#11120
mcculls wants to merge 19 commits intomasterfrom
mcculls/otlp-traces-proto

Conversation

@mcculls
Copy link
Copy Markdown
Contributor

@mcculls mcculls commented Apr 15, 2026

What Does This Do

Uses a single temporary buffer as in #10983 to prepare message chunks at different nesting levels (resource / scope / span)

First we chunk all nested messages, i.e. span-links, for a given span. Once the span is complete we add the first part of the span message and its chunked links to the scoped chunks. Once the scope is complete we add the first part of the scoped spans message and all its chunks (span messages and their links) to the payload. Once all the span data has been chunked we add the enclosing resource metrics message to the start of the payload.

Multiple traces can be added to the collector before collecting them into a payload. Note that this payload is only valid for the calling thread until the next collection. Adding traces after collection automatically starts a new payload.

Motivation

Avoids need to use full protobuf library while keeping intermediate array creation to a minimum.

Additional Notes

OtlpTraceProtoTest was created with the help of Claude.

Contributor Checklist

Jira ticket: [PROJ-IDENT]

Note: Once your PR is ready to merge, add it to the merge queue by commenting /merge. /merge -c cancels the queue request. /merge -f --reason "reason" skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.

@mcculls mcculls added tag: do not merge Do not merge changes type: feature request inst: opentelemetry OpenTelemetry instrumentation labels Apr 15, 2026
@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Apr 15, 2026

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master mcculls/otlp-traces-proto
git_commit_date 1776423906 1776461456
git_commit_sha d5d2097 397618f
release_version 1.62.0-SNAPSHOT~d5d2097cb9 1.62.0-SNAPSHOT~397618fe1c
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1776463320 1776463320
ci_job_id 1607682814 1607682814
ci_pipeline_id 108343309 108343309
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-9ovytqsw 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-9ovytqsw 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 63 metrics, 8 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~397618fe1c, baseline=1.62.0-SNAPSHOT~d5d2097cb9

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.057 s) : 0, 1056864
Total [baseline] (8.792 s) : 0, 8792297
Agent [candidate] (1.058 s) : 0, 1058128
Total [candidate] (8.832 s) : 0, 8831719
section iast
Agent [baseline] (1.225 s) : 0, 1224854
Total [baseline] (9.555 s) : 0, 9555351
Agent [candidate] (1.221 s) : 0, 1221465
Total [candidate] (9.531 s) : 0, 9530635
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.057 s -
Agent iast 1.225 s 167.99 ms (15.9%)
Total tracing 8.792 s -
Total iast 9.555 s 763.054 ms (8.7%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.058 s -
Agent iast 1.221 s 163.337 ms (15.4%)
Total tracing 8.832 s -
Total iast 9.531 s 698.916 ms (7.9%)
gantt
    title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~397618fe1c, baseline=1.62.0-SNAPSHOT~d5d2097cb9

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.228 ms) : 0, 1228
crashtracking [candidate] (1.225 ms) : 0, 1225
BytebuddyAgent [baseline] (632.795 ms) : 0, 632795
BytebuddyAgent [candidate] (634.341 ms) : 0, 634341
AgentMeter [baseline] (29.555 ms) : 0, 29555
AgentMeter [candidate] (29.54 ms) : 0, 29540
GlobalTracer [baseline] (248.117 ms) : 0, 248117
GlobalTracer [candidate] (249.378 ms) : 0, 249378
AppSec [baseline] (32.329 ms) : 0, 32329
AppSec [candidate] (32.406 ms) : 0, 32406
Debugger [baseline] (59.021 ms) : 0, 59021
Debugger [candidate] (59.035 ms) : 0, 59035
Remote Config [baseline] (599.472 µs) : 0, 599
Remote Config [candidate] (609.97 µs) : 0, 610
Telemetry [baseline] (8.0 ms) : 0, 8000
Telemetry [candidate] (8.014 ms) : 0, 8014
Flare Poller [baseline] (9.066 ms) : 0, 9066
Flare Poller [candidate] (7.426 ms) : 0, 7426
section iast
crashtracking [baseline] (1.24 ms) : 0, 1240
crashtracking [candidate] (1.222 ms) : 0, 1222
BytebuddyAgent [baseline] (801.389 ms) : 0, 801389
BytebuddyAgent [candidate] (798.808 ms) : 0, 798808
AgentMeter [baseline] (11.676 ms) : 0, 11676
AgentMeter [candidate] (11.586 ms) : 0, 11586
GlobalTracer [baseline] (239.829 ms) : 0, 239829
GlobalTracer [candidate] (238.72 ms) : 0, 238720
IAST [baseline] (25.982 ms) : 0, 25982
IAST [candidate] (25.839 ms) : 0, 25839
AppSec [baseline] (33.35 ms) : 0, 33350
AppSec [candidate] (31.985 ms) : 0, 31985
Debugger [baseline] (61.747 ms) : 0, 61747
Debugger [candidate] (63.662 ms) : 0, 63662
Remote Config [baseline] (539.057 µs) : 0, 539
Remote Config [candidate] (534.116 µs) : 0, 534
Telemetry [baseline] (9.292 ms) : 0, 9292
Telemetry [candidate] (9.407 ms) : 0, 9407
Flare Poller [baseline] (3.639 ms) : 0, 3639
Flare Poller [candidate] (3.615 ms) : 0, 3615
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~397618fe1c, baseline=1.62.0-SNAPSHOT~d5d2097cb9

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.056 s) : 0, 1056322
Total [baseline] (11.042 s) : 0, 11042462
Agent [candidate] (1.058 s) : 0, 1057877
Total [candidate] (11.085 s) : 0, 11084609
section appsec
Agent [baseline] (1.252 s) : 0, 1252335
Total [baseline] (11.027 s) : 0, 11026633
Agent [candidate] (1.248 s) : 0, 1248192
Total [candidate] (11.211 s) : 0, 11210673
section iast
Agent [baseline] (1.223 s) : 0, 1223292
Total [baseline] (11.19 s) : 0, 11189571
Agent [candidate] (1.225 s) : 0, 1224534
Total [candidate] (11.313 s) : 0, 11313499
section profiling
Agent [baseline] (1.196 s) : 0, 1195708
Total [baseline] (11.033 s) : 0, 11033181
Agent [candidate] (1.187 s) : 0, 1187043
Total [candidate] (11.011 s) : 0, 11011386
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.056 s -
Agent appsec 1.252 s 196.013 ms (18.6%)
Agent iast 1.223 s 166.97 ms (15.8%)
Agent profiling 1.196 s 139.386 ms (13.2%)
Total tracing 11.042 s -
Total appsec 11.027 s -15.829 ms (-0.1%)
Total iast 11.19 s 147.109 ms (1.3%)
Total profiling 11.033 s -9.281 ms (-0.1%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.058 s -
Agent appsec 1.248 s 190.315 ms (18.0%)
Agent iast 1.225 s 166.657 ms (15.8%)
Agent profiling 1.187 s 129.166 ms (12.2%)
Total tracing 11.085 s -
Total appsec 11.211 s 126.064 ms (1.1%)
Total iast 11.313 s 228.89 ms (2.1%)
Total profiling 11.011 s -73.224 ms (-0.7%)
gantt
    title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~397618fe1c, baseline=1.62.0-SNAPSHOT~d5d2097cb9

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.217 ms) : 0, 1217
crashtracking [candidate] (1.227 ms) : 0, 1227
BytebuddyAgent [baseline] (631.786 ms) : 0, 631786
BytebuddyAgent [candidate] (632.214 ms) : 0, 632214
AgentMeter [baseline] (29.486 ms) : 0, 29486
AgentMeter [candidate] (29.5 ms) : 0, 29500
GlobalTracer [baseline] (248.13 ms) : 0, 248130
GlobalTracer [candidate] (249.184 ms) : 0, 249184
AppSec [baseline] (32.293 ms) : 0, 32293
AppSec [candidate] (32.275 ms) : 0, 32275
Debugger [baseline] (59.772 ms) : 0, 59772
Debugger [candidate] (59.747 ms) : 0, 59747
Remote Config [baseline] (589.264 µs) : 0, 589
Remote Config [candidate] (590.356 µs) : 0, 590
Telemetry [baseline] (8.709 ms) : 0, 8709
Telemetry [candidate] (8.047 ms) : 0, 8047
Flare Poller [baseline] (8.253 ms) : 0, 8253
Flare Poller [candidate] (8.914 ms) : 0, 8914
section appsec
crashtracking [baseline] (1.22 ms) : 0, 1220
crashtracking [candidate] (1.211 ms) : 0, 1211
BytebuddyAgent [baseline] (664.211 ms) : 0, 664211
BytebuddyAgent [candidate] (661.673 ms) : 0, 661673
AgentMeter [baseline] (12.301 ms) : 0, 12301
AgentMeter [candidate] (12.256 ms) : 0, 12256
GlobalTracer [baseline] (249.769 ms) : 0, 249769
GlobalTracer [candidate] (248.638 ms) : 0, 248638
IAST [baseline] (24.628 ms) : 0, 24628
IAST [candidate] (24.435 ms) : 0, 24435
AppSec [baseline] (184.928 ms) : 0, 184928
AppSec [candidate] (185.262 ms) : 0, 185262
Debugger [baseline] (66.265 ms) : 0, 66265
Debugger [candidate] (65.881 ms) : 0, 65881
Remote Config [baseline] (609.87 µs) : 0, 610
Remote Config [candidate] (605.608 µs) : 0, 606
Telemetry [baseline] (8.333 ms) : 0, 8333
Telemetry [candidate] (8.343 ms) : 0, 8343
Flare Poller [baseline] (3.498 ms) : 0, 3498
Flare Poller [candidate] (3.542 ms) : 0, 3542
section iast
crashtracking [baseline] (1.222 ms) : 0, 1222
crashtracking [candidate] (1.231 ms) : 0, 1231
BytebuddyAgent [baseline] (799.935 ms) : 0, 799935
BytebuddyAgent [candidate] (799.759 ms) : 0, 799759
AgentMeter [baseline] (11.584 ms) : 0, 11584
AgentMeter [candidate] (11.65 ms) : 0, 11650
GlobalTracer [baseline] (238.668 ms) : 0, 238668
GlobalTracer [candidate] (239.58 ms) : 0, 239580
IAST [baseline] (25.787 ms) : 0, 25787
IAST [candidate] (25.842 ms) : 0, 25842
AppSec [baseline] (31.153 ms) : 0, 31153
AppSec [candidate] (31.576 ms) : 0, 31576
Debugger [baseline] (65.418 ms) : 0, 65418
Debugger [candidate] (65.344 ms) : 0, 65344
Remote Config [baseline] (537.354 µs) : 0, 537
Remote Config [candidate] (531.559 µs) : 0, 532
Telemetry [baseline] (9.253 ms) : 0, 9253
Telemetry [candidate] (9.365 ms) : 0, 9365
Flare Poller [baseline] (3.576 ms) : 0, 3576
Flare Poller [candidate] (3.623 ms) : 0, 3623
section profiling
crashtracking [baseline] (1.199 ms) : 0, 1199
crashtracking [candidate] (1.18 ms) : 0, 1180
BytebuddyAgent [baseline] (700.314 ms) : 0, 700314
BytebuddyAgent [candidate] (692.817 ms) : 0, 692817
AgentMeter [baseline] (9.213 ms) : 0, 9213
AgentMeter [candidate] (9.267 ms) : 0, 9267
GlobalTracer [baseline] (208.475 ms) : 0, 208475
GlobalTracer [candidate] (208.03 ms) : 0, 208030
AppSec [baseline] (32.862 ms) : 0, 32862
AppSec [candidate] (32.986 ms) : 0, 32986
Debugger [baseline] (66.127 ms) : 0, 66127
Debugger [candidate] (65.942 ms) : 0, 65942
Remote Config [baseline] (581.002 µs) : 0, 581
Remote Config [candidate] (583.759 µs) : 0, 584
Telemetry [baseline] (7.814 ms) : 0, 7814
Telemetry [candidate] (7.754 ms) : 0, 7754
Flare Poller [baseline] (3.576 ms) : 0, 3576
Flare Poller [candidate] (3.498 ms) : 0, 3498
ProfilingAgent [baseline] (93.611 ms) : 0, 93611
ProfilingAgent [candidate] (93.764 ms) : 0, 93764
Profiling [baseline] (94.166 ms) : 0, 94166
Profiling [candidate] (94.304 ms) : 0, 94304
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master mcculls/otlp-traces-proto
git_commit_date 1776423906 1776461456
git_commit_sha d5d2097 397618f
release_version 1.62.0-SNAPSHOT~d5d2097cb9 1.62.0-SNAPSHOT~397618fe1c
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1776463802 1776463802
ci_job_id 1607682815 1607682815
ci_pipeline_id 108343309 108343309
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-cw18d2ct 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-cw18d2ct 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 1 performance improvements and 0 performance regressions! Performance is the same for 19 metrics, 16 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:insecure-bank:iast:high_load better
[-178.227µs; -66.529µs] or [-6.710%; -2.505%]
unsure
[-585.053µs; -0.415µs] or [-7.561%; -0.005%]
unstable
[-85.265op/s; +206.202op/s] or [-6.337%; +15.325%]
2.534ms 7.445ms 1405.969op/s 2.656ms 7.737ms 1345.500op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~397618fe1c, baseline=1.62.0-SNAPSHOT~d5d2097cb9
    dateFormat X
    axisFormat %s
section baseline
no_agent (18.064 ms) : 17882, 18247
.   : milestone, 18064,
appsec (18.821 ms) : 18630, 19011
.   : milestone, 18821,
code_origins (17.901 ms) : 17722, 18080
.   : milestone, 17901,
iast (17.657 ms) : 17484, 17831
.   : milestone, 17657,
profiling (18.281 ms) : 18102, 18460
.   : milestone, 18281,
tracing (17.511 ms) : 17337, 17686
.   : milestone, 17511,
section candidate
no_agent (19.111 ms) : 18916, 19307
.   : milestone, 19111,
appsec (18.871 ms) : 18680, 19061
.   : milestone, 18871,
code_origins (17.713 ms) : 17537, 17890
.   : milestone, 17713,
iast (17.923 ms) : 17747, 18100
.   : milestone, 17923,
profiling (18.334 ms) : 18154, 18514
.   : milestone, 18334,
tracing (17.98 ms) : 17803, 18158
.   : milestone, 17980,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 18.064 ms [17.882 ms, 18.247 ms] -
appsec 18.821 ms [18.63 ms, 19.011 ms] 756.168 µs (4.2%)
code_origins 17.901 ms [17.722 ms, 18.08 ms] -163.397 µs (-0.9%)
iast 17.657 ms [17.484 ms, 17.831 ms] -407.158 µs (-2.3%)
profiling 18.281 ms [18.102 ms, 18.46 ms] 216.345 µs (1.2%)
tracing 17.511 ms [17.337 ms, 17.686 ms] -552.997 µs (-3.1%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 19.111 ms [18.916 ms, 19.307 ms] -
appsec 18.871 ms [18.68 ms, 19.061 ms] -240.908 µs (-1.3%)
code_origins 17.713 ms [17.537 ms, 17.89 ms] -1.398 ms (-7.3%)
iast 17.923 ms [17.747 ms, 18.1 ms] -1.188 ms (-6.2%)
profiling 18.334 ms [18.154 ms, 18.514 ms] -777.563 µs (-4.1%)
tracing 17.98 ms [17.803 ms, 18.158 ms] -1.131 ms (-5.9%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~397618fe1c, baseline=1.62.0-SNAPSHOT~d5d2097cb9
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.254 ms) : 1242, 1266
.   : milestone, 1254,
iast (3.404 ms) : 3357, 3450
.   : milestone, 3404,
iast_FULL (6.064 ms) : 6002, 6126
.   : milestone, 6064,
iast_GLOBAL (3.869 ms) : 3813, 3925
.   : milestone, 3869,
profiling (2.232 ms) : 2208, 2256
.   : milestone, 2232,
tracing (1.929 ms) : 1911, 1947
.   : milestone, 1929,
section candidate
no_agent (1.297 ms) : 1284, 1311
.   : milestone, 1297,
iast (3.257 ms) : 3210, 3304
.   : milestone, 3257,
iast_FULL (5.943 ms) : 5883, 6003
.   : milestone, 5943,
iast_GLOBAL (3.826 ms) : 3759, 3893
.   : milestone, 3826,
profiling (2.087 ms) : 2067, 2108
.   : milestone, 2087,
tracing (1.873 ms) : 1856, 1889
.   : milestone, 1873,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.254 ms [1.242 ms, 1.266 ms] -
iast 3.404 ms [3.357 ms, 3.45 ms] 2.15 ms (171.4%)
iast_FULL 6.064 ms [6.002 ms, 6.126 ms] 4.81 ms (383.5%)
iast_GLOBAL 3.869 ms [3.813 ms, 3.925 ms] 2.615 ms (208.5%)
profiling 2.232 ms [2.208 ms, 2.256 ms] 978.199 µs (78.0%)
tracing 1.929 ms [1.911 ms, 1.947 ms] 675.028 µs (53.8%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.297 ms [1.284 ms, 1.311 ms] -
iast 3.257 ms [3.21 ms, 3.304 ms] 1.96 ms (151.1%)
iast_FULL 5.943 ms [5.883 ms, 6.003 ms] 4.646 ms (358.1%)
iast_GLOBAL 3.826 ms [3.759 ms, 3.893 ms] 2.528 ms (194.9%)
profiling 2.087 ms [2.067 ms, 2.108 ms] 790.085 µs (60.9%)
tracing 1.873 ms [1.856 ms, 1.889 ms] 575.259 µs (44.3%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master mcculls/otlp-traces-proto
git_commit_date 1776423906 1776461456
git_commit_sha d5d2097 397618f
release_version 1.62.0-SNAPSHOT~d5d2097cb9 1.62.0-SNAPSHOT~397618fe1c
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1776463511 1776463511
ci_job_id 1607682816 1607682816
ci_pipeline_id 108343309 108343309
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-2-0zc9woyi 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-2-0zc9woyi 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics.

Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~397618fe1c, baseline=1.62.0-SNAPSHOT~d5d2097cb9
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.487 ms) : 1476, 1499
.   : milestone, 1487,
appsec (2.527 ms) : 2472, 2581
.   : milestone, 2527,
iast (2.267 ms) : 2198, 2335
.   : milestone, 2267,
iast_GLOBAL (2.31 ms) : 2241, 2380
.   : milestone, 2310,
profiling (2.089 ms) : 2035, 2144
.   : milestone, 2089,
tracing (2.075 ms) : 2022, 2128
.   : milestone, 2075,
section candidate
no_agent (1.486 ms) : 1474, 1497
.   : milestone, 1486,
appsec (2.529 ms) : 2474, 2583
.   : milestone, 2529,
iast (2.274 ms) : 2204, 2344
.   : milestone, 2274,
iast_GLOBAL (2.307 ms) : 2237, 2376
.   : milestone, 2307,
profiling (2.099 ms) : 2044, 2154
.   : milestone, 2099,
tracing (2.077 ms) : 2024, 2131
.   : milestone, 2077,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.487 ms [1.476 ms, 1.499 ms] -
appsec 2.527 ms [2.472 ms, 2.581 ms] 1.039 ms (69.9%)
iast 2.267 ms [2.198 ms, 2.335 ms] 779.288 µs (52.4%)
iast_GLOBAL 2.31 ms [2.241 ms, 2.38 ms] 823.098 µs (55.3%)
profiling 2.089 ms [2.035 ms, 2.144 ms] 602.115 µs (40.5%)
tracing 2.075 ms [2.022 ms, 2.128 ms] 587.826 µs (39.5%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.486 ms [1.474 ms, 1.497 ms] -
appsec 2.529 ms [2.474 ms, 2.583 ms] 1.043 ms (70.2%)
iast 2.274 ms [2.204 ms, 2.344 ms] 788.285 µs (53.1%)
iast_GLOBAL 2.307 ms [2.237 ms, 2.376 ms] 820.949 µs (55.3%)
profiling 2.099 ms [2.044 ms, 2.154 ms] 613.186 µs (41.3%)
tracing 2.077 ms [2.024 ms, 2.131 ms] 591.704 µs (39.8%)
Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~397618fe1c, baseline=1.62.0-SNAPSHOT~d5d2097cb9
    dateFormat X
    axisFormat %s
section baseline
no_agent (14.912 s) : 14912000, 14912000
.   : milestone, 14912000,
appsec (14.744 s) : 14744000, 14744000
.   : milestone, 14744000,
iast (18.559 s) : 18559000, 18559000
.   : milestone, 18559000,
iast_GLOBAL (18.404 s) : 18404000, 18404000
.   : milestone, 18404000,
profiling (14.942 s) : 14942000, 14942000
.   : milestone, 14942000,
tracing (15.048 s) : 15048000, 15048000
.   : milestone, 15048000,
section candidate
no_agent (15.239 s) : 15239000, 15239000
.   : milestone, 15239000,
appsec (14.725 s) : 14725000, 14725000
.   : milestone, 14725000,
iast (18.706 s) : 18706000, 18706000
.   : milestone, 18706000,
iast_GLOBAL (17.975 s) : 17975000, 17975000
.   : milestone, 17975000,
profiling (14.886 s) : 14886000, 14886000
.   : milestone, 14886000,
tracing (14.806 s) : 14806000, 14806000
.   : milestone, 14806000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 14.912 s [14.912 s, 14.912 s] -
appsec 14.744 s [14.744 s, 14.744 s] -168.0 ms (-1.1%)
iast 18.559 s [18.559 s, 18.559 s] 3.647 s (24.5%)
iast_GLOBAL 18.404 s [18.404 s, 18.404 s] 3.492 s (23.4%)
profiling 14.942 s [14.942 s, 14.942 s] 30.0 ms (0.2%)
tracing 15.048 s [15.048 s, 15.048 s] 136.0 ms (0.9%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.239 s [15.239 s, 15.239 s] -
appsec 14.725 s [14.725 s, 14.725 s] -514.0 ms (-3.4%)
iast 18.706 s [18.706 s, 18.706 s] 3.467 s (22.8%)
iast_GLOBAL 17.975 s [17.975 s, 17.975 s] 2.736 s (18.0%)
profiling 14.886 s [14.886 s, 14.886 s] -353.0 ms (-2.3%)
tracing 14.806 s [14.806 s, 14.806 s] -433.0 ms (-2.8%)

@mcculls mcculls force-pushed the mcculls/otlp-traces-proto branch 2 times, most recently from 583dc0c to 4adb56e Compare April 15, 2026 17:53
@mcculls mcculls changed the title WIP: Provide optimized writers for OpenTelemetry's "trace.proto" wire protocol Provide optimized writers for OpenTelemetry's "trace.proto" wire protocol Apr 15, 2026
@mcculls mcculls marked this pull request as ready for review April 15, 2026 17:56
@mcculls mcculls requested a review from a team as a code owner April 15, 2026 17:56
@mcculls mcculls requested review from mtoffl01 and ygree and removed request for ygree April 15, 2026 17:56
@mcculls mcculls removed the tag: do not merge Do not merge changes label Apr 15, 2026
Comment on lines +59 to +65
/**
* Collects trace spans and marshalls them into a chunked payload.
*
* <p>This payload is only valid for the calling thread until the next collection.
*/
@Override
public OtlpPayload collectSpans(List<DDSpan> spans) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is List<DDSpan> spans expected to be spans from a single trace? If so, each collectSpans call produces a full TracesData envelope with resource and scope wrappers per trace. This doesn't seem optimal and differs from the Datadog/msgpack implementation? Unless the expectation is that the eventual OtlpWriter will accumulated completed traces and call this once per flush cycle with a combined span list (although that can't be right based on the MetaWriter, which expects just a single trace at a time).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point - on reflection I'll change this to add a flush method so we can accumulate trace chunks over multiple calls.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've updated the collector API so it has two methods:

  • addTrace(spans) which adds a trace to the collector
  • collectTraces() which marshals the collected spans into a payload

This should allow its use as a replacement PayloadDispatcher, which means we can re-use more of the existing remote writer code.

Comment thread dd-trace-core/src/main/java/datadog/trace/core/otlp/trace/OtlpTraceProto.java Outdated
@mcculls mcculls force-pushed the mcculls/otlp-traces-proto branch from ab2ef0b to 7cdfed7 Compare April 17, 2026 11:53
@mcculls mcculls requested a review from a team as a code owner April 17, 2026 11:53
@mcculls mcculls requested review from dougqh and removed request for a team April 17, 2026 11:53
Comment thread dd-trace-core/src/main/java/datadog/trace/core/otlp/trace/OtlpTraceProto.java Outdated
Comment thread dd-trace-core/src/main/java/datadog/trace/core/otlp/common/OtlpCommonProto.java Outdated
Comment thread dd-trace-core/src/main/java/datadog/trace/core/otlp/trace/OtlpTraceProto.java Outdated
Copy link
Copy Markdown
Contributor

@dougqh dougqh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude caught a couple issues...

  • NPE and ClassCastException

Since I'm off next week, I'm not going to "request changes".
I'll just trust those get fixed and let someone else do the final review.

Also, added one key performance suggestion around use of forEach.

And here are couple more Claude reported that I'll leave to your discretion...

  1. Config.get().getServiceName() on every span — OtlpTraceProto.java:137
    if (!Config.get().getServiceName().equalsIgnoreCase(span.getServiceName())) {
    Cache the default service name (ideally as a UTF8BytesString for cheap equality). This runs for every span in every payload.

  2. recordMessage allocates a fresh ByteBuffer + backing array per chunk — OtlpCommonProto.java:126-140
    Every span, every link, every scope prefix gets its own heap allocation. Precisely-sized allocations are nice but total allocation count scales with the
    chunk count. If profiling shows GC pressure, a small reusable scratch arena that hands out slices (or an OtlpPayload that owns a large backing buffer with
    offset/length pairs) would eliminate most of this. Trade-off is lifetime complexity, so only worth it if measurements show it matters.

@mcculls
Copy link
Copy Markdown
Contributor Author

mcculls commented Apr 17, 2026

recordMessage allocates a fresh ByteBuffer + backing array per chunk — OtlpCommonProto.java:126-140
Every span, every link, every scope prefix gets its own heap allocation. Precisely-sized allocations are nice but total allocation count scales with the
chunk count. If profiling shows GC pressure, a small reusable scratch arena that hands out slices (or an OtlpPayload that owns a large backing buffer with
offset/length pairs) would eliminate most of this. Trade-off is lifetime complexity, so only worth it if measurements show it matters.

Yes, sadly this is the nature of heavily nested protobuf messages (the protobuf manual says to avoid too much nesting)

It means that before we can write out a span we need to know its exact message size. And because the size field is written out with varint encoding, different sizes take up different amounts of space even for just the size field. This means we can't leave placeholders to come back and write the size, because we don't know how many bytes the size field will take up, and if we estimate wrong then we'd then have to shift different chunks around multiple times. Ditto for any links it has, because the OTel proto spec says they're nested inside spans - and span-links may have attributes, which means another level of nested message!

You could process traces twice - once to size everything, and again to write it out - but the book-keeping needed for that gets complicated, and you're doubling the CPU time doing two passes.

Initial benchmarking showed we're allocating less than OTel with the current approach, mainly because we re-use the same buffer for doing the initial writes before recording each message slice. But I might look into pooling of slices to reduce churn.

@mcculls mcculls force-pushed the mcculls/otlp-traces-proto branch from 97b5fc9 to e77fe7e Compare April 17, 2026 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants