Skip to content

Capture deeper diagnostics for log-injection smoke test flake#11211

Draft
bm1549 wants to merge 1 commit intomasterfrom
brian.marks/deep-diagnose-log-injection-flake
Draft

Capture deeper diagnostics for log-injection smoke test flake#11211
bm1549 wants to merge 1 commit intomasterfrom
brian.marks/deep-diagnose-log-injection-flake

Conversation

@bm1549
Copy link
Copy Markdown
Contributor

@bm1549 bm1549 commented Apr 27, 2026

What Does This Do

Extends the failure diagnostic in LogInjectionSmokeTest.waitForTraceCountAlive so the next failure of check raw file injection produces enough information to identify the actual root cause in one CI run, rather than requiring another iteration. Test-only change; no agent or product code touched.

Motivation

Prior work

This test has flaked across 11+ logging backends for months.

PR Theory Outcome
#10920 Wrong index in fallback assertion Real bug, fixed; did not reduce flake rate (failures continued at the timeout, not the assertion)
#10999 App's 10s BaseApplication.TIMEOUT_IN_NANOS fires before the test's 30s PollingConditions, kills the daemon trace writer Bumped app timeout to 30s; did not reduce flake rate
#11075 We can't tell what's going wrong; add diagnostics on timeout (process alive? trace count? RC polls? last process output?) Did its job — gave us actionable data on the next two failures

The two #11075-instrumented failures both showed:

process.alive=true   rcPolls=132–135   traceCount=0   Last process output: <empty>

Hypotheses ruled out by #11075

  • Process crashprocess.alive=true
  • Network / RC connectivity — 132+ polls reached /v0.7/config in ~27s
  • Trace decoder bug — would surface via traceDecodingFailure
  • PR Fix log injection smoke test flakiness under CI load #10999's "app times out and kills the writer" theory — process didn't exit; the JVM is silent but alive
  • SSI-specific — 12/16 failures are test_ssi_smoke, but 4/16 are regular test_smoke. Same root cause in both job types

What #11075 could not tell us (and why this PR exists)

The captured stdout file is empty. With dd-trace at defaultLogLevel=debug, that is unexpected: in 30s of agent + app startup we would normally see thousands of log lines. The tailProcessLog from #11075 cannot distinguish:

Hypothesis What stdout being empty actually means
H1 Class-load deadlock between main thread and an agent transformer main is wedged before producing any output; RC daemon (started before the wedge) keeps polling. Agent has a ClassLoadCallBack.run() defer pattern (Agent.java:596) that exists because this class of bug is real
H2 @Trace advice deadlocks on first invocation main is in the bytecode wrapper for firstTracedMethod; System.out.println("FIRSTTRACEID …") is inside the wedged advice
H3 App ran fully; trace writer (Disruptor) is stuck main is sleeping in waitForCondition; spans were created but never flushed
H4 Tracer init silently failed → @Trace is no-op App runs to completion; spans created have traceId/spanId of 0
H5 OutputThreads writer thread died (uncaught exception in wc.write / decoder) Captured file is empty regardless of app state
H6 Stdout pipe filled and JVM blocked on writeBytes Captured file has some bytes followed by nothing more
H7 RC polls came from somewhere unexpected (invalidating "agent reached Agent.java:689") Test process's own thread dump shows no RC poller

The existing fields are insufficient to discriminate these. The next iteration should not have to be another guessing game.

Additional Notes

What this PR captures

In waitForTraceCountAlive's catch block (failure path only), captureFullDiagnostic collects:

  • PID, alive, exit valueProcess.pid() is Java 9+, guarded for zulu8
  • Captured stdout file metadata + bounded tail — size in bytes, last-modified, last 60 lines (closes "0 bytes" vs "has content but no newlines")
  • SIGQUIT thread dump on the live process — JVM dumps to stderr → captured pipe; we sleep briefly so the writer thread can drain it before we re-read
  • Application logger file (outputLogFile, the dd.test.logfile target) — bounded tail. This is the H1/H2 vs H3/H4 discriminator: if the app log has BEFORE FIRST SPAN, the app started; if it has all 4 lines, the app finished
  • smoke-output ThreadGroup enumeration — count + state of OutputThreads writer threads. Non-empty thread set tells us H5 is wrong; if empty, the captured file is unreliable regardless
  • jcmd Thread.print fallback — works even if the OutputThreads writer is dead. Drains stdout incrementally (the naive form deadlocks on full pipes, per codex review)
  • Previously-swallowed rcClientDecodingFailure — surfaced
  • Launch command line — saves digging through Gradle test reports

How each remaining hypothesis falls out

Hypothesis Signature in the new diagnostic
H1 class-load deadlock thread dump shows main BLOCKED on loadClass/<clinit>, monitor held by an agent transformer thread
H2 stuck @Trace advice thread dump shows main inside datadog.trace.bootstrap.instrumentation.api.* / datadog.trace.core.*
H3 trace writer stuck app log has all 4 BEFORE/INSIDE/AFTER lines; captured stdout has FIRSTTRACEID/SECONDTRACEID; thread dump shows trace-writer/disruptor thread BLOCKED/WAITING
H4 no-op tracer captured stdout has FIRSTTRACEID 0 0 (literal zero ids)
H5 dead writer smoke-output threads=0 (or threads exist but in TERMINATED state)
H6 pipe full captured stdout file size is non-zero with old mtime; thread dump shows main in writeBytes native frame
H7 phantom RC poller thread dump shows no RC polling thread in the test process

Review fixes already applied

Codex adversarial review caught two issues that would have made this PR ironic — diagnostic code that itself behaves badly under stress:

  • The naive proc.in.text after proc.waitFor() self-deadlocks when the dump exceeds the OS pipe buffer (~64KB on Linux), exactly when we most need it. The implementation now drains stdout incrementally with redirectErrorStream(true).
  • The naive readLines() on a thread-dump-inflated log loads everything into memory and could OOM the Gradle worker. Replaced with a bounded RandomAccessFile tail.

Contributor Checklist

  • Title is imperative ("Capture deeper diagnostics for log-injection smoke test flake")
  • Labels: type: bug, comp: testing, tag: ai generated, tag: no release notes, tag: flaky test, tag: diagnostics
  • No public documentation impact (test-only change)
  • No CODEOWNERS change (no source file additions or moves)

Jira ticket: N/A — test infrastructure / flake investigation

🤖 Generated with Claude Code

PR #11075 narrowed the flake to "process alive, RC polling, captured
stdout empty" but couldn't pinpoint where the JVM was wedged. This
extends the failure diagnostic to capture every signal needed to
discriminate the remaining hypotheses in a single failure.

On the failure path of `waitForTraceCountAlive`:
- SIGQUIT to the live process so its thread dump flows through the
  captured stdout pipe (when the OutputThreads writer is alive)
- bounded tail of captured stdout via RandomAccessFile (avoids loading
  a now-thread-dump-inflated log into memory)
- bounded tail of the application logger file (logback/log4j2/JBoss
  target via dd.test.logfile) — distinguishes "app never ran" from
  "app ran but stdout pipe lost output"
- enumeration of the smoke-output ThreadGroup so we can see if the
  OutputThreads writer thread is dead
- jcmd Thread.print fallback that drains stdout incrementally (codex
  review caught the pipe-buffer deadlock the naive form would hit)
- previously-swallowed rcClientDecodingFailure surfaced
- launch command line included so we don't have to dig through the
  Gradle test reports

Process.pid() is guarded for Java 8, and resolveJcmdPath probes both
JRE and JDK layouts.

tag: no release note
tag: ai generated

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bm1549 bm1549 added type: bug Bug report and fix comp: testing Testing tag: no release notes Changes to exclude from release notes tag: flaky test Flaky tests tag: diagnostics Diagnostics related changes tag: ai generated Largely based on code generated by an AI or LLM labels Apr 27, 2026
@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Apr 27, 2026

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/deep-diagnose-log-injection-flake
git_commit_date 1777299542 1777300921
git_commit_sha 3ba8d13 cb27fd6
release_version 1.62.0-SNAPSHOT~3ba8d13c73 1.62.0-SNAPSHOT~cb27fd64db
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1777302732 1777302732
ci_job_id 1633902846 1633902846
ci_pipeline_id 109918212 109918212
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-eqq4zk13 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-eqq4zk13 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 61 metrics, 10 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~cb27fd64db, baseline=1.62.0-SNAPSHOT~3ba8d13c73

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.065 s) : 0, 1065486
Total [baseline] (8.803 s) : 0, 8803278
Agent [candidate] (1.063 s) : 0, 1062711
Total [candidate] (8.86 s) : 0, 8859633
section iast
Agent [baseline] (1.249 s) : 0, 1248750
Total [baseline] (9.539 s) : 0, 9538927
Agent [candidate] (1.251 s) : 0, 1250835
Total [candidate] (9.523 s) : 0, 9522530
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.065 s -
Agent iast 1.249 s 183.264 ms (17.2%)
Total tracing 8.803 s -
Total iast 9.539 s 735.649 ms (8.4%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.063 s -
Agent iast 1.251 s 188.124 ms (17.7%)
Total tracing 8.86 s -
Total iast 9.523 s 662.897 ms (7.5%)
gantt
    title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~cb27fd64db, baseline=1.62.0-SNAPSHOT~3ba8d13c73

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.232 ms) : 0, 1232
crashtracking [candidate] (1.216 ms) : 0, 1216
BytebuddyAgent [baseline] (637.21 ms) : 0, 637210
BytebuddyAgent [candidate] (635.013 ms) : 0, 635013
AgentMeter [baseline] (29.477 ms) : 0, 29477
AgentMeter [candidate] (29.656 ms) : 0, 29656
GlobalTracer [baseline] (248.826 ms) : 0, 248826
GlobalTracer [candidate] (249.734 ms) : 0, 249734
AppSec [baseline] (32.889 ms) : 0, 32889
AppSec [candidate] (33.081 ms) : 0, 33081
Debugger [baseline] (59.897 ms) : 0, 59897
Debugger [candidate] (60.297 ms) : 0, 60297
Remote Config [baseline] (596.576 µs) : 0, 597
Remote Config [candidate] (603.398 µs) : 0, 603
Telemetry [baseline] (9.65 ms) : 0, 9650
Telemetry [candidate] (8.942 ms) : 0, 8942
Flare Poller [baseline] (9.752 ms) : 0, 9752
Flare Poller [candidate] (8.257 ms) : 0, 8257
section iast
crashtracking [baseline] (1.243 ms) : 0, 1243
crashtracking [candidate] (1.232 ms) : 0, 1232
BytebuddyAgent [baseline] (828.4 ms) : 0, 828400
BytebuddyAgent [candidate] (830.064 ms) : 0, 830064
AgentMeter [baseline] (11.41 ms) : 0, 11410
AgentMeter [candidate] (11.418 ms) : 0, 11418
GlobalTracer [baseline] (238.347 ms) : 0, 238347
GlobalTracer [candidate] (239.246 ms) : 0, 239246
AppSec [baseline] (30.359 ms) : 0, 30359
AppSec [candidate] (32.29 ms) : 0, 32290
Debugger [baseline] (62.262 ms) : 0, 62262
Debugger [candidate] (62.294 ms) : 0, 62294
Remote Config [baseline] (520.643 µs) : 0, 521
Remote Config [candidate] (524.722 µs) : 0, 525
Telemetry [baseline] (7.628 ms) : 0, 7628
Telemetry [candidate] (7.66 ms) : 0, 7660
Flare Poller [baseline] (3.419 ms) : 0, 3419
Flare Poller [candidate] (3.307 ms) : 0, 3307
IAST [baseline] (29.046 ms) : 0, 29046
IAST [candidate] (26.599 ms) : 0, 26599
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~cb27fd64db, baseline=1.62.0-SNAPSHOT~3ba8d13c73

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.064 s) : 0, 1064282
Total [baseline] (11.05 s) : 0, 11050284
Agent [candidate] (1.069 s) : 0, 1069466
Total [candidate] (11.075 s) : 0, 11075238
section appsec
Agent [baseline] (1.273 s) : 0, 1273056
Total [baseline] (11.067 s) : 0, 11066925
Agent [candidate] (1.268 s) : 0, 1267613
Total [candidate] (11.046 s) : 0, 11046423
section iast
Agent [baseline] (1.267 s) : 0, 1267360
Total [baseline] (11.331 s) : 0, 11331229
Agent [candidate] (1.243 s) : 0, 1243230
Total [candidate] (11.251 s) : 0, 11251021
section profiling
Agent [baseline] (1.185 s) : 0, 1185397
Total [baseline] (11.052 s) : 0, 11052359
Agent [candidate] (1.189 s) : 0, 1189067
Total [candidate] (11.09 s) : 0, 11089924
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.064 s -
Agent appsec 1.273 s 208.775 ms (19.6%)
Agent iast 1.267 s 203.079 ms (19.1%)
Agent profiling 1.185 s 121.115 ms (11.4%)
Total tracing 11.05 s -
Total appsec 11.067 s 16.641 ms (0.2%)
Total iast 11.331 s 280.945 ms (2.5%)
Total profiling 11.052 s 2.076 ms (0.0%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.069 s -
Agent appsec 1.268 s 198.147 ms (18.5%)
Agent iast 1.243 s 173.764 ms (16.2%)
Agent profiling 1.189 s 119.601 ms (11.2%)
Total tracing 11.075 s -
Total appsec 11.046 s -28.815 ms (-0.3%)
Total iast 11.251 s 175.783 ms (1.6%)
Total profiling 11.09 s 14.685 ms (0.1%)
gantt
    title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~cb27fd64db, baseline=1.62.0-SNAPSHOT~3ba8d13c73

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.217 ms) : 0, 1217
crashtracking [candidate] (1.227 ms) : 0, 1227
BytebuddyAgent [baseline] (636.454 ms) : 0, 636454
BytebuddyAgent [candidate] (639.569 ms) : 0, 639569
AgentMeter [baseline] (29.454 ms) : 0, 29454
AgentMeter [candidate] (29.684 ms) : 0, 29684
GlobalTracer [baseline] (248.773 ms) : 0, 248773
GlobalTracer [candidate] (250.116 ms) : 0, 250116
AppSec [baseline] (32.755 ms) : 0, 32755
AppSec [candidate] (32.85 ms) : 0, 32850
Debugger [baseline] (60.523 ms) : 0, 60523
Debugger [candidate] (60.867 ms) : 0, 60867
Remote Config [baseline] (592.92 µs) : 0, 593
Remote Config [candidate] (593.866 µs) : 0, 594
Telemetry [baseline] (8.111 ms) : 0, 8111
Telemetry [candidate] (9.6 ms) : 0, 9600
Flare Poller [baseline] (10.491 ms) : 0, 10491
Flare Poller [candidate] (8.98 ms) : 0, 8980
section appsec
crashtracking [baseline] (1.228 ms) : 0, 1228
crashtracking [candidate] (1.222 ms) : 0, 1222
BytebuddyAgent [baseline] (680.288 ms) : 0, 680288
BytebuddyAgent [candidate] (677.635 ms) : 0, 677635
AgentMeter [baseline] (12.306 ms) : 0, 12306
AgentMeter [candidate] (12.352 ms) : 0, 12352
GlobalTracer [baseline] (250.807 ms) : 0, 250807
GlobalTracer [candidate] (250.64 ms) : 0, 250640
AppSec [baseline] (185.843 ms) : 0, 185843
AppSec [candidate] (186.022 ms) : 0, 186022
Debugger [baseline] (65.329 ms) : 0, 65329
Debugger [candidate] (64.879 ms) : 0, 64879
Remote Config [baseline] (582.616 µs) : 0, 583
Remote Config [candidate] (579.585 µs) : 0, 580
Telemetry [baseline] (7.674 ms) : 0, 7674
Telemetry [candidate] (7.622 ms) : 0, 7622
Flare Poller [baseline] (5.69 ms) : 0, 5690
Flare Poller [candidate] (5.626 ms) : 0, 5626
IAST [baseline] (24.826 ms) : 0, 24826
IAST [candidate] (24.913 ms) : 0, 24913
section iast
crashtracking [baseline] (1.25 ms) : 0, 1250
crashtracking [candidate] (1.229 ms) : 0, 1229
BytebuddyAgent [baseline] (843.307 ms) : 0, 843307
BytebuddyAgent [candidate] (823.126 ms) : 0, 823126
AgentMeter [baseline] (11.524 ms) : 0, 11524
AgentMeter [candidate] (11.309 ms) : 0, 11309
GlobalTracer [baseline] (239.584 ms) : 0, 239584
GlobalTracer [candidate] (237.373 ms) : 0, 237373
AppSec [baseline] (29.153 ms) : 0, 29153
AppSec [candidate] (31.34 ms) : 0, 31340
Debugger [baseline] (63.783 ms) : 0, 63783
Debugger [candidate] (62.832 ms) : 0, 62832
Remote Config [baseline] (537.124 µs) : 0, 537
Remote Config [candidate] (523.253 µs) : 0, 523
Telemetry [baseline] (7.834 ms) : 0, 7834
Telemetry [candidate] (7.691 ms) : 0, 7691
Flare Poller [baseline] (3.473 ms) : 0, 3473
Flare Poller [candidate] (3.468 ms) : 0, 3468
IAST [baseline] (28.891 ms) : 0, 28891
IAST [candidate] (26.647 ms) : 0, 26647
section profiling
crashtracking [baseline] (1.181 ms) : 0, 1181
crashtracking [candidate] (1.173 ms) : 0, 1173
BytebuddyAgent [baseline] (691.877 ms) : 0, 691877
BytebuddyAgent [candidate] (694.88 ms) : 0, 694880
AgentMeter [baseline] (8.868 ms) : 0, 8868
AgentMeter [candidate] (8.912 ms) : 0, 8912
GlobalTracer [baseline] (207.36 ms) : 0, 207360
GlobalTracer [candidate] (208.002 ms) : 0, 208002
AppSec [baseline] (32.742 ms) : 0, 32742
AppSec [candidate] (32.576 ms) : 0, 32576
Debugger [baseline] (65.843 ms) : 0, 65843
Debugger [candidate] (65.657 ms) : 0, 65657
Remote Config [baseline] (568.088 µs) : 0, 568
Remote Config [candidate] (576.076 µs) : 0, 576
Telemetry [baseline] (7.801 ms) : 0, 7801
Telemetry [candidate] (7.765 ms) : 0, 7765
Flare Poller [baseline] (3.557 ms) : 0, 3557
Flare Poller [candidate] (3.519 ms) : 0, 3519
ProfilingAgent [baseline] (93.901 ms) : 0, 93901
ProfilingAgent [candidate] (94.64 ms) : 0, 94640
Profiling [baseline] (94.466 ms) : 0, 94466
Profiling [candidate] (95.193 ms) : 0, 95193
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/deep-diagnose-log-injection-flake
git_commit_date 1777299542 1777300921
git_commit_sha 3ba8d13 cb27fd6
release_version 1.62.0-SNAPSHOT~3ba8d13c73 1.62.0-SNAPSHOT~cb27fd64db
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1777303227 1777303227
ci_job_id 1633902848 1633902848
ci_pipeline_id 109918212 109918212
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-iesexfse 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-iesexfse 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 3 performance regressions! Performance is the same for 17 metrics, 16 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:insecure-bank:iast:high_load worse
[+58.299µs; +181.801µs] or [+2.292%; +7.148%]
unsure
[+85.501µs; +549.559µs] or [+1.150%; +7.391%]
unstable
[-224.169op/s; +86.919op/s] or [-15.885%; +6.159%]
2.663ms 7.753ms 1342.594op/s 2.543ms 7.435ms 1411.219op/s
scenario:load:petclinic:tracing:high_load worse
[+417.898µs; +1271.450µs] or [+2.390%; +7.273%]
unsure
[+46.447µs; +1236.944µs] or [+0.161%; +4.290%]
unstable
[-33.962op/s; +13.525op/s] or [-13.042%; +5.194%]
18.327ms 29.474ms 250.188op/s 17.483ms 28.833ms 260.406op/s
scenario:load:petclinic:no_agent:high_load worse
[+1.165ms; +2.446ms] or [+6.917%; +14.518%]
unstable
[+1.148ms; +4.062ms] or [+4.032%; +14.264%]
unstable
[-53.614op/s; -2.074op/s] or [-19.806%; -0.766%]
18.651ms 31.082ms 242.844op/s 16.845ms 28.477ms 270.688op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~cb27fd64db, baseline=1.62.0-SNAPSHOT~3ba8d13c73
    dateFormat X
    axisFormat %s
section baseline
no_agent (17.236 ms) : 17066, 17406
.   : milestone, 17236,
appsec (18.598 ms) : 18409, 18787
.   : milestone, 18598,
code_origins (17.996 ms) : 17818, 18175
.   : milestone, 17996,
iast (17.98 ms) : 17803, 18158
.   : milestone, 17980,
profiling (18.902 ms) : 18715, 19090
.   : milestone, 18902,
tracing (17.917 ms) : 17740, 18094
.   : milestone, 17917,
section candidate
no_agent (19.222 ms) : 19024, 19421
.   : milestone, 19222,
appsec (18.875 ms) : 18683, 19066
.   : milestone, 18875,
code_origins (17.901 ms) : 17723, 18078
.   : milestone, 17901,
iast (17.749 ms) : 17575, 17923
.   : milestone, 17749,
profiling (18.676 ms) : 18491, 18862
.   : milestone, 18676,
tracing (18.651 ms) : 18465, 18837
.   : milestone, 18651,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 17.236 ms [17.066 ms, 17.406 ms] -
appsec 18.598 ms [18.409 ms, 18.787 ms] 1.363 ms (7.9%)
code_origins 17.996 ms [17.818 ms, 18.175 ms] 760.631 µs (4.4%)
iast 17.98 ms [17.803 ms, 18.158 ms] 744.377 µs (4.3%)
profiling 18.902 ms [18.715 ms, 19.09 ms] 1.666 ms (9.7%)
tracing 17.917 ms [17.74 ms, 18.094 ms] 681.496 µs (4.0%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 19.222 ms [19.024 ms, 19.421 ms] -
appsec 18.875 ms [18.683 ms, 19.066 ms] -347.747 µs (-1.8%)
code_origins 17.901 ms [17.723 ms, 18.078 ms] -1.322 ms (-6.9%)
iast 17.749 ms [17.575 ms, 17.923 ms] -1.473 ms (-7.7%)
profiling 18.676 ms [18.491 ms, 18.862 ms] -545.847 µs (-2.8%)
tracing 18.651 ms [18.465 ms, 18.837 ms] -571.161 µs (-3.0%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~cb27fd64db, baseline=1.62.0-SNAPSHOT~3ba8d13c73
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.276 ms) : 1264, 1289
.   : milestone, 1276,
iast (3.242 ms) : 3206, 3279
.   : milestone, 3242,
iast_FULL (6.065 ms) : 6003, 6127
.   : milestone, 6065,
iast_GLOBAL (3.601 ms) : 3541, 3660
.   : milestone, 3601,
profiling (2.007 ms) : 1991, 2023
.   : milestone, 2007,
tracing (1.932 ms) : 1916, 1949
.   : milestone, 1932,
section candidate
no_agent (1.249 ms) : 1236, 1261
.   : milestone, 1249,
iast (3.411 ms) : 3365, 3456
.   : milestone, 3411,
iast_FULL (6.212 ms) : 6148, 6276
.   : milestone, 6212,
iast_GLOBAL (3.773 ms) : 3708, 3839
.   : milestone, 3773,
profiling (2.1 ms) : 2082, 2119
.   : milestone, 2100,
tracing (1.895 ms) : 1879, 1912
.   : milestone, 1895,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.276 ms [1.264 ms, 1.289 ms] -
iast 3.242 ms [3.206 ms, 3.279 ms] 1.966 ms (154.0%)
iast_FULL 6.065 ms [6.003 ms, 6.127 ms] 4.788 ms (375.2%)
iast_GLOBAL 3.601 ms [3.541 ms, 3.66 ms] 2.325 ms (182.1%)
profiling 2.007 ms [1.991 ms, 2.023 ms] 730.64 µs (57.2%)
tracing 1.932 ms [1.916 ms, 1.949 ms] 656.185 µs (51.4%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.249 ms [1.236 ms, 1.261 ms] -
iast 3.411 ms [3.365 ms, 3.456 ms] 2.162 ms (173.2%)
iast_FULL 6.212 ms [6.148 ms, 6.276 ms] 4.963 ms (397.5%)
iast_GLOBAL 3.773 ms [3.708 ms, 3.839 ms] 2.525 ms (202.2%)
profiling 2.1 ms [2.082 ms, 2.119 ms] 851.649 µs (68.2%)
tracing 1.895 ms [1.879 ms, 1.912 ms] 646.573 µs (51.8%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/deep-diagnose-log-injection-flake
git_commit_date 1777299542 1777300921
git_commit_sha 3ba8d13 cb27fd6
release_version 1.62.0-SNAPSHOT~3ba8d13c73 1.62.0-SNAPSHOT~cb27fd64db
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1777302984 1777302984
ci_job_id 1633902849 1633902849
ci_pipeline_id 109918212 109918212
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-673ur2pq 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-673ur2pq 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~cb27fd64db, baseline=1.62.0-SNAPSHOT~3ba8d13c73
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.486 ms) : 1474, 1498
.   : milestone, 1486,
appsec (3.785 ms) : 3567, 4004
.   : milestone, 3785,
iast (2.289 ms) : 2219, 2360
.   : milestone, 2289,
iast_GLOBAL (2.332 ms) : 2260, 2403
.   : milestone, 2332,
profiling (2.116 ms) : 2060, 2172
.   : milestone, 2116,
tracing (2.082 ms) : 2028, 2136
.   : milestone, 2082,
section candidate
no_agent (1.489 ms) : 1478, 1501
.   : milestone, 1489,
appsec (3.825 ms) : 3602, 4049
.   : milestone, 3825,
iast (2.295 ms) : 2223, 2367
.   : milestone, 2295,
iast_GLOBAL (2.336 ms) : 2264, 2408
.   : milestone, 2336,
profiling (2.115 ms) : 2059, 2172
.   : milestone, 2115,
tracing (2.086 ms) : 2031, 2140
.   : milestone, 2086,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.486 ms [1.474 ms, 1.498 ms] -
appsec 3.785 ms [3.567 ms, 4.004 ms] 2.3 ms (154.8%)
iast 2.289 ms [2.219 ms, 2.36 ms] 803.341 µs (54.1%)
iast_GLOBAL 2.332 ms [2.26 ms, 2.403 ms] 845.639 µs (56.9%)
profiling 2.116 ms [2.06 ms, 2.172 ms] 630.116 µs (42.4%)
tracing 2.082 ms [2.028 ms, 2.136 ms] 596.084 µs (40.1%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.489 ms [1.478 ms, 1.501 ms] -
appsec 3.825 ms [3.602 ms, 4.049 ms] 2.336 ms (156.9%)
iast 2.295 ms [2.223 ms, 2.367 ms] 806.008 µs (54.1%)
iast_GLOBAL 2.336 ms [2.264 ms, 2.408 ms] 846.668 µs (56.9%)
profiling 2.115 ms [2.059 ms, 2.172 ms] 626.152 µs (42.0%)
tracing 2.086 ms [2.031 ms, 2.14 ms] 596.656 µs (40.1%)
Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~cb27fd64db, baseline=1.62.0-SNAPSHOT~3ba8d13c73
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.547 s) : 15547000, 15547000
.   : milestone, 15547000,
appsec (14.836 s) : 14836000, 14836000
.   : milestone, 14836000,
iast (18.438 s) : 18438000, 18438000
.   : milestone, 18438000,
iast_GLOBAL (17.854 s) : 17854000, 17854000
.   : milestone, 17854000,
profiling (15.024 s) : 15024000, 15024000
.   : milestone, 15024000,
tracing (14.944 s) : 14944000, 14944000
.   : milestone, 14944000,
section candidate
no_agent (15.429 s) : 15429000, 15429000
.   : milestone, 15429000,
appsec (14.921 s) : 14921000, 14921000
.   : milestone, 14921000,
iast (18.821 s) : 18821000, 18821000
.   : milestone, 18821000,
iast_GLOBAL (17.939 s) : 17939000, 17939000
.   : milestone, 17939000,
profiling (15.044 s) : 15044000, 15044000
.   : milestone, 15044000,
tracing (15.115 s) : 15115000, 15115000
.   : milestone, 15115000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.547 s [15.547 s, 15.547 s] -
appsec 14.836 s [14.836 s, 14.836 s] -711.0 ms (-4.6%)
iast 18.438 s [18.438 s, 18.438 s] 2.891 s (18.6%)
iast_GLOBAL 17.854 s [17.854 s, 17.854 s] 2.307 s (14.8%)
profiling 15.024 s [15.024 s, 15.024 s] -523.0 ms (-3.4%)
tracing 14.944 s [14.944 s, 14.944 s] -603.0 ms (-3.9%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.429 s [15.429 s, 15.429 s] -
appsec 14.921 s [14.921 s, 14.921 s] -508.0 ms (-3.3%)
iast 18.821 s [18.821 s, 18.821 s] 3.392 s (22.0%)
iast_GLOBAL 17.939 s [17.939 s, 17.939 s] 2.51 s (16.3%)
profiling 15.044 s [15.044 s, 15.044 s] -385.0 ms (-2.5%)
tracing 15.115 s [15.115 s, 15.115 s] -314.0 ms (-2.0%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: testing Testing tag: ai generated Largely based on code generated by an AI or LLM tag: diagnostics Diagnostics related changes tag: flaky test Flaky tests tag: no release notes Changes to exclude from release notes type: bug Bug report and fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant