Enable live heap profiling by default on safe JVM versions#11039
Enable live heap profiling by default on safe JVM versions#11039
Conversation
0ac4316 to
ca7960a
Compare
Adds unified config key `profiling.liveheap.enabled` that auto-detects safe systems (isJmethodIDSafe || isOldObjectSampleAvailable) and enables live heap profiling by default. Ddprof native library is preferred with JFR OldObjectSample as fallback. New `profiling.liveheap.jfr.enabled` replaces deprecated `profiling.heap.enabled` with context-aware deprecation warnings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ca7960a to
868c547
Compare
|
/merge |
|
View all feedbacks in Devflow UI.
The expected merge time in
The merge request has been interrupted because the build 106243977 took longer than expected. The current limit for the base branch 'master' is 120 minutes. |
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 56 metrics, 15 unstable metrics. Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.61.0-SNAPSHOT~749172f660, baseline=1.61.0-SNAPSHOT~3123b191d9
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.076 s) : 0, 1076144
Total [baseline] (8.889 s) : 0, 8888522
Agent [candidate] (1.055 s) : 0, 1054513
Total [candidate] (8.831 s) : 0, 8830568
section iast
Agent [baseline] (1.231 s) : 0, 1231188
Total [baseline] (9.555 s) : 0, 9554629
Agent [candidate] (1.228 s) : 0, 1228321
Total [candidate] (9.563 s) : 0, 9563025
gantt
title insecure-bank - break down per module: candidate=1.61.0-SNAPSHOT~749172f660, baseline=1.61.0-SNAPSHOT~3123b191d9
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.262 ms) : 0, 1262
crashtracking [candidate] (1.224 ms) : 0, 1224
BytebuddyAgent [baseline] (645.072 ms) : 0, 645072
BytebuddyAgent [candidate] (631.623 ms) : 0, 631623
AgentMeter [baseline] (29.864 ms) : 0, 29864
AgentMeter [candidate] (29.229 ms) : 0, 29229
GlobalTracer [baseline] (252.474 ms) : 0, 252474
GlobalTracer [candidate] (248.366 ms) : 0, 248366
AppSec [baseline] (32.521 ms) : 0, 32521
AppSec [candidate] (32.038 ms) : 0, 32038
Debugger [baseline] (59.905 ms) : 0, 59905
Debugger [candidate] (59.116 ms) : 0, 59116
Remote Config [baseline] (626.741 µs) : 0, 627
Remote Config [candidate] (662.015 µs) : 0, 662
Telemetry [baseline] (8.212 ms) : 0, 8212
Telemetry [candidate] (8.084 ms) : 0, 8084
Flare Poller [baseline] (9.691 ms) : 0, 9691
Flare Poller [candidate] (8.107 ms) : 0, 8107
section iast
crashtracking [baseline] (1.245 ms) : 0, 1245
crashtracking [candidate] (1.227 ms) : 0, 1227
BytebuddyAgent [baseline] (807.3 ms) : 0, 807300
BytebuddyAgent [candidate] (805.534 ms) : 0, 805534
AgentMeter [baseline] (11.634 ms) : 0, 11634
AgentMeter [candidate] (11.613 ms) : 0, 11613
GlobalTracer [baseline] (240.178 ms) : 0, 240178
GlobalTracer [candidate] (239.552 ms) : 0, 239552
IAST [baseline] (26.1 ms) : 0, 26100
IAST [candidate] (25.943 ms) : 0, 25943
AppSec [baseline] (31.973 ms) : 0, 31973
AppSec [candidate] (32.592 ms) : 0, 32592
Debugger [baseline] (60.746 ms) : 0, 60746
Debugger [candidate] (59.7 ms) : 0, 59700
Remote Config [baseline] (1.147 ms) : 0, 1147
Remote Config [candidate] (631.329 µs) : 0, 631
Telemetry [baseline] (10.966 ms) : 0, 10966
Telemetry [candidate] (11.346 ms) : 0, 11346
Flare Poller [baseline] (3.461 ms) : 0, 3461
Flare Poller [candidate] (3.865 ms) : 0, 3865
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.61.0-SNAPSHOT~749172f660, baseline=1.61.0-SNAPSHOT~3123b191d9
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.055 s) : 0, 1055284
Total [baseline] (11.083 s) : 0, 11082800
Agent [candidate] (1.059 s) : 0, 1058579
Total [candidate] (11.096 s) : 0, 11096067
section appsec
Agent [baseline] (1.247 s) : 0, 1247436
Total [baseline] (11.09 s) : 0, 11089943
Agent [candidate] (1.247 s) : 0, 1246609
Total [candidate] (11.285 s) : 0, 11285329
section iast
Agent [baseline] (1.225 s) : 0, 1225496
Total [baseline] (11.282 s) : 0, 11282306
Agent [candidate] (1.227 s) : 0, 1227356
Total [candidate] (11.314 s) : 0, 11314374
section profiling
Agent [baseline] (1.186 s) : 0, 1185607
Total [baseline] (11.016 s) : 0, 11016050
Agent [candidate] (1.187 s) : 0, 1186582
Total [candidate] (11.175 s) : 0, 11175443
gantt
title petclinic - break down per module: candidate=1.61.0-SNAPSHOT~749172f660, baseline=1.61.0-SNAPSHOT~3123b191d9
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.235 ms) : 0, 1235
crashtracking [candidate] (1.237 ms) : 0, 1237
BytebuddyAgent [baseline] (631.868 ms) : 0, 631868
BytebuddyAgent [candidate] (633.277 ms) : 0, 633277
AgentMeter [baseline] (29.228 ms) : 0, 29228
AgentMeter [candidate] (29.218 ms) : 0, 29218
GlobalTracer [baseline] (248.49 ms) : 0, 248490
GlobalTracer [candidate] (248.726 ms) : 0, 248726
AppSec [baseline] (31.926 ms) : 0, 31926
AppSec [candidate] (32.053 ms) : 0, 32053
Debugger [baseline] (59.775 ms) : 0, 59775
Debugger [candidate] (60.021 ms) : 0, 60021
Remote Config [baseline] (595.863 µs) : 0, 596
Remote Config [candidate] (616.573 µs) : 0, 617
Telemetry [baseline] (8.034 ms) : 0, 8034
Telemetry [candidate] (8.071 ms) : 0, 8071
Flare Poller [baseline] (8.04 ms) : 0, 8040
Flare Poller [candidate] (8.98 ms) : 0, 8980
section appsec
crashtracking [baseline] (1.224 ms) : 0, 1224
crashtracking [candidate] (1.218 ms) : 0, 1218
BytebuddyAgent [baseline] (661.162 ms) : 0, 661162
BytebuddyAgent [candidate] (660.669 ms) : 0, 660669
AgentMeter [baseline] (12.029 ms) : 0, 12029
AgentMeter [candidate] (12.098 ms) : 0, 12098
GlobalTracer [baseline] (249.102 ms) : 0, 249102
GlobalTracer [candidate] (248.829 ms) : 0, 248829
IAST [baseline] (24.584 ms) : 0, 24584
IAST [candidate] (24.529 ms) : 0, 24529
AppSec [baseline] (183.824 ms) : 0, 183824
AppSec [candidate] (183.628 ms) : 0, 183628
Debugger [baseline] (66.26 ms) : 0, 66260
Debugger [candidate] (66.303 ms) : 0, 66303
Remote Config [baseline] (615.448 µs) : 0, 615
Remote Config [candidate] (605.214 µs) : 0, 605
Telemetry [baseline] (8.658 ms) : 0, 8658
Telemetry [candidate] (8.789 ms) : 0, 8789
Flare Poller [baseline] (3.56 ms) : 0, 3560
Flare Poller [candidate] (3.617 ms) : 0, 3617
section iast
crashtracking [baseline] (1.234 ms) : 0, 1234
crashtracking [candidate] (1.23 ms) : 0, 1230
BytebuddyAgent [baseline] (802.259 ms) : 0, 802259
BytebuddyAgent [candidate] (802.47 ms) : 0, 802470
AgentMeter [baseline] (11.427 ms) : 0, 11427
AgentMeter [candidate] (11.489 ms) : 0, 11489
GlobalTracer [baseline] (239.493 ms) : 0, 239493
GlobalTracer [candidate] (240.294 ms) : 0, 240294
IAST [baseline] (25.812 ms) : 0, 25812
IAST [candidate] (25.89 ms) : 0, 25890
AppSec [baseline] (30.974 ms) : 0, 30974
AppSec [candidate] (30.308 ms) : 0, 30308
Debugger [baseline] (63.035 ms) : 0, 63035
Debugger [candidate] (64.229 ms) : 0, 64229
Remote Config [baseline] (1.144 ms) : 0, 1144
Remote Config [candidate] (549.763 µs) : 0, 550
Telemetry [baseline] (10.26 ms) : 0, 10260
Telemetry [candidate] (11.152 ms) : 0, 11152
Flare Poller [baseline] (3.614 ms) : 0, 3614
Flare Poller [candidate] (3.621 ms) : 0, 3621
section profiling
crashtracking [baseline] (1.186 ms) : 0, 1186
crashtracking [candidate] (1.188 ms) : 0, 1188
BytebuddyAgent [baseline] (692.371 ms) : 0, 692371
BytebuddyAgent [candidate] (691.139 ms) : 0, 691139
AgentMeter [baseline] (9.167 ms) : 0, 9167
AgentMeter [candidate] (9.155 ms) : 0, 9155
GlobalTracer [baseline] (207.499 ms) : 0, 207499
GlobalTracer [candidate] (208.251 ms) : 0, 208251
AppSec [baseline] (32.6 ms) : 0, 32600
AppSec [candidate] (32.853 ms) : 0, 32853
Debugger [baseline] (65.776 ms) : 0, 65776
Debugger [candidate] (66.164 ms) : 0, 66164
Remote Config [baseline] (574.093 µs) : 0, 574
Remote Config [candidate] (571.893 µs) : 0, 572
Telemetry [baseline] (7.855 ms) : 0, 7855
Telemetry [candidate] (7.866 ms) : 0, 7866
Flare Poller [baseline] (3.535 ms) : 0, 3535
Flare Poller [candidate] (3.642 ms) : 0, 3642
ProfilingAgent [baseline] (93.665 ms) : 0, 93665
ProfilingAgent [candidate] (94.533 ms) : 0, 94533
Profiling [baseline] (94.24 ms) : 0, 94240
Profiling [candidate] (95.12 ms) : 0, 95120
LoadParameters
See matching parameters
SummaryFound 2 performance improvements and 2 performance regressions! Performance is the same for 14 metrics, 18 unstable metrics.
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~749172f660, baseline=1.61.0-SNAPSHOT~3123b191d9
dateFormat X
axisFormat %s
section baseline
no_agent (18.492 ms) : 18303, 18681
. : milestone, 18492,
appsec (18.952 ms) : 18762, 19142
. : milestone, 18952,
code_origins (17.735 ms) : 17560, 17909
. : milestone, 17735,
iast (18.197 ms) : 18018, 18376
. : milestone, 18197,
profiling (19.341 ms) : 19145, 19537
. : milestone, 19341,
tracing (18.897 ms) : 18708, 19086
. : milestone, 18897,
section candidate
no_agent (17.641 ms) : 17461, 17821
. : milestone, 17641,
appsec (18.857 ms) : 18665, 19049
. : milestone, 18857,
code_origins (18.208 ms) : 18025, 18392
. : milestone, 18208,
iast (18.194 ms) : 18016, 18373
. : milestone, 18194,
profiling (18.508 ms) : 18323, 18693
. : milestone, 18508,
tracing (17.674 ms) : 17501, 17847
. : milestone, 17674,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~749172f660, baseline=1.61.0-SNAPSHOT~3123b191d9
dateFormat X
axisFormat %s
section baseline
no_agent (1.249 ms) : 1237, 1262
. : milestone, 1249,
iast (3.139 ms) : 3096, 3182
. : milestone, 3139,
iast_FULL (5.941 ms) : 5881, 6000
. : milestone, 5941,
iast_GLOBAL (3.501 ms) : 3447, 3555
. : milestone, 3501,
profiling (2.313 ms) : 2292, 2335
. : milestone, 2313,
tracing (1.982 ms) : 1964, 2001
. : milestone, 1982,
section candidate
no_agent (1.258 ms) : 1245, 1271
. : milestone, 1258,
iast (3.292 ms) : 3250, 3333
. : milestone, 3292,
iast_FULL (5.997 ms) : 5937, 6056
. : milestone, 5997,
iast_GLOBAL (3.564 ms) : 3511, 3618
. : milestone, 3564,
profiling (2.322 ms) : 2295, 2349
. : milestone, 2322,
tracing (1.931 ms) : 1915, 1947
. : milestone, 1931,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 2 unstable metrics. Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~749172f660, baseline=1.61.0-SNAPSHOT~3123b191d9
dateFormat X
axisFormat %s
section baseline
no_agent (15.426 s) : 15426000, 15426000
. : milestone, 15426000,
appsec (14.509 s) : 14509000, 14509000
. : milestone, 14509000,
iast (18.258 s) : 18258000, 18258000
. : milestone, 18258000,
iast_GLOBAL (17.883 s) : 17883000, 17883000
. : milestone, 17883000,
profiling (15.026 s) : 15026000, 15026000
. : milestone, 15026000,
tracing (14.993 s) : 14993000, 14993000
. : milestone, 14993000,
section candidate
no_agent (15.564 s) : 15564000, 15564000
. : milestone, 15564000,
appsec (14.833 s) : 14833000, 14833000
. : milestone, 14833000,
iast (18.508 s) : 18508000, 18508000
. : milestone, 18508000,
iast_GLOBAL (17.99 s) : 17990000, 17990000
. : milestone, 17990000,
profiling (14.705 s) : 14705000, 14705000
. : milestone, 14705000,
tracing (14.945 s) : 14945000, 14945000
. : milestone, 14945000,
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~749172f660, baseline=1.61.0-SNAPSHOT~3123b191d9
dateFormat X
axisFormat %s
section baseline
no_agent (1.488 ms) : 1476, 1499
. : milestone, 1488,
appsec (3.782 ms) : 3565, 4000
. : milestone, 3782,
iast (2.274 ms) : 2205, 2344
. : milestone, 2274,
iast_GLOBAL (2.319 ms) : 2249, 2389
. : milestone, 2319,
profiling (2.128 ms) : 2071, 2185
. : milestone, 2128,
tracing (2.107 ms) : 2052, 2161
. : milestone, 2107,
section candidate
no_agent (1.487 ms) : 1476, 1499
. : milestone, 1487,
appsec (3.825 ms) : 3604, 4047
. : milestone, 3825,
iast (2.276 ms) : 2206, 2346
. : milestone, 2276,
iast_GLOBAL (2.326 ms) : 2256, 2397
. : milestone, 2326,
profiling (2.511 ms) : 2298, 2724
. : milestone, 2511,
tracing (2.095 ms) : 2041, 2149
. : milestone, 2095,
|
|
/merge |
|
View all feedbacks in Devflow UI.
The expected merge time in
This PR is rejected because it was updated |
- On Java 8, neither ddprof MEMLEAK (requires Java 11+) nor JFR OldObjectSample is available; explicitly enabling profiling.heap.enabled now logs a warning instead of silently enabling an unsupported event - ddprofLikelyActive heuristic uses isJmethodIDSafe() as the default for the liveheap flag (was hardcoded true), matching ddprof's own resolution and preventing false disablement of OldObjectSample on Java 11.0.12-11.0.22 and 17.0.3-17.0.10 - Fix testHeapProfilerIsStillOverriddenThroughConfig to expect isOldObjectSampleAvailable() instead of unconditional true Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
What Does This Do
Changes
profiling.heap.enabledfrom a JFR-only flag (defaultfalse) to a unified live heap master switch that auto-detects safe systems and enables live heap profiling by default. The ddprof native library is prioritized, with JFRjdk.OldObjectSampleas an automatic fallback when ddprof is not active.Enablement Logic
flowchart TD A["profiling.heap.enabled\n(unified master switch)\ndefault: isLiveHeapProfilingSafe()"] A -->|"false"| B["ALL live heap OFF\n(ddprof + JFR)"] A -->|"unset or true"| C{Split into two paths} C --> D["ddprof path\nDatadogProfilerConfig"] C --> E["JFR path\nOpenJdkController"] D --> J11{"Java 11+?\n(JVMTI Allocation Sampler\nrequired)"} J11 -->|no| J11NO["ddprof MEMLEAK OFF\n(JFR fallback if available)"] J11 -->|yes| F["profiling.ddprof.liveheap.enabled"] F -->|"false"| G["MEMLEAK OFF"] F -->|"unset or true"\ndefault = isJmethodIDSafe| H{"JVM safe?\n11.0.23+, 17.0.11+,\n21.0.3+, 22+"} H -->|yes| I[MEMLEAK ON] H -->|no| JWARN["MEMLEAK ON\n⚠️ warn: not stable on this JVM"] G --> GCHK{"JFR OldObjectSample\navailable?\n11.0.12+, 17.0.3+, 18+"} GCHK -->|no| GWARN["⚠️ warn: live heap\nwill be inactive"] GCHK -->|yes| GOK[JFR fallback active] E --> EJCHK{"Java 11+ AND\nddprof likely active?\n(liveheap default = isJmethodIDSafe())"} EJCHK -->|yes| NOKEY_OFF["OldObjectSample OFF\n(ddprof handles live heap)"] EJCHK -->|no| NOKEY_JFP{"JFR OldObjectSample\navailable?\n11.0.12+, 17.0.3+, 18+"} NOKEY_JFP -->|yes| O["OldObjectSample ON\n(JFR fallback)"] NOKEY_JFP -->|no| P["OldObjectSample OFF\n⚠️ warn if explicitly enabled"] style B fill:#f66,color:#fff style G fill:#f66,color:#fff style J11NO fill:#f66,color:#fff style NOKEY_OFF fill:#69c,color:#fff style P fill:#f66,color:#fff style GWARN fill:#fa0,color:#fff style I fill:#6b6,color:#fff style JWARN fill:#fa0,color:#fff style O fill:#6b6,color:#fff style GOK fill:#6b6,color:#fffMotivation
Live heap profiling provides valuable memory leak detection but was previously opt-in (
profiling.ddprof.liveheap.enableddefaulted tofalse). Users had to know about and explicitly enable it.This change makes it enabled by default on safe systems, matching the pattern already used by allocation profiling. The two live heap mechanisms (ddprof native and JFR OldObjectSample) are now unified under
profiling.heap.enabledwith automatic fallback.PROF-14188
Additional Notes
profiling.heap.enabledsemantics changed: was JFR-only (defaultfalse), now unified master switch (default auto-detect viaProfilingSupport.isLiveHeapProfilingSafe())isLiveHeapProfilingSafe()=isJmethodIDSafe() || isOldObjectSampleAvailable()(moved toProfilingSupport)falseto dynamicisJmethodIDSafe()ddprofLikelyActiveheuristic inOpenJdkControllerusesisJmethodIDSafe()as the default for the liveheap flag, matching ddprof's own resolution logic — prevents silently losing live heap on versions where OldObjectSample is available but ddprof won't actually enable MEMLEAK (e.g. Java 11.0.12–11.0.22, 17.0.3–17.0.10)profiling.heap.enabled=truelogs a warning but has no effectisEventEnabledwas called withjdk.OldObjectSample#enabledwhich resulted in a double#enabledlookup, making the warning dead codeContributor Checklist
type:and (comp:orinst:) labels in addition to any other useful labelsclose,fix, or any linking keywords when referencing an issueUse
solvesinstead, and assign the PR milestone to the issueJira ticket: PROF-14188
Note: Once your PR is ready to merge, add it to the merge queue by commenting
/merge./merge -ccancels the queue request./merge -f --reason "reason"skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.