Upgrade async-profiler to 4.0 in inferred spans feature#1872
Upgrade async-profiler to 4.0 in inferred spans feature#1872trask merged 1 commit intoopen-telemetry:mainfrom
Conversation
| String createStartCommand() { | ||
| StringBuilder startCommand = | ||
| new StringBuilder("start,jfr,clock=m,event=wall,cstack=n,interval=") | ||
| new StringBuilder("start,jfr,clock=m,event=wall,nobatch,cstack=n,interval=") |
There was a problem hiding this comment.
In 4.0, I found there is an attractive feature in stack walking async-profiler/async-profiler#1073, it can avoid using AsyncGetCallTrace in JDK that it may causes all kind of crash problems. For detail, can refer to async-profiler/async-profiler#795 Is there any plan to switch to that stack walking mode?
There was a problem hiding this comment.
I think it is best to by default stick with the default stack walking of async-profiler: Like the maintainer stated, async-profiler will eventually switch to the custom stack walking implementation by default, and then we'll switch our default to.
However, I don't see a reason why we shouldn't allow the stack walker to be configured on the inferred-spans extension. Feel free to open a PR adding such a config option if you think it would be useful.
4ae092c
Description:
Upgrades async-profiler used by the inferred-spans extension to 4.0. Supersedes #1840.
Async-profiler 4.0 introduced a new batching mode as the default for wall clock profiling (PR). This mode yields great efficiency improvements:
Previously, if a thread was blocked for e.g. a second, the profiler would periodically wake it up and collect the same stacktrace every time. With batching, the profiler got smarter: It still wakes the thread up, but then detects that the CPU-time of the thread hasn't changed, so it can omit fetching the same stacktrace again.
Unfortunately I don't think we can make use of this improvement yet, see async-profiler/async-profiler#1277. However, eventually we'll definitely add it, as I'm expecting great efficiency improvements.
Based on the outcome of the discussion in the linked issue, I'll open a issue in this repo to make sure we don't forget about it.
This feature has also changed how the samples are stored in the JFR files, causing the inferred spans extension to not find them anymore. To fix this for now, I've disabled the batching functionality, causing async-profiler to operate in the "old" wall clock profiling mode.
Testing:
Quick manual test plus our unit test coverage is already decent.