fix: make use of frame pointer unwinding on python3.15#1404
Conversation
Signed-off-by: hanshal101 <hanshalmehta10@gmail.com>
Signed-off-by: hanshal101 <hanshalmehta10@gmail.com>
Signed-off-by: hanshal101 <hanshalmehta10@gmail.com>
Signed-off-by: hanshal101 <hanshalmehta10@gmail.com>
Signed-off-by: hanshal101 <hanshalmehta10@gmail.com>
florianl
left a comment
There was a problem hiding this comment.
Can you please add a coredump test?
| // used by detectFramePointers, separated for unit testing. | ||
| func prologueHasFramePointer(machine elf.Machine, code []byte) bool { | ||
| switch machine { | ||
| case elf.EM_X86_64: |
There was a problem hiding this comment.
Please make use of asm/amd to correctly identify the prologue.
| // unwinder can read the live _PyInterpreterFrame* directly from a callee- | ||
| // saved register at sample time, bypassing the PyThreadState/TLS lookup | ||
| // on the hot path. | ||
| u8 uses_frame_pointers; |
There was a problem hiding this comment.
Can we make this a bool?
| u8 uses_frame_pointers; | |
| bool uses_frame_pointers; |
fabled
left a comment
There was a problem hiding this comment.
This seems to not really use frame pointer unwinding, but is rather an optimization to get the python frame from native register if possible. But given that we cannot track the native register I'm not sure if this is a huge improvement or not. It adds quite a bit of complexity to avoid few bpf_probe_read_user reads. I also doubt the value of frame pointer unwinding for us. I'll comment on this on the original issue.
| candidate = (void *)ctx->regs[28]; | ||
| #else | ||
| return ERR_PYTHON_BAD_FRAME_OBJECT_ADDR; | ||
| #endif |
There was a problem hiding this comment.
This test is not correct. It does not take into account signal handlers and other corner cases. You could probably just use state->return_address. If return_address then this is no longer a leaf frame and the registers are not recovered.
Fixes #1389
Closes #1389
Summary
Implements python/cpython#149201: CPython 3.15 ships with
-fno-omit-frame-pointerby default, putting the live_PyInterpreterFrame*in a callee-saved register (r14on x86-64,x28on arm64) at every entry to_PyEval_EvalFrameDefault. When detected, the eBPF Python unwinder takes a fast path that reads that register directly and skips thepthread_getspecific(autoTLSKey)/_Py_tss_tstatelookup on the hot path.Detection is by prologue inspection rather than version check, since Debian and a few other distros already enable the flag on earlier 3.x releases, those builds get the benefit too.
Changes
interpreter/python/python.go:detectFramePointers()scans the first 32 bytes of_PyEval_EvalFrameDefaultfor the canonical FP setup (push %rbp; mov %rsp,%rbpon x86-64;mov x29,sp/0x910003fdon arm64). Result is plumbed intopythonDataand onward into the eBPFPyProcInfo.interpreter/python/python.go: folds 3.15 into the existing 3.14 offset case (_PyInterpreterFrameandPyThreadStatelayouts verified identical via DWARF on the upstream 3.15-rc build).support/ebpf/types.h+support/types.go: addsuses_frame_pointersflag toPyProcInfo(regenerated viacgo -godefs; consumes one byte of existing padding, struct size unchanged).support/ebpf/python_tracer.ebpf.c: newget_PyFrame_from_fp()helper readsctx->r14/ctx->regs[28], validates by readingf_executable, falls through to the existing slow path on any miss.interpreter/python/python_test.go: unit tests forprologueHasFramePointer(synthetic byte sequences for both archs, covering FP / no-FP / endbr64 / paciasp / too-short input), plus an env-var-gated integration test for real binaries.