Skip to content

fix: make use of frame pointer unwinding on python3.15#1404

Open
hanshal101 wants to merge 5 commits into
open-telemetry:mainfrom
hanshal101:hanshal101/python-fp-unwind
Open

fix: make use of frame pointer unwinding on python3.15#1404
hanshal101 wants to merge 5 commits into
open-telemetry:mainfrom
hanshal101:hanshal101/python-fp-unwind

Conversation

@hanshal101
Copy link
Copy Markdown
Contributor

Fixes #1389
Closes #1389

Summary

Implements python/cpython#149201: CPython 3.15 ships with -fno-omit-frame-pointer by default, putting the live _PyInterpreterFrame* in a callee-saved register (r14 on x86-64, x28 on arm64) at every entry to _PyEval_EvalFrameDefault. When detected, the eBPF Python unwinder takes a fast path that reads that register directly and skips the pthread_getspecific(autoTLSKey) / _Py_tss_tstate lookup on the hot path.

Detection is by prologue inspection rather than version check, since Debian and a few other distros already enable the flag on earlier 3.x releases, those builds get the benefit too.

Changes

  • interpreter/python/python.go: detectFramePointers() scans the first 32 bytes of _PyEval_EvalFrameDefault for the canonical FP setup (push %rbp; mov %rsp,%rbp on x86-64; mov x29,sp / 0x910003fd on arm64). Result is plumbed into pythonData and onward into the eBPF PyProcInfo.
  • interpreter/python/python.go: folds 3.15 into the existing 3.14 offset case (_PyInterpreterFrame and PyThreadState layouts verified identical via DWARF on the upstream 3.15-rc build).
  • support/ebpf/types.h + support/types.go: adds uses_frame_pointers flag to PyProcInfo (regenerated via cgo -godefs; consumes one byte of existing padding, struct size unchanged).
  • support/ebpf/python_tracer.ebpf.c: new get_PyFrame_from_fp() helper reads ctx->r14 / ctx->regs[28], validates by reading f_executable, falls through to the existing slow path on any miss.
  • interpreter/python/python_test.go: unit tests for prologueHasFramePointer (synthetic byte sequences for both archs, covering FP / no-FP / endbr64 / paciasp / too-short input), plus an env-var-gated integration test for real binaries.

Signed-off-by: hanshal101 <hanshalmehta10@gmail.com>
@hanshal101 hanshal101 requested review from a team as code owners May 9, 2026 18:16
Signed-off-by: hanshal101 <hanshalmehta10@gmail.com>
Signed-off-by: hanshal101 <hanshalmehta10@gmail.com>
Signed-off-by: hanshal101 <hanshalmehta10@gmail.com>
Signed-off-by: hanshal101 <hanshalmehta10@gmail.com>
Copy link
Copy Markdown
Member

@florianl florianl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a coredump test?

// used by detectFramePointers, separated for unit testing.
func prologueHasFramePointer(machine elf.Machine, code []byte) bool {
switch machine {
case elf.EM_X86_64:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make use of asm/amd to correctly identify the prologue.

Comment thread support/ebpf/types.h
// unwinder can read the live _PyInterpreterFrame* directly from a callee-
// saved register at sample time, bypassing the PyThreadState/TLS lookup
// on the hot path.
u8 uses_frame_pointers;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this a bool?

Suggested change
u8 uses_frame_pointers;
bool uses_frame_pointers;

Copy link
Copy Markdown
Contributor

@fabled fabled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to not really use frame pointer unwinding, but is rather an optimization to get the python frame from native register if possible. But given that we cannot track the native register I'm not sure if this is a huge improvement or not. It adds quite a bit of complexity to avoid few bpf_probe_read_user reads. I also doubt the value of frame pointer unwinding for us. I'll comment on this on the original issue.

candidate = (void *)ctx->regs[28];
#else
return ERR_PYTHON_BAD_FRAME_OBJECT_ADDR;
#endif
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is not correct. It does not take into account signal handlers and other corner cases. You could probably just use state->return_address. If return_address then this is no longer a leaf frame and the registers are not recovered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

python: use frame pointer unwinding

3 participants