Skip to content

Use abseil for readable POSIX stack traces in debug builds#28405

Open
tianleiwu wants to merge 7 commits into
mainfrom
tlwu/20260507/stacktrace
Open

Use abseil for readable POSIX stack traces in debug builds#28405
tianleiwu wants to merge 7 commits into
mainfrom
tlwu/20260507/stacktrace

Conversation

@tianleiwu
Copy link
Copy Markdown
Contributor

@tianleiwu tianleiwu commented May 7, 2026

Description

Replace glibc backtrace()/backtrace_symbols() with abseil's absl::GetStackTrace()/absl::Symbolize() for POSIX/Linux debug builds, and add automatic addr2line resolution for file paths and line numbers. The previous implementation produced raw addresses requiring manual addr2line translation. The new implementation produces demangled function names with source locations directly in exception messages, with zero new dependencies.

Summary of Changes

Stack Trace Implementation

File Change
onnxruntime/core/platform/posix/stacktrace.cc Replace glibc backtrace()/backtrace_symbols() with absl::GetStackTrace()/absl::Symbolize(). Use dladdr() + addr2line to resolve source file and line number for each frame.
onnxruntime/core/session/environment.cc Add one-time absl::InitializeSymbolizer(nullptr) call via std::call_once in Environment::Initialize(). On Linux, nullptr works because abseil reads /proc/self/exe.

Before vs After

Before (raw addresses requiring manual addr2line):

Stacktrace:
 /home/me/build/Debug/onnxruntime_test_all(+0x3f46cc) [0x559543faf6cc]
 /home/me/build/Debug/onnxruntime_test_all(+0x2bef04d) [0x559543faf6cc]

After (demangled function names + file:line):

Stacktrace:
onnxruntime::OpKernelContext::Output() at .../core/framework/op_kernel.cc:45
onnxruntime::Add<>::Compute() at .../core/providers/cpu/math/element_wise_ops.cc:596
onnxruntime::ExecuteKernel() at .../core/framework/sequential_executor.cc:535
onnxruntime::InferenceSession::Run() at .../core/session/inference_session.cc:3142
onnxruntime::test::...TestBody() at .../test/framework/execution_frame_test.cc:506

Motivation and Context

Follow-up on #26257, which was closed because abseil's backtrace/symbolize is already available as a dependency. This PR implements that suggestion with additional file:line resolution:

  • No new dependency: absl::stacktrace and absl::symbolize are already in ABSEIL_LIBS and linked to onnxruntime_common. dladdr() and addr2line are standard POSIX/Linux utilities.
  • No CMake changes needed: Everything is already wired up
  • Debug-only: Guarded by #ifndef NDEBUG — no performance impact in release builds
  • Best-effort file:line: Uses dladdr() to compute file offsets, then calls addr2line in batch (once per binary). Falls back gracefully to function-name-only output if addr2line is unavailable.
  • Windows unchanged: Windows already has superior stack traces via C++23 <stacktrace>
  • Platform exclusions preserved: Android, WebAssembly, AIX, and _OPSCHEMA_LIB_ builds continue to return empty stack traces

How it works

  1. absl::GetStackTrace() captures raw frame addresses
  2. absl::Symbolize() resolves each address to a demangled function name
  3. dladdr() determines which binary each address belongs to and computes the file offset
  4. addr2line is called in batch (one invocation per binary) to resolve file:line
  5. Results are combined into a single readable string per frame

Testing

  • Built and verified on Linux with CUDA EP in Debug mode
  • Ran onnxruntime_test_all --gtest_filter="*BadModelInvalidDimParamUsage*" — confirmed stack trace shows demangled function names with file paths and line numbers through the full call chain
  • Verified graceful fallback when addr2line cannot resolve a frame (shows function name + address only)
  • No CMake changes, so no risk of build system regressions on other platforms

@tianleiwu tianleiwu force-pushed the tlwu/20260507/stacktrace branch 2 times, most recently from 821dab8 to 65086c2 Compare May 7, 2026 23:37
Replace glibc backtrace()/backtrace_symbols() with abseil's
GetStackTrace()/Symbolize() in debug builds on Linux/POSIX.

The previous implementation produced raw addresses requiring manual
addr2line translation. The new implementation produces demangled
function names (e.g. onnxruntime::InferenceSession::Run) directly
in exception messages.

No new dependency is introduced — absl::stacktrace and absl::symbolize
are already linked via ABSEIL_LIBS. Windows stack traces (C++23
<stacktrace>) are unchanged.

Closes #26257
@tianleiwu tianleiwu force-pushed the tlwu/20260507/stacktrace branch 2 times, most recently from 418a610 to f4a02a5 Compare May 8, 2026 06:06
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates ONNX Runtime’s POSIX/Linux debug stack trace generation to produce more readable, symbolized call stacks by switching from glibc backtrace()/backtrace_symbols() to Abseil’s stacktrace/symbolization APIs, with optional addr2line-based file:line resolution.

Changes:

  • Replace glibc backtrace collection/symbol printing with absl::GetStackTrace() + absl::Symbolize() in POSIX debug builds.
  • Add best-effort addr2line integration to resolve source file and line numbers for stack frames (opt-in via env var).
  • Initialize Abseil’s symbolizer once during Environment::Initialize() on non-Windows POSIX platforms.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
onnxruntime/core/session/environment.cc Adds one-time Abseil symbolizer initialization during environment setup on non-Windows POSIX platforms.
onnxruntime/core/platform/posix/stacktrace.cc Reworks debug stack trace capture/symbolization to use Abseil and optionally resolve file:line via addr2line.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/platform/posix/stacktrace.cc Outdated
Comment thread onnxruntime/core/platform/posix/stacktrace.cc Outdated
Comment thread onnxruntime/core/platform/posix/stacktrace.cc Outdated
Comment thread onnxruntime/core/session/environment.cc
tianleiwu added 5 commits May 16, 2026 01:03
- Fix env var name inconsistency: comments now correctly say ORT_ADDR2LINE
  to match the implementation (was ORT_ENABLE_ADDR2LINE)
- Replace popen() shell command with posix_spawnp() + pipe to eliminate
  shell injection risk from binary paths with spaces or special chars
- Guard InitializeSymbolizer() and its include with #ifndef NDEBUG to
  match the debug-only stacktrace implementation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants