Skip to content

gh-148178: Validate remote debug offset tables on load#148187

Open
pablogsal wants to merge 1 commit intopython:mainfrom
pablogsal:148178
Open

gh-148178: Validate remote debug offset tables on load#148187
pablogsal wants to merge 1 commit intopython:mainfrom
pablogsal:148178

Conversation

@pablogsal
Copy link
Copy Markdown
Member

@pablogsal pablogsal commented Apr 6, 2026

Treat the debug offset tables read from a target process as untrusted input and validate them before the unwinder uses any reported sizes or offsets.

Add a shared validator in debug_offsets_validation.h and run it once when _Py_DebugOffsets is loaded and once when AsyncioDebug is loaded. The checks cover section sizes used for fixed local buffers and every offset that is later dereferenced against a local buffer or local object view. This keeps the bounds checks out of the sampling hot path while rejecting malformed tables up front.

Copy link
Copy Markdown
Contributor

@maurycy maurycy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add Linux poison tests that corrupt both the main and asyncio offset tables and assert that RemoteUnwinder fails with an invalid-offset error instead of walking into bad accesses or misleading late failures.

Where are they?

I'm also wondering about char[] buffers and GET_MEMBER? I'm thinking about alignment of offsets for a given type. Perhaps there's a risk of undefined behaviour there as well. Do we care about it?

@pablogsal
Copy link
Copy Markdown
Member Author

pablogsal commented Apr 8, 2026

Also add Linux poison tests that corrupt both the main and asyncio offset tables and assert that RemoteUnwinder fails with an invalid-offset error instead of walking into bad accesses or misleading late failures.

Where are they?

I'm also wondering about char[] buffers and GET_MEMBER? I'm thinking about alignment of offsets for a given type. Perhaps there's a risk of undefined behaviour there as well. Do we care about it?

I decided to remove the tests because there were stupidly long, verbose (as I am basically parsing elf by hand in Python), only worked on Linux and has reduced value (just one way to trigger it not all of them) and the Pr is already quite big but I didn't update the commit message

You can see them in the diff here https://github.com/python/cpython/compare/f12f6c4f0467837e25a445339f2e5a24388a317d..b8dcd8fa2b3be846b1dad6d2afb1dccd968789b3

@pablogsal
Copy link
Copy Markdown
Member Author

I'm also wondering about char[] buffers and GET_MEMBER? I'm thinking about alignment of offsets for a given type. Perhaps there's a risk of undefined behaviour there as well. Do we care about it?

I don't think so. But if you have a specific suggestion code suggestion happy to consider it ;)

Treat the debug offset tables read from a target process as untrusted input
and validate them before the unwinder uses any reported sizes or offsets.

Add a shared validator in debug_offsets_validation.h and run it once when
_Py_DebugOffsets is loaded and once when AsyncioDebug is loaded. The checks
cover section sizes used for fixed local buffers and every offset that is
later dereferenced against a local buffer or local object view. This keeps
the bounds checks out of the sampling hot path while rejecting malformed
tables up front.
@pablogsal pablogsal force-pushed the 148178 branch 2 times, most recently from e5e1f21 to f6f2faf Compare April 8, 2026 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants