Skip to content
72 changes: 55 additions & 17 deletions peps/pep-0768.rst
Original file line number Diff line number Diff line change
Expand Up @@ -141,8 +141,10 @@ A new structure is added to PyThreadState to support remote debugging:

This structure is appended to ``PyThreadState``, adding only a few fields that
are **never accessed during normal execution**. The ``debugger_pending_call`` field
indicates when a debugger has requested execution, while ``debugger_script``
provides Python code to be executed when the interpreter reaches a safe point.
indicates when a debugger has requested execution, while ``debugger_script_path``
provides a filesystem path to a Python source file (.py) that will be executed when
the interpreter reaches a safe point. The path must point to a Python source file,
not compiled Python code (.pyc) or any other format.

The value for ``MAX_SCRIPT_PATH_SIZE`` will be a trade-off between binary size
and how big debugging scripts' paths can be. To limit the memory overhead per
Expand Down Expand Up @@ -177,7 +179,7 @@ debugger support:
These offsets allow debuggers to locate critical debugging control structures in
the target process's memory space. The ``eval_breaker`` and ``remote_debugger_support``
offsets are relative to each ``PyThreadState``, while the ``debugger_pending_call``
and ``debugger_script`` offsets are relative to each ``_PyRemoteDebuggerSupport``
and ``debugger_script_path`` offsets are relative to each ``_PyRemoteDebuggerSupport``
structure, allowing the new structure and its fields to be found regardless of
where they are in memory. ``debugger_script_path_size`` informs the attaching
tool of the size of the buffer.
Expand All @@ -200,13 +202,19 @@ When a debugger wants to attach to a Python process, it follows these steps:

5. Write control information:

- Write a filename containing Python code to be executed into the
``debugger_script`` field in ``_PyRemoteDebuggerSupport``.
- Most debuggers will pause the process before writing to its memory. This is
standard practice for tools like GDB, which use SIGSTOP or ptrace to pause the process.
This approach prevents races when writing to process memory. Profilers and other tools
that don't wish to stop the process can still use this interface, but they need to
handle possible races, which is a normal consideration for profilers in general.

- Write a file path to a Python source file (.py) into the
``debugger_script_path`` field in ``_PyRemoteDebuggerSupport``.
- Set ``debugger_pending_call`` flag in ``_PyRemoteDebuggerSupport`` to 1
- Set ``_PY_EVAL_PLEASE_STOP_BIT`` in the ``eval_breaker`` field

Once the interpreter reaches the next safe point, it will execute the script
provided by the debugger.
Once the interpreter reaches the next safe point, it will execute the Python code
contained in the file specified by the debugger.

Interpreter Integration
-----------------------
Expand Down Expand Up @@ -237,7 +245,7 @@ to be audited or disabled if desired by a system's administrator.
if (tstate->eval_breaker) {
if (tstate->remote_debugger_support.debugger_pending_call) {
tstate->remote_debugger_support.debugger_pending_call = 0;
const char *path = tstate->remote_debugger_support.debugger_script;
const char *path = tstate->remote_debugger_support.debugger_script_path;
if (*path) {
if (0 != PySys_Audit("debugger_script", "%s", path)) {
PyErr_Clear();
Expand Down Expand Up @@ -273,16 +281,17 @@ arbitrary Python code within the context of a specified Python process:

.. code-block:: python

def remote_exec(pid: int, code: str, timeout: int = 0) -> None:
def remote_exec(pid: int, code: str) -> None:
"""
Executes a block of Python code in a given remote Python process.

This function returns immediately, and the code will be executed at the next
available opportunity in the target process, similar to how signals are handled.
There is no way to determine when or if the code has been executed.

Args:
pid (int): The process ID of the target Python process.
code (str): A string containing the Python code to be executed.
timeout (int): An optional timeout for waiting for the remote
process to execute the code. If the timeout is exceeded a
``TimeoutError`` will be raised.
"""

An example usage of the API would look like:
Expand All @@ -292,9 +301,7 @@ An example usage of the API would look like:
import sys
# Execute a print statement in a remote Python process with PID 12345
try:
sys.remote_exec(12345, "print('Hello from remote execution!')", timeout=3)
except TimeoutError:
print(f"The remote process took too long to execute the code")
sys.remote_exec(12345, "print('Hello from remote execution!')")
except Exception as e:
print(f"Failed to execute code: {e}")

Expand Down Expand Up @@ -322,6 +329,37 @@ feature. This way, tools can offer a useful error message explaining why they
won't work, instead of believing that they have attached and then never having
their script run.

Multi-threading Considerations
------------------------------

Debugging code injected through this interface executes opportunistically in
whichever thread first encounters a safe evaluation point after the request is
made. This behavior mirrors how Python handles signals, providing a reliable
execution model without adding overhead. For developers needing to target
specific threads, the debug script can be installed only on the desired thread
structure or in all of them if needed.

The Global Interpreter Lock (GIL) continues to govern execution as normal when
debug code runs. This means if a target thread is currently executing a C
extension that holds the GIL without releasing it, the debug code will wait
until that operation completes and the GIL becomes available. However, the
interface introduces no additional GIL contention beyond what the debugging
code itself requires. Importantly, the interface remains fully compatible with
Python's free-threaded mode, where the GIL is not held, allowing debugger code
to execute in any available thread.

In situations where all threads in the target process are blocked—waiting on I/O
operations, sleep states, or external resources—the debugging code might not
execute immediately. In these cases, debuggers can send a pre-registered signal
to the process, which will interrupt the sleep state and force thread scheduling,
creating an opportunity for the debug code to run or leverage any other mechanism
that can force the target process to resume execution.

The execution pattern closely resembles how Python handles signals internally.
The interpreter guarantees that debug code only runs at safe points, never
interrupting atomic operations within the interpreter itself. This approach
ensures that debugging operations cannot corrupt the interpreter state while
still providing timely execution in most real-world scenarios.

Backwards Compatibility
=======================
Expand Down Expand Up @@ -454,8 +492,8 @@ Rejected Ideas
Writing Python code into the buffer
-----------------------------------

We have chosen to have debuggers write the code to be executed into a file
whose path is written into a buffer in the remote process. This has been deemed
We have chosen to have debuggers write the path to a file containing Python code
into a buffer in the remote process. This has been deemed
more secure than writing the Python code to be executed itself into a buffer in
the remote process, because it means that an attacker who has gained arbitrary
writes in a process but not arbitrary code execution or file system
Expand Down