Skip to content

Commit eea3c60

Browse files
committed
Fix pdb / breakpoint() hang in workflow code (temporalio#1104)
Closes temporalio#1104. breakpoint() and pdb.set_trace() inside workflow code silently hang even with debug_mode=True and an unsandboxed runner. Three orthogonal issues contribute; this PR addresses all three behind the existing debug_mode flag so production behavior is unchanged. 1. Thread placement. Activations run on a ThreadPoolExecutor worker thread, so pdb's cmdloop() calls input() from a thread that doesn't own the controlling TTY. Fixed by scheduling the activation as a loop.call_soon callback and awaiting a future the callback completes. The dispatch task suspends at the await so it's no longer mid-__step() when the workflow's internal task stepping happens. (A direct synchronous call ran afoul of Python 3.14's tightened asyncio task-entry validation: "Cannot enter into task while another task is being executed.") 2. Sandbox restriction. The sandbox flags `breakpoint` and `input` as non-deterministic builtins. With debug_mode=True the user has explicitly accepted non-determinism for the debugging session, so we relax those two specific restrictions when the runner is a SandboxedWorkflowRunner. Other sandbox checks remain intact. 3. Silent-hang failure mode. Installs a process-wide sys.breakpointhook at worker startup that raises a clear RuntimeError when breakpoint() is called from a workflow worker thread without debug_mode, replacing the silent hang. Adds a "Debugging Workflows with breakpoint() / pdb" subsection to the README under Workflow Sandbox, including a runnable example and the caveats around workflow task timeouts. Tests at tests/worker/test_breakpoint_hang.py cover thread placement in both modes, the sandboxed-workflow path, and the defensive hook. Verified on Python 3.13 and 3.14 locally; CI matrix green on fork. The load-bearing observation for the dispatch fix: `await future` suspends the dispatch task such that asyncio no longer considers it "currently executing," even though it's still in a pending state. That's what lets workflow.activate(act) step the workflow's internal task without 3.14's task-entry error.
1 parent 7ea54e6 commit eea3c60

3 files changed

Lines changed: 523 additions & 45 deletions

File tree

README.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ informal introduction to the features and their implementation.
8282
- [Customizing the Sandbox](#customizing-the-sandbox)
8383
- [Passthrough Modules](#passthrough-modules)
8484
- [Invalid Module Members](#invalid-module-members)
85+
- [Debugging Workflows with `breakpoint()` / `pdb`](#debugging-workflows-with-breakpoint--pdb)
8586
- [Known Sandbox Issues](#known-sandbox-issues)
8687
- [Global Import/Builtins](#global-importbuiltins)
8788
- [Sandbox is not Secure](#sandbox-is-not-secure)
@@ -1241,6 +1242,79 @@ my_worker = Worker(..., workflow_runner=SandboxedWorkflowRunner(restrictions=my_
12411242

12421243
See the API for more details on exact fields and their meaning.
12431244

1245+
##### Debugging Workflows with `breakpoint()` / `pdb`
1246+
1247+
Setting `debug_mode=True` on the `Worker` (or `TEMPORAL_DEBUG=1` in the environment) routes workflow activations
1248+
onto the asyncio main thread instead of a worker thread pool. This lets `breakpoint()` and `pdb.set_trace()`
1249+
inside workflow code open an interactive REPL — without it, pdb hangs because its `input()` call would run on a
1250+
thread that does not own the controlling TTY.
1251+
1252+
A minimal runnable example:
1253+
1254+
```python
1255+
import asyncio
1256+
from datetime import timedelta
1257+
1258+
from temporalio import workflow
1259+
from temporalio.client import Client
1260+
from temporalio.worker import Worker
1261+
1262+
1263+
@workflow.defn
1264+
class DebugMeWorkflow:
1265+
@workflow.run
1266+
async def run(self) -> str:
1267+
x = 42
1268+
breakpoint() # interactive pdb prompt opens at this line
1269+
return f"x was {x}"
1270+
1271+
1272+
async def main() -> None:
1273+
client = await Client.connect("localhost:7233")
1274+
async with Worker(
1275+
client,
1276+
task_queue="debug-me",
1277+
workflows=[DebugMeWorkflow],
1278+
debug_mode=True,
1279+
):
1280+
result = await client.execute_workflow(
1281+
DebugMeWorkflow.run,
1282+
id="debug-me-wf",
1283+
task_queue="debug-me",
1284+
task_timeout=timedelta(minutes=10), # see caveat below
1285+
)
1286+
print(result)
1287+
1288+
1289+
if __name__ == "__main__":
1290+
asyncio.run(main())
1291+
```
1292+
1293+
Run with `python debug_me.py`, or under pytest with `pytest -s` (the `-s` flag disables pytest's stdin
1294+
capture). At the `(Pdb)` prompt you'll land at the line where `breakpoint()` was called, with workflow
1295+
locals in scope. Try `p x`, `n`, `c`, `q`.
1296+
1297+
**Quitting cleanly.** Typing `q` or hitting Ctrl-D continues the workflow rather than raising `BdbQuit`
1298+
(which would fail the workflow task). To genuinely abort, kill the outer process with Ctrl-C.
1299+
1300+
Two caveats when pausing at a breakpoint inside a workflow:
1301+
1302+
1. **Workflow task timeout.** Temporal expires a workflow task after ~10 seconds by default. If you sit at the
1303+
`(Pdb)` prompt longer than that, the server reassigns the task and your workflow replays from the start when
1304+
you continue — re-hitting the breakpoint. Pass `task_timeout=timedelta(minutes=N)` to `execute_workflow` /
1305+
`start_workflow` to give yourself debugging headroom:
1306+
1307+
```python
1308+
await client.execute_workflow(MyWorkflow.run, ..., task_timeout=timedelta(minutes=10))
1309+
```
1310+
1311+
2. **Deterministic replay.** Workflows are deterministic and replay from history; any wall-clock pause violates
1312+
that contract. For post-mortem debugging without these caveats, use the [Replayer](#replayer) on a recorded
1313+
history instead of live debugging.
1314+
1315+
A `breakpoint()` call from workflow code without `debug_mode` enabled raises a `RuntimeError` with a pointer to
1316+
this section, so the failure mode is loud rather than a silent hang.
1317+
12441318
##### Known Sandbox Issues
12451319

12461320
Below are known sandbox issues. As the sandbox is developed and matures, some may be resolved.

0 commit comments

Comments
 (0)