Commit 791f853
authored
chore(internal): honor positional timeout in PeriodicThread.join (#17632)
## Description
`PeriodicThread_join` (native, in `ddtrace/internal/_threads.cpp`) guarded its
argument parsing with `args != NULL && kwargs != NULL`. CPython passes
`kwargs == NULL` whenever the caller uses only positional arguments, so
`t.join(0.1)` skipped parsing and `timeout` stayed `Py_None`, falling through
to `self->_stopped->wait()` — an unbounded wait.
Both `PeriodicService.join` (`ddtrace/internal/periodic.py:64`) and
`Timer.join` (`ddtrace/internal/periodic.py:151`) forward to the underlying
worker positionally, so every caller that did `service.join(timeout=...)` was
silently waiting for the thread to exit on its own instead of honoring the
requested timeout. The telemetry writer shutdown path (`self.join(timeout=2)`)
is one user-visible example.
The fix drops the spurious `kwargs != NULL` check: `PyArg_ParseTupleAndKeywords`
already accepts `kwargs == NULL`.
This bug has been in the file since `PeriodicThread_join` was first
introduced; it was found by a randomized lifecycle stress test targeting the
native `_threads.cpp` that I'm prototyping separately.
## Testing
New regression test in `tests/internal/test_periodic.py`:
`test_periodic_join_positional_timeout_is_honored` — asserts both
`t.join(0.1)` and `t.join(timeout=0.1)` return within 1s against a long-interval
thread. Before the fix, the positional form hangs forever; after the fix both
forms return in ~0.1s.
Quick standalone verification on a Linux workspace:
Before:
```
kwarg join(timeout=0.1): elapsed=0.100s
WATCHDOG: positional join hung past 2s — killing
```
After:
```
positional join(0.1): elapsed=0.100s
keyword join(t=0.1): elapsed=0.100s
```
## Risks
Very low. Behavior change is scoped to `PeriodicThread.join(X)` called
positionally — previously an unbounded wait, now bounded by `X`. Callers that
were relying on the (undocumented) infinite-wait-when-passing-a-number
behavior would be affected, but that matches no documented contract and all
existing positional callsites in the repo already want the timeout honored
(they are `service.join(timeout=...)` wrappers forwarding positionally).
No API surface change. Release note included.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: erwan.viollet <erwan.viollet@datadoghq.com>1 parent 5a9e351 commit 791f853
2 files changed
Lines changed: 34 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
742 | 742 | | |
743 | 743 | | |
744 | 744 | | |
745 | | - | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| 749 | + | |
| 750 | + | |
746 | 751 | | |
747 | 752 | | |
748 | 753 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
48 | 48 | | |
49 | 49 | | |
50 | 50 | | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
51 | 79 | | |
52 | 80 | | |
53 | 81 | | |
| |||
0 commit comments