Commit a1ecf73
committed
feat: cross-platform force-kill primitive for stuck PHP threads
Introduces a self-contained primitive that unblocks a PHP thread stuck
in a blocking call (sleep, synchronous I/O, etc.) so the graceful drain
used by RestartWorkers, DrainWorkers, and Shutdown makes progress
instead of hanging for the duration of the block. The primitive is
useful on its own and gives follow-up graceful-shutdown work a reviewed
foundation to build on.
Design: each PHP thread, at boot from its own TSRM context, hands a
force_kill_slot (pointers to its own EG(vm_interrupt) and EG(timed_out)
atomic bools, plus pthread_t / Windows HANDLE for the wake-up) back to
Go via go_frankenphp_store_force_kill_slot. The slot lives on phpThread
and is protected by a per-thread RWMutex so the zero-and-release path
at thread exit cannot race an in-flight kill. From any goroutine, Go
passes the slot back to frankenphp_force_kill_thread, which stores true
into both bools (waking the VM at the next opcode boundary and routing
through zend_timeout -> "Maximum execution time exceeded") and then
delivers a platform-specific wake-up:
- Linux/FreeBSD: pthread_kill(SIGRTMIN+3) with a no-op handler installed
via pthread_once, SA_ONSTACK, no SA_RESTART. Signal delivery causes
any in-flight blocking syscall to return EINTR.
- Windows: CancelSynchronousIo + QueueUserAPC covers alertable I/O and
SleepEx. Non-alertable Sleep (including PHP's usleep) stays
uninterruptible.
- macOS: atomic-bool-only path. Threads stuck in blocking syscalls wait
to return on their own.
JIT caveat: under the OPcache JIT some hot code paths skip vm_interrupt
checks (see php-src#21267), so a pure-PHP busy loop under JIT may not
observe the store and will fall through to the abandon path below.
Drain flow:
- worker.go: drainWorkerThreads waits drainGracePeriod (5s) for each
drained thread to reach Yielding; then arms force-kill on stragglers
and waits forceKillDeadline (5s) more. Threads still stuck past that
are abandoned rather than hanging the drain forever.
- drainWorkerThreads returns (drained, abandoned). RestartWorkers puts
drained threads back to Ready and abandoned ones into the new
state.Abandoned (handlers treat it like ShuttingDown on next
callback) so an abandoned thread that finally unwinds exits instead
of re-entering the serve loop under stale request state. Abandoned
threads are also filtered out of worker.threads immediately so
isAtThreadLimit and the scaler see accurate capacity; the matching
deactivateThreads cleanup drops Abandoned/Done entries from the
auto-scaling slice so they do not permanently consume a global
scaling slot. If any were abandoned, RestartWorkers returns
errIncompleteRestart wrapped with abandoned/restarted counts -
admin endpoint and watcher surface it.
- phpthread.go: phpThread.shutdown mirrors the same grace + force-kill
+ abandon pattern so Shutdown cannot hang on an uninterruptible
blocking call either. RequestSafeStateChange and shutdown's fast-fail
WaitFor both accept Abandoned so Shutdown racing a RestartWorkers
that marks a thread Abandoned does not park on a transition that
will never come.
Lifecycle hardening: Shutdown intentionally leaves phpThreads allocated
and thread_metrics alive - an abandoned thread that eventually unwinds
still calls through the SAPI and lifecycle callbacks which index those
structures. initPHPThreads blocks on a package-level sync.WaitGroup
(Add on every php_thread entry, Done on every exit path, routed
through a single goto exit: label in php_thread so future return paths
cannot silently leak an Add) so the next Init cycle cannot reassign
them out from under a lingering abandoned thread, then frees the
previous allocation inside frankenphp_init_thread_metrics before
allocating fresh. A dedicated C-side atomic (shutdown_in_progress,
toggled by frankenphp_set_shutdown_in_progress) is the signal the
unhealthy-thread restart path uses to refuse respawning past Shutdown.
- go_frankenphp_store_force_kill_slot / clear_force_kill_slot /
on_thread_shutdown: take the per-thread write lock; clear runs before
ts_free_thread on both healthy and unhealthy exit paths so the
captured &EG() pointers are zeroed before their backing storage is
freed.
- php_thread unblocks FRANKENPHP_KILL_SIGNAL with pthread_sigmask at
startup so Go's runtime signal mask cannot silently drop deliveries.
Safe main-thread teardown after abandonment: an abandoned thread that
unwinds after sapi_shutdown / tsrm_shutdown would touch freed TSRM
storage (ts_free_thread) and torn-down SAPI (php_request_shutdown via
zend_catch, ub_write, etc). php_main now calls back into Go through
go_frankenphp_can_teardown after the main thread's Done signal; Go
does a bounded wait (mainThreadShutdownDeadline, 5s) on the
lingeringThreads WaitGroup. If every thread has exited, teardown runs
normally; on timeout Go logs a warning and returns false, and the C
side skips frankenphp_sapi_module.shutdown / sapi_shutdown /
tsrm_shutdown entirely. PHP's SAPI/TSRM state then leaks until process
exit, which is the safe outcome: a subsequent Shutdown after an
errIncompleteRestart can no longer crash on a late-unwinding abandoned
thread. Callers that want a fully-clean process after observing
errIncompleteRestart should terminate rather than re-Init, since a
skipped teardown leaves SAPI un-torn-down and the next Init would try
to initialise on top of it.
- worker_test.go + testdata/worker-sleep.php: the regression test
drives the full path via a request marker file so it only arms
RestartWorkers once the worker is proven parked in sleep(), then
asserts both the bounded elapsed time and that the "should not
reach" line after sleep never runs (which would indicate the VM
interrupt was never observed).
RestartWorkers now returns an error - a source-compatible Go API change
(callers that ignored it still compile) but worth noting for embedders.1 parent a05e6dd commit a1ecf73
15 files changed
Lines changed: 758 additions & 43 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
43 | 49 | | |
44 | 50 | | |
45 | 51 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
92 | 92 | | |
93 | 93 | | |
94 | 94 | | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
95 | 237 | | |
96 | 238 | | |
97 | 239 | | |
| |||
1065 | 1207 | | |
1066 | 1208 | | |
1067 | 1209 | | |
| 1210 | + | |
| 1211 | + | |
| 1212 | + | |
| 1213 | + | |
| 1214 | + | |
| 1215 | + | |
| 1216 | + | |
| 1217 | + | |
| 1218 | + | |
| 1219 | + | |
| 1220 | + | |
| 1221 | + | |
| 1222 | + | |
| 1223 | + | |
| 1224 | + | |
| 1225 | + | |
| 1226 | + | |
1068 | 1227 | | |
1069 | 1228 | | |
1070 | 1229 | | |
| |||
1073 | 1232 | | |
1074 | 1233 | | |
1075 | 1234 | | |
| 1235 | + | |
| 1236 | + | |
| 1237 | + | |
| 1238 | + | |
| 1239 | + | |
1076 | 1240 | | |
1077 | 1241 | | |
1078 | 1242 | | |
| |||
1150 | 1314 | | |
1151 | 1315 | | |
1152 | 1316 | | |
| 1317 | + | |
| 1318 | + | |
| 1319 | + | |
| 1320 | + | |
| 1321 | + | |
| 1322 | + | |
| 1323 | + | |
| 1324 | + | |
| 1325 | + | |
1153 | 1326 | | |
1154 | 1327 | | |
1155 | 1328 | | |
| |||
1158 | 1331 | | |
1159 | 1332 | | |
1160 | 1333 | | |
1161 | | - | |
1162 | | - | |
| 1334 | + | |
1163 | 1335 | | |
1164 | 1336 | | |
1165 | 1337 | | |
1166 | | - | |
| 1338 | + | |
| 1339 | + | |
| 1340 | + | |
| 1341 | + | |
| 1342 | + | |
| 1343 | + | |
| 1344 | + | |
| 1345 | + | |
| 1346 | + | |
| 1347 | + | |
1167 | 1348 | | |
1168 | 1349 | | |
1169 | 1350 | | |
1170 | 1351 | | |
1171 | 1352 | | |
1172 | 1353 | | |
1173 | 1354 | | |
| 1355 | + | |
| 1356 | + | |
| 1357 | + | |
| 1358 | + | |
| 1359 | + | |
| 1360 | + | |
1174 | 1361 | | |
1175 | 1362 | | |
1176 | 1363 | | |
| |||
1265 | 1452 | | |
1266 | 1453 | | |
1267 | 1454 | | |
1268 | | - | |
1269 | | - | |
1270 | | - | |
1271 | | - | |
| 1455 | + | |
| 1456 | + | |
| 1457 | + | |
| 1458 | + | |
| 1459 | + | |
| 1460 | + | |
| 1461 | + | |
| 1462 | + | |
| 1463 | + | |
| 1464 | + | |
| 1465 | + | |
1272 | 1466 | | |
1273 | | - | |
| 1467 | + | |
1274 | 1468 | | |
1275 | 1469 | | |
1276 | | - | |
1277 | | - | |
1278 | | - | |
| 1470 | + | |
| 1471 | + | |
| 1472 | + | |
| 1473 | + | |
1279 | 1474 | | |
1280 | 1475 | | |
1281 | 1476 | | |
| |||
1470 | 1665 | | |
1471 | 1666 | | |
1472 | 1667 | | |
| 1668 | + | |
| 1669 | + | |
| 1670 | + | |
| 1671 | + | |
| 1672 | + | |
| 1673 | + | |
1473 | 1674 | | |
1474 | 1675 | | |
1475 | 1676 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
49 | 71 | | |
50 | 72 | | |
51 | 73 | | |
| |||
193 | 215 | | |
194 | 216 | | |
195 | 217 | | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
196 | 231 | | |
197 | 232 | | |
198 | 233 | | |
0 commit comments