Commit d36b9eb
committed
feat: cross-platform force-kill primitive for stuck PHP threads
Introduces a self-contained primitive that unblocks a PHP thread stuck
in a blocking call (sleep, synchronous I/O, etc.) so the graceful drain
used by RestartWorkers, DrainWorkers, and Shutdown makes progress
instead of hanging for the duration of the block. The primitive is
useful on its own and gives follow-up graceful-shutdown work a reviewed
foundation to build on.
Design: each PHP thread, at boot from its own TSRM context, hands a
force_kill_slot (pointers to its own EG(vm_interrupt) and EG(timed_out)
atomic bools, plus pthread_t / Windows HANDLE for the wake-up) back to
Go via go_frankenphp_store_force_kill_slot. The slot lives on phpThread
and is protected by a per-thread RWMutex so the zero-and-release path
at thread exit cannot race an in-flight kill. From any goroutine, Go
passes the slot back to frankenphp_force_kill_thread, which stores true
into both bools (waking the VM at the next opcode boundary and routing
through zend_timeout -> "Maximum execution time exceeded") and then
delivers a platform-specific wake-up:
- Linux/FreeBSD: pthread_kill(SIGRTMIN+3) with a no-op handler installed
via pthread_once, SA_ONSTACK, no SA_RESTART. Signal delivery causes
any in-flight blocking syscall to return EINTR.
- Windows: CancelSynchronousIo + QueueUserAPC covers alertable I/O and
SleepEx. Non-alertable Sleep (including PHP's usleep) stays
uninterruptible.
- macOS: atomic-bool-only path. Threads stuck in blocking syscalls wait
to return on their own.
JIT caveat: under the OPcache JIT some hot code paths skip vm_interrupt
checks (see php-src#21267), so a pure-PHP busy loop under JIT may not
observe the store and will fall through to the abandon path below.
Drain flow:
- worker.go: drainWorkerThreads waits drainGracePeriod (5s) for each
drained thread to reach Yielding; then arms force-kill on stragglers
and waits forceKillDeadline (5s) more. Threads still stuck past that
are abandoned rather than hanging the drain forever.
- drainWorkerThreads returns (drained, abandoned). RestartWorkers puts
drained threads back to Ready and abandoned ones into the new
state.Abandoned (handlers treat it like ShuttingDown on next
callback) so an abandoned thread that finally unwinds exits instead
of re-entering the serve loop under stale request state. If any were
abandoned, RestartWorkers returns errIncompleteRestart wrapped with
abandoned/restarted counts - admin endpoint and watcher surface it.
- phpthread.go: phpThread.shutdown mirrors the same grace + force-kill
+ abandon pattern so Shutdown cannot hang on an uninterruptible
blocking call either.
Lifecycle hardening: Shutdown intentionally leaves phpThreads allocated
and thread_metrics alive - an abandoned thread that eventually unwinds
still calls through the SAPI and lifecycle callbacks which index those
structures. initPHPThreads blocks on a package-level sync.WaitGroup
(Add on every php_thread entry, Done on every exit path) so the next
Init cycle cannot reassign them out from under a lingering abandoned
thread, then frees the previous allocation inside
frankenphp_init_thread_metrics before allocating fresh. A dedicated
C-side atomic (shutdown_in_progress, toggled by
frankenphp_set_shutdown_in_progress) is the signal the unhealthy-thread
restart path uses to refuse respawning past Shutdown.
- go_frankenphp_store_force_kill_slot / clear_force_kill_slot /
on_thread_shutdown: take the per-thread write lock; clear runs before
ts_free_thread on both healthy and unhealthy exit paths so the
captured &EG() pointers are zeroed before their backing storage is
freed. A threadForLateCallback helper guards callbacks against
phpThreads races anyway, belt-and-suspenders.
- php_thread unblocks FRANKENPHP_KILL_SIGNAL with pthread_sigmask at
startup so Go's runtime signal mask cannot silently drop deliveries.
- worker_test.go + testdata/worker-sleep.php: the regression test
drives the full path via a request marker file so it only arms
RestartWorkers once the worker is proven parked in sleep(), then
asserts both the bounded elapsed time and that the "should not
reach" line after sleep never runs (which would indicate the VM
interrupt was never observed).
RestartWorkers now returns an error - a source-compatible Go API change
(callers that ignored it still compile) but worth noting for embedders.1 parent a05e6dd commit d36b9eb
15 files changed
Lines changed: 758 additions & 43 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
43 | 49 | | |
44 | 50 | | |
45 | 51 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
92 | 92 | | |
93 | 93 | | |
94 | 94 | | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
95 | 237 | | |
96 | 238 | | |
97 | 239 | | |
| |||
1065 | 1207 | | |
1066 | 1208 | | |
1067 | 1209 | | |
| 1210 | + | |
| 1211 | + | |
| 1212 | + | |
| 1213 | + | |
| 1214 | + | |
| 1215 | + | |
| 1216 | + | |
| 1217 | + | |
| 1218 | + | |
| 1219 | + | |
| 1220 | + | |
| 1221 | + | |
| 1222 | + | |
| 1223 | + | |
| 1224 | + | |
| 1225 | + | |
| 1226 | + | |
1068 | 1227 | | |
1069 | 1228 | | |
1070 | 1229 | | |
| |||
1073 | 1232 | | |
1074 | 1233 | | |
1075 | 1234 | | |
| 1235 | + | |
| 1236 | + | |
| 1237 | + | |
| 1238 | + | |
| 1239 | + | |
1076 | 1240 | | |
1077 | 1241 | | |
1078 | 1242 | | |
| |||
1150 | 1314 | | |
1151 | 1315 | | |
1152 | 1316 | | |
| 1317 | + | |
| 1318 | + | |
| 1319 | + | |
| 1320 | + | |
| 1321 | + | |
| 1322 | + | |
| 1323 | + | |
| 1324 | + | |
| 1325 | + | |
1153 | 1326 | | |
1154 | 1327 | | |
1155 | 1328 | | |
| |||
1158 | 1331 | | |
1159 | 1332 | | |
1160 | 1333 | | |
1161 | | - | |
1162 | | - | |
| 1334 | + | |
1163 | 1335 | | |
1164 | 1336 | | |
1165 | 1337 | | |
1166 | | - | |
| 1338 | + | |
| 1339 | + | |
| 1340 | + | |
| 1341 | + | |
| 1342 | + | |
| 1343 | + | |
| 1344 | + | |
| 1345 | + | |
| 1346 | + | |
| 1347 | + | |
1167 | 1348 | | |
1168 | 1349 | | |
1169 | 1350 | | |
1170 | 1351 | | |
1171 | 1352 | | |
1172 | 1353 | | |
1173 | 1354 | | |
| 1355 | + | |
| 1356 | + | |
| 1357 | + | |
| 1358 | + | |
| 1359 | + | |
| 1360 | + | |
1174 | 1361 | | |
1175 | 1362 | | |
1176 | 1363 | | |
| |||
1265 | 1452 | | |
1266 | 1453 | | |
1267 | 1454 | | |
1268 | | - | |
1269 | | - | |
1270 | | - | |
1271 | | - | |
| 1455 | + | |
| 1456 | + | |
| 1457 | + | |
| 1458 | + | |
| 1459 | + | |
| 1460 | + | |
| 1461 | + | |
| 1462 | + | |
| 1463 | + | |
| 1464 | + | |
| 1465 | + | |
1272 | 1466 | | |
1273 | | - | |
| 1467 | + | |
1274 | 1468 | | |
1275 | 1469 | | |
1276 | | - | |
1277 | | - | |
1278 | | - | |
| 1470 | + | |
| 1471 | + | |
| 1472 | + | |
| 1473 | + | |
1279 | 1474 | | |
1280 | 1475 | | |
1281 | 1476 | | |
| |||
1470 | 1665 | | |
1471 | 1666 | | |
1472 | 1667 | | |
| 1668 | + | |
| 1669 | + | |
| 1670 | + | |
| 1671 | + | |
| 1672 | + | |
| 1673 | + | |
1473 | 1674 | | |
1474 | 1675 | | |
1475 | 1676 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
49 | 71 | | |
50 | 72 | | |
51 | 73 | | |
| |||
193 | 215 | | |
194 | 216 | | |
195 | 217 | | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
196 | 231 | | |
197 | 232 | | |
198 | 233 | | |
0 commit comments