Skip to content

[🐛 BUG]: SIGSEGV in PHP worker causes full activity pool restart instead of single worker replacement #2335

@koren88

Description

@koren88

No duplicates 🥲.

  • I have searched for a similar issue in our bug tracker and didn't find any solutions.

What happened?

When a PHP worker process receives SIGSEGV (e.g. via posix_kill(getmypid(), SIGSEGV) inside activity code), RoadRunner does not isolate the fault to that single worker. Instead, the entire activity pool is restarted — all 5 sibling workers are killed and replaced.
This is analogous to issue temporalio/roadrunner-temporal#699 which reported missing graceful handling for SIGQUIT/SIGUSR2. In that case the signal bypassed proper shutdown. Here the issue is that a fatal signal on one worker propagates to the whole pool, causing unnecessary disruption to in-flight activities on unaffected workers.

Version (rr --version)

Plugin / Component: roadrunner-temporal / activity pool
RoadRunner version: 2025.1.8 (buildtime: 2026-02-19)

How to reproduce the issue?

Add the following anywhere inside activity code:
phpposix_kill(getmypid(), SIGSEGV);

Relevant log output

[INFO] RoadRunner server started; version: 2025.1.8, buildtime: 2026-02-19T14:58:23+0000
WARN  RoadRunner can't communicate with the worker  {"reason": "worker hung or process was killed", "pid": 22, "internal_event_name": "EventWorkerError", "error": "sync_worker_receive_frame: Network: EOF"}
Activity error.  {"ActivityType": "AttributeTableIntegration.Process", "Attempt": 1, "Error": "activity_pool_execute_activity: Network:\n\tsync_worker_receive_frame: EOF"}
reset signal received, resetting activity pool
activity pool restarted

Worker table before (5 stable workers, PIDs 167–171):
┌─────┬────────┬───────┬────────┬───────┬───────────────┐
│ PID │ STATUS │ EXECS │ MEMORY │ CPU % │    CREATED    │
├─────┼────────┼───────┼────────┼───────┼───────────────┤
│ 115 │ ready  │ 9     │ 169 MB │ 0.92  │ 6 minutes ago │
│ 167 │ ready  │ 0     │ 162 MB │ 1.10  │ 5 minutes ago │
│ 168 │ ready  │ 0     │ 162 MB │ 1.11  │ 5 minutes ago │
│ 169 │ ready  │ 0     │ 162 MB │ 1.10  │ 5 minutes ago │
│ 170 │ ready  │ 0     │ 162 MB │ 1.10  │ 5 minutes ago │
│ 171 │ ready  │ 0     │ 162 MB │ 1.11  │ 5 minutes ago │
└─────┴────────┴───────┴────────┴───────┴───────────────┘
Worker table after (all 5 workers replaced, PIDs 350–354, high CPU spike during init):
┌─────┬────────┬───────┬────────┬───────┬───────────────┐
│ PID │ STATUS │ EXECS │ MEMORY │ CPU % │    CREATED    │
├─────┼────────┼───────┼────────┼───────┼───────────────┤
│ 115 │ ready  │ 14    │ 169 MB │ 0.85  │ 6 minutes ago │
│ 350 │ ready  │ 0     │ 160 MB │ 37.75 │ 7 seconds ago │
│ 351 │ ready  │ 0     │ 162 MB │ 37.52 │ 7 seconds ago │
│ 352 │ ready  │ 0     │ 162 MB │ 37.86 │ 7 seconds ago │
│ 353 │ ready  │ 0     │ 162 MB │ 37.86 │ 7 seconds ago │
│ 354 │ ready  │ 0     │ 162 MB │ 37.52 │ 7 seconds ago │
└─────┴────────┴───────┴────────┴───────┴───────────────┘

Metadata

Metadata

Assignees

Labels

C-enhancementCategory: enhancement. Meaning improvements of current module, transport, etc..

Type

No fields configured for Task.

Projects

Status

📋 Backlog

Relationships

None yet

Development

No branches or pull requests

Issue actions