|
| 1 | +# opensandbox-supervisor |
| 2 | + |
| 3 | +A lightweight process supervisor that wraps a single worker with restart backoff, lifecycle hooks, a crashloop circuit breaker, and a structured event log. Designed to run as a container `ENTRYPOINT` or as a child of another process; it does not assume PID 1 and performs no zombie reaping. |
| 4 | + |
| 5 | +## Usage |
| 6 | + |
| 7 | +``` |
| 8 | +opensandbox-supervisor [flags] -- <worker-cmd> [worker-args...] |
| 9 | +``` |
| 10 | + |
| 11 | +Everything after `--` is the worker command. The supervisor starts the worker, monitors it, and restarts it on unexpected exits. |
| 12 | + |
| 13 | +### Example (egress sidecar) |
| 14 | + |
| 15 | +```dockerfile |
| 16 | +ENTRYPOINT ["/opt/opensandbox-egress/supervisor", \ |
| 17 | + "--pre-start=/opt/opensandbox-egress/cleanup.sh", \ |
| 18 | + "--name=egress", \ |
| 19 | + "--grace-period=20s", \ |
| 20 | + "--", \ |
| 21 | + "/opt/opensandbox-egress/egress"] |
| 22 | +``` |
| 23 | + |
| 24 | +## Flags |
| 25 | + |
| 26 | +| Flag | Default | Description | |
| 27 | +|------|---------|-------------| |
| 28 | +| `--pre-start` | _(none)_ | Executable to run before each worker launch (repeatable). No shell expansion; wrap in a script if needed. | |
| 29 | +| `--post-exit` | _(none)_ | Executable to run after each worker exit (repeatable). Receives `WORKER_*` env vars. Failures are logged, not fatal. | |
| 30 | +| `--event-log` | stderr | Path to JSONL event log file. Supports rotation via lumberjack. | |
| 31 | +| `--backoff-min` | `1s` | Minimum restart backoff. | |
| 32 | +| `--backoff-max` | `30s` | Maximum restart backoff (exponential growth capped here). | |
| 33 | +| `--backoff-jitter` | `0.1` | Jitter fraction (±10%). Set to `0` to disable. | |
| 34 | +| `--stable-after` | `60s` | Worker uptime after which backoff resets to minimum. | |
| 35 | +| `--burst-window` | `5m` | Sliding window for crashloop detection. | |
| 36 | +| `--burst-max` | `10` | Maximum launches allowed within `burst-window` before the breaker trips. | |
| 37 | +| `--on-burst-exit` | `true` | `true`: supervisor exits non-zero when burst budget trips (lets kubelet react). `false`: keep retrying indefinitely. | |
| 38 | +| `--grace-period` | `10s` | Time between SIGTERM and SIGKILL when shutting the worker down. | |
| 39 | +| `--pre-start-timeout` | `30s` | Timeout for each pre-start hook execution. | |
| 40 | +| `--post-exit-timeout` | `30s` | Timeout for each post-exit hook execution. | |
| 41 | +| `--name` | _(basename of worker cmd)_ | Worker name shown in logs and events. | |
| 42 | +| `--log-level` | `info` | Supervisor diagnostic log level (`debug`\|`info`\|`warn`\|`error`). | |
| 43 | + |
| 44 | +## Restart Behavior |
| 45 | + |
| 46 | +### Exponential Backoff |
| 47 | + |
| 48 | +When the worker exits unexpectedly, the supervisor sleeps before restarting: |
| 49 | + |
| 50 | +``` |
| 51 | +1s → 2s → 4s → 8s → 16s → 30s → 30s → ... |
| 52 | +``` |
| 53 | + |
| 54 | +Each delay is perturbed by ±`backoff-jitter` (default ±10%) to avoid thundering herds. After the worker has been alive at least `stable-after` (default 60 s), the backoff resets to `backoff-min`. |
| 55 | + |
| 56 | +### Crashloop Circuit Breaker |
| 57 | + |
| 58 | +A sliding-window counter tracks launches. If more than `burst-max` (default 10) launches occur within `burst-window` (default 5 min), the supervisor either: |
| 59 | + |
| 60 | +- **Exits non-zero** (`--on-burst-exit=true`, default) — surfacing the crashloop via Kubernetes pod status instead of silently retrying. |
| 61 | +- **Continues retrying** (`--on-burst-exit=false`) — for environments without an outer restart supervisor. |
| 62 | + |
| 63 | +## Lifecycle Hooks |
| 64 | + |
| 65 | +### Pre-start hooks |
| 66 | + |
| 67 | +Run **before each worker launch**. A non-zero exit aborts that launch attempt and counts toward the crashloop budget. Use for cleanup tasks like reaping orphaned child processes from a previous crash. |
| 68 | + |
| 69 | +### Post-exit hooks |
| 70 | + |
| 71 | +Run **after the worker has been reaped**. Failures are logged but do not block the restart loop. Post-exit hooks run to completion even during shutdown (bounded by `--post-exit-timeout`) so cleanup paths are not aborted. |
| 72 | + |
| 73 | +Post-exit hooks receive these environment variables: |
| 74 | + |
| 75 | +| Variable | Description | |
| 76 | +|----------|-------------| |
| 77 | +| `WORKER_EXIT_CODE` | Worker's exit code (`-1` if not available) | |
| 78 | +| `WORKER_SIGNAL` | Signal name if worker was signaled (e.g. `terminated`, `killed`) | |
| 79 | +| `WORKER_DURATION_MS` | Wall-clock worker runtime in milliseconds | |
| 80 | +| `WORKER_PID` | Worker's PID | |
| 81 | +| `WORKER_ATTEMPT` | Launch attempt number (1-based) | |
| 82 | + |
| 83 | +## Graceful Shutdown |
| 84 | + |
| 85 | +On context cancellation (typically from `SIGTERM` or `SIGINT`): |
| 86 | + |
| 87 | +1. Supervisor sends `SIGTERM` to the worker. |
| 88 | +2. Waits up to `--grace-period` for the worker to exit on its own. |
| 89 | +3. Sends `SIGKILL` if the worker does not exit in time. |
| 90 | + |
| 91 | +### Signal Handling |
| 92 | + |
| 93 | +- The supervisor does **not** install `signal.Notify` itself; the caller (e.g. `cmd/supervisor/main.go`) translates OS signals into context cancellation. |
| 94 | +- `SIGINT` and `SIGTERM` both result in `SIGTERM` to the worker. |
| 95 | +- Other signals (`SIGHUP`, `SIGUSR1`, etc.) are **not forwarded**. Add forwarding in the caller if the worker needs them. |
| 96 | + |
| 97 | +### Process Group Isolation |
| 98 | + |
| 99 | +The worker is started with `Setpgid=true` on Unix so signals delivered to the supervisor's process group do not reach the worker by side channel. The supervisor signals the worker explicitly via its PID. |
| 100 | + |
| 101 | +## Structured Event Log |
| 102 | + |
| 103 | +One JSONL record per lifecycle event, written to stderr by default or to the file specified by `--event-log` (with automatic rotation). |
| 104 | + |
| 105 | +### Event Kinds |
| 106 | + |
| 107 | +| Event | When | Key Fields | |
| 108 | +|-------|------|------------| |
| 109 | +| `start` | Worker process launched | `pid`, `gen`, `attempt` | |
| 110 | +| `exit` | Worker exited | `pid`, `gen`, `attempt`, `exit_code`, `signal`, `duration_ms`, `reason` | |
| 111 | +| `prestart` | Pre-start hook ran | `hook`, `exit_code`, `duration_ms` | |
| 112 | +| `postexit` | Post-exit hook ran | `hook`, `exit_code`, `duration_ms` | |
| 113 | +| `backoff` | Sleeping before next restart | `sleep_ms`, `next_attempt` | |
| 114 | +| `stable` | Worker uptime exceeded `stable-after`; backoff reset | `pid`, `gen`, `duration_ms`, `reset_backoff` | |
| 115 | +| `burst_exit` | Crashloop budget exceeded | `attempts`, `window` | |
| 116 | +| `shutdown` | Supervisor shutting down | `reason` | |
| 117 | + |
| 118 | +### Example Events |
| 119 | + |
| 120 | +```jsonl |
| 121 | +{"ts":"2026-01-15T10:30:00Z","name":"egress","event":"start","pid":42,"gen":1,"attempt":1} |
| 122 | +{"ts":"2026-01-15T10:30:00.15Z","name":"egress","event":"exit","pid":42,"gen":1,"attempt":1,"exit_code":1,"duration_ms":150,"reason":"crashed"} |
| 123 | +{"ts":"2026-01-15T10:30:00.15Z","name":"egress","event":"backoff","sleep_ms":1000,"next_attempt":2} |
| 124 | +{"ts":"2026-01-15T10:30:01.15Z","name":"egress","event":"prestart","hook":"cleanup.sh","exit_code":0,"duration_ms":50} |
| 125 | +{"ts":"2026-01-15T10:30:01.2Z","name":"egress","event":"start","pid":43,"gen":2,"attempt":2} |
| 126 | +``` |
| 127 | + |
| 128 | +### Exit Reasons |
| 129 | + |
| 130 | +| Reason | Meaning | |
| 131 | +|--------|---------| |
| 132 | +| `exited` | Worker exited with code 0 | |
| 133 | +| `crashed` | Worker exited with non-zero code | |
| 134 | +| `signaled` | Worker killed by signal | |
| 135 | +| `shutdown` | Supervisor-initiated stop (context cancelled) | |
| 136 | +| `launch_failed` | Worker binary could not be started | |
| 137 | +| `no_processstate` | Unexpected: no process state available | |
| 138 | + |
| 139 | +## Library Usage |
| 140 | + |
| 141 | +The `internal/supervisor` package can be used programmatically: |
| 142 | + |
| 143 | +```go |
| 144 | +import "github.com/alibaba/opensandbox/internal/supervisor" |
| 145 | + |
| 146 | +spec := supervisor.Spec{ |
| 147 | + Name: "my-worker", |
| 148 | + Cmd: "/usr/local/bin/worker", |
| 149 | + Args: []string{"--config", "/etc/worker.toml"}, |
| 150 | + PreStart: []supervisor.Hook{{Argv: []string{"/usr/local/bin/cleanup.sh"}}}, |
| 151 | + BackoffMin: time.Second, |
| 152 | + BackoffMax: 30 * time.Second, |
| 153 | + GracePeriod: 15 * time.Second, |
| 154 | +} |
| 155 | + |
| 156 | +ctx, cancel := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.SIGTERM) |
| 157 | +defer cancel() |
| 158 | + |
| 159 | +err := supervisor.Run(ctx, spec) |
| 160 | +``` |
| 161 | + |
| 162 | +`Run` blocks until context cancellation or `ErrBurstExceeded`. Zero-valued fields receive sensible defaults (see Flags table above for values). |
0 commit comments