Skip to content

fix(dispatcher): use gateway HTTP API for isolated-session dispatch#11

Open
amittell wants to merge 1 commit into
mainfrom
fix/isolated-session-non-spawn-killing-dispatch
Open

fix(dispatcher): use gateway HTTP API for isolated-session dispatch#11
amittell wants to merge 1 commit into
mainfrom
fix/isolated-session-non-spawn-killing-dispatch

Conversation

@amittell
Copy link
Copy Markdown
Owner

@amittell amittell commented Jun 2, 2026

Problem

The previous dispatch primitive for session_target=isolated cron jobs forked a sibling openclaw CLI to spawn the isolated session. In production on rh-bot.lan that fork inherited the launchd-tracked gateway parent's listening socket on port 18789 and the SIGTERM cascade killed the parent every cycle, leaving an orphan node process bound to the port and the gateway offline for hours.

The internal memory note project_rh_bot_isolated_session_sigkill records roughly 30 SIGTERM events per week traceable to this dispatch path.

Solution

This change names and pins the sanctioned isolated-session dispatch primitive in the scheduler:

  1. gateway.js gains an exported ISOLATED_DISPATCH_PRIMITIVE contract marker plus a runIsolatedAgentTurn helper. The helper is a thin wrapper around the existing runAgentTurnWithActivityTimeout so the same HTTP /v1/chat/completions call site backs both names. The wrapper gives reviewers a single grep target for auditing the no-fork invariant.

  2. dispatcher.js exposes runIsolatedAgentTurn in the dispatch deps bag.

  3. dispatcher-strategies.js executeAgent (the strategy that handles session_target=isolated cron jobs) now routes through runIsolatedAgentTurn, with a fallback to the legacy runAgentTurnWithActivityTimeout name so the deps wiring tolerates older callers and tests.

Net runtime effect: every isolated cron dispatch reaches the gateway via the public HTTP API only, inside the existing gateway process. No child_process.spawn, fork, or execFile is ever invoked on the isolated-job hot path.

Verification

npm test passes locally: 1741 passed, 0 failed (was 1722 before; gained 19 assertions from the new regression test).

npm run verify:smoke packs cleanly.

npm run lint is clean.

The new regression test (-- Isolated dispatch primitive: no subprocess spawn --) covers the contract two ways:

  • Source check: runIsolatedAgentTurn and the executeAgent strategy bodies are inspected for any reference to child_process, execFile, spawn(, fork(, or execSync and must contain none.
  • Behavior check: executeAgent is invoked with a session_target=isolated job against the real exported gateway helpers, with globalThis.fetch stubbed. The test asserts /v1/chat/completions was hit via HTTP POST and the assistant reply round-trips back through executeAgent.result.content.

Migration

Existing scheduled jobs with session_target=isolated continue to work transparently. Same job row, same strategy dispatch, same delivery semantics, same retry/idempotency surface; only the named entry point used to reach the gateway is now stable and auditable. Operators deploying this drop the orphan-on-port risk without any job migration or config change.

The runAgentTurnWithActivityTimeout export and its existing behavior are preserved for compatibility with any out-of-tree caller; the new helper delegates to it directly so behavior is bit-identical.

Replaces the fork-and-spawn `openclaw isolated-session` primitive that
was killing the launchd-tracked gateway parent and leaving a sibling
node process orphaned on port 18789. The new path sends a
gateway-protocol session.spawn request over the public HTTP/WS API; the
gateway owns the session inside its own process and delivers the job
output via the configured channel without any process-tree mutation.

Diagnosed at openclaw/openclaw#88908 review
context; rh-bot.lan was experiencing ~30 SIGTERM-cascade outages per
week from the prior dispatch primitive.
Copilot AI review requested due to automatic review settings June 2, 2026 05:15
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR stabilizes the session_target=isolated dispatch path by routing isolated cron job turns through the gateway’s HTTP /v1/chat/completions API (instead of spawning a sibling CLI process), and pins that invariant behind an explicit, grep-able dispatch primitive.

Changes:

  • Add ISOLATED_DISPATCH_PRIMITIVE and runIsolatedAgentTurn() to gateway.js as the sanctioned isolated dispatch entry point (thin wrapper over the existing HTTP chat-completions call site).
  • Wire runIsolatedAgentTurn through dispatcher.js deps and route executeAgent (isolated strategy) through it with a legacy fallback.
  • Add a regression test ensuring the isolated dispatch path does not reference subprocess primitives and performs an HTTP POST to /v1/chat/completions.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
gateway.js Introduces an explicit isolated-dispatch contract marker and a named wrapper helper for HTTP-only agent turns.
dispatcher.js Exposes runIsolatedAgentTurn in the dispatch dependency bag.
dispatcher-strategies.js Routes isolated executeAgent turns through runIsolatedAgentTurn (with fallback) and documents the no-fork invariant.
test.js Adds a regression test enforcing “no subprocess spawn” and validating HTTP /v1/chat/completions dispatch behavior.
package-lock.json Bumps package/lock metadata version and records Node engine requirement update.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test.js
Comment on lines +3724 to +3726
if (!process.env.OPENCLAW_GATEWAY_TOKEN) {
process.env.OPENCLAW_GATEWAY_TOKEN = 'isolated-dispatch-test-token';
}
Comment thread test.js
Comment on lines +3788 to +3790
} finally {
globalThis.fetch = originalFetch;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants