Skip to content

Bridge seam: harness-agnostic FO-event egress (DRC-3798)#445

Open
gcko wants to merge 14 commits into
bridge-seam-inbox-eventsfrom
feature/drc-3798-harness-agnostic-fo-events
Open

Bridge seam: harness-agnostic FO-event egress (DRC-3798)#445
gcko wants to merge 14 commits into
bridge-seam-inbox-eventsfrom
feature/drc-3798-harness-agnostic-fo-events

Conversation

@gcko

@gcko gcko commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

What & why

Makes the Bridge ↔ Spacedock seam harness-agnostic and closes the full captain-intent conversation loop. Implements the DRC-3799 harness-agnosticism audit of the Bridge seam (PR #435) plus the follow-up contract, wake, and alert work needed for a live Bridge conversation across Claude, Codex, and Pi.

The seam is two flows over the shared _bridge/ directory:

  • Ingress (captain intent → FO): Bridge appends intent to _bridge/inbox.jsonl; each FO drains it on its own portable mod-hook loop, with an opt-in durable wake for parked Codex sessions.
  • Egress (FO liveness/activity/replies → Bridge): Spacedock writes host-normalized activity, session markers, FO replies/acks, and permission alerts back to Bridge.

The event schema stays Spacedock-owned and harness-neutral; concrete producers are bound per host. Claude, Codex, and Pi ship packaged event producers via the shared spacedock bridge egress emit --host <host> command; deterministic session→entity marker parity remains Claude-proven only.

Stacked on #435 (bridge-seam-inbox-events) — target files only exist on that branch; base retargets to main when #435 merges. Review only the diff here. Companion Bridge implementation: spacedock-dev/bridge#48.

What's in this PR

  1. Harness-neutral egress (internal/bridgeegress/) — normalizes host payloads into a stable _bridge/events.jsonl schema and first-write-wins _bridge/sessions/<actor_id>.json markers; normalizes Pi-native lifecycle names to Bridge canonical events. Observe-only: malformed input or write failures degrade to no-op.

  2. Shared egress command + per-host producers — hidden spacedock bridge egress emit --host <host> (internal/cli/cli.go). Claude's hook (scripts/spacedock-bridge-events.sh) becomes a thin wrapper; Codex gets its own non-async hooks (scripts/codex-bridge-events.sh, hooks/codex-hooks.json, .codex-plugin/plugin.json) that call the emitter via SPACEDOCK_BIN/PATH without reusing Claude's async file, env vars, or plugin-cache wrapper; Pi forwards lifecycle events via .pi/extensions/spacedock.ts.

  3. Durable Codex wake (internal/bridgeingress/) — spacedock bridge ingress wake --host codex resumes parked Codex FO sessions for inbox records not yet delivered (by cursor or FO reply/ack). A wake is only an attempt; delivery is confirmed by the FO-owned drain + ack, so it stays safe against the gate-vs-inbox race.

  4. FO permission alerts (internal/bridgealert/) — spacedock bridge alert permission appends non-blocking captain interrupts to _bridge/fo-alerts.jsonl.

  5. Front-door --plugin-dir split (internal/cli/frontdoor.go) — before--- --plugin-dir dirs are kept separate from host passthrough: Claude receives them at launch, Codex installs them via a local marketplace symlink (Codex has no launch-time --plugin-dir). --no-install + --plugin-dir is rejected for Codex.

  6. Pi extension gating (internal/cli/pi.go) — Pi package/runtime checks now require the Spacedock Pi extension and the ensign skill, so a skills-only package is not treated as extension-capable.

  7. Contract + skill docsdocs/dev/bridge-egress-contract.md and docs/dev/_mods/bridge-inbox.md pin the harness-neutral producer contract, target_set routing, per-workflow cursors, _bridge/fo-replies.jsonl reply/ack semantics, and at-least-once (not exactly-once) delivery; first-officer runtime references document per-host producer support vs. Claude-only marker parity.

Validation

go build ./..., go vet ./..., go test ./... (all green except a pre-existing, environment-dependent TestSurveyCodexPresenceThroughSync that fails identically on the base branch and is unrelated to this PR).

gcko and others added 5 commits June 30, 2026 18:56
…rn (DRC-3798)

The DRC-3799 audit of the Bridge seam found it is two flows in opposite
directions: ingress (captain intent -> FO, the bridge-inbox drain) was
already harness-agnostic on the portable mod-hook loop, while egress (FO
liveness/activity -> Bridge: events.jsonl, the session->entity marker, the
heartbeat session id) was Claude-Code-coupled with no adapter seam.

This routes the egress through Spacedock's existing per-host adapter pattern
(the same PRESENT/ABSENT idiom fo-dispatch-core.md uses), keeping the schema
Spacedock-owned and the producer per-host:

- docs/dev/bridge-egress-contract.md (new): the harness-neutral schema for all
  four egress surfaces (events.jsonl, fo.$SLUG.json heartbeat, fo-feed.jsonl,
  the session->entity marker) + the per-host producer bindings. Claude PRESENT;
  Codex/Pi ABSENT/TODO with the exact open work named. Records the decision
  that the deterministic RUNNING badge is Claude-only for now, with graceful
  degradation on other hosts (heartbeat still attaches; git + fo-feed still
  drive fleet-history; only the live FO-vs-ensign badge is withheld).
- A "## Bridge egress" binding section in each of the claude/codex/pi
  first-officer runtime adapters; the claude ensign badge paragraph and
  shared-core step 7b now point at it.
- The bridge-inbox heartbeat session id moves off the hardcoded
  ${CLAUDE_CODE_SESSION_ID:-} onto a neutral-first token with a built-in
  per-host fallback: ${SD_SESSION_ID:-${CLAUDE_CODE_SESSION_ID:-${CODEX_THREAD_ID:-}}}.
  The launcher cannot export SD_SESSION_ID (the harness mints the session id
  inside the session) and a per-tick FO export is fragile, so the fallback
  keeps Claude/Codex populated with no regression while SD_SESSION_ID stays the
  neutral override the contract documents.
- bridge_session_link_test.go reframed as TestClaudeAdapterConformsToEgressContract:
  the harness-neutral contract is the unit under test (parse-based assertions on
  the events.jsonl line shape + nesting + the session-marker shape), so a future
  Codex/Pi producer reuses the same assertions with its own input builder.

Doc/contract + test only; no producer behavior change. contractlint, build,
go vet, and the reframed contract test pass. (Pre-existing, unrelated:
TestSurveyCodexPresenceThroughSync.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Jared Scott <jared.scott@infuseai.io>
…o the DRC-3798 egress branch

Keeps #445 stacked on the refreshed base so its diff stays just the egress-abstraction changes. Clean merge; contractlint + the egress contract test pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Jared Scott <jared.scott@infuseai.io>
Signed-off-by: Jared Scott <jared.scott@variable.team>
@gcko gcko self-assigned this Jul 1, 2026
Signed-off-by: Jared Scott <jared.scott@variable.team>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the Bridge seam to be harness-agnostic on egress by routing FO/ensign lifecycle activity through a shared, host-neutral internal/bridgeegress emitter and binding host-specific wrappers (Claude hooks, Codex hooks, Pi extension) to a single hidden CLI surface (spacedock bridge egress emit --host <host>). It also tightens the Bridge ↔ FO conversation-loop contract (intent ids, frozen target_set, and best-effort _bridge/fo-replies.jsonl acknowledgements) with doc updates and contractlint tests.

Changes:

  • Add a host-neutral egress normalizer/writer for _bridge/events.jsonl and first-write-wins _bridge/sessions/* markers, including Pi lifecycle-name normalization.
  • Add a hidden, silent CLI command (bridge egress emit) and update Claude/Codex/Pi host bindings to call it.
  • Update Bridge inbox/egress contract docs and add tests locking the reply-loop semantics and host packaging expectations.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
skills/integration/testdata/codex/bridge-egress-minimal-session-start.json Adds a minimal Codex hook payload fixture for egress tests.
skills/integration/codex_bridge_egress_hook_test.go Verifies Codex plugin manifest/hooks wiring and wrapper silence/argv contract.
skills/integration/bridge_session_link_test.go Reworks Claude adapter conformance test to assert harness-neutral egress output shapes.
skills/first-officer/references/pi-first-officer-runtime.md Documents Pi Bridge-egress support boundaries (events packaged; markers unclaimed).
skills/first-officer/references/first-officer-shared-core.md Updates fleet-mode + startup guidance to reflect target_set and egress capability split.
skills/first-officer/references/codex-first-officer-runtime.md Documents Codex Bridge-egress bindings and session-id fallback semantics.
skills/first-officer/references/claude-first-officer-runtime.md Documents Claude Bridge-egress bindings (events + deterministic marker path).
skills/ensign/references/claude-ensign-runtime.md Clarifies ensign-side running-badge behavior via Claude egress hooks/markers.
scripts/spacedock-bridge-events.sh Simplifies Claude wrapper to delegate to shared CLI emitter.
scripts/codex-bridge-events.sh Adds Codex wrapper delegating to the shared CLI emitter (silent, observe-only).
internal/contractlint/fo_feed_and_eager_drain_test.go Adds contract locks for inbox id/target_set and _bridge/fo-replies.jsonl semantics.
internal/cli/pi.go Tightens Pi package gating to require the Spacedock Pi extension in addition to skills.
internal/cli/pi_frontdoor_test.go Updates Pi tests for new extension gate and dev-override behavior.
internal/cli/pi_egress_test.go Adds tests for Pi extension wiring + package manifest advertising + runtime extension gate.
internal/cli/cli.go Adds hidden spacedock bridge egress emit --host <host> command wiring to bridgeegress.
internal/cli/bridge_egress_test.go Tests the hidden bridge egress CLI for silence, output, and malformed-payload no-op.
internal/bridgeegress/egress.go Introduces host-neutral egress normalizer/writer (events + markers + truncation).
internal/bridgeegress/egress_test.go Adds unit tests for event schema, Claude marker behavior, truncation, and Pi normalization.
hooks/codex-hooks.json Adds Codex non-async command hooks invoking the Codex wrapper via PLUGIN_ROOT.
docs/dev/bridge-egress-contract.md Adds/updates the harness-neutral egress contract (events, heartbeat, feed, replies, markers).
docs/dev/_mods/bridge-inbox.md Updates inbox schema/routing (id, target_set) and defines _bridge/fo-replies.jsonl rules.
.pi/extensions/spacedock.ts Extends Pi extension to forward lifecycle events into the shared CLI emitter.
.codex-plugin/plugin.json Points Codex plugin manifest at the Codex-specific hooks file.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/bridgeegress/egress.go Outdated
Comment thread skills/integration/bridge_session_link_test.go Outdated
Comment thread docs/dev/bridge-egress-contract.md Outdated
gcko and others added 8 commits July 1, 2026 20:34
Signed-off-by: Jared Scott <jared.scott@variable.team>
Signed-off-by: Jared Scott <jared.scott@variable.team>
Signed-off-by: Jared Scott <jared.scott@variable.team>
Signed-off-by: Jared Scott <jared.scott@variable.team>
Signed-off-by: Jared Scott <jared.scott@variable.team>
Review follow-ups on the harness-agnostic FO-event work:

- wake: reclaim a stale .wake-lock.codex left by a crashed/killed wake
  (O_EXCL alone permanently wedged durable delivery on any crash).
- wake: validate session ids read from _bridge/ state before they become
  a codex argv positional (argument-injection hardening).
- wake/egress: raise the bufio scan limit above 64KB so a large record
  cannot hard-fail the inbox scan or silently disable the event-log trim.
- codex: drop the orphaned scripts/codex-bridge-events.sh wrapper and its
  test; codex-hooks.json inlines the emitter call and a test forbids ever
  wiring the wrapper in, so it was superseded dead code.
- docs: fix the Claude events.jsonl schema (was missing timestamp/host/
  actor_id), the "Last-write" mislabel on the first-write-wins marker, and
  the codex/pi marker path (<actor_id>.json, not <session_id>.json).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@gcko

gcko commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Code Review: PR #445

Reviewed 1b16bbe3 · Fixes pushed eb1fba3a · Verdict GO (blocking findings resolved in this pass)

Full antagonistic pass (correctness, security data-flow, cross-reference, error-handling, tests, diff, docs-vs-code) over the egress/ingress/alert Go code, the host wiring (Claude/Codex/Pi hooks + scripts), and the contract/skill docs. Reviewed against base bridge-seam-inbox-events (stacked PR). No BLOCKERs found. The ISSUEs below were fixed and pushed in eb1fba3a.

Issues (fixed in eb1fba3a)

  1. internal/bridgeingress/wake.go:308 (orig) — acquireLock used O_CREATE|O_EXCL with no stale recovery. A wake that is killed/crashes before its deferred unlock() leaves .wake-lock.codex forever, so every later wake returns locked and Codex is never woken again — permanently defeating the PR's own "durable delivery" goal.
    Fix: reclaim a lock older than staleLockTTL (5m; a wake pass is sub-second). Added TestWakeReclaimsStaleLock + TestWakeSkipsWhenFreshLockHeld.

  2. internal/bridgeegress/egress.go:248 (orig) — truncateEvents used a default bufio.Scanner (64KB cap). A single line >64KB makes scanner.Err() non-nil, the guard bails without trimming, and events.jsonl then grows unbounded for every later emit. detail.tool/source/timestamp are copied verbatim from the payload, so a pathological host defeats the size cap.
    Fix: raised the scan buffer to 1 MiB.

  3. scripts/codex-bridge-events.sh — orphaned dead code. .codex-plugin/plugin.jsonhooks/codex-hooks.json, which inlines the binary-resolution + emit logic; the wrapper is referenced nowhere in production and codex_bridge_egress_hook_test.go:81 forbids the hooks file from ever referencing it. Its header still claimed to be the live hook, and it duplicated drift-prone logic. Git history confirms it was superseded when the hooks were inlined (commit 0682d239).
    Fix: removed the script and its dead wrapper test.

  4. skills/first-officer/references/claude-first-officer-runtime.md:39 — published a wrong events.jsonl schema {"ts","event","session_id","agent_id","agent_type","detail"}, omitting timestamp, host, actor_id, which egress.go writes unconditionally (Event, egress.go:41-51). The codex/pi adapters and the contract already show the full shape; only Claude drifted.
    Fix: aligned to the full contract schema.

  5. docs/dev/bridge-egress-contract.md:96 — the session-marker section was labeled Last-write (first-write-wins per host actor) — self-contradictory. writeMarker uses O_EXCL (first-write-wins). Fix: dropped the "Last-write" prefix.

Notes

  • Fixed (hardening): wake.go passed a session_id read from _bridge/ state straight into codex exec resume <sessionID> without the safeID/slug validation used elsewhere; a poisoned marker/heartbeat/event could inject a leading-dash token codex parses as a flag (requires local write to _bridge/, so defense-in-depth). Now validated via safeSessionID. readInbox's 64KB hard-fail is also lifted by the same 1 MiB buffer bump.
  • Not changed — flagged for author (docs, contract-pinned + functionally correct):
    • bridge-egress-contract.md:65 / _mods/bridge-inbox.md:118 say replies fold "by intent id (or legacy line fallback)… and reply kind"; wake.go:410 replyKey actually keys on in_reply_to_line (always) + id-or-ts + intent_kind + target. Works because the FO emits these deterministically, but the stated mechanism is inaccurate. Left unchanged because the exact strings are pinned by contractlint — worth a follow-up that updates doc + test together.
    • _mods/bridge-inbox.md:117 / contract:65 call fo-replies.jsonl "best-effort, not a delivery ledger," but wake.go:360 drops a target from the pending set on any matching reply regardless of status (a blocked ack still suppresses re-wake). Bounded to the crash-window; the framing understates it.
    • bridge-egress-contract.md:43 says emit normalizes host-native names for all hosts; egress.go:293 only normalizes for pi (Claude/Codex rely on already-canonical hook registration). Safe today, but the phrasing overstates the code.
  • Not changed — behavior (best-effort telemetry): .pi/extensions/spacedock.ts:29-45 registers 10 lifecycle events, but canonicalPiEventName maps tool_execution_start/_end, tool_call, tool_result all to PostToolUse (and session_shutdown+turn_end to Stop). If Pi emits overlapping pairs, each tool call spawns up to 4 short-lived spacedock processes writing duplicate liveness lines. Non-fatal; worth de-duping later.
  • Test gaps: TestEmitMarkerFirstWriteWins is sequential (never exercises the concurrent race the O_EXCL guard defends); no test drives the >64KB scan path; the codex inlined hook command and hook→emitter stdin forwarding are only structurally asserted, never executed end-to-end; TestCodexBridgeEgressMinimalPayloadFixture only checks the fixture's own fields.

Confirmed correct (hostile hypotheses that did not hold)

  • No path traversal: slugs (wake.go) and actor_id (egress.go) are safeID/slug-gated (no /); DeriveEntity enforces pathRelInside(cwd,…) and rejects _archive.
  • No shell injection: codex is exec'd via argv, no shell. Marker writes are atomically first-write-wins (O_EXCL); event trim is temp-file + atomic rename with cleanup on every branch.
  • Pi lifecycle→canonical mapping is complete vs. the extension's emitted names; Event/Marker JSON tags match the contract field-for-field; egress/emit are panic-free on malformed/empty/wrong-type stdin (typed struct + early return).
  • pi.go correctly requires both the Pi extension and the ensign skill (a skills-only package is not extension-capable); the Codex hooks carry no Claude env / async / plugin-cache dependency.

Limits (verification gaps, not PR defects)

  • TestSurveyCodexPresenceThroughSync fails locally, but it fails identically on base bridge-seam-inbox-events, is not in this PR's diff, and depends on the locally-installed agentsview binary version (asserts blank_cwd > 0). Pre-existing and environment-dependent — out of scope.
  • No failing GitHub Actions exist for this PR. docs.yml and install-e2e.yml only trigger on PRs targeting main; this PR targets the stacked bridge-seam-inbox-events, so they don't run. The only check that ran, Copilot Code Review, passed. Normal CI will engage automatically once the base retargets to main.

@gcko gcko left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review complete: 5 issues found and fixed in eb1fba3 (durable-wake stale-lock recovery, egress/inbox scan limit, session-id argv hardening, orphaned codex wrapper removed, egress-doc schema/label corrections). Residual notes (contract-pinned doc wording, Pi lifecycle fan-out, test gaps) are non-blocking — see the review comment. No failing Actions exist on this base.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants