Skip to content

fix(opencode-plugin): allow --agent mode to register session binding#35

Open
mmkzer0 wants to merge 4 commits into
aannoo:mainfrom
mmkzer0:fix/bindIdentity-headless
Open

fix(opencode-plugin): allow --agent mode to register session binding#35
mmkzer0 wants to merge 4 commits into
aannoo:mainfrom
mmkzer0:fix/bindIdentity-headless

Conversation

@mmkzer0
Copy link
Copy Markdown
Contributor

@mmkzer0 mmkzer0 commented May 1, 2026

Summary

OpenCode instances launched in headless --agent mode were invisible to hcom hooks and hcom list (showing up as PTY-only). This PR fixes the session registration bypass so they correctly create session_bindings rows.
(Issue I encountered when codex should spawn headless opencode instance via hcom script when --agent was provided; env var worked).

Root cause

In --agent mode, OpenCode does not emit a session.created event. Execution starts directly at chat.message with the agent pre-configured.

bindIdentity() is the only path in the plugin that calls hcom opencode-start --session-id ... to create the session DB row. However, it was strictly guarded by:

if (process.env.HCOM_LAUNCHED !== "1") return

HCOM_LAUNCHED is injected by the PTY launcher, but not by the headless --agent path. This caused the guard to fire unconditionally. The fallback paths (chat.message and transform) hit the early return and skipped opencode-start entirely.

Changes

HCOM_LAUNCHED guard removal

  • Removed the strict HCOM_LAUNCHED !== "1" check in bindIdentity().
  • Safety is fully preserved by the internal instanceName || bindingPromise concurrency fence and the overarching checkHcom() binary verification gates.

Fallback caller hardening

  • Added !bindingPromise pre-checks to chat.message and experimental.chat.messages.transform.
  • This aligns the fallback callers with how the primary event handlers (session.created, permission.asked) already operate.

chat.message unbound guard

  • Added an explicit !instanceName && !sessionId check.
  • Previously, if bindIdentity() was called but failed (e.g. daemon absent), isBoundSession(null) would short-circuit to true and silently mutate agent/model state.
  • Now, it emits a WARN (plugin.chat_message_unbound) and explicitly skips mutation.

Regression Coverage

  • Added 4 new unit tests in src/instance_binding.rs covering the exact --agent mode binding invariants, lifecycle, and idempotency.

Testing

  • cargo test (all 1473 pass, including new instance_binding coverage)
  • cargo build --release
  • Manual: Launch hcom opencode --agent <name> headless. Verified it successfully registers its session and is visible to hcom hooks and hcom list.

Copilot AI review requested due to automatic review settings May 1, 2026 11:54
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes OpenCode --agent (headless) launches being invisible to hcom hooks / hcom list by ensuring session bindings are created even when session.created is not emitted.

Changes:

  • Remove the HCOM_LAUNCHED gate in the OpenCode plugin’s bindIdentity() path so opencode-start can run in --agent mode.
  • Harden fallback binding calls in chat.message and experimental.chat.messages.transform to avoid duplicate/concurrent binds.
  • Add Rust unit tests covering --agent-mode session binding creation and idempotency.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/opencode_plugin/hcom.ts Adjusts binding/guard logic so headless --agent sessions invoke hcom opencode-start and avoid unsafe state mutation when unbound.
src/instance_binding.rs Adds regression tests to verify bind_session_to_process creates session_bindings rows in --agent-like flows and remains idempotent.
Comments suppressed due to low confidence (1)

src/opencode_plugin/hcom.ts:305

  • With the HCOM_LAUNCHED guard removed, bindIdentity() can now run in non-hcom launches whenever the hcom binary is present. Since hcom opencode-start hard-requires HCOM_PROCESS_ID (it returns {error:"HCOM_PROCESS_ID not set"}), consider adding a cheap process.env.HCOM_PROCESS_ID presence check here to avoid repeatedly starting the notify server + spawning a subprocess that is guaranteed to fail (and to reduce WARN log noise) when OpenCode isn't launched by hcom.
  async function bindIdentity(sid: string): Promise<void> {
    if (instanceName || bindingPromise) return
    // No HCOM_LAUNCHED guard — allows binding even when OpenCode is launched
    // via `hcom opencode --agent` (agent mode skips session.created events).
    // Safe to call repeatedly: $.nothrow() catches failures and bindIdentity's
    // own early-return (instanceName || bindingPromise) handles duplicates.
    bindingPromise = (async () => {
      try {
        // Start TCP notify server before binding so port is registered atomically
        const notifyPort = startNotifyServer()
        const result = notifyPort
          ? await $.nothrow()`hcom opencode-start --session-id ${sid} --notify-port ${String(notifyPort)}`.quiet()
          : await $.nothrow()`hcom opencode-start --session-id ${sid}`.quiet()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/opencode_plugin/hcom.ts Outdated
Comment on lines +459 to +465
// Guard: if bindIdentity was called but produced no binding (daemon absent,
// missing HCOM_PROCESS_ID, etc.), both instanceName and sessionId remain null.
// isBoundSession(null) short-circuits to true on double-null, which would
// silently mutate agent/model state. Emit WARN and skip instead.
if (!instanceName && !sessionId) {
log("WARN", "plugin.chat_message_unbound", null, {
session_id: input.sessionID,
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new unbound guard only triggers when both instanceName and sessionId are null, but sessionId can be populated from earlier events even if binding ultimately fails (and input.sessionID may be absent). In that case isBoundSession(input.sessionID) still returns true for a null/undefined candidate and the handler will keep mutating currentAgent/currentModel while unbound. If the intent is to skip mutation whenever the chat.message lacks a sessionID or when there is no successful binding, guard on input.sessionID presence (or tighten isBoundSession semantics) before updating agent/model state.

Suggested change
// Guard: if bindIdentity was called but produced no binding (daemon absent,
// missing HCOM_PROCESS_ID, etc.), both instanceName and sessionId remain null.
// isBoundSession(null) short-circuits to true on double-null, which would
// silently mutate agent/model state. Emit WARN and skip instead.
if (!instanceName && !sessionId) {
log("WARN", "plugin.chat_message_unbound", null, {
session_id: input.sessionID,
// Only mutate agent/model state when the message carries a session ID and
// binding has actually succeeded. `sessionId` may be populated from earlier
// events before bindIdentity completes successfully, so do not treat it as
// proof of a bound session here.
if (!input.sessionID) {
log("WARN", "plugin.chat_message_unbound", null, {
session_id: input.sessionID,
reason: "chat.message missing sessionID",
})
} else if (!instanceName) {
log("WARN", "plugin.chat_message_unbound", null, {
session_id: input.sessionID,

Copilot uses AI. Check for mistakes.
@aannoo
Copy link
Copy Markdown
Owner

aannoo commented May 1, 2026

Hey, I couldn't reproduce, it already was working fine for me - do you have more info?

AI:

PR #35 review

target: mmkzer0/hcom#35 (fix/bindIdentity-headless)
base: main @ e0560a7a
scope: src/opencode_plugin/hcom.ts env-guard swap + chat.message guard tightening + 4 Rust tests in src/instance_binding.rs

live test

method: hcom opencode --agent build --headless --hcom-prompt ... on each branch, fresh agent, plugin log inspected.

run branch instance session session_created bound via session_bindings row reply
A main roge ses_21c349dc7ffet3tMN0ymnBRMlg yes session.created path yes "live test ok"
B PR rila ses_21c317bc6ffeQMiPZW3GGfXOoy yes session.created path yes "pr branch test ok"

note: plugin.session_created log entry has no instance field (logs before instanceName is set). grepping by instance name omits it. earlier conclusion that session.created did not fire was an artifact of that filter and is retracted.

result: identical plugin log sequence on both branches. neither run exercised the chat.message-first / no-session.created path the PR claims to fix. PR is functionally inert for the tested invocation.

env-guard swap analysis

change: if (process.env.HCOM_LAUNCHED !== "1") returnif (!process.env.HCOM_PROCESS_ID) return at src/opencode_plugin/hcom.ts:295.

evidence both vars are coupled in the standard launcher path:

  • src/launcher.rs:890 sets HCOM_LAUNCHED=1.
  • src/launcher.rs:912 sets HCOM_PROCESS_ID=<uuid>.
  • same loop, no --agent-aware branching.
  • src/pty/mod.rs pty_child_env also sets HCOM_LAUNCHED=1.
  • launch scripts for both A and B exported both vars before exec of hcom pty opencode --agent build --prompt ....
  • src/terminal.rs:716 unsets both together in post-exit cleanup.

implication: for any invocation that goes through prepare_launch_execution, HCOM_LAUNCHED == "1" iff HCOM_PROCESS_ID is set. swap is a no-op on this path.

PR description claim: "HCOM_LAUNCHED is injected by the PTY launcher, but not by the headless --agent path." not supported by the code on main. likely refers to a non-launcher path (e.g. codex subprocess spawning opencode via a wrapper that propagates one env var but not the other) — description does not specify.

OpenCode behavior

PR claim: "in --agent mode, OpenCode does not emit a session.created event."

observed (current OpenCode in this env): plugin.session_created fired in both A and B with --agent build. OpenCode run.ts calls sdk.session.create() when no existing session is provided; subsequent chat.message carries sessionID. claim does not reproduce on the OpenCode version installed.

needed from author: OpenCode version/commit where --agent skips session.created.

test coverage assessment

4 new tests in src/instance_binding.rs:

  • test_agent_mode_bind_session_creates_session_binding
  • test_agent_mode_bind_session_idempotent
  • test_agent_mode_initialize_then_bind_session
  • test_agent_mode_no_process_id_no_session_binding

all exercise the Rust DB function bind_session_to_process. none exercise:

  • the TS plugin bindIdentity env-guard.
  • the chat.message fallback bind path.
  • the HCOM_PROCESS_ID set / HCOM_LAUNCHED unset asymmetry the PR claims to fix.

DB invariants tested were already correct; the load-bearing change is in hcom.ts and is uncovered. a future refactor of the guard could silently regress the target case.

non-hcom session side effect

new branch at src/opencode_plugin/hcom.ts:466-471 emits WARN plugin.chat_message_unbound per chat.message when instanceName is null.

triggers for users who:

  • have hcom binary on $PATH (checkHcom() passes at :449).
  • launch OpenCode without going through hcom launcher (HCOM_PROCESS_ID unset → bindIdentity early-returns at :295).
  • have ~/.hcom/.tmp/logs/ populated (otherwise appendFileSync silently fails at :62).

result: every prompt appends a WARN line. WARN reason text "hcom absent or daemon error" is incorrect in this case (cause is "not launched by hcom").

remediation: gate the WARN on process.env.HCOM_PROCESS_ID so it fires only on genuine bind-attempt failures. preserves "hooks do nothing if you are not using hcom" guarantee.

independent of env-guard issue

chat.message guard tightening at :461-478 is correct on its own merits. old code's isBoundSession(sid) returned true when local sessionId was null (!sessionId short-circuit), allowing currentAgent / currentModel mutation pre-binding and from foreign sessions. new code requires instanceName set before mutation. retain regardless of env-guard outcome.

defensive !bindingPromise checks

added at :458 and :495. each is preceded by if (bindingPromise) await bindingPromise and bindIdentity's finally clears the promise. condition is always true at evaluation. dead code in current control flow. harmless. comment intent if guarding against re-entrancy not present in current handler model.

requested artifacts before merge

  1. exact failing invocation: command, OpenCode version/commit, hcom commit.
  2. plugin log from failing run showing HCOM_PROCESS_ID set, HCOM_LAUNCHED !== "1", no plugin.session_created, no plugin.bound, no session_bindings row.
  3. matching log from PR branch with same invocation showing plugin.bound and session_bindings row.
  4. plugin-level regression test driving the asymmetric env state (or documented manual repro path).
  5. fix or removal of plugin.chat_message_unbound WARN for the non-hcom-launched OpenCode case.

status

merge not justified by current evidence on main. hypothesized real fix scope: a non-standard env-inheritance edge case the description does not specify. independent value: chat.message guard tightening (item: retain). other changes pending verification.

@mmkzer0
Copy link
Copy Markdown
Contributor Author

mmkzer0 commented May 1, 2026

@aannoo thanks for the repro attempt. I think the important difference is that the failing path was not the same as the smoke test in your comment.

The observed failure happened from Codex via hcom run partner-setup. That script is our session-partner bootstrap helper: it looks for an existing co-located OpenCode partner, and if none exists it spawns one, records it under ~/.hcom/session-partners/<owner>-<dirhash>.state, then later task dispatch sends work to that partner.

The spawn path is ~/dev/code-configs/hcom-scripts/partner-setup.sh:88:

hcom opencode --headless --go ${HCOM_NAME:+--name "$HCOM_NAME"} \
  --hcom-system-prompt "$HCOM_SP" -- --agent "$PARTNER_AGENT"

Your test used --hcom-prompt .... That matters in current launcher code:

  • --hcom-prompt becomes params.initial_prompt, and for OpenCode launcher injects --prompt <text> (src/launcher.rs:771-799).
  • --hcom-system-prompt becomes params.system_prompt, but launcher currently only applies that for Gemini (src/launcher.rs:745-750). For OpenCode it is effectively ignored.

So the partner-setup path is “OpenCode headless + --agent + no OpenCode --prompt”, while the smoke test was “OpenCode headless + --agent + --prompt”. That could explain why your run followed the normal session.created path while the partner spawned from Codex got stuck/unregistered.

I agree the PR should not claim this reproduces for every --agent launch. More precise claim: this was observed with the partner-setup.sh launch shape above. Best next repro is to run that exact command/path from a Codex-launched hcom session and compare hcom list/plugin log for whether opencode-start runs and whether a session_bindings row is created.

@mmkzer0
Copy link
Copy Markdown
Contributor Author

mmkzer0 commented May 1, 2026

@aannoo follow-up after more repro work today.

I can no longer reproduce the full failure mode on current local main + OpenCode 1.14.31.

Original failure observed yesterday:

  • hcom run partner-setup from Codex spawned OpenCode via:
    hcom opencode --headless --go --hcom-system-prompt ... -- --agent <partner-agent>
  • The spawned partner appeared pty-only / unregistered:
    session_id: (none), bindings: pty, transcript: (none)
  • Terminal was stuck with raw <hcom> text, ready=false, no usable OC term view, and high CPU.

What I verified today:

  • partner-setup.sh itself looks sound. --agent X and PARTNER_AGENT=X both resolve to the same OpenCode launch line.
  • With OpenCode 1.14.31, --agent still initially returns a pty-only partner before any OC session exists: session_id: (none), bindings: pty, transcript: (none).
  • But the first hcom message now usually bootstraps the OC session successfully: plugin.session_created -> opencode-start -> plugin.bound.
  • Launching with --hcom-prompt also binds cleanly because the initial prompt creates a session immediately.
  • I tested both default m2p7-executor and non-default dsv4flash-executor; both now recover.

So my earlier “messages never deliver” statement was too strong for current OpenCode. The hard failure may have depended on an OpenCode-side prompt/session startup bug that changed with the recent OC 1.14.31 update, or with the local Codex/OpenCode updates today.

Why I still think this hardening is useful:

  • partner-setup readiness currently means PTY/process readiness, not OC hook/session readiness.
  • --agent launches can still sit in an observable unbound window until the first prompt/message creates a session.
  • This patch makes the plugin binding path less dependent on only the earliest session.created timing and hardens startup when the first real session event arrives later.
  • Even if current OC recovers, avoiding a pty-only unbound window is safer for hcom partner orchestration and reduces dependence on fragile first-message PTY bootstrap behavior.

Happy to narrow the PR description from “fixes stuck launch” to “hardens --agent headless startup binding; original hard failure no longer reproduces on OC 1.14.31”.

mmkzer0 added 4 commits May 1, 2026 20:29
In --agent mode OpenCode skips session.created and starts at chat.message
with the agent pre-configured. bindIdentity() was guarded by a check for
HCOM_LAUNCHED=1, which is set by the PTY launcher but not by the headless
--agent path. The guard fired unconditionally, blocking opencode-start from
running and leaving session_bindings empty. Result: has_session_binding=false
and the instance invisible to hcom hooks/list.

Remove the HCOM_LAUNCHED guard. Safety is preserved by the existing
instanceName||bindingPromise fence inside bindIdentity, $.nothrow() on the
subprocess call, and the checkHcom() gate on all event handler entry points.

Also add !bindingPromise pre-checks to the chat.message and transform
fallback call sites, aligning them with the three primary event handlers
(session.created, permission.asked, session.status) that already carry it.
Four new unit tests in instance_binding::tests covering the DB-level
invariants the plugin fix relies on:

- test_agent_mode_bind_session_creates_session_binding: bind without a
  prior session.created creates a session_bindings row (was broken before)
- test_agent_mode_bind_session_idempotent: double-call with the same
  session_id returns the same name and produces exactly one row
- test_agent_mode_initialize_then_bind_session: full sequence of
  initialize_instance_in_position_file (no session_id) followed by
  bind_session_to_process matches the --agent lifecycle
- test_agent_mode_no_process_id_no_session_binding: absent process binding
  returns None and leaves session_bindings empty
isBoundSession() returns true when both instanceName and sessionId are null
(double-null short-circuit). If bindIdentity() was called but failed silently
(daemon absent, missing HCOM_PROCESS_ID), the chat.message handler would
fall through to isBoundSession and mutate currentAgent/currentModel in
unbound state.

Add an explicit !instanceName && !sessionId check before the isBoundSession
branch. When hit, emit plugin.chat_message_unbound at WARN level with the
session_id and reason, then skip agent/model mutation. Existing bound and
foreign-session branches are unchanged.
…nbound state

Addressed two Copilot xreview findings:
1. Re-introduced a cheap check for HCOM_PROCESS_ID in bindIdentity() to prevent
   manual OpenCode GUI launches from unnecessarily starting the TCP notify server
   and spawning failing opencode-start subprocesses.
2. Hardened the chat.message unbound guard to explicitly check !instanceName
   instead of the combined (!instanceName && !sessionId) check. This prevents
   isBoundSession(null) from improperly evaluating to true on chat messages
   lacking a session_id when binding previously failed.
@mmkzer0 mmkzer0 force-pushed the fix/bindIdentity-headless branch from 1db6d34 to c02743d Compare May 1, 2026 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants