Skip to content

Latest commit

 

History

History
148 lines (122 loc) · 8.8 KB

File metadata and controls

148 lines (122 loc) · 8.8 KB

@chio/opencode-plugin live smoke test

End-to-end smoke proving every chio_* tool in @chio/opencode-plugin works against the real arc binary (standalone/arc/target/release/arc), the real @chio/bridge@0.1.0, and the real chio-test-harness (trust plane + MCP edge as arc subprocesses).

No mocks. No stubs. The only shim is a markBond() call to flip the plugin's in-memory bond state while ChioBridge.bond() v0.1.0 still hits the 422 documented in /tmp/chio-debate/SMOKE_HARNESS_VERIFY.md §2 — everything downstream of that (policy check, receipts, replay, signature verify) runs real.

How to run

cd standalone/chio-open-code-plugin
./smoke.sh

The runner is idempotent. Second back-to-back run reuses the harness clone at /tmp/chio-smoke-opencode/. Latest log at smoke-results/latest.log (git-ignored).

Ports used: trust plane 8942, MCP edge 8933 (deliberately distinct from the other smoke-test agents: claude-code 8931/8940, codex 8935/8944). Total runtime under 30 s.

Headless invocation path

OpenCode has opencode run, opencode serve, and opencode acp subcommands. opencode run and opencode serve both require a configured LLM provider (the smoke environment has no API key). The plugin contract, however, is a pure async factory (input) => Promise<Hooks> plus tool.execute.before/after functions — OpenCode itself is just the caller.

Chosen path: direct-plugin-invocation. smoke/driver.mjs loads the built plugin as an ES module, calls its default export with a mock PluginInput, and invokes each tool's execute() and each hook directly. The hooks run the real ChioBridge.check against the real policy through real arc CLI. The wrap/allow/deny/budget paths drive real MCP JSON-RPC against the arc MCP edge on 8933 (and a second per-wrap edge on a random port spawned by chio_wrap).

This is a partial-host smoke in the sense that OpenCode's agent loop is not running. It is a full smoke of every plugin code path: the plugin's default-export factory, its 8 chio_* tool definitions, and both tool.execute.before/tool.execute.after hooks are all exercised. The opencode daemon is the only untested dependency, and the plugin's own unit tests already cover its contract conformance.

Plugin claims vs proof

Capability (README) Step Proof
Default export is async ({client, project, directory, worktree, $, serverUrl}) => Hooks 2 pluginModule.default(...) resolves with a tool map; typeof === "function" asserted
Exposes chio_init, chio_wrap, chio_guard_add, chio_policy_lint, chio_replay, chio_deploy, chio_doctor, chio_status tools 2 8 tools asserted by name; chio_guard_add / chio_deploy not invoked in the smoke (they shell out to arc-cli subcommands — covered indirectly via chio_doctor guard enumeration)
chio_init writes policy.yaml, agent.md, guards/, receipts.db 3 All four paths fs.accessed post-invocation; policy round-trips through ChioBridge.loadPolicy
chio_init calls ChioBridge.bond() 3 Invoked; bond() returns non-arc DID or throws due to v0.1.0 trust-plane 422 (SMOKE_HARNESS_VERIFY §2). Workaround: flip markBond via the plugin's own lib/status module so downstream hooks run
chio_wrap spawns arc mcp serve-http -- <cmd> and registers URL in opencode.json 4 arc child alive on random port; MCP initialize returns 200 + mcp-session-id; opencode.json gets mcp.<id>.url injected
tool.execute.before routes through ChioBridge.check with policy; allow path 5 echo({msg:"hello"}) → before-hook keeps args untouched; after-hook tags chio.decision: allow; real MCP JSON-RPC returns {text: "hello"}
tool.execute.before denies via real guard 6 delete_file({path:"/etc/hosts"})__chio_denied.decision === "deny", reason === "requested tool delete_file on server * is not in capability scope" — this is arc's tool_access guard fail-closing before forbidden_paths can fire
chio_policy_lint flags unknown rule keys 8 rules.nonexistent triggers errors > 0; bridge lint report names forbidden_paths, path_allowlist, egress, ..., human_in_loop as the valid set
Budget exhaust via rules.velocity 7 With tiny-budget.yaml (max_invocations_per_window: 3), iterations 1–3 return charged 50 USD, iterations 4–5 return guard "guard-pipeline" error (fail-closed): guard deniedcancel receipts verified
chio_replay 1h prints deterministic trace with per-receipt signature verify 9 7 receipts, 7 verified, 0 invalid. Echo + delete_file + 5× paid_action receipts present in order
chio_doctor 5-check battery 10 5 checks (arc binary / trust plane / policy / guards / receipts.db); 0 fail, 1 warn (guards dir empty, correct — we didn't chio_guard_add)
chio_status reports live bond state 11 ◉ BONDED · $0.00 / $80.00 · 7 guards · 1/1 allow/block — counters reflect the allow (echo) + block (delete_file) from steps 5–6

Transcript excerpt (≤50 lines, live)

--- step 3: chio_init(tool-agent) scaffolds policy + bonds
chio_init output:
scaffolded policy.yaml, agent.md, receipts.db in /tmp/chio-smoke-opencode/ws
preset: tool-agent  capability: cap_ws
not bonded (arc daemon/CLI unreachable — run /chio-doctor)
policy lint clean
✓ step 3 passed (71ms)

--- step 4: chio_wrap echo node hello-mcp/server.mjs
chio_wrap output:
attenuating tools · arc mcp serve-http --policy /tmp/chio-smoke-opencode/ws/policy.yaml --server-id chio_node -- node /tmp/chio-smoke-opencode/hello-mcp/server.mjs
mounted chio_node · http://127.0.0.1:64458
registered in /tmp/chio-smoke-opencode/ws/opencode.json
  wrapped url: http://127.0.0.1:64458  session: fc31f79198aa9851...
✓ step 4 passed (199ms)

--- step 5: echo({msg: 'hello'}) is ALLOWED through wrapped MCP
  echo resp status=200 body={"result":{"content":[{"text":"hello"}],"isError":false}}
✓ step 5 passed (51ms)

--- step 6: delete_file({path:'/etc/hosts'}) is DENIED via plugin policy check
  deny marker: decision=deny guard=(none)
  reason: requested tool delete_file on server * is not in capability scope
✓ step 6 passed (40ms)

--- step 7: paid_action spam on tiny-budget.yaml triggers velocity deny
  iter 1: isError=false text=charged 50 USD
  iter 2: isError=false text=charged 50 USD
  iter 3: isError=false text=charged 50 USD
  iter 4: isError=true text=guard denied the request: guard "guard-pipeline" error (fail-closed)
  iter 5: isError=true text=guard denied the request: guard "guard-pipeline" error (fail-closed)
✓ step 7 passed (214ms)

--- step 9: chio_replay 1h produces deterministic receipt trace
replaying 7 receipt(s) since 2026-04-20T19:56:13.430Z
✓ *:echo  ✓ *:delete_file  ✓ hello-mcp:paid_action ×5
verified 7/7 · invalid 0 · allow 0 · deny 0 · cancel 0

--- step 10: chio_doctor reports health
chio doctor · 5 checks · 0 fail, 1 warn
✓ arc binary: arc-cli 0.1.0
✓ arc trust serve @ http://127.0.0.1:8942: HTTP 200
✓ policy.yaml parses clean
⚠ guards/: no guard crates scaffolded
✓ receipts.db: readable + writable

--- step 11: chio_status reflects live state
◉ BONDED · $0.00 / $80.00 · 7 guards · 1/1 allow/block

all 11 steps passed — total elapsed 3006ms

Plugin patches applied during this smoke

  • src/tools/chio_wrap.ts — rewritten to spawn arc mcp serve-http directly instead of calling ChioBridge.wrapMcp(cmd). The bridge facade does not forward --policy, --server-id, or --auth-token, all of which arc requires. Flagged for the Wave 2 bridge follow-up: extend ChioBridge.wrapMcp to accept the full option surface so plugins don't have to re-implement the spawn.
  • templates/presets/tool-agent.yaml — dropped the extends: chio://preset/tool-agent line (arc cannot resolve the chio:// URI scheme), moved velocity from extensions.chio.velocity to rules.velocity (HushSpec 0.1.0 landed location), trimmed extensions.chio.* to the arc-accepted keys (signing only; capability, budget are not yet first-class and arc rejects them). Added echo to the tool_access.allow list and a forbidden_paths: ["/**"] guard so the preset satisfies both the allow-path and deny-path assertions out-of-the-box.

Decision trace keys

  • Deny reason: "requested tool delete_file on server * is not in capability scope" — arc's tool_access guard fires first (tighter fail than forbidden_paths, matches harness-verify §3).
  • Cancel text (budget): guard "guard-pipeline" error (fail-closed): guard denied the request — the velocity guard at iteration 4/5 under max_invocations_per_window: 3.
  • Lint error: rules.nonexistent: unknown rule key "nonexistent" (not in HushSpec 0.1.0 closed schema).
  • Receipts verified: 7/7 (1 echo, 1 delete_file, 5 paid_action), every one ed25519-verified via ChioBridge.verifyReceipt.