Skip to content

feat(channels/discord): outbound attachments with SSRF-guarded multipart batching#1229

Open
benhoverter wants to merge 106 commits into
RightNow-AI:mainfrom
benhoverter:topic/discord-outbound-attachments
Open

feat(channels/discord): outbound attachments with SSRF-guarded multipart batching#1229
benhoverter wants to merge 106 commits into
RightNow-AI:mainfrom
benhoverter:topic/discord-outbound-attachments

Conversation

@benhoverter

Copy link
Copy Markdown
Contributor

Summary

Lets agents send images and files outbound through the Discord channel.
This is a deliberately narrowed re-submission of the transport half of #1162
(closed 2026-05-12) — see "How this addresses the #1162 review" below.

Scope is exactly one file of production code (crates/openfang-channels/src/discord.rs):

  • New Multipart arm batches N attachments into a single Discord message via
    multipart/form-data, with greedy byte-cap packing for Discord's 25 MiB / 10-file
    limits.
  • Handles ChannelContent::FileData (pre-resolved bytes), and File / Image
    blocks carrying URLs.
  • SSRF guard on every URL fetch: scheme allowlist, literal-IP host check
    (loopback, RFC1918, link-local, unique-local, multicast, unspecified, CGNAT),
    IPv4-mapped IPv6, and redirect revalidation at each hop (URL_FETCH_MAX_REDIRECTS).
  • Streaming download with a hard 25 MiB size-cap and mid-stream abort; 15 s timeout.
  • Body-aware 429 retry-after honored on the multi-file path; partial-send failure
    emits a structured, grep-friendly WARN.
  • Fetcher trait extracted so tests can stub the wire without disabling the guard.

How this addresses the #1162 review

#1162 was closed for two blockers. This branch resolves both.

/cc @jaberjaber23 — you reviewed the original #1162; flagging so you can see both
blockers you raised are addressed below.

  1. Credential-exfil via too-wide allow_roots — the blocking concern. Not
    present in this branch.
    This PR contains no <openfang:attach path=…> parser,
    no default_allow_roots/allow_root logic, and no local filesystem reads at all

    (fs::read / File::open / tokio::fs — none). It transports pre-resolved bytes
    (FileData) and SSRF-guarded URLs only. The local-path attach surface that the
    reviewer flagged is intentionally out of scope here and will land separately
    with allow_roots empty-by-default + a hard ~/.openfang/ refusal, as requested.
    This matches the reviewer's own suggested split ("PR A: smaller, easier review").

  2. Dirty against main (fix(channels/discord): surface image attachments to text-only providers #1143/feat(channels): harden channel_id binding — adapter allowlist, strict validation, single source of truth for routing #1147/runtime/claude_code: materialize image blocks to tmpfile + extract image_cache module #1151/Unintended behavior of OpenFang workspaces #1097 drift) — resolved. Branch is
    rebased on current upstream/main; git merge-tree is clean. It builds on the
    #1143 Multipart types already in main (no duplicate variant) and carries
    no parallel image_cache implementation.

Out of scope (intentional)

Test plan

Tests live in-file (discord.rs test module):

  • SSRF guard: blocks loopback, 10/192.168, link-local, non-HTTP scheme, [::1],
    IPv4-mapped metadata ([::ffff:169.254.169.254]); allows public IP literal;
    asserts errors never leak the query string.
  • Multipart: greedy byte-cap chunking, empty/whitespace-only caption handling,
    mixed Text/Image/File dispatch, empty-multipart rejection, body-aware 429 on the
    aggregated path, SSRF-blocked File URL surfaces send() -> Err.
  • Fixture servers bound to 127.0.0.1 use an explicit test-only ssrf_bypass
    fetcher so the production guard stays enforced everywhere else.

Run before submit: cargo test -p openfang-channels, cargo clippy -p openfang-channels --all-targets -- -D warnings.

Note for review

The Fetcher trait + SSRF guard is the security-sensitive surface — flagging it for
closer review (same as #1162; that engineering was praised, only the path-attach
allow-root sank the prior PR, and it isn't here).

benhoverter and others added 30 commits May 12, 2026 15:34
Standalone crate exposing OpenFang's tool surface to MCP clients (primarily
Claude Code subprocesses) over stdio. Per architectural decision in ANAI-22:
not folded into openfang-runtime — keeps the protocol adapter out of the
kernel/compactor blast radius and the dep graph clean.

This commit is scaffolding only:
  * Cargo manifest with rmcp 1.x (server, transport-io, macros)
  * lib.rs: ToolDispatcher seam trait (runtime-implements, bridge-consumes,
    one-way dep), ToolDispatchError enum, Bridge struct wrapping an
    Optional<Arc<dyn ToolDispatcher>>, single stub `ping` tool
  * main.rs: stdio MCP server entrypoint, tracing -> stderr (stdout is the
    transport), no dispatcher attached
  * Workspace members updated

Identity is bound at Bridge construction time, not per-call — the security
invariant tracked by ANAI-31. Real tool surface mapping lands in ANAI-30.

cargo check -p openfang-mcp-bridge: clean.
cargo check --workspace: clean (pre-existing imap-proto future-incompat
warning unrelated).

Refs: ANAI-22, ANAI-29

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds the daemon-side foundation for the MCP bridge per the ANAI-30 plan
(topology 1b: daemon → CC → bridge → unix socket → daemon dispatcher).

- New `protocol` module in openfang-mcp-bridge: Frame/Hello/HelloAck/
  CallRequest/CallResponse types with length-prefixed JSON framing
  (1 MiB cap, 4-byte BE length prefix). Gated by `ipc-codec` feature
  so type-only consumers can drop the tokio io traits.
- New `bridge_ipc` module in openfang-api: BridgeIpcServer binds
  <home_dir>/run/bridge.sock (0600), accept loop with graceful
  shutdown via Notify, per-connection Hello validation and CallRequest
  → CallResponse loop.
- run_daemon spawns the listener; failure is non-fatal (HTTP keeps
  serving; bridge just unavailable). Socket file removed on shutdown.

Step 1 stub: the dispatcher returns CallResult::Error
("not yet wired"). Step 2 replaces this with a call into
openfang_runtime::tool_runner::execute_tool, scoped to the four-tool
allowlist (file_read, file_list, agent_list, channel_send). Identity
binding + token-table auth land in ANAI-31.

Tests: 3 protocol roundtrip tests + 4 IPC handler tests
(handshake/dispatch end-to-end via tempfile socket, version mismatch
rejection, empty-token rejection).

Refs ANAI-30, ANAI-22.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replaces the step-1 stub in `BridgeIpcServer` with a real call into
`openfang_runtime::tool_runner::execute_tool`, mirroring the argument
bundle used by the HTTP /mcp endpoint in routes.rs.

- Added ALLOWED_TOOLS allowlist: file_read, file_list, agent_list,
  channel_send. Rejection happens at the protocol layer (CallResult::Error)
  before any kernel touch.
- Added dispatch_call(): snapshots the skill registry, builds a
  KernelHandle from Arc<OpenFangKernel>, and invokes execute_tool.
- ToolResult mapped to CallResult::Ok { content, is_error }, preserving
  the Ok/Error distinction (Error = bridge couldn't dispatch; Ok with
  is_error = tool ran but returned an error).
- Identity stub: caller_agent_id taken at face value from
  CallRequest::agent_id. Real per-spawn token-bound identity lands in
  ANAI-31.

Test: ipc_handshake_and_allowlist_gate verifies wire shape end-to-end:
disallowed tool gets allowlist Error, allowed tool gets Ok response. Real
execute_tool integration tests come once the daemon spawns the bridge
for real (ANAI-31).
…l surface

Replaces the stub `ping` tool with the four ANAI-30 tools (file_read,
file_list, agent_list, channel_send) and wires the bridge binary to forward
each `tools/call` over the daemon IPC socket established in step 1.

Library (lib.rs):
- ToolDispatcher::call now returns DispatchOk { content, is_error }
  preserving the tool-error-vs-dispatch-error distinction across the seam
- built_in_tools() declares the four-tool slice; schemas mirror
  runtime::tool_runner::builtin_tool_definitions() (kept in lockstep)
- Bridge: manual ServerHandler impl (drops the #[tool_router] macro). Filters
  advertised tools by intersecting built_in_tools() with
  ToolDispatcher::allowed_tools(); double-checks before dispatch
- Bridge::new now requires a dispatcher (was Option<_>)

Binary (main.rs):
- Reads OPENFANG_BRIDGE_SOCKET / TOKEN / AGENT_ID env vars (last is stub for
  ANAI-30; ANAI-31 derives identity from token)
- Connects to daemon, performs Hello/HelloAck handshake, exits on rejection
- IpcDispatcher: bridge-side ToolDispatcher impl. Forwards each call via mpsc
  to an actor task that owns the stream; correlation-by-request_id with a
  PendingMap<u64, oneshot> so concurrent tools/call invocations don't
  serialize at the dispatcher layer
- Reader task drains pending oneshots with an error on connection close so
  in-flight calls don't hang; production path exits the process so CC
  notices and tears down (gated behind cfg(not(test)))

Tests:
- lib: built_in_tools_has_anai30_slice, permitted_tools_intersects_with_dispatcher_allowed
- main: ipc_dispatcher_round_trip_and_correlation — fake daemon listener,
  full handshake, two concurrent calls, verifies per-id correlation and the
  NotPermitted gate

Workspace check clean. Daemon-side bridge_ipc tests still pass (4/4).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
End-to-end topology now exists at the type level:
daemon → claude (per-prompt) → openfang-mcp-bridge → IPC → daemon

- Add `caller_agent_id: Option<String>` to CompletionRequest. Plumbed
  through all construction sites; agent_loop populates it with
  session.agent_id, everywhere else passes None.
- Daemon (`server.rs::run_daemon`): after BridgeIpcServer starts,
  publish OPENFANG_BRIDGE_SOCKET and OPENFANG_BRIDGE_BIN as process env
  for subprocess drivers to discover. Bridge bin defaults to a sibling
  of current_exe; operators can override with OPENFANG_BRIDGE_BIN. Both
  set with `unsafe` (edition 2024) but only during single-threaded
  daemon startup, before any subprocess spawns.
- BridgeIpcServer gains `socket_path()` accessor.
- ClaudeCodeDriver: per-spawn `try_build_bridge_mcp_config`. When
  caller_agent_id is set AND both discovery env vars are present,
  generate a UUID token, write `<home>/run/cc-mcp-<uuid>.json` (0600),
  and add `--mcp-config <path> --strict-mcp-config` to the claude args.
  RAII guard removes the file on drop so per-spawn token lifetime is
  bounded by the CC subprocess.
- apply_env_filter extended to strip OPENFANG_BRIDGE_* from CC's child
  env. Bridge gets these only via the explicit `env` map in the
  mcp-config — CC inheriting them would risk a stray bridge picking up
  the daemon socket without a fresh per-spawn token.
- Tests:
  - test_build_bridge_mcp_config_shape — verifies wire shape claude
    expects: mcpServers.openfang.{command,args,env} with exactly the
    three discovery vars in env (no extras to leak state).
  - test_apply_env_filter_strips_bridge_discovery_vars — confirms
    filter removes all four bridge vars from CC's child env.
  - test_bridge_mcp_config_drop_removes_file — RAII cleanup invariant.

Stub points still flagged: token validated as non-empty (ANAI-31
replaces with daemon-issued per-spawn token table); agent_id taken
in-band from CallRequest (ANAI-31 derives from token).

11 CC driver tests pass. bridge_ipc (4) and bridge crate (6) tests
unchanged. Workspace check clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…NABLED

Default-off kill switch so we can deploy the bridge code path without
inlining it into every CC invocation. When the gate is unset or not in
{1, true}, try_build_bridge_mcp_config returns None and CC is spawned
exactly as it was pre-step-4 — no --mcp-config, no temp file, no bridge
child. Validation flow: deploy with gate off (sanity), launchctl setenv
OPENFANG_BRIDGE_ENABLED 1, bounce daemon, observe; if anything regresses,
flip back to 0 and bounce for instant recovery.

Daemon still starts the IPC listener and publishes BRIDGE_SOCKET/BIN env
unconditionally — both are harmless without a bridge child connecting.
Pure additive switch; zero behavior change when off.

Test exercises the full truth table for bridge_enabled() (unset, truthy
variants, falsy/garbage variants) and confirms the gate suppresses
config generation regardless of other env. Single test owns the global
env var so no serial_test infra needed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bridge IPC handshake works standalone (bridge binary connects + Hello/HelloAck
ok against the live socket), and the daemon-side `wired CC --mcp-config for
OpenFang bridge` debug line confirms the flag is being passed to claude. But
no `bridge IPC accepted connection` events ever fire — meaning claude is
launched with `--mcp-config` but isn't spawning the MCP server subprocess.

Without `--debug`, claude swallows MCP launch errors silently. And we drop
CC's stderr on success spawns, so any silent rejection is invisible.

Add (both spawn paths):
- `--debug` flag when bridge config is wired, so MCP errors print to stderr.
- Always log a 4 KB tail of CC stderr at info when bridge_wired, regardless
  of success/failure. Streaming path now drains stderr concurrently to avoid
  pipe deadlock under chatty --debug output.

Existing 12/12 claude_code unit tests still pass; workspace check clean.

Diagnostic only — once the cause is identified we'll pare back to bounded
on-demand logging.
- bridge_ipc: promote handshake/dispatch events to INFO and add an
  `accepted connection` log on accept. Operators can now observe the
  full bridge lifecycle from daemon stderr without crawling through
  ~/.claude/debug/<uuid>.txt.
- claude_code driver: gate --debug + the 4 KB CC-stderr-tail diagnostic
  behind a new OPENFANG_BRIDGE_DEBUG env var (off by default). With
  proper INFO logs daemon-side, the noisy --debug output and the
  per-spawn ~/.claude/debug/ files are no longer load-bearing.
- server: validate operator-supplied OPENFANG_BRIDGE_BIN path at boot
  and log the resolution outcome (override vs. probe). Catches deploy
  ordering bugs where the env points at a binary that doesn't exist.

Stderr is still drained concurrently in the streaming path — required
whenever --debug might be on, cheap when it isn't.
The MCP bridge IPC is unix-domain-socket-only by construction (daemon
listens on a unix socket; bridge subprocess connects to it). The bridge
crate and the daemon-side `bridge_ipc` module unconditionally imported
`tokio::net::{UnixStream, UnixListener}`, which broke Windows CI with
E0432 unresolved-import errors in `openfang-mcp-bridge::main` and
`openfang-api::bridge_ipc`.

Gates:
- `openfang-mcp-bridge::main` — entire body cfg-gated to `unix`; on
  non-unix the binary is a no-op stub that prints a clear message and
  exits non-zero. Tests gated `cfg(all(test, unix))`.
- `openfang-api::lib` — `pub mod bridge_ipc` gated to `unix`.
- `openfang-api::server::run_daemon` — `BridgeIpcServer::start` call
  gated to `unix`; non-unix logs a single info line and proceeds without
  bridge IPC. The CC driver's existing missing-socket fallthrough means
  CC subprocesses spawn without `--mcp-config` on Windows, matching the
  bridge-disabled path.

No behavioral change on Linux/macOS. Windows users get a daemon that
boots without bridge support; MCP-routed tools are unavailable until a
Windows-native transport (named pipes / TCP loopback) lands as a
follow-up.

Verified: cargo check --workspace, cargo check --workspace --tests,
cargo test -p openfang-mcp-bridge -p openfang-api --lib, cargo fmt
--check, and cargo clippy all clean on macOS.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Introduce the security primitives that will gate bridge IPC handshake:

- openfang-types::bridge_auth::Token — 32-byte CSPRNG-generated opaque
  token. No Debug impl (anti-leak); only an 8-hex-char fingerprint() for
  logs. Constant-time equality, hex round-trip. 8 unit tests.

- openfang-runtime::bridge_auth — TokenIssuer trait (issue/revoke) and
  SpawnGuard. Guard holds Arc<dyn TokenIssuer>; Drop revokes. Trait lives
  here so the Claude Code driver (phase B) can hold Arc<dyn TokenIssuer>
  without a circular dep on openfang-api. 2 unit tests via stub issuer.

- openfang-api::bridge_auth::BridgeAuthority — concrete TokenIssuer impl.
  Mutex<HashMap<Token, AgentId>> behind Arc::new_cyclic so issued guards
  carry a Weak<Self> back for self-revocation. Inherent resolve() and
  live_spawn_count() for the IPC dispatcher and Debug impl. Manual Debug
  redacts to spawn count (Token has none by design). 7 unit tests,
  including an Arc<dyn TokenIssuer> round-trip that exercises the exact
  abstraction the driver will hold.

Dep direction respected: openfang-api -> openfang-runtime -> openfang-types.
No wiring yet; phase B threads Arc<dyn TokenIssuer> into ClaudeCodeDriver,
phase C constructs the Arc<BridgeAuthority> in the daemon and hands it to
both BridgeIpcServer and the driver factory.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wire the spawn side of the bridge handshake. The driver can now mint
authenticated bridge tokens via the daemon-side authority, with the legacy
UUID path preserved so dev builds without an issuer don't regress.

- ClaudeCodeDriver gains token_issuer: Option<Arc<dyn TokenIssuer>>.
  Builder with_token_issuer(...) for phase C wiring; new()/with_timeout()
  default to None.

- BridgeMcpConfig gains _guard: Option<SpawnGuard>. Lifetime-only field
  (underscore-prefixed) — when the spawned process exits and the config
  drops, SpawnGuard::drop revokes the token from the authority's spawn
  table.

- try_build_bridge_mcp_config is now a &self method with two branches:
    * issuer present  -> parse caller_agent_id as AgentId, refuse to wire
      if it doesn't parse (no anonymous tokens), then issuer.issue(...) and
      emit guard.token().to_hex() (64-char hex) on OPENFANG_BRIDGE_TOKEN.
      Debug log carries token fingerprint.
    * issuer absent   -> legacy UUID path (renamed
      generate_legacy_bridge_token) — preserves current ANAI-30 behavior
      where any non-empty token is treated as authenticated.

- Both call sites in complete() and stream_complete() updated to call
  through self. Env var name OPENFANG_BRIDGE_TOKEN unchanged.

- Two test fixtures updated to construct via the driver and to set
  _guard: None on synthetic BridgeMcpConfig values. claude_code tests:
  21/21 green.

Phase C will construct Arc<BridgeAuthority> in openfang-api::server and
hand it to both BridgeIpcServer::start and the driver factory.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase C1 of bridge auth wiring. create_driver now accepts an optional
TokenIssuer; the Claude Code driver consumes it via with_token_issuer.
OpenFangKernel gains a token_issuer slot + setter/getter; resolve_driver
threads the issuer through to fresh driver builds. Boot-time and
one-shot probe sites keep the legacy UUID path (None). Agent-loop
fallback sites are marked for C2.
…loop

Phase C2 of bridge auth wiring. KernelHandle gains a default-None
token_issuer() accessor; OpenFangKernel overrides it to expose the
daemon's BridgeAuthority. The agent loop reads the issuer via
KernelHandle and threads it into call_with_retry / stream_with_retry,
which forward it to create_driver on the ModelNotFound fallback path so
fallback drivers also get hardened bridge tokens. server.rs constructs
the BridgeAuthority at daemon startup (unix-gated) and hands it to the
kernel before background agents start.
…r connection

Phase D of bridge auth wiring. validate_hello is renamed to
authenticate_hello and now resolves the presented token through the
daemon's BridgeAuthority. Well-formed hex tokens that the authority
issued resolve to an AgentId that overrides the bridge's claimed
agent_id on every subsequent CallRequest (with a warn! on mismatch).
Well-formed hex tokens the authority never issued are rejected. Non-hex
tokens fall through to the ANAI-30 legacy path for back-compat with
spawn sites that haven't yet been wired through the TokenIssuer (boot-
time create_driver in particular). Handshake log line now includes the
token fingerprint and authenticated agent id for audit correlation.
BridgeIpcServer::start gains the authority argument; server.rs hoists
the authority out of the C2 cfg block so it can be threaded in.
…gents

Phase E of the bridge tool surface v2 work closes a boot-time loophole
left by phases C1/C2/D: agents brought up during kernel boot
(autostart + persisted) were instantiated before the post-boot
`set_token_issuer` call, so their drivers were baked with legacy-UUID
identity even though every later code path was already on the hardened
BridgeAuthority/TokenIssuer track.

Changes:
- kernel: add `boot_with_config_and_issuer(config, Option<Arc<dyn TokenIssuer>>)`;
  the existing `boot_with_config` becomes a thin wrapper passing `None`.
  The issuer is stashed into the kernel's `token_issuer` slot before the
  boot-time driver chain is built, so the three in-boot `create_driver`
  sites see the hardened path. Adds two unit tests covering both entry
  points.
- cli/main: construct `BridgeAuthority` before kernel boot and pass
  `Some(issuer)` into the new entrypoint; thread the same `Arc` into
  `run_daemon`.
- server::run_daemon: gain `#[cfg(unix)] bridge_authority` param;
  remove the now-redundant post-boot `set_token_issuer` call.
- bridge_auth: add `as_token_issuer()` ergonomic helper.
- bridge_ipc: comment refresh - the legacy lane is now reserved for
  non-unix / non-daemon callers only.

Tests: 307/307 green, including the two new boot_with_config tests.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…IDGE_ALLOWED

The bridge subprocess has been deafer than the rest of the system:
every CC spawn fell back to the static four-tool DEFAULT_ALLOWED set
inside `openfang-mcp-bridge`, regardless of what the calling agent's
`agent.toml` actually permits. Capability/RBAC decisions made by the
kernel and agent loop didn't reach the bridge.

This commit plumbs the existing source of truth — `agent.toml` →
kernel-resolved `available_tools` → agent loop — into the bridge's
env via `OPENFANG_BRIDGE_ALLOWED`:

- Add `allowed_tools: Option<Vec<String>>` to `CompletionRequest`.
- Populate at both agent_loop call sites from `available_tools`.
- Other 9 construction sites (compactor, routing tests, driver tests,
  HTTP probe, kernel router probe, fallback test helper) pass `None`
  — they don't drive bridge subprocesses or don't have an agent context.
- Thread the list through `try_build_bridge_mcp_config` →
  `build_bridge_mcp_config_value`, which now emits
  `OPENFANG_BRIDGE_ALLOWED` in the per-spawn `--mcp-config` env map.
- Strip `OPENFANG_BRIDGE_ALLOWED` in `apply_env_filter` so a CC child
  never inherits a stale allowlist.
- `None` → env absent → bridge falls back to its hard-coded default,
  matching today's behavior. Empty list is meaningful: it emits the
  env var as the empty string so the bridge produces an explicit
  zero-tool surface instead of silently defaulting.

Tool surface change: zero. This is pure plumbing; the bridge still
only knows about the ANAI-30 four-tool slice. Adding `agent_send` to
that slice ships in the next commit.

Tests:
- New `test_build_bridge_mcp_config_emits_allowed_tools` asserts the
  comma-separated env var lands in the config.
- New `test_build_bridge_mcp_config_emits_empty_allowed_tools_explicitly`
  pins the empty-list-vs-absent distinction.
- Existing `test_build_bridge_mcp_config_shape` updated to assert
  `OPENFANG_BRIDGE_ALLOWED` is *absent* when no list is supplied.
- Existing env-filter test extended to assert the new var is stripped.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
First tool added to the bridge surface past the original ANAI-30 slice.
Now that the previous commit makes the bridge honor per-agent capability
gates (via OPENFANG_BRIDGE_ALLOWED sourced from each agent's agent.toml),
expanding the bridge's *capability* set is safe: agents whose toml does
not list agent_send will not see or be able to dispatch it.

Schema mirrors `openfang_runtime::tool_runner::builtin_tool_definitions`
→ agent_send entry. Kept in sync by hand; the bridge crate is
runtime-free by design and can't import the source.

- `built_in_tools()` in openfang-mcp-bridge gains the agent_send Tool.
- `DEFAULT_ALLOWED` in openfang-mcp-bridge/src/main.rs gains agent_send
  so the legacy/dev path (env var unset) still advertises every tool
  the bridge can dispatch. Production daemon spawns always set the env.
- `bridge_ipc::ALLOWED_TOOLS` (daemon-side ceiling) gains agent_send so
  the dispatch_call gate doesn't reject it.
- `built_in_tools_has_anai30_slice` → `built_in_tools_surface`: now
  asserts the five-tool set and includes a drift-warning comment.
- `permitted_tools_intersects_with_dispatcher_allowed` swaps `agent_send`
  for `shell_exec` as its "unknown to the bridge" example, since
  agent_send is no longer unknown.
- Bridge `instructions` string updated: drops the ANAI-30 enumeration,
  explains the gating model instead.

Tool surface effect: agents whose agent.toml grants `agent_send` (which
already includes the coder agents that have been using it via the
legacy path) can now call it through MCP. Agents that don't list it
remain unable to.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The bridge IPC path (`bridge_ipc::dispatch_call`) and HTTP `/mcp`
endpoint both passed `workspace_root: None` to `execute_tool`, which
caused `resolve_file_path` to fall through to a `..`-only validation
check. Absolute paths bypassed it entirely and bare-relative paths
resolved against the daemon CWD (`~/.openfang`), allowing any agent
advertised `mcp__openfang__file_read`/`file_list` to read every sibling
workspace plus `secrets.env` and GCP service-account JSON sitting at the
openfang root.

Fix:
- bridge_ipc: look up workspace via authenticated `AgentId` → registry
  manifest. Refuse FS tools when no workspace is registered rather than
  silently falling back to an unscoped view.
- routes (`POST /mcp`): same gate; HTTP has no ambient agent identity,
  so callers must pass `_agent_id` in arguments to scope FS calls.
  Kernel-level tools (agent_list, channel_send, etc.) keep working
  without `_agent_id` since they don't touch the filesystem.
- Both paths now thread `workspace_root` through to `execute_tool`, so
  `workspace_sandbox::resolve_sandbox_path` actually runs — canonicalize-
  then-prefix-check blocks absolute escapes and symlink traversal; the
  existing `..` denial covers relative traversal.

Smoke (post-rebuild, from `coder-learn-rust` with file_read/list):
- `secrets.env` (bare relative) → ENOENT inside sandboxed workspace
- `/Users/rlyeh/.openfang/secrets.env` → "Access denied: resolves
  outside workspace"
- `../assistant/IDENTITY.md` → "Path traversal denied"
- `IDENTITY.md` → success (workspace-relative legitimate read works)
- `file_list .` → returns the agent's own workspace contents only;
  zero openfang-root entries

Deferred: defensive log on disallowed-tool invocation at execution
time (thread C), and removing `tool_runner`'s `workspace_root: None`
legacy fallback (used only by tests).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Defense-in-depth against a misbehaving subprocess crafting JSON-RPC calls
to tools outside its advertised set. Restructures dispatch_call into three
ordered gates:

1. Static bridge allowlist (unchanged) — reject unknown tools pre-identity.
2. Identity resolution (hardened) — authenticated AgentId wins; legacy lane
   requires claimed string to parse as AgentId AND have a registry entry.
   Closes the "trust the claimed string" loophole (ANAI-30 follow-up).
3. Per-agent permission gate (new) — calls the canonical resolver
   kernel.available_tools_with_registry + AgentMode::filter_tools, the
   exact pair agent_loop uses to build OPENFANG_BRIDGE_ALLOWED at spawn.
   Advertise-time and execute-time gates cannot drift.
4. Workspace sandbox (D-fix, unchanged) — FS tools require a workspace.

Snapshot construction now matches agent_loop's bundled→global→workspace
layering, so workspace skill overrides are visible bridge-side.

Rejections log warn! with request_id, tool, agent, mode, permitted_count.
Caller sees a generic "tool 'X' not permitted for this agent".

Promotes available_tools_with_registry from fn to pub fn.

Verified end-to-end via raw IPC frames: rejection on agent_list for
coder-learn-rust (Gate 2), positive path on file_read, identity gate
rejection for unregistered UUID and unparseable legacy claim, and Gate 1
rejection for shell_exec outside bridge surface.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add file_write and web_fetch to the bridge tool surface so CC-backed
agents reach parity with API-backed agents for FS-write and HTTP-out.
Both inherit the existing four-gate pipeline (static allowlist, identity,
per-agent permission via available_tools_with_registry, FS sandbox).

- openfang-mcp-bridge: hand-mirrored schemas in built_in_tools(); drift
  catcher test updated. Both added to DEFAULT_ALLOWED.
- openfang-api/bridge_ipc: both added to ALLOWED_TOOLS. file_write added
  to FS_SANDBOXED_TOOLS; web_fetch deliberately not (no FS touch — SSRF
  guardrails live in the runtime impl, including the external-content
  delimiter wrapper).

Smoke (raw-IPC harness): permitted positive + unpermitted negative for
each tool, all four pass with matching daemon warn lines.
Adds the four remaining agent-lifecycle tools to the MCP bridge surface,
bringing parity with OpenFang's native kernel tool set for agent control.

- openfang-mcp-bridge/lib.rs: register all four in built_in_tools();
  drift-catcher test bumped to 11 tools.
- openfang-mcp-bridge/main.rs: add to DEFAULT_ALLOWED.
- openfang-api/bridge_ipc.rs: add to ALLOWED_TOOLS. Not added to
  FS_SANDBOXED_TOOLS (kernel-only, no filesystem touch).

Gate behavior is inherited from existing plumbing:
- Gate 1 (static allowlist): all four now pass.
- Gate 2 (per-agent capability via agent.toml): enforced through
  the canonical resolver, no extra hardcoded checks.
- Gate 3 (FS sandbox): skipped by design.

Smoke: 8/8 scenarios green (positives via coder-openfang against a
throwaway coder-smoketest agent; negatives via code-reviewer hit
Gate 2 cleanly with permitted_count=5 matching her agent.toml).
Full spawn → activate → kill lifecycle verified end-to-end through
the bridge. Smoketest workspace + registry entry cleaned up.
Brings the MCP bridge surface to 13 tools, leaving only shell_exec
pending the Claude Code Bash-disable spike. Memory tools are gated
per-agent via agent.toml; not sandboxed at the FS layer since memory
writes route through kernel-managed workspace memory.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Material change: OpenFang now authoritatively blocks Claude Code's native
filesystem, shell, and web tool surface for every CC subprocess it spawns,
by injecting a per-spawn settings.json (written under the bridge socket
dir, removed on drop) and passing it via `claude --settings <file>`.

Why this works alongside `--dangerously-skip-permissions`: per Anthropic's
settings documentation, `permissions.deny` rules are enforced even when
the YOLO flag is set — that flag only bypasses allow/ask prompts, not
security-critical denies. The daemon keeps non-interactive operation
(no TTY to answer prompts) while still authoritatively blocking the
native surface.

Deny set: Bash, Read, Write, Edit, MultiEdit, NotebookEdit, WebFetch,
WebSearch, Glob, Grep. The MCP namespace (`mcp__openfang__*`) lives in
a separate pattern space and is untouched — that's the point: replace
the bypassable native surface with the gated, RBAC-checked bridge
surface. TodoWrite and Task (subagent) are deliberately left alone:
agent-internal control flow with no escape to the host.

Gated on `bridge_enabled()` so the deny set co-travels with the bridge
wiring — either both or neither. Without the bridge wired the agent has
no `mcp__openfang__*` fallback, so blanket-denying native tools would
yield a useless agent.

Also retroactively earns the docstring claim in
`llm_driver.rs::skip_permissions`: before this commit the claim that
"OpenFang's RBAC layer makes YOLO safe" was true at the bridge wire but
false at CC's wide-open native surface. With native tools denied, the
bridge is now the only path to host resources, and per-agent
agent.toml capabilities become authoritative for CC subprocesses.

Tests:
- test_build_cc_settings_shape: wire shape, bare tool names only
- test_cc_native_deny_includes_glob_grep: pins the deny set decisions
- test_cc_settings_file_drop_removes_file: RAII cleanup

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Commit 13a denies CC's native Bash via --settings; every CC-driven agent
now needs shell access through the OpenFang bridge instead. This adds
`shell_exec` to the bridge allowlist and threads the two pieces of
context the runtime needs to dispatch it safely:

  * `exec_policy` resolved from the registered manifest (per-agent
    override or kernel fallback, already populated by the kernel at
    agent boot — kernel.rs:1440-1447). Drives the allowlist/full/deny
    decision plus shell-metacharacter rejection in tool_runner.
  * `allowed_env_vars` read from `manifest.metadata["hand_allowed_env"]`,
    mirroring agent_loop.rs:320 — same lookup, same semantics, so the
    bridge path and the in-proc path see identical hand-granted env.

`shell_exec` joins `FS_SANDBOXED_TOOLS`: tool_runner sets the command's
cwd to the workspace_root, so without a registered workspace the shell
would default to the daemon CWD (`~/.openfang`, where secrets.env and
the GCP service-account JSON live). Refusing the call when no workspace
is registered keeps that closed.

Tests:
  * `allowlist_contains_shell_exec` — name-locked allowlist membership
  * `shell_exec_is_workspace_sandboxed` — name-locked sandbox membership
  * Updated `ipc_handshake_and_allowlist_gate` to use a synthetic
    non-tool name as the "not on allowlist" example; shell_exec is now
    permitted there.

Backcompat: agents that already declared `shell_exec` in their
capabilities get it via the bridge with the same gating they had
in-proc. Agents that didn't, can't — gate 2 still rejects.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Commit 13a's deny set blocks CC's native `WebSearch`; researcher agents
(researcher, researcher-autumn-business, researcher-autumn-medical)
depend on this surface for primary research. Restoring it through the
bridge means the request flows through OpenFang's multi-provider chain
(Tavily → Brave → Perplexity → DDG) instead of CC's own provider, which
is also a sturdier behavior win — we control the retry/fallback policy.

Implementation is a one-line allowlist add. Every prerequisite was
already in place:
  * `web_ctx` is already passed to `execute_tool` at the dispatch site.
  * `web_search` is already a fully-implemented native tool with a
    ToolDefinition (tool_runner.rs:645) and provider chain
    (tool_runner.rs:233).
  * Per-agent capability gating already honors `web_search` against the
    agent.toml's resolved surface (gate 2 at bridge_ipc.rs:475).

No new context arg, no new FS sandbox concerns (pure-net tool). The
smallest bridge add to date.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Commits 9bd3e48 (13b) and 64078d0 (13d) added shell_exec and web_search
to the daemon-side ALLOWED_TOOLS allowlist in bridge_ipc, but missed the
two corresponding files on the bridge subprocess side:

- crates/openfang-mcp-bridge/src/lib.rs :: built_in_tools()
  The MCP tools/list surface advertised to CC subprocesses. Without an
  entry here, CC never sees the tool name or schema; permitted_tools()
  intersects this list with the dispatcher's allowed_tools, so the
  daemon's allowlist alone is not enough.

- crates/openfang-mcp-bridge/src/main.rs :: DEFAULT_ALLOWED
  Fallback allowlist used when OPENFANG_BRIDGE_ALLOWED is unset
  (legacy / dev spawn path). Mirrors the daemon's bridge_ipc list.

Net effect: with 13a deploying the deny of CC's native Bash/WebSearch
tools, CC agents would have lost shell + web search entirely on deploy.
This commit closes that gap so the bridge actually delivers what
13b/13d advertised.

Surface drift test (built_in_tools_surface) updated with both names.
The permitted_tools intersection test now uses a synthetic
'not_a_real_tool' name since shell_exec is no longer an outsider.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bridges OpenFang's native apply_patch tool (registered in tool_runner.rs
at the ToolDefinition table, dispatched at line 209) end-to-end so CC
subprocesses can perform surgical multi-hunk edits instead of paying
the token + drift cost of whole-file file_write rewrites.

This is the closest analogue to CC's native `Edit` tool that OpenFang
ships today — slightly higher emit cost (unified diff vs string
substitution) but the same core ergonomic: change only what changes.
A native `string_edit` follow-up may replace this as the primary edit
ergonomic for agents; for now apply_patch closes the worst of the
"no Edit" gap left by 13a's CC native-tool deny set.

Three files, mirroring every prior bridge-tool add since 908c39c:

  - crates/openfang-api/src/bridge_ipc.rs
      • apply_patch added to ALLOWED_TOOLS
      • apply_patch added to FS_SANDBOXED_TOOLS — tool_apply_patch
        resolves Add/Update/Delete paths embedded in the patch against
        workspace_root, so the no-workspace fail-closed gate is critical
        (same sibling-leak class as file_write)
      • Two name-locked tests: allowlist + sandbox membership

  - crates/openfang-mcp-bridge/src/lib.rs
      • Tool::new("apply_patch", ...) added to built_in_tools()
        (mirrors the runtime schema by hand, as the bridge crate is
        runtime-free by design)
      • built_in_tools_surface test updated

  - crates/openfang-mcp-bridge/src/main.rs
      • apply_patch added to DEFAULT_ALLOWED

No new plumbing required — workspace_root + the kernel handle are
already threaded into the bridge's execute_tool call site, and the
native apply_patch implementation already enforces the sandbox via
crate::apply_patch::apply_patch(&ops, root).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Commit 18 closes the gaps surfaced by the deny-set audit
(plan §10d). Adds 14 entries to CC_NATIVE_DENY, bringing it
from 10 → 24 names, grouped into five categories:

- Bash adjuncts / shell streamer: BashOutput, KillShell,
  KillBash, Monitor. Inert today (Bash is denied) but locked
  down as defense in depth — Monitor in particular is
  Bash-with-streaming and bypasses shell_exec's exec_policy
  gating entirely.
- FS escape via worktree: EnterWorktree, ExitWorktree.
  Direct FS-escape primitive; native workspace-aware variant
  is on the follow-up backlog.
- Notebook read: NotebookRead. Symmetry with NotebookEdit;
  forward-compat against future CC versions that ship it.
- Skill substrate: SlashCommand. CC's parallel skill
  curation surface; OF's is canonical.
- OpenFang-First scheduling / control plane: CronCreate,
  CronDelete, CronList, ScheduleWakeup, RemoteTrigger, plus
  PushNotification on the comms-routing axis.

Deliberately NOT denied (pinned in tests): TodoWrite, Task,
Skill, AskUserQuestion, EnterPlanMode, ExitPlanMode,
TaskOutput, TaskStop, ToolSearch — agent-internal control
flow / UX with no escape surface.

Tests: existing pins extended; new
test_cc_native_deny_covers_audit_gaps locks in each new
addition with a category-rationale assertion. 20/20 in
drivers::claude_code green; 1005/1005 openfang-runtime lib
green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Locks in the invariant that the bridge tool surface is described by three
co-equal lists, all of which must agree:

  1. openfang_api::bridge_ipc::ALLOWED_TOOLS    (daemon dispatch gate)
  2. openfang_mcp_bridge::built_in_tools()      (MCP advertise surface)
  3. openfang_mcp_bridge::DEFAULT_ALLOWED       (bridge process default)

13b/13d shipped with a tool dispatchable on the daemon side but absent
from the MCP advertise surface — invisible to CC. The existing IPC smoke
tests exercised the daemon path directly and never hit `tools/list`, so
the gap shipped silently in two commits in a row. This commit closes
that class of failure permanently.

Changes:
- Move `DEFAULT_ALLOWED` from `main.rs` (binary, untestable) into
  `lib.rs` as `pub const`, so the drift-catcher can reach all three sets
  from `openfang-api` tests. `main.rs` now imports it.
- Add `allowlist_three_way_correspondence` in `bridge_ipc::tests` —
  asserts the three sets are equal, with diff output identifying which
  set is missing what when it fires.
- Add `allowlist_count_is_sixteen` — pins cardinality on all three sets
  so an off-by-one add/remove on any single side fails loudly.

Both tests are documented with the lesson from 13b/13d and the
three-file pattern any future bridge tool add must follow.

Test results:
- openfang-api lib: 13/13 bridge_ipc tests pass (was 11; +2 new)
- openfang-mcp-bridge lib: 5/5
- openfang-runtime lib: 1005/1005
- openfang-runtime drivers::claude_code: 20/20

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The exec_policy deny path in shell_exec and process_start formatted the
rejection message as "{reason}. Current exec_policy.mode = ...", but
validate_command_allowlist already terminates its reason string with a
period. Result was "safe_bins.. Current ..." in every deny payload.

Trim a trailing "." off the reason before interpolating so the rendered
message has a single, correctly-placed period.

Cosmetic only - no behavior change to the deny gate itself.
benhoverter and others added 29 commits June 4, 2026 11:15
…_DENY

Closes a subprocess-sandbox gap missed during bridge tool-surface work. Three CC-native tools bypassed the mcp-bridge enforcement seam by construction.

- Task/Skill are parallel control-plane substrates. OpenFang owns orchestration via agent_spawn/agent_send and skill curation via skill_execute. This reconciles with the already-denied SlashCommand, which is the same skill room.

- AskUserQuestion is headless-inert under -p/stdin-null. An explicit deny tool_result replaces the silent dead-end that plausibly fed phantom picker-failed openers.

Removed the three from the must_allow test loop and added positive deny assertions. Loop-critical control flow stays allowed: TodoWrite, EnterPlanMode/ExitPlanMode, TaskOutput/TaskStop, ToolSearch.
Path-tier filesystem governance config layer, mirroring ExecPolicy's
ergonomics. Inert by default (enabled=false means the legacy workspace
clamp applies, byte-for-byte). Fail-closed when enabled
(default_tier=Deny). Adds the global Config.file_policy field, the
per-agent AgentManifest.file_policy override with a lenient serde shim,
and FilePolicy::tier_for (pure longest-prefix match). No consumer yet.
Threading into the tool layer lands in later commits.
…r resolver

Refactors workspace_sandbox.rs so the absolute, non-overridable parts (the
"floor": ..-rejection, canonicalization, sensitive-path deny-list) are
separable from the workspace clamp:

  - sandbox_floor()           floor only, returns the canonical candidate
  - enforce_workspace_clamp() the legacy starts_with(workspace_root) check
  - resolve_sandbox_path()    floor + clamp; public signature unchanged, so
                              existing callers compile untouched
  - resolve_with_policy()     floor, then tier evaluation when a FilePolicy
                              is active (else the legacy clamp)

The sensitive-path deny-list (~/.openfang config/secrets/vault/run/daemon
logs/credential files, and ~/.mcp-auth OAuth tokens) is ported verbatim
from fork commits 421e5d9 and 71639c0 into the floor, so it ships with the
governance feature that needs it and always wins over any tier rule.

Tests: 6 original sandbox tests preserved, 9 sensitive-path tests ported,
6 new resolve_with_policy tier tests (None==legacy, disabled==legacy,
default-deny, read-blocks-write, longest-prefix, prompt-fail-closed).
21 pass.
Wires the path-tier governance into the tool layer:

  - execute_tool gains a file_policy: Option<&FilePolicy> param, a sibling
    of exec_policy.
  - The five filesystem verbs (file_read / file_write / file_list /
    create_directory / apply_patch) resolve paths through
    resolve_with_policy, passing needs_write so the read tier blocks
    mutating verbs. create_directory also gets sensitive-path floor parity
    (its bespoke resolver never ran sandbox_floor).
  - Prompt tier (D1-a / Q1=B): a path-aware pre-pass in execute_tool's top
    block reuses the existing RightNow-AI#1182 request_approval primitive for
    single-path fs tools. Approve => proceed; deny / timeout / no-kernel =>
    fail closed. apply_patch is multi-path and fails closed on prompt tier
    for v1.
  - Threaded through both agent_loop call sites (effective_file_policy =
    manifest.file_policy.as_ref()); the HTTP /mcp route and all 15 test
    stubs pass None (the /mcp surface graduates from None with the fork's
    AgentExecContext leaf).

Scope notes: two auxiliary file-reading tools (channel media attachment,
speech_to_text) keep the legacy clamp (None) — governing non-file_* tools
is a follow-up. No subprocess/bridge awareness here; §6 inertness lands as
a fork leaf.

Builds clean; clippy -D warnings clean; openfang-runtime 1008 / types 386
tests pass.
Two-layer floor in is_sensitive_home_path: a curated hard-deny set that
ignores any allow rule (.ssh, .aws, .gnupg, .kube, .netrc,
.git-credentials, .npmrc, .pypirc, .pgpass) plus component-sequence-aware
two-deep entries (.config/gcloud, .config/gh, .docker/config.json,
.cargo/credentials). Other $HOME dotfiles fall through to Layer-2 prompt.

Closes the credential-exposure path (security finding F1) where a broad
allow rule could reach SSH/AWS/GPG/kube/.netrc secrets. Adds
test_sensitive_home_credential_floor.
… [F2]

The global [file_policy] was dead config: the runtime only ever read
manifest.file_policy, which stays None unless an agent.toml sets it, so a
global deny silently no-opped. This wires global -> manifest at the same
four sites exec_policy uses (spawn, restart re-resolve, persist-strip,
merge-disk), but with hard-floor (narrow-only) semantics rather than
exec_policy's wholesale replace: a per-agent override may only narrow
access, never widen past the global floor.

Mechanism:
- FilePolicy gains a transient #[serde(skip)] `floor` field, re-derived
  from current config on every spawn/restart (relative-path tiers resolve
  at match time, so the floor must be evaluated then, not merged at config
  time). tier_for now returns the more-restrictive of own vs floor tier;
  a disabled override defers entirely to its floor.
- FileAccessTier::floor() picks the more restrictive of two tiers
  (Write < Read < Prompt < Deny).
- FilePolicy::resolve_under_floor(global, override) is the single resolver
  called at all four kernel sites.
- resolve_with_policy gates on is_active() (enabled || enabled floor) so a
  disabled per-agent override cannot escape an enabled global floor.

Adds 6 unit tests covering inherit-wholesale, cannot-widen, may-narrow,
disabled-override-cannot-escape, disabled-global-no-floor, and the tier
floor helper.
apply_patch resolved and wrote each op inline, so a policy-denied (or
out-of-workspace) target later in a multi-op patch could land after
earlier ops had already written. Add a pre-pass that resolves every
target (AddFile / UpdateFile path + move_to / DeleteFile) against
workspace confinement and file_policy before any write; if any target is
rejected, the whole patch is refused with zero writes.

Scope note (per security review): this is a policy pre-check, not
filesystem-transactional rollback. A mid-apply I/O failure can still
leave partial writes — std offers no clean rollback — but no write occurs
past a policy-denied target.

Adds test_apply_patch_f3_denied_target_blocks_all_writes.
create_directory only consulted file_policy inside
`if let (Some(fp), Some(root)) = (file_policy, workspace_root) { if fp.enabled`,
so it fell open in two ways: (1) with no workspace root the policy was
skipped entirely and the directory was created with only the sensitive-
path floor; (2) gating on `fp.enabled` meant a disabled per-agent override
beneath an enabled global floor (F2) bypassed the check.

Now gates on `is_active()` (enabled || enabled floor) and refuses
creation when an active policy has no workspace root to evaluate against.

Adds test_create_directory_f4_denied_tier_blocks and
test_create_directory_f4_active_policy_no_root_fails_closed.
The prompt-tier pre-pass canonicalized the target to make the approval
decision, then each fs tool independently re-canonicalized it for the
actual I/O. A symlink swapped during the (human) approval window could
make the written/read target differ from the one that was approved.

execute_tool now resolves the canonical target once and threads it into
the single-path fs tools (file_read / file_write / file_list /
create_directory); each asserts its own resolution equals the
pre-validated path via workspace_sandbox::assert_prevalidated and fails
closed on mismatch. The pre-pass also now gates on is_active() (F2) so a
disabled per-agent override beneath an enabled global floor still
participates.

Scope (per security review): this closes the approval-window TOCTOU. A
narrow residual remains between the final resolution and the kernel
open() (the canonicalize->open window); eliminating it needs
openat/O_NOFOLLOW semantics std does not expose portably, documented as a
fast-follow rather than chased here.

Adds test_assert_prevalidated_toctou_guard.
…ller

Integration seam between feat/file-policy-core and feat/mcp-bridge-upstream-forwarding. The bridge leaf added a brand-new execute_tool caller (bridge_ipc.rs) that the file-policy leaf never saw, so rebasing the file-policy stack onto the bridge leaf left that caller one positional arg short of the widened signature.

Passes None for the new file_policy parameter, mirroring routes.rs and the HTTP /mcp path. FilePolicy enforcement is wired only on the kernel and agent-loop path for v1. The bridge IPC and /mcp surfaces remain uncovered, with a follow-up issue to extend FilePolicy to both MCP surfaces.

Lives only on integration/file-policy-on-bridge. feat/file-policy-core stays pristine on upstream/main for the upstream PR.
resolver_audit.rs drops two needless path borrows (already a Path ref) flagged by clippy needless_borrows_for_generic_args at the audit-file open and set_permissions sites. tool_runner.rs gets a rustfmt sweep collapsing the synth_attach assert_eq and a trailing blank line in tests. No behavior change. Closes the -D warnings and fmt --check gap before the upstream PR.
The const's only use site is a #[cfg(debug_assertions)] debug! call, so release builds (debug_assertions off) compiled the use away and flagged the const as dead_code. Gate the const to match its use site: defined+used in debug, absent in release. No behavior change (the release audit sink is resolver_audit::record_resolution, untouched). ANAI-55.
bridge_ipc dispatch_call hardcoded file_policy=None, so every file op
arriving over the MCP bridge fell into the legacy workspace-clamp arm
and skipped tier evaluation entirely. This silently downgraded any
agent whose policy was *tighter* than the workspace clamp (e.g.
default_tier=deny): over the bridge it got the broader workspace view.

Pass entry.manifest.file_policy.as_ref() -- the floor-resolved registry
manifest, same source effective_exec_policy already uses two lines up,
matching agent_loop's semantics. None when no [file_policy] is declared
(legacy clamp preserved), Some only when the operator wrote a policy.

sandbox_floor() already runs unconditionally before the tier match, so
the P0 credential floor was never affected -- only the configurable tier
layer was bypassed. Supersedes the deliberate None stub in 41460a8.
FileRule.path with a leading ~ or ~/ never expanded. own_tier_for joined
the non-absolute path onto the workspace root (Path::new("~/foo").is_absolute()
is false), so "~/work" became "<workspace>/~/work" and could never match an
absolute target. Under default_tier = deny this silently denied granted paths.

Normalize once at deserialize via a serde deserialize_with hook
(expand_tilde / expand_tilde_with), so own_tier_for is untouched and just
sees an absolute prefix. A "~user" form is left literal (no shell-style
lookup), and a missing home dir falls back to passthrough -- fail-safe,
never widens a grant.

Adds 8 tests including the regression tilde_rule_matches_target_after_expansion
(deserialize a ~/grant rule, assert Write on $HOME/grant/..., Deny on a sibling).
cargo test -p openfang-types --lib: 408 passed, 0 failed. clippy -D warnings clean.
…pprove [ANAI-81]

Pre-#2a, `resolve_approval_text` stamped `decided_by = "channel"` with no
identity check: any user on a bound channel could resolve another agent's
approval, and the audit trail recorded only the literal "channel". The push
notification (#2b) would have advertised that open door, so the authz fix
lands here in #2a, ahead of the buttons.

- auth: new `Action::ApproveExecution` (Admin floor) — a role-based proxy for
  binding-owner ownership until per-binding ownership lands as its own schema
  change.
- channel_bridge: identify + authorize the resolver before resolving; record
  `decided_by` as a stable `uid:name` (UUID-anchored audit, name as
  decoration); reject unrecognized and under-privileged users. RBAC-disabled
  is fail-open-but-recorded with a WARN, matching the bridge's
  allow-when-unconfigured posture (security-openfang verdict A).
- bridge: thread channel_type / user_id / display_name through handle_command
  and dispatch_message; update existing tests for the new param.
- channel echo shows the name only (no UserId UUID leak); the audit record
  keeps uid:name.
- tests: ApproveExecution admin-floor matrix + unrecognized-denied.

Follow-ups tracked: bridge-down explicit auto-deny (AC: no path lands an
auto-approve); per-binding ownership to replace the Admin-role proxy;
shared-vs-DM WARN discrimination; resolve-level integration tests + api-crate
kernel fixture.

Reviewed-by: security-openfang (M1/M2/M3 + WARN cleared to commit)
…ANAI-81]

The kernel-layer authorize() matrix (Admin floor, unrecognized-denied) was
already covered in openfang-kernel auth.rs. The bridge glue in
`resolve_approval_text` was not — that's the "resolve-level integration tests
+ api-crate kernel fixture" follow-up flagged in 238af8a.

Standing up a full OpenFangKernel in a unit test is impractical (no cheap test
constructor), so this extracts the #2a authz branching verbatim into a pure
`classify_approver(&AuthManager, channel_type, user_id, display)` helper and
tests that against a cheap AuthManager fixture. `resolve_approval_text` becomes
a thin caller — no behavior change; the gate logic and the RBAC-disabled
fail-open posture (security-openfang verdict A) are unchanged, just relocated
into a named, tested seam.

Tests (api crate, openfang-api):
- admin_floor_records_uid_name: Admin/Owner resolve; audit label is the stable
  `uid:name`; channel echo is name-only (asserts no UserId UUID leak).
- below_floor_denied: recognized user/viewer roles rejected ("Not authorized").
- unrecognized_denied: unknown channel identity rejected ("Unrecognized").
- rbac_disabled_fail_open_records_channel_identity: no users => no gate, but
  records `channel_type:user_id (display)` and echoes the display name.

No new deps. `cargo test -p openfang-api --lib` green (128 passed).
…s [ANAI-82]

Foundation for #2b (Discord interactive approvals). Adds the outbound
capability only — no emit site and no inbound handling yet, so existing
behavior is unchanged.

- types: add `ChannelContent::Interactive { text, buttons }`, the
  `InteractiveButton` / `ButtonStyle` types, and a `degrade_to_text` helper.
  `custom_id` is documented as opaque (request id + nonce) and explicitly
  never an authz carrier (capability-in-URL antipattern, per security review).
- types: add `ChannelAdapter::supports_interactive()` (default false) so the
  ~50 non-Discord adapters need no change; the dispatch path will degrade
  Interactive -> text on adapters that return false (wired in #2b commit 2).
- discord: render Interactive as a `components` action row (button type 2,
  style 1-4), `supports_interactive() = true`, single non-split message.
- bridge/telegram: exhaustiveness arms for the new variant; inbound sites
  treat it as its text body (outbound-only — never expected inbound),
  Telegram degrades to the self-sufficient text body.

Tests: Discord style mapping (API v10), action-row shape, 5-button chunking,
degrade_to_text. `cargo test -p openfang-channels --lib anai82` -> 4 passed.
Workspace check + clippy clean.
…AI-82]

Extract the adapter.send/send_in_thread chokepoint in send_parsed into a
single dispatch_content() helper that degrades ChannelContent::Interactive
to its text body when the target adapter does not support_interactive.
Keeps the ~50 non-Discord adapters from ever receiving a variant they
cannot render, and gives the Piece 2 approval-prompt emit path a safe sink
on every channel.

No behavior change on existing paths: send_parsed only ever builds Text or
Multipart today. Adds RecordingAdapter plus tests covering both branches of
the degrade guard (supported passes through, unsupported collapses to text).

Piece 1 of ANAI-82, decision-independent, no new outbound surface.
ANAI-82 / 2b Piece 2 Option Y -- step 1 of 5 plumbing.

Introduce ApprovalOrigin in openfang-types to carry the live origin of a
triggering run, so a downstream approval prompt can be pushed back to the
exact channel/conversation it came from. Add an optional origin field on
ApprovalRequest, serde-default so persisted/in-flight requests and existing
constructors stay backward-compatible. Validate bounds the channel_id,
thread_id and recipient lengths.

The recipient field is audit/targeting only and must never be used as an
authz carrier -- the clicker is re-authorized from the platform-attested
interaction identity per 2a.

Behavior-preserving: nothing reads origin yet. The three live struct-literal
sites set origin: None to keep the workspace compiling.

Tests: ApprovalOrigin serde round-trip, request with origin round-trip,
legacy JSON without origin deserializes to None, over-long channel_id
rejected, max-length fields accepted. cargo test -p openfang-types -- 413 passed.
ANAI-82 / 2b Piece 2 Option Y -- step 2 of 5 plumbing.

Add send_message_with_origin and send_message_with_blocks_and_origin to the
ChannelBridgeHandle trait, default-implemented to ignore origin and delegate
to the non-origin variants. This keeps the ~50 adapters and every existing
caller unchanged until the real kernel impl in openfang-api overrides the
method to thread origin into the run.

Behavior-preserving: only the path that overrides the default changes, and
nothing overrides it yet.

Tests: default method delegates to send_message unchanged.
ANAI-82 / 2b Piece 2 Option Y -- folds in security-openfang LOW from the
delta review of 5dd78b3..d3f7387.

validate() bounded channel_id / thread_id / recipient but not channel_type,
which is the same adapter-supplied-string surface. Add it to the same
256-char bound for a uniform contract -- defense-in-depth; no current adapter
can set it arbitrarily.

Test: over-long channel_type rejected.
…AI-82]

Option-gated signature threading so a future emit site (Piece 2) can read
where an approval-gated run originated. Behavior-preserving: origin is None
at every source caller and nothing reads it yet — the user-visible change
lands later behind security sign-off (step 5), same discipline as #2a.

Chain (all Option<&ApprovalOrigin>):
  run_agent_loop{,_streaming}  -> execute_tool  -> KernelHandle::request_approval
  -> OpenFangKernel impl sets ApprovalRequest.origin = origin.cloned()

- openfang-runtime: kernel_handle trait default + execute_tool signature;
  shell approval gate forwards real origin, file_policy prompt-tier passes
  None (local FS, no channel push); both run_agent_loop variants forward to
  their execute_tool call sites.
- openfang-kernel: real request_approval impl populates origin; both loop
  launch sites pass None for now.
- openfang-api: bridge IPC + MCP-HTTP tool paths pass None (no channel origin).

Clippy debt deliberately deferred: execute_tool is at 20 args
(#[allow(too_many_arguments)]); a follow-up folds the approval params into
ApprovalCtx<'_> as a separate refactor PR, per scope-B-Y-plumbing.md.

Verify:
  cargo check --workspace: green (13 crates)
  cargo test --workspace --no-run: all test targets compile
  cargo test -p openfang-runtime --lib: 1099 passed, 0 failed
  cargo test -p openfang-kernel --lib approval: 12 passed, 0 failed
…Y step 4)

§8 step 4 of the approval-surfacing Y plumbing: the bridge now stops
dropping the origin it already computes and carries it into the run.

- Add approval_origin_for(message, thread_id): builds ApprovalOrigin from
  the same single-source-of-truth as routing (binding_context_for) plus the
  resolved (possibly auto-created) thread. recipient = peer_id, audit/target
  only — never an authz carrier.
- Text path (dispatch_message): send_message -> send_message_with_origin,
  including the re-resolution retry.
- Blocks path (dispatch_with_blocks): send_message_with_blocks ->
  send_message_with_blocks_and_origin, including the retry.

Behavior-preserving: nothing reads ApprovalRequest.origin until the emit
site (step 5), which stays held for security-openfang. Every non-channel
caller (cron/API/agent_send) still passes None and is byte-for-byte
unchanged.

cargo check -p openfang-channels: green.
cargo test -p openfang-channels --lib: 548 passed.
…-82]

Make ApprovalManager an additive lifecycle event source so an out-of-band
surfacer (the channel bridge) can push approval prompts and post follow-ups.
This is the substrate half of §8 step 5; the live emit *consumer* (which reads
ApprovalRequest.origin to resolve a push destination) is held for
security-openfang's delta review and is NOT included here.

- New `ApprovalEvent` enum: Submitted / Resolved / TimedOut, each carrying the
  ApprovalRequest (origin included; audit/targeting only, never an authz carrier).
- `events: broadcast::Sender<ApprovalEvent>` on ApprovalManager + `subscribe()`.
- Emit on submit, resolve, and timeout.

Behavior-preserving: delivery is lossy and requires no subscribers. With no
listener, every send is a silent no-op and the approve/deny/timeout decision
path is byte-for-byte unchanged. Nothing reads `origin` yet — that line is the
security-gated emit consumer, staged next.

Tests (3 new, 15/15 approval-module green):
- submitted+resolved ordering and payload round-trip
- timed-out event after expiry (start_paused virtual time)
- origin carried verbatim on Submitted

cargo check --workspace green (all 13 crates).
…-82]

§8 step 5, the live half. This is the first and only place that reads
`ApprovalRequest.origin` to take an action: a bridge-side subscriber to the
ApprovalManager event stream (ca55242 substrate) that pushes a prompt back to
the channel/conversation the triggering run came from.

Design — pure side-effect, fail-closed:
- New `approval_push_target(req) -> Option<ApprovalPushTarget>` (pure, unit
  tested): returns None unless origin carries BOTH a non-empty channel_type
  and a concrete non-empty channel_id. None ⇒ no proactive push; the
  self-sufficient text `/approve <id>` body that already accompanied the
  request stands. Never best-efforts to a broader/ambiguous channel.
- `recipient` (the triggering peer_id) is deliberately absent from
  ApprovalPushTarget — audit/targeting only, never an authz or routing input.
  Structurally enforced: the target type has no recipient field.
- `OpenFangKernel::surface_approval_prompt` builds the prompt (still carrying
  `/approve <id>` so it degrades on non-interactive adapters) and pushes via
  the existing fail-closed `send_channel_message` (adapter-not-found and
  recipient-unresolved both Err → logged + swallowed). It NEVER resolves the
  request: approval stays on the text `/approve` and button paths, which
  re-authorize from the platform-attested interaction identity. Nothing here
  reaches `classify_approver`; Verdict A (RBAC-off) carries unchanged because
  resolution is untouched.
- Subscriber wired in `start_channel_bridge_with_config` after the bridge is
  up, and only when a bridge actually started. Lossy broadcast handled
  explicitly: RecvError::Lagged(n) logs a WARN and continues — a dropped/lagged
  event means "no proactive prompt", never "no approval"; the decision path
  (await/timeout in request_approval) is wholly independent. Closed ⇒ task
  exits. Task tied to BridgeManager shutdown (new `shutdown_receiver()` +
  `attach_task()`) so it is torn down on stop()/reload rather than leaking.

Tests (4 new, fail-closed decision):
- no target when origin absent
- no target when channel_type empty
- no target when channel_id missing or empty
- target carries conversation + thread but NOT recipient

Verified locally: cargo check --workspace (13 crates) ok; clippy
-p kernel,channels,api -D warnings clean; kernel --lib approval 19/19;
channels --lib 548/548. Additive only (+228, 0 deletions) — held for
security-openfang delta review before merge.
Closes the MEDIUM from security-openfang's emit-consumer review: the prompt
pushed by `surface_approval_prompt` is composed from agent-controlled fields
(agent_id, tool_name, action_summary) and sent via `send_channel_message` →
`outbound_attach::parse`. The parser strips `<openfang:attach …/>` markers and
appends their `caption=` values to the displayed text — so a requesting agent
could shape what the human approver sees (e.g. a reassuring "(dry-run only)"
caption) and make the rendered prompt diverge from the real action. That breaks
the informed-consent property the whole approval gate rests on.

(File exfiltration was already blocked: the surfacer passes workspace_root:
None → empty allow_roots → every path fails validation, with the ~/.openfang
hard-deny as backstop. This fix is about prompt integrity, not exfil.)

Fix (option 1, security's preferred):
- New `outbound_attach::neutralize_markers(text)` — HTML-escapes the `<` of any
  `<openfang:attach` opener (the transform this module's own parse doc notes
  "break[s] detection"), case-insensitive for defense in depth. The marker is
  no longer parsed but stays visible and faithful to the human. Shared so the
  pre-existing `/approvals` listing path can adopt it too.
- `surface_approval_prompt` runs agent_id / tool_name / action_summary through
  it before composing the prompt. Fields are sanitized independently, so cross-
  field concatenation cannot reform a marker either.

Tests (2 new): neutralize_breaks_marker_detection (well-formed marker +
caption → parse sees NoMarkers, nothing stripped/appended, opener escaped),
neutralize_is_noop_without_marker_and_case_insensitive.

FOLLOW-UP (not in this commit): the `/approvals` text listing
(list_approvals_text) routes through the same send_parsed path and likely has
the same exposure. Tracked separately; this commit covers the new push surface.

Verified: clippy -p kernel,channels,api -D warnings clean; outbound_attach
24/24; my lines fmt-clean. Held for security-openfang re-read before merge.
Folds in the live twin security-openfang traced during the ad01939 re-read.
`list_approvals_text` built the queue listing from raw agent-controlled fields
(agent_id, tool_name, action_summary) and returns through send_parsed →
outbound_attach::parse — the same sink as the proactive prompt. A pending
request carrying an `<openfang:attach …/>` marker in its summary would be
stripped-and-caption-promoted at the exact moment an operator runs /approvals
to decide. Same integrity hole, pull instead of push.

- Extracted the composition into a pure `render_pending_approvals(&[..])` (no
  kernel/IO) so the invariant is unit-testable, mirroring the kernel-side
  approval_push_target/surface split.
- Runs the three agent-controlled fields through the shared
  outbound_attach::neutralize_markers (landed in ad01939). id_short / risk_level
  / age are system-controlled and left as-is.
- list_approvals_text is now a one-line delegate.

Test (1 new): approvals_listing_neutralizes_attach_markers — a queued request
with a marker+caption in action_summary → render → parse asserts NoMarkers and
the opener stays escaped (verbatim, not stripped/promoted).

Both attach-marker surfaces (proactive push + pull listing) are now closed by
the one shared neutralizer.

Verified: clippy -p openfang-api -D warnings clean; the new test green; my
lines fmt-clean. Held for security-openfang's mechanical re-read before merge.
…art batching

Squashed reroll of topic/discord-outbound-attachments onto v0.6.9 (acf2587).
Replaces the 16-commit history that landed via local-main merge 2c32102.

Reroll context:
  - mime/size on ChannelContent::File and file:// in download_image_to_blocks
    already landed upstream via RightNow-AI#1143; dropped from this branch.
  - source_url-on-ContentBlock::Image is provided by feat/runtime-image-cache
    (PR RightNow-AI#1151); the three compactor source_url tests live there.
  - Net contribution here is the outbound Multipart feature plus its deps.

Feature scope:
  - Outbound for ChannelContent::File and Image URL arms (per-block resolver
    dispatch in mixed Multipart payloads).
  - N-attachment batching with greedy-pack chunker and aggregate per-chunk
    byte cap; parallel attachment fetch via try_join_all.
  - SSRF guard + redirect revalidation + log scrubbing for URL fetch
    (Fetcher trait abstraction).
  - Body-aware 429 retry on the multi-file path; structured partial-send
    WARN with grep-friendly event.
  - Transparent decompression disabled on image download (bridge.rs).

Verification:
  - cargo check --workspace: clean
  - cargo test -p openfang-channels -p openfang-runtime: 1002/1002

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@benhoverter benhoverter force-pushed the topic/discord-outbound-attachments branch from 5a2d39f to 135d07f Compare June 21, 2026 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant