Skip to content

Latest commit

 

History

History
187 lines (147 loc) · 11 KB

File metadata and controls

187 lines (147 loc) · 11 KB

← Back to README

Tool-policy enforcement

A tool can declare a ToolPolicy — the risk class plus filesystem, network, and environment sub-policies (see permission-manifest.md for the declaration DSL and the manifest it feeds). This page is about what the runtime does to enforce that declaration.

Enforcement is delivered in two layers:

Layer What it protects Applies to Status
Layer 1 — in-JVM policy gate Filesystem-path arguments every tool (in-process lambdas included) shipped (#2890)
Layer 2 — OS sandbox The process itself (fs + network + env) subprocess-shaped tools macOS Seatbelt + Linux bwrap (#2906 / #2892); firejail fallback, Wasm/Docker planned (#1916)

Layer 1 — the in-JVM filesystem policy gate

When a tool declares a filesystem stance, the framework checks every pending tool call against it before the executor runs. No hand-written interceptor is needed.

val uploads = tool("writeReport") {
    description("Write a report file")
    policy {
        risk = ToolRisk.Medium
        filesystem { write("/srv/uploads/**") }   // declared write surface
    }
    executor { args ->
        File(args["path"].toString()).writeText(render(args))
        "ok"
    }
}

With this declaration:

  • writeReport(path = "/srv/uploads/2026/r.txt")runs (inside the declared glob).
  • writeReport(path = "/etc/passwd")denied before the executor runs. The denial surfaces exactly like any other blocked call:
    • onToolDenied { name, args, reason -> … }
    • PipelineEvent.ToolDenied (carries toolPolicyRisk and usedDeclaredCapability)
    • the executed-call hooks (onToolUse / PipelineEvent.ToolCalled) do not fire for it.

Exactly what the gate checks

  1. Opt-in by declaration. If a tool's filesystem stance is Unspecified for both read and write (i.e. it declared no filesystem policy), the gate does nothing. Existing tools are unaffected — enforcement only engages once you declare a stance.
  2. Absolute path arguments only. Each string-valued argument that is an absolute path is a candidate. The allowed set is the union of the declared read + write globs; a candidate that matches none of them denies the call.
  3. Normalization. Candidate paths are normalized before matching, so /srv/uploads/../../etc/passwd resolves to /etc/passwd and is denied — .. traversal cannot escape a declared glob.
  4. None means none. A filesystem { writeNone() } (or readNone()) stance is a declared stance with an empty allow-set, so any absolute path argument is denied.

Turning it off

Enforcement is on by default. To restore the 0.6.0 declare-only (inert) behavior — for example if you prefer to enforce with your own onBeforeToolCall interceptor:

agent("myAgent") {
    enforceToolPolicies = false
    // … your own onBeforeToolCall { … } enforcement, if any
}

The built-in gate runs before user onBeforeToolCall interceptors and short-circuits on denial (matching the "first non-Proceed wins" chain semantics). An executed call still flows through your interceptors.

Deliberate Layer-1 limitations

Layer 1 inspects arguments, not the running process. Two things it intentionally does not do — both covered by the Layer-2 OS sandbox:

  • Relative paths are not gated. The JVM has no reliable, side-effect-free way to bind a lambda's working directory, and treating every slash-bearing string as a path would false-deny ordinary content. Only absolute paths are checked. Pass absolute paths, or run the tool under the Layer-2 sandbox, when you need relative-path coverage.
  • network and environment are not enforced in-JVM. A plain in-process Kotlin lambda can open a socket or read an environment variable with no interception point (modern JDKs have no SecurityManager). Declaring network { denyAll() } documents intent and feeds the manifest, but the actual block requires the Layer-2 OS sandbox.

These boundaries are the reason for Layer 2: real process/network/env isolation for tools whose executor shells out to a subprocess. See the roadmap entry for #1916.


Layer 2 — OS sandbox (macOS Seatbelt + Linux bwrap)

Layer 2 isolates the process, not just the arguments — so it holds even for paths a tool constructs itself. The first slice (#2906) is macOS write-confinement via Seatbelt:

val sandbox = ProcessSandbox(sandboxedFolder)        // folder is canonicalized (toRealPath)
val result = sandbox.run(listOf("/bin/sh", "-c", "")) // runs under sandbox-exec
// writes outside sandboxedFolder are blocked by the kernel; result.ok == false

agents_engine.sandbox.ProcessSandbox generates a Seatbelt profile that denies by default and allows file writes only under one canonical folder (reads + process exec stay allowed so the command can load). The convenience sandboxedEchoToFileTool(folder) is the simplest end-to-end example — a tool that echoes text into a path, OS-confined to folder.

The ergonomic way to build a sandboxed subprocess tool is processTool, which auto-applies ProcessSandbox.forPolicy from the tool's declared policy — you never wire the sandbox by hand:

val grep = processTool("grep", policy = toolPolicy {
    risk = ToolRisk.Medium
    filesystem { read("/data/**"); write("/out/**") }
    network { denyAll() }
}) { args ->
    listOf("rg", args["pattern"].toString(), "/data")   // the command to run
}
// writes confined to /out, network blocked, stdout returned; ERROR (no run) if the
// platform has no OS sandbox (fail-closed).

Caveats / status:

  • Three backends, picked by OS (#2892): macOS Seatbelt (sandbox-exec), Linux bubblewrap (bwrap — binds the root fs read-only, re-binds the write roots read-write, --unshare-net unless network is opened), then Linux firejail (the setuid fallback — --read-only=/ + --read-write carve-outs + --net=none, so it still confines where unprivileged user namespaces are restricted and bwrap can't start). isSupported() is true when any is present. On a host with none, run by default does not throw — it runs the command via a plain ProcessBuilder and prints a loud UNCONFINED warning (isSupported() is false, so a caller that requires enforcement can detect and refuse); run(command, requireSandbox = true) (#4497) refuses instead — IllegalStateException, subprocess never starts — bringing the fail-closed stance to the low-level API. Wasm (#2894) and Docker (#2895) are follow-ups.
  • Write / network / env confinement, derived from policy. ProcessSandbox.forPolicy(policy) is the bridge from Layer-1 declaration to Layer-2 enforcement: write roots come from filesystem.write globs (one-or-many); network is default-deny and opens only for network = AllowAll; and the child environment is derived from environment { }allow("HOME") passes only those vars, denyAll() gives an empty env, unspecified inherits. forWritableRoots(roots, env, workingDir) sets roots, env, and working directory directly. Reads stay broad, and network { allow(host) } selective allow-listing needs the proxy (#2893; default-deny already ships); read-confinement, the grants { } structure DSL, and the process { } DSL are the remaining #2891 work.
  • macOS's /tmp is a symlink to /private/tmp, and Seatbelt matches the canonical path — ProcessSandbox resolves the folder with toRealPath() before building the profile.
  • OS-gated integration tests are tagged mac_os_only / linux_only (+ @EnabledOnOs). The pure arg/profile generation is unit-tested on every platform; the kernel-level Linux tests auto-skip on macOS and run on CI's native Ubuntu runner. See testing.md.

Relationship to the permission manifest

Declaration and enforcement are two sides of the same ToolPolicy:

  • the manifest (permission-manifest.md) is the build-time, reviewable view of what every tool may touch;
  • Layer 1 makes the filesystem part of that declaration bite at runtime;
  • Layer 2 (#1916, shipped 0.7.0) extends enforcement to the subprocess boundary for processTool-based tools; remaining work is read confinement, the hostname-allowlist egress proxy (#2893), Docker/Wasm backends (#2894/#2895), and the grants { } DSL — all 0.8.

Declare-vs-do comparator + the exec capability (#2887)

ToolPolicy gains a fourth capability: exec — the declared subprocess stance (exec { allow() } / exec { deny() }; absent sections in legacy manifests parse as unspecified). The manifest verifier treats unspecified/deny → allow as a widening (tool.exec.widened).

The static half is ToolPolicyCapabilityComparator in agents-kt-detekt: for a tool { policy { … }; executor { … } } declaration, the executor body's extracted capabilities (ToolCapabilityExtractor: FS_READ / FS_WRITE / NETWORK / ENVIRONMENT / EXEC) must be a subset of what the policy grants — using more than you declared fails the build with a widen-or-remove hint; declaring more than you use passes (over-declaration is a manifest-review concern, not a violation). Tools without a policy { } block are out of scope here (that's ToolBodyForbiddenApis' territory). Same honest limits as the extractor: syntactic and callee-name based — reflection and aliasing are invisible; the Layer-2 sandbox covers the residue.

High-level vs low-level sandbox API. processTool(name, policy) { … } is the fail-closed path: no OS sandbox backend → the tool refuses to run. Raw ProcessSandbox.run is the low-level primitive — by default it falls back to a plain ProcessBuilder with a loud UNCONFINED warning; pass requireSandbox = true to make it fail closed too (#4497). Anything dangerous belongs on processTool.

Usage constraints — constraints { } (#4490)

ToolPolicy declares what a tool may touch; ToolConstraints declare when and how often it may run within one invocation:

tool("commit") {
    policy { filesystem { write("/repo/**") } }
    constraints {
        maxInvocations = 3        // per agent invocation
        onlyAfter("fetch")        // prerequisite tools must have completed first
        // forbidden()            // quarantine: visible to code, never dispatchable by the model
    }
    executor { args, env -> … }
}

Violations deny through the standard auditable path (onToolDenied / PipelineEvent.ToolDenied / JSONL) and the model sees the reason as the tool result, so it can self-correct instead of dying. Counts are per invocation — a fresh tracker per run, nothing leaks across calls of a shared agent. Constraints appear in the permission manifest under each tool's constraints key. Deferred from the PRD sketch: ForceAtStep (prescriptive sequencing) and RequiresApproval (already first-class via humanApproval / HumanGateRegistry).