Skip to content

Centralize plan mode policy#5246

Merged
SivanCola merged 17 commits into
esengine:main-v2from
lifu963:pr/plan-mode-policy
Jun 24, 2026
Merged

Centralize plan mode policy#5246
SivanCola merged 17 commits into
esengine:main-v2from
lifu963:pr/plan-mode-policy

Conversation

@lifu963

@lifu963 lifu963 commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

背景

Plan mode 目前的规则分散在 control、agent、boot/config 和 economy source 连接逻辑中:提示文案、实际工具拦截、bash 安全判断、plan_mode_allowed_tools override、connect_tool_source 各自维护一部分语义。这会让模型看到的能力说明和实际 gate 发生漂移。

本轮工作不只是重构,也包含少量语义收紧和偏移修复。因此辛苦重点核查下:这些收紧是否符合项目作者对 plan mode 的设计意图。

做了什么

  • 新增 internal/planmode,集中维护 plan mode 的阶段 policy。
  • 让 model-facing marker、agent tool gate、bash plan-mode safety、plan_mode_allowed_tools、economy connect_tool_source 都消费同一套 policy。
  • 修复 marker 曾声称 task 可用但执行层会拒绝的漂移。
  • complete_step、写工具、安装/记忆变更、进程控制、writer-capable delegation 等归入 plan mode 阻断范围。
  • 收紧 plan_mode_allowed_tools:它现在只表示额外只读 custom/external tools,不能解锁 known blocked tools 或 unsafe bash。
  • 补充配置渲染、指南和迁移文档,说明 override 新语义。
  • 补充覆盖 marker、agent gate、bash 参数、economy source、配置 warning 的测试。

设计理解与需要作者核查的点

我对 plan mode 的理解是:它是“计划阶段”的粗粒度 gate,核心语义是用户批准前只能研究、询问和组织计划,不能通过工具产生写入、持久状态变更、安装能力、危险 shell、后台进程或 writer-capable 子代理。

基于这个理解,本 PR 做了两类收紧:

  • plan_mode_allowed_tools = ["bash"] 不再绕过 bash 安全检查;安全 bash 仍可运行,但 rmgit commitnpm install、管道/重定向/后台等仍被阻断。
  • complete_step 虽然在 permission 层是 read-only,但它属于批准后执行阶段的 evidence sign-off,所以 plan mode 下会被阻断。

这些改动能减少 plan mode 的逃生口,并让提示、执行 gate、economy mode 行为保持一致。但如果项目作者原本希望 plan_mode_allowed_tools 能作为非常强的 escape hatch,或认为 complete_step 在计划阶段也应可用,请指出,我会按项目意图调整。

后续可选方向

如果项目希望 plan mode 下仍能使用 task / subagent 做更大范围的研究,我认为更合理的后续方向不是重新放开当前 writer-capable task,而是单独设计“只读 subagent”能力:子 agent 继承 plan mode 阶段约束,只暴露 read-only research 工具,禁止 unsafe bash、后台进程、安装、记忆变更和 writer-capable delegation。这样可以保留 plan mode 的只读边界,同时让复杂计划仍能借助隔离上下文做更深入的探索。

这不是本 PR 的实现范围;我把它列出来,是希望维护者评估是否符合项目后续设计方向。

预期结果与收益

  • 修改 plan mode 规则时只需要改一个 policy authority。
  • 模型看到的 plan-mode 说明与实际工具 gate 一致。
  • unsafe bash 和 writer-capable delegation 不再能通过宽泛 override 绕过计划阶段边界。
  • economy mode 与 normal tool gate 对 plan mode 的判断保持一致。
  • 未来新增工具或 source 时,更容易通过集中测试发现遗漏。

Cache 影响说明

Cache-impact: low - plan-mode marker 仍作为 user turn 前缀注入,不进入 system prompt;plan mode toggle 仍不改变 provider-visible tool schema。
Cache-guard: go test -count=1 ./internal/agent -run TestPlanModeDoesNotMutateSystemOrTools;另已运行 go test ./...
System-prompt-review: self-reviewed; no stable system prompt change intended. The changed boot/config/token_profile files route policy/config behavior, and the model-facing plan-mode instruction remains user-turn context.

验证

  • go test -count=1 ./internal/planmode ./internal/agent ./internal/boot
  • go test ./...
  • git merge-tree --write-tree HEAD origin/main-v2 无冲突
  • PR diff 排除检查:未包含 AGENTS.mdarchitecture/context/.gitignore

Follow-up commit:只读 subagent 落地 + plan-mode gate 加固

上文“后续可选方向”提出的“只读 subagent”已落地为 read_only_task / read_only_skill;在此基础上,本轮 follow-up commit 对 plan-mode gate 做了一轮可靠性加固——把 gate 对不可信工具的失败方向从 fail-open 收敛到 fail-closed,并补上防漂移测试。

解决的问题

原 gate 在 plan mode 下盲信工具自报的 ReadOnly()(if call.ReadOnly { allow })。但 ReadOnly() 的语义是“可并行 / 无主机副作用”,不等于“计划阶段安全”:complete_step 正是 ReadOnly()==true 的伪只读工具,仅靠手动列入 knownBlockedTools 才被拦,且无任何测试对账 builtin 全集——新增工具时容易静默漂移。

做了什么

  • 三态 PlanSafety + tool.PlanModeClassifier 可选接口:工具自报计划阶段立场。complete_step 自报 unsafe(与 knownBlockedTools 双保险),read_only_task / read_only_skill 自报 safe。
  • Decide 改为温和 fail-closed:ReadOnly()==false 的工具(plugin/MCP、写工具)默认拒绝,除非自报 plan-safe 或在 plan_mode_allowed_tools 声明;ReadOnly()==true 的内部工具仍受信任(除非自报 unsafe)。强制不变量 PlanSafe ⇒ ReadOnly
  • reconcile_test.go(新):遍历 tool.Builtins(),任何 builtin 缺少显式 plan-mode 立场即测试失败——把“新增工具忘登记”从静默漂移变成 CI 红(forcing function)。
  • marker round-trip 测试(新):把 model-facing Marker 文案点名的“可用 / 不可用”工具锁到 Decide,防文案与 gate 漂移。
  • plan_mode_allowed_tools 现在也是只读 MCP/plugin 工具的逃生阀:它们按接口契约 ReadOnly()==false,默认 fail-closed,需在此显式声明;文档(SPEC/GUIDE/GUIDE.zh-CN/MIGRATING)已更新。

需要 reviewer 重点确认的设计取舍

最初实现的是完全 fail-closed(删掉 if call.ReadOnly),但实测会让 8+ 个项目内部只读工具回归被拦(connect_tool_source / recall / lsp / history / slash_command / session 工具)。鉴于本 PR 的 Tool 接口契约已规定 plugin/MCP 工具必为 ReadOnly()==false(其副作用无法静态推断),改为温和 fail-closed:信任内部 ReadOnly()==true 工具,对不可信工具(必为 ReadOnly()==false)fail-closed——达到等价安全且零回归。请确认这个边界是否符合项目对 plan mode 的设计意图。

验证

  • go test ./...:plan-mode 相关全部包(planmode / tool / agent / boot / skill / config 等)全绿;go vet ./...go build ./... 通过。
  • forcing function 实测:临时移除某 builtin 的立场 → 对账测试立即变红并精确点名 → 恢复。
  • 注:internal/cli 存在与本改动无关的预存 flaky(TUI 渲染测试,单独跑通过、全量并行失败)。

@github-actions github-actions Bot added v2 Go rewrite (1.x) — main-v2 branch, active development skills Skill system (internal/skill, internal/tool) agent Core agent loop (internal/agent, internal/control) config Configuration & setup (internal/config) labels Jun 24, 2026
SivanCola and others added 4 commits June 24, 2026 23:11
Add read_only_task as an ephemeral subagent that exposes only read-only research tools plus plan-mode safe foreground bash. Wire it into token economy source loading and document the plan-mode boundary.

Co-authored-by: SivanCola <32437197+SivanCola@users.noreply.github.com>
Exclude connect_tool_source from read_only_task child registries so token economy source loading cannot reopen writer-capable or installer tool surfaces. Extend the economy plan-mode test to execute read_only_task and verify the child request stays within the read-only boundary.

Co-authored-by: SivanCola <32437197+SivanCola@users.noreply.github.com>
Co-authored-by: SivanCola <63166760+SivanCola@users.noreply.github.com>
Plan mode previously trusted any ReadOnly()==true tool while planning, so
a pseudo-read-only tool (e.g. complete_step) could slip through unless it
was manually denylisted, with no test pinning the built-in roster.

- Add tri-state PlanSafety + tool.PlanModeClassifier so tools self-report
  their plan-mode stance; complete_step reports unsafe, read_only_task /
  read_only_skill report safe.
- Make Decide fail-closed for non-read-only tools (plugin/MCP, writers)
  while still trusting in-process ReadOnly() tools; enforce PlanSafe =>
  ReadOnly. plan_mode_allowed_tools is the escape valve for read-only
  MCP/plugin tools.
- Add reconcile_test.go: walks tool.Builtins() and fails if any built-in
  lacks an explicit plan-mode stance (forcing function against drift).
- Add marker round-trip test pinning the model-facing Marker to the gate.
- Document the boundary in SPEC/GUIDE/GUIDE.zh-CN/MIGRATING.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@SivanCola

Copy link
Copy Markdown
Collaborator

@codex review

@SivanCola SivanCola left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is directionally right and the central policy shape is useful, but I think one plan-mode boundary needs fixing before merge.

Blocking issue:

  • The new policy relies on the invariant that MCP/plugin tools are untrusted and therefore report ReadOnly()==false, so they fail closed unless explicitly declared through plan_mode_allowed_tools (internal/planmode/policy.go, around Decide). That invariant is not true in the current code: MCP remoteTool.ReadOnly() returns true when the server advertises annotations.readOnlyHint or when Spec.ReadOnlyToolNames marks the raw tool read-only (internal/plugin/plugin.go). Because Policy.Decide allows any call with ReadOnly()==true, a connected MCP server can make a tool callable in plan mode without being declared in plan_mode_allowed_tools. The same trust leak also reaches read_only_task / read_only_skill, because ReadOnlySubagentToolRegistry admits every parent tool whose ReadOnly() returns true.

This reopens the external-tool fail-closed boundary this PR is trying to establish. Please either make MCP/plugin calls plan-opaque unless the visible mcp__... tool is explicitly declared for plan mode, and add a regression test for an MCP tool with ReadOnly()==true being blocked by default, or change the design/docs/tests to explicitly say plan mode trusts MCP readOnlyHint/known overrides.

What I checked:

  • Plan-mode policy centralization and marker/gate alignment.
  • read_only_task / read_only_skill registry boundaries and economy source gating.
  • Security/cache impact: no per-turn tool-schema mutation found, but this PR does add stable tool schemas and changes the user-turn plan marker.
  • Related PRs: #5262 has a stack conflict in internal/skill/tools_test.go; #5256 and #5161 merge-tree cleanly with this PR.

Verified:

  • go test -count=1 ./internal/planmode ./internal/agent ./internal/boot ./internal/skill ./internal/config ./internal/control ./internal/plugin passed
  • git diff --check origin/main-v2...refs/remotes/origin/pr/5246 passed
  • git merge-tree --write-tree origin/main-v2 refs/remotes/origin/pr/5246 passed

Once the external-tool plan-mode boundary is aligned and verification stays green, I will consider merging this into main-v2.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 905977386e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/boot/token_profile.go
Comment thread internal/planmode/policy.go Outdated
Comment thread internal/control/input.go
The moderate fail-closed gate trusted any ReadOnly()==true tool, but an MCP
remoteTool's ReadOnly() can be true purely from a server-reported readOnlyHint —
an untrusted signal. Such a tool could enter plan mode (and read-only research
sub-agents) without a plan_mode_allowed_tools declaration.

- Add tool.PlanModeUntrustedReadOnly; plugin remoteTool/lazyTool implement it,
  returning true only when ReadOnly() came from a server readOnlyHint, not from a
  first-party Spec.ReadOnlyToolNames override (cache format unchanged).
- planmode.Call gains Untrusted; Decide no longer trusts an untrusted ReadOnly()
  on its fast path — such tools are gated like writers and must be declared in
  plan_mode_allowed_tools. Built-ins and first-party overrides stay trusted.
- ReadOnlySubagentToolRegistry / FilterReadOnlyRegistry exclude untrusted
  read-only tools from read-only research sub-agents and the planner registry.
- Tests across planmode/plugin/agent; docs corrected (the earlier "plugin/MCP
  contractually non-read-only" claim was wrong).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the mcp MCP servers / plugins (internal/plugin, codegraph) label Jun 24, 2026
@SivanCola

Copy link
Copy Markdown
Collaborator

Fixed in e3b4d2a1a. You're right — the "MCP/plugin tools are ReadOnly()==false" invariant was false (readOnlyHint and Spec.ReadOnlyToolNames both make ReadOnly() true), so the external-tool boundary leaked. I took a hybrid of your two options, split by trust source:

  • Untrusted readOnlyHint → plan-opaque. New tool.PlanModeUntrustedReadOnly; remoteTool/lazyTool return true only when ReadOnly() came from the server hint, not a first-party override (cache format unchanged). Policy.Decide no longer trusts such a tool on the read-only fast path — it's gated like a writer and must be declared in plan_mode_allowed_tools. ReadOnlySubagentToolRegistry / FilterReadOnlyRegistry exclude it too, closing the read_only_task / read_only_skill / planner leak.
  • Trusted first-party ReadOnlyToolNames (e.g. codeGraph) → still allowed, documented as the explicit exception.

Regression tests:

  • planmode: untrusted ReadOnly()==true call fails closed, then runs once declared — TestDecideUntrustedReadOnlyFailsClosedThenOverride.
  • agent: end-to-end MCP-shaped tool blocked by default, runs when declared — TestPlanModeUntrustedReadOnlyToolFailsClosed; and excluded from a read-only sub-agent — TestReadOnlySubagentToolRegistryExcludesUntrustedReadOnly.
  • plugin: remoteTool/lazyTool PlanModeUntrustedReadOnly truth table — TestPluginToolsPlanModeUntrustedReadOnly.

Docs corrected in SPEC/GUIDE/GUIDE.zh-CN/MIGRATING (the earlier "contractually non-read-only" wording was wrong).

go test ./... is green except a pre-existing, unrelated internal/cli TUI-rendering flake (fails only under the full parallel run, passes in isolation; reproduced on a clean base without this change).

@SivanCola SivanCola left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick update. The previous MCP readOnlyHint / ReadOnlyToolNames trust-boundary issue is fixed in e3b4d2a1a: the new PlanModeUntrustedReadOnly path is wired through the agent gate, planner/read-only subagent registries, lazy MCP tools, and the docs/tests now describe the trusted first-party override exception correctly.

I am still requesting changes before merge because the remaining bot findings expose real plan-mode contract holes:

  • P1: the bash safety guard still tokenizes with strings.Fields, so quoted dangerous arguments are missed. For example, find . '-delete' is accepted by the policy because the checked token is '-delete', but the shell removes the quotes before running find, making it the write-capable -delete action. Please parse shell words correctly or conservatively reject quoted guarded flags, and add regression coverage for quoted find/git grep/go guarded arguments.
  • P2: in token economy plan mode, connect_tool_source({"source":"mcp","name":"..."}) is blocked before the server can register tools, so a documented plan_mode_allowed_tools = ["mcp__server__tool"] escape hatch cannot ever take effect for MCP tools hidden behind the connector. Please thread the configured allow-list/server name into the source gate, or otherwise allow connecting MCP servers that have explicitly allowed plan-mode tools while still preserving the per-tool gate after registration.

The P3 legacy marker stripping comment is lower severity, but it is also valid as a compatibility cleanup: old persisted plan-mode turns can leak the previous marker into display/title fallback paths after this marker text change. Please either fix it in this PR or state why it is intentionally deferred.

What I checked on the updated head:

  • The original external-tool fail-closed blocker is addressed by PlanModeUntrustedReadOnly and the new regression tests.
  • Cache impact remains bounded to the intended stable tool additions and user-turn plan marker change; I did not find per-turn schema churn.
  • Security review: no new dependency, hidden network path, secret handling change, or command execution expansion beyond the plan-mode bash wrapper. The quoted-argument issue above is the remaining unsafe command-execution gap.
  • Related PRs: #5262 still conflicts with this PR in internal/skill/tools_test.go; #5217 now conflicts in internal/agent/agent.go; #5256 and #5161 merge-tree cleanly with this PR.

Verified:

  • go test -count=1 ./internal/planmode ./internal/agent ./internal/boot ./internal/skill ./internal/config ./internal/control ./internal/plugin passed
  • git diff --check origin/main-v2...refs/remotes/origin/pr/5246 passed
  • git merge-tree --write-tree origin/main-v2 refs/remotes/origin/pr/5246 passed

Once the P1/P2 issues are addressed and CI stays green, I will consider merging this into main-v2.

Honor shell quoting when screening plan-mode bash commands, allow explicitly declared MCP servers to connect in token economy plan mode, and strip the legacy plan marker from restored display text.

Co-authored-by: SivanCola <32437197+SivanCola@users.noreply.github.com>
@SivanCola SivanCola enabled auto-merge June 24, 2026 18:27
@SivanCola SivanCola disabled auto-merge June 24, 2026 18:28

@SivanCola SivanCola left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. This is a well-scoped plan-mode policy improvement, and the follow-up fixes address the remaining blockers I raised.

Why this fits the architecture:

  • Plan mode still rides in the user turn and keeps the stable provider-visible prefix intact.
  • The MCP path remains fail-closed by default: token economy can connect an MCP server during planning only when plan_mode_allowed_tools declares a concrete visible tool name for that server.
  • The bash gate now screens shell fields after quote/escape handling, so read-only planning cannot bypass the argument denylist with quoted write-capable flags.

Additional confidence:

  • Legacy plan markers are stripped for restored display text without changing the current marker injection path.
  • The added regression tests cover the bash quoting bypass, the MCP allowlist source gate, and the legacy marker fallback.

Verified:

  • go test -count=1 ./internal/planmode ./internal/agent ./internal/boot ./internal/skill ./internal/config ./internal/control ./internal/plugin passed
  • git diff --check passed
  • GitHub CI is green on b1326ef

@SivanCola SivanCola merged commit ce3c510 into esengine:main-v2 Jun 24, 2026
14 checks passed
@lifu963 lifu963 deleted the pr/plan-mode-policy branch June 25, 2026 00:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent Core agent loop (internal/agent, internal/control) config Configuration & setup (internal/config) mcp MCP servers / plugins (internal/plugin, codegraph) skills Skill system (internal/skill, internal/tool) v2 Go rewrite (1.x) — main-v2 branch, active development

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants