Skip to content

Commit 1161fea

Browse files
authored
Merge pull request #439 from Cai-Tang-www/fix/hook&verifier
feat: 实现 runtime 验收闭环与 verifier 双门控
2 parents aec39b1 + 16de689 commit 1161fea

73 files changed

Lines changed: 6783 additions & 116 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Compatibility Fallback Lifecycle
2+
3+
## 为什么 fallback 只能短期存在
4+
- completion-only 旧路径会绕过 verifier gate,破坏双门控与单一裁决层。
5+
- 长期保留会形成双真源,增加状态不一致风险。
6+
7+
## 触发条件
8+
- `runtime.verification.enabled=false` 时进入兼容路径。
9+
10+
## 事件与日志要求
11+
- 兼容路径必须输出结构化 stop reason:`compatibility_fallback`
12+
- acceptance 事件中保留内部摘要,标记这是 fallback 行为。
13+
14+
## 移除条件
15+
- 验收与 verifier 在主链路稳定后,移除 `enabled=false` 作为常规运行路径。
16+
- 仅保留灰度发布窗口内的短期开关。
17+
18+
## 禁止长期双轨原因
19+
- 终态语义会被分裂为“old completion-only”与“new dual-gate”两套规则。
20+
- TUI / runtime / persistence 难以保证统一解释,容易引发回归。
21+

docs/runtime-finalization-flow.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Runtime Finalization Flow
2+
3+
## 旧流程
4+
- assistant final 且无 tool call 时,可直接进入完成路径。
5+
6+
## 新流程
7+
- assistant final -> completion gate -> `beforeAcceptFinal` -> verifier engine -> acceptance decision -> decider stop reason -> runtime 终态。
8+
9+
## beforeAcceptFinal 插入点
10+
- 在 runtime 主循环中,`len(tool_calls)==0` 的 final 候选分支。
11+
- 先发 `verification_started`,后执行 acceptance engine。
12+
13+
## 分支行为
14+
- `accepted`: `verification_completed` -> `acceptance_decided` -> `agent_done`
15+
- `continue`: `verification_finished` -> 注入 runtime reminder -> 下一轮
16+
- `incomplete`: `acceptance_decided` + `stop_reason_decided` -> `agent_done`
17+
- `failed`: `verification_failed` + `acceptance_decided` + `stop_reason_decided` -> `agent_done`
18+
19+
## completion_gate vs verification_gate
20+
- completion gate 只控制“能否尝试收尾”。
21+
- verification gate 才决定“是否允许最终完成”。
22+
23+
## decider 位置与真源关系
24+
- decider 在 run 退出时统一发 `stop_reason_decided`
25+
- acceptance 输出写入 runtime 终态快照,由 decider 统一编码原因。
26+
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# Stop Reason And Decision Priority
2+
3+
## StopReason 全集
4+
- `user_interrupt`
5+
- `fatal_error`
6+
- `budget_exceeded`
7+
- `max_turn_exceeded`
8+
- `retry_exhausted`
9+
- `verification_failed`
10+
- `accepted`
11+
- `todo_not_converged`
12+
- `todo_waiting_external`
13+
- `no_progress_after_final_intercept`
14+
- `max_turn_exceeded_with_unconverged_todos`
15+
- `max_turn_exceeded_with_failed_verification`
16+
- `verification_config_missing`
17+
- `verification_execution_denied`
18+
- `verification_execution_error`
19+
- `compatibility_fallback`
20+
21+
## 优先级
22+
- `user_interrupt` > `fatal_error` > `budget_exceeded` > `max_turn_exceeded` > `retry_exhausted` > `verification_failed` > `accepted`
23+
24+
## 决议互斥关系
25+
- decider 返回单一 stop reason。
26+
- acceptance/verifier 只提供输入,不直接终裁。
27+
28+
## 与 ErrorClass 的关系
29+
- `ErrorClass` 只描述失败分类(compile/test/lint/type/timeout/permission 等)。
30+
- stop reason 描述终止归因;error class 描述失败类型,二者不重复表达。
31+

docs/task-acceptance-design.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Task Acceptance Design
2+
3+
## 背景问题
4+
- 旧流程中,模型输出 final 且无 tool call 时,runtime 可能直接完成。
5+
- 这会导致“文本 final”与“任务真实完成”混淆。
6+
7+
## 为什么模型 final 不能直接等于完成
8+
- final 仅代表模型主观结束意图,不代表 required todo、文件产物或验证命令已满足。
9+
- 真实完成必须由 runtime 验收层裁决。
10+
11+
## completion / verification / acceptance 区分
12+
- `completion_gate`:判断当前回合是否可尝试收尾(必要非充分)。
13+
- `verification_gate`:由 verifier engine 判断任务是否满足验收条件。
14+
- `acceptance_decision`:聚合两者输出 `accepted/continue/incomplete/failed`
15+
16+
## 双门控模型
17+
- `completed = completion_gate.passed && verification_gate.passed`
18+
- 任一门未通过都不能直接 `agent_done`
19+
20+
## 状态机
21+
- provider final -> `beforeAcceptFinal` -> verification -> acceptance_decided
22+
- `accepted` -> `agent_done`
23+
- `continue` -> 注入系统提醒继续推理
24+
- `incomplete/failed` -> 结束 run 并输出 stop reason
25+
26+
## StopReason 设计
27+
- stop reason 由 controlplane decider 统一输出。
28+
- 新增 `verification_failed``todo_not_converged``retry_exhausted` 等枚举。
29+
30+
## 与 todo / subagent / runtime 的关系
31+
- todo 是 verifier 输入,不直接决定终态。
32+
- subagent 完成不等于主任务完成,仍需通过 verifier gate。
33+
- runtime 只消费 decider 输出,不再平行判定终态。
34+
35+
## decider 单一裁决层
36+
- 终态只由 decider 输出。
37+
- events / TUI / persistence 统一消费 decider 决议。
38+

docs/todo-schema-migration.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Todo Schema Migration
2+
3+
## 新增字段
4+
- `required`:是否参与 final 收口拦截,默认 `true`
5+
- `blocked_reason``internal_dependency / permission_wait / user_input_wait / external_resource_wait / unknown`
6+
7+
## 兼容策略
8+
- 旧数据缺失 `required` 时,按 `true` 处理。
9+
- 旧数据缺失 `blocked_reason` 时,按 `unknown` 处理。
10+
- 旧 blocked todo 不会因新字段缺失导致反序列化失败。
11+
12+
## 默认值语义
13+
- `required=nil` 视为 required=true(兼容旧 session)。
14+
- `blocked_reason=""` 规整为 `unknown`
15+
16+
## 持久化迁移注意事项
17+
- `CurrentTodoVersion` 升级到 `5`
18+
- 归一化流程在加载与写入两侧都应用缺省值规则。
19+
- 工具层 schema 与 patch 同步支持新字段,避免 runtime/工具协议漂移。
20+
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Verifier Configuration And Policy
2+
3+
## 配置来源
4+
- 全局:`~/.neocode/config.yaml``runtime.verification`
5+
- 仓库级扩展预留:`.neocode/verification.yaml`(本阶段先保留接口与策略位)。
6+
7+
## 优先级
8+
- 仓库级 > 全局级 > 内建默认值(策略已按该优先级设计)。
9+
10+
## 命令来源
11+
- 所有命令型 verifier 从 `runtime.verification.verifiers.<name>.command` 读取。
12+
- verifier 内禁止硬编码项目命令。
13+
14+
## 启停规则
15+
- verifier 支持独立 `enabled/required`
16+
- 未配置 command 时:
17+
- required=true -> 返回显式 soft_block/fail
18+
- required=false -> skip(显式结果,不 silent pass)
19+
20+
## required / optional 行为
21+
- required verifier 失败会阻断 final。
22+
- optional verifier 缺省可跳过,但仍有事件与结果记录。
23+
24+
## non-interactive policy
25+
- verifier 命令走独立 `execution_policy`,不走普通 ask 权限链路。
26+
- 默认白名单命令(go/git/test/lint/typecheck 等)。
27+
- 明确拒绝高风险命令(如 `rm``sudo`)。
28+

docs/verifier-engine-design.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Verifier Engine Design
2+
3+
## Verifier 接口
4+
- `FinalVerifier{Name, VerifyFinal}`
5+
- 输入为 `FinalVerifyInput`(session/run/task/workdir/messages/todos/runtime state/config)。
6+
- 输出为 `VerificationResult`(status/reason/error_class/evidence)。
7+
8+
## Verifier 分类
9+
- P0:`todo_convergence`
10+
- P1:`file_exists``content_match``command_success``git_diff`
11+
- P1 代码任务:`build/test/lint/typecheck`(命令驱动)
12+
13+
## Orchestrator 流程
14+
- 按策略解析 verifier 列表并顺序执行。
15+
- 汇总 `VerificationGateDecision{Passed, Reason, Results}`
16+
17+
## 聚合规则
18+
- 任一 `fail` -> `verification_failed`
19+
- 否则任一 `hard_block` -> `todo_waiting_external``todo_not_converged`
20+
- 否则任一 `soft_block` -> `todo_not_converged`
21+
- 全部 `pass` -> `accepted`
22+
23+
## Task policy 映射
24+
- `unknown`: todo_convergence
25+
- `create_file/docs`: todo_convergence + file_exists + content_match
26+
- `config`: todo_convergence + file_exists + content_match + command_success
27+
- `edit_code/fix_bug/refactor`: todo_convergence + git_diff + build/test/lint/typecheck(按策略启停)
28+

internal/config/runtime.go

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ type RuntimeConfig struct {
1818
MaxNoProgressStreak int `yaml:"max_no_progress_streak,omitempty"`
1919
MaxRepeatCycleStreak int `yaml:"max_repeat_cycle_streak,omitempty"`
2020
MaxTurns int `yaml:"max_turns,omitempty"`
21+
Verification VerificationConfig `yaml:"verification,omitempty"`
2122
Assets RuntimeAssetsConfig `yaml:"assets,omitempty"`
2223
}
2324

@@ -33,6 +34,7 @@ func defaultRuntimeConfig() RuntimeConfig {
3334
MaxNoProgressStreak: DefaultMaxNoProgressStreak,
3435
MaxRepeatCycleStreak: DefaultMaxRepeatCycleStreak,
3536
MaxTurns: DefaultMaxTurns,
37+
Verification: defaultVerificationConfig(),
3638
Assets: defaultRuntimeAssetsConfig(),
3739
}
3840
}
@@ -51,6 +53,7 @@ func (c RuntimeConfig) Clone() RuntimeConfig {
5153
MaxNoProgressStreak: c.MaxNoProgressStreak,
5254
MaxRepeatCycleStreak: c.MaxRepeatCycleStreak,
5355
MaxTurns: c.MaxTurns,
56+
Verification: c.Verification.Clone(),
5457
Assets: c.Assets.Clone(),
5558
}
5659
}
@@ -69,6 +72,7 @@ func (c *RuntimeConfig) ApplyDefaults(defaults RuntimeConfig) {
6972
if c.MaxTurns <= 0 {
7073
c.MaxTurns = defaults.MaxTurns
7174
}
75+
c.Verification.ApplyDefaults(defaults.Verification)
7276
c.Assets.ApplyDefaults(defaults.Assets)
7377
}
7478

@@ -83,6 +87,11 @@ func (c RuntimeConfig) Validate() error {
8387
if c.MaxTurns < 0 {
8488
return errors.New("max_turns must be greater than or equal to 0")
8589
}
90+
verification := c.Verification.Clone()
91+
verification.ApplyDefaults(defaultVerificationConfig())
92+
if err := verification.Validate(); err != nil {
93+
return err
94+
}
8695
if err := c.Assets.Validate(); err != nil {
8796
return err
8897
}

internal/config/runtime_test.go

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,27 @@ func TestRuntimeConfigValidate(t *testing.T) {
144144
}).Validate(); err == nil {
145145
t.Fatal("expected validation error for assets.max_session_assets_total_bytes=-1")
146146
}
147+
148+
if err := (RuntimeConfig{
149+
MaxNoProgressStreak: 1,
150+
MaxRepeatCycleStreak: 1,
151+
MaxTurns: 1,
152+
Verification: VerificationConfig{
153+
DefaultTaskPolicy: "unknown",
154+
MaxNoProgress: 1,
155+
MaxRetries: 0,
156+
Verifiers: map[string]VerifierConfig{
157+
"todo_convergence": {FailOpen: true, FailClosed: true},
158+
},
159+
ExecutionPolicy: VerificationExecutionPolicyConfig{
160+
Mode: "non_interactive",
161+
DefaultTimeout: 1,
162+
DefaultOutputCap: 1,
163+
},
164+
},
165+
}).Validate(); err == nil {
166+
t.Fatal("expected validation error for invalid verification config")
167+
}
147168
}
148169

149170
func TestRuntimeAssetsConfigZeroValuesResolveToDefaults(t *testing.T) {
@@ -171,3 +192,47 @@ func TestRuntimeAssetsConfigZeroValuesResolveToDefaults(t *testing.T) {
171192
)
172193
}
173194
}
195+
196+
func TestRuntimeConfigVerificationDefaultsApplied(t *testing.T) {
197+
t.Parallel()
198+
199+
defaults := defaultRuntimeConfig()
200+
cfg := RuntimeConfig{}
201+
cfg.ApplyDefaults(defaults)
202+
if !cfg.Verification.EnabledValue() {
203+
t.Fatalf("expected verification enabled by default")
204+
}
205+
if !cfg.Verification.FinalInterceptValue() {
206+
t.Fatalf("expected verification final intercept enabled by default")
207+
}
208+
if cfg.Verification.MaxNoProgress <= 0 {
209+
t.Fatalf("expected max_no_progress > 0, got %d", cfg.Verification.MaxNoProgress)
210+
}
211+
if len(cfg.Verification.Verifiers) == 0 {
212+
t.Fatal("expected default verifiers to be populated")
213+
}
214+
}
215+
216+
func TestRuntimeConfigVerificationExplicitFalsePreserved(t *testing.T) {
217+
t.Parallel()
218+
219+
defaults := defaultRuntimeConfig()
220+
cfg := RuntimeConfig{
221+
Verification: VerificationConfig{
222+
Enabled: boolPtrTest(false),
223+
FinalIntercept: boolPtrTest(false),
224+
},
225+
}
226+
cfg.ApplyDefaults(defaults)
227+
if cfg.Verification.EnabledValue() {
228+
t.Fatalf("expected explicit verification.enabled=false to be preserved")
229+
}
230+
if cfg.Verification.FinalInterceptValue() {
231+
t.Fatalf("expected explicit verification.final_intercept=false to be preserved")
232+
}
233+
}
234+
235+
func boolPtrTest(value bool) *bool {
236+
v := value
237+
return &v
238+
}

0 commit comments

Comments
 (0)