You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Assessed "HookRegistry on spine" against the code first. Finding: hooks run in
the tool-dispatch layer (teaagent/tools.py::execute), not the runner the plan
named. PreToolUse/PostToolUse MUTATE in-flight arguments/result fed to the
handler — the spine has no channel to ferry a mutated payload back, so an
interceptor cannot replace them without losing mutation (same coupling that kept
approval inline, decision B). The 6 session-lifecycle hooks have no production
caller — nothing to strangle. So the enforcement bridge (M5-T002) is unsuitable;
M5 closes as observability-only.
Suitable, additive slice DONE (M2 pattern, zero behavior change):
- 5 hook audit events typed in RunEventType + bidirectional mapper:
tool_hook_pre_mutation / _pre_mutation_blocked / _vetoed / _post_mutation /
_post_failed. The M2-T001 reader now surfaces hook veto/mutation activity from
the audit JSONL for the M6 fold.
- test_m5_hook_audit_events_are_typed_and_reader_surfaced; round-trip
completeness now covers 31 members (7 M0 + 19 M2 + 5 M5).
- Plan §5 graph + §7 M5 row revised; M5-T001 DONE-as-taxonomy, T002 CLOSED
(unsuitable), T003 DEFERRED. New work-log with full assessment.
Audit bytes unchanged; hook execution + mutation semantics unchanged.
Constraint: mapping/reader only; hook execution stays in dispatch layer; no public hook semantics changed; mutating/dispatch-coupled mechanisms stay where they are (same logic as approval B / budget B-analog).
Tested: tests/lifecycle/test_run_event_spine.py 22 passed (incl. new hook-taxonomy test + round-trip completeness 31==31); docs inventory --check passes.
Not-tested: full suite not run on 3.12 (hypothesis missing in 3.14 sandbox); pre-commit smoke covers governance core.
Confidence: high
Roadmap-Status: unchanged
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
surfaced); hook EXECUTION stays in the tool-dispatch layer
103
+
(it mutates in-flight args/results; the 6 session-lifecycle
104
+
hooks are unwired in prod) — see m5-hooks-observability-only
102
105
-> M6 evidence + receipt FOLD (was M2; corrected
103
106
scope A) — now genuinely event-typed because M2 completed
104
107
the taxonomy and M3/M4 preserved the decision-event contract
@@ -174,7 +177,7 @@ consumers by M6.
174
177
| ADR-0032-M2 (REDEFINED, taxonomy-only §16) | Every audit event the evidence bundle reads is typed in `RunEventType` and mapped both directions, so the M2-T001 reader surfaces it **from the audit JSONL** (mapper is sufficient; emit-site migration is NOT in M2 — it is deferred to the component milestones, §16). Covers routes, git-sandbox, skills, tests, undo, provenance, approval/tool-call decision events, cancelled/pending lifecycle. Pure additive; zero behavior change. (Old M2 "evidence/receipt fold" moved to M6 — §14.) |
175
178
| ADR-0032-M3 | Plan gate is an interceptor using `PlanValidator`, landed parity-first (§13.3): a shadow-parity test asserting interceptor==inline per reason code went green before the inline branch was deleted in a separate commit. Denials and reason codes match current behavior; adversarial and first-hour tests remain green. |
176
179
| ADR-0032-M4 (CLOSED — owner decisions B + B-analog, 2026-06-13) | **No gate moves to an interceptor; approval AND budget enforcement both STAY INLINE.** Both proved runtime-stateful on assessment, a poor fit for the pure-interceptor model. **Approval** (decision B): live JIT/session state, tool handler, auto-mode-swappable policy — every coupling gap was invisible to a unit parity test (`docs/work-log/m4-approval-sliceB-blocked-2026-06-13.md`). **Budget** (decision B-analog): it is three mechanisms — only the global cost cap (`_assert_cost_budget`) is stateless; the phase budget (live `phase_tracker`) and the warning ladder (`_budget_warning_levels_emitted` + `BudgetMonitor._emitted_levels`/`_prompted` dedup sets + an interactive `on_prompt` side-effect handler — the same `assert_allowed` shadow-coexistence trap that blocked approval) are stateful, and even the cost cap is enforced at two evolving-cost points per iteration that do not map 1:1 to events (`docs/work-log/m4-budget-stays-inline-2026-06-13.md`). Both gates' observability is already provided by M2 (their audit events — `tool_call_*`, `approval_*`, `budget_warning`, `budget_prompt`, `phase_budget_warning` — are typed + reader-surfaced); the M6 fold reads them without owning enforcement. Approval/budget behavior unchanged. **Net: plan gate (M3) is the sole governance gate moved to an interceptor.** |
177
-
| ADR-0032-M5 | HookRegistry subscribes through the spine; Claude-Code-compatible hook names remain aliases; public hook API docs and tests pass. |
180
+
| ADR-0032-M5 (REVISED — observability-only, 2026-06-13) | **Hook OBSERVABILITY folds onto the spine; hook EXECUTION stays in the tool-dispatch layer.** Assessment found the planned "HookRegistry on spine" unsuitable for the same runtime-coupling reason as approval/budget: PreToolUse/PostToolUse run in `teaagent/tools.py::execute` and **mutate in-flight `arguments`/`result`** (the spine has no channel to ferry mutated payloads back to the dispatch site), and the 6 session-lifecycle hooks (SessionStart/End, UserPromptSubmit, PreCompact, Stop, SubagentStop) have **no production caller** — nothing to strangle; wiring them is feature work. Done: the 5 dispatch-layer hook audit events (`tool_hook_pre_mutation`, `tool_hook_pre_mutation_blocked`, `tool_hook_vetoed`, `tool_hook_post_mutation`, `tool_hook_post_failed`) are typed in `RunEventType` + mapped both directions, so the M2-T001 reader surfaces hook veto/mutation activity from the audit JSONL for the M6 fold. Mapping/reader only; audit bytes unchanged; hook execution + mutation semantics unchanged. See `docs/work-log/m5-hooks-observability-only-2026-06-13.md`. |
178
181
| ADR-0032-M6 (was M2 fold; corrected scope A) | Evidence and receipts are folded from the typed event stream and equal the legacy builder on success/failure/pending fixtures (cancelled once emitted in M2); the fold reads the full stream (no fallback flag, per Q1); synthetic receipt-only fixtures are retired or relabeled legacy. Runs only after M2 coverage + M3/M4 decision events exist. |
179
182
| ADR-0032-M7 (was M6) | ContextBus and webhook sinks consume the spine; inline emission paths are deleted; validator shows no orphaned eventing modules. |
180
183
@@ -552,7 +555,15 @@ commit once Slice A is green.
552
555
- Parallelizable: no.
553
556
- Human Review Required: no.
554
557
555
-
### ADR32-M5-T001: Hook Alias Matrix
558
+
> **M5 REVISED — observability-only (2026-06-13).** T001 landed as a *typed
559
+
> taxonomy* for the 5 audit events the dispatch-layer HookRegistry actually
560
+
> emits (not an 8-name public alias matrix — 6 lifecycle hooks are unwired in
561
+
> prod). T002 (execution bridge) is CLOSED as unsuitable: hooks mutate in-flight
562
+
> args/results and the spine cannot ferry mutated payloads back. T003 (public
563
+
> hook API docs) is deferred — no public hook semantics changed. See the M5
564
+
> row in §7 and `docs/work-log/m5-hooks-observability-only-2026-06-13.md`.
565
+
566
+
### ADR32-M5-T001: Hook Alias Matrix [DONE as observability taxonomy]
556
567
557
568
- Goal: define the stable mapping from public hook names to `RunEventType`.
# M5 — HookRegistry: Observability Onto the Spine, Execution Stays in Dispatch
2
+
3
+
> **Status:** observability slice DONE (taxonomy typed + reader-surfaced); the
4
+
> enforcement-bridge half of the planned M5 is assessed UNSUITABLE for the same
5
+
> runtime-coupling reason as approval (B) and budget (B-analog). Recommendation:
6
+
> close M5 as **observability-only**. Owner decision pending on the bridge.
7
+
8
+
## What the plan assumed vs. what the code shows
9
+
10
+
The work-plan's M5 (§7 row, T002) assumed hooks run in the runner and that the
11
+
migration would move **PreToolUse → spine interceptor**, **PostToolUse + session
12
+
lifecycle → spine consumers**, touching `teaagent/runner/_core.py`. The
13
+
observable code contradicts every part of that:
14
+
15
+
| Hook | Production caller | Nature |
16
+
|---|---|---|
17
+
| PreToolUse |`teaagent/tools.py::ToolRegistry.execute` (~line 222) |**Mutates in-flight `arguments`** fed to `tool.handler`; can veto via `HookError`; destructive-tool mutation guard |
18
+
| PostToolUse |`teaagent/tools.py::ToolRegistry.execute` (~line 291) |**Mutates the tool `result`** returned upstream |
19
+
| SessionStart / SessionEnd / UserPromptSubmit / PreCompact / Stop / SubagentStop |**none** — only tests call `run_session_*` etc. | defined + unit-tested, **not wired into any production path**|
20
+
21
+
So hooks live at the **tool-dispatch layer**, plumbed via
0 commit comments