Skip to content

Commit be4209a

Browse files
johnteeeclaude
andcommitted
feat: ADR 0032 M5 — hook observability onto spine (taxonomy); execution stays in dispatch
Assessed "HookRegistry on spine" against the code first. Finding: hooks run in the tool-dispatch layer (teaagent/tools.py::execute), not the runner the plan named. PreToolUse/PostToolUse MUTATE in-flight arguments/result fed to the handler — the spine has no channel to ferry a mutated payload back, so an interceptor cannot replace them without losing mutation (same coupling that kept approval inline, decision B). The 6 session-lifecycle hooks have no production caller — nothing to strangle. So the enforcement bridge (M5-T002) is unsuitable; M5 closes as observability-only. Suitable, additive slice DONE (M2 pattern, zero behavior change): - 5 hook audit events typed in RunEventType + bidirectional mapper: tool_hook_pre_mutation / _pre_mutation_blocked / _vetoed / _post_mutation / _post_failed. The M2-T001 reader now surfaces hook veto/mutation activity from the audit JSONL for the M6 fold. - test_m5_hook_audit_events_are_typed_and_reader_surfaced; round-trip completeness now covers 31 members (7 M0 + 19 M2 + 5 M5). - Plan §5 graph + §7 M5 row revised; M5-T001 DONE-as-taxonomy, T002 CLOSED (unsuitable), T003 DEFERRED. New work-log with full assessment. Audit bytes unchanged; hook execution + mutation semantics unchanged. Constraint: mapping/reader only; hook execution stays in dispatch layer; no public hook semantics changed; mutating/dispatch-coupled mechanisms stay where they are (same logic as approval B / budget B-analog). Tested: tests/lifecycle/test_run_event_spine.py 22 passed (incl. new hook-taxonomy test + round-trip completeness 31==31); docs inventory --check passes. Not-tested: full suite not run on 3.12 (hypothesis missing in 3.14 sandbox); pre-commit smoke covers governance core. Confidence: high Roadmap-Status: unchanged Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
1 parent 3e69ac0 commit be4209a

5 files changed

Lines changed: 166 additions & 10 deletions

File tree

docs/generated/docs-inventory.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
Generated by `python3 scripts/generate_docs_inventory.py`.
77
Do not edit this file manually — regenerate instead.
88

9-
**Markdown files:** 588
9+
**Markdown files:** 589
1010

1111
| Path | Tier | Bytes | SHA256 (12) |
1212
| --- | --- | ---: | --- |
@@ -414,7 +414,7 @@ Do not edit this file manually — regenerate instead.
414414
| `ops/security-hardening.md` | working | 11733 | `0a385c7dab82` |
415415
| `ops/troubleshooting.md` | working | 9127 | `4921b6d50f5c` |
416416
| `permission-and-approval-playbook.md` | working | 6560 | `813bc74bb156` |
417-
| `plans/adr-0032-m1-m6-work-plan-2026-06-13.md` | archive | 52091 | `ce92504ad57b` |
417+
| `plans/adr-0032-m1-m6-work-plan-2026-06-13.md` | archive | 54690 | `4a6ca4a1b9b6` |
418418
| `plans/agent-ecosystem-acceptance-roadmap-2026-05-31.md` | archive | 29099 | `7c4a4972cfeb` |
419419
| `plans/community-pain-points-response-plan-2026-06-05.md` | archive | 7276 | `571d010133ad` |
420420
| `plans/competitive-positioning-plan-2026-05-31.md` | archive | 8726 | `d16dfd2bdd99` |
@@ -591,6 +591,7 @@ Do not edit this file manually — regenerate instead.
591591
| `work-log/documentation-optimization-work-items-2026-06-04.md` | archive | 11750 | `9233b40b0bce` |
592592
| `work-log/m4-approval-sliceB-blocked-2026-06-13.md` | archive | 7347 | `3981ed82bc08` |
593593
| `work-log/m4-budget-stays-inline-2026-06-13.md` | archive | 5727 | `0e7a6ee74954` |
594+
| `work-log/m5-hooks-observability-only-2026-06-13.md` | archive | 5000 | `8a87eaee4d15` |
594595
| `work-log/operator-friction-log.md` | working | 2560 | `fe79899db10f` |
595596
| `work-log/p0-p1-governance-implementation-ledger-2026-06-11.md` | archive | 5212 | `0b72cd69de32` |
596597
| `work-log/parallel-phase-0-implementation-report-2026-06-04.md` | archive | 13181 | `098186167459` |

docs/plans/adr-0032-m1-m6-work-plan-2026-06-13.md

Lines changed: 28 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,10 @@ M0 accepted ADR + dual-write spine [done]
9898
INLINE (owner decisions B + B-analog, 2026-06-13 — both are
9999
runtime-stateful; plan gate is the sole interceptor gate;
100100
see m4-approval-sliceB-blocked + m4-budget-stays-inline reports)
101-
-> M5 HookRegistry on spine
101+
-> M5 hook OBSERVABILITY onto spine (taxonomy typed + reader-
102+
surfaced); hook EXECUTION stays in the tool-dispatch layer
103+
(it mutates in-flight args/results; the 6 session-lifecycle
104+
hooks are unwired in prod) — see m5-hooks-observability-only
102105
-> M6 evidence + receipt FOLD (was M2; corrected
103106
scope A) — now genuinely event-typed because M2 completed
104107
the taxonomy and M3/M4 preserved the decision-event contract
@@ -174,7 +177,7 @@ consumers by M6.
174177
| ADR-0032-M2 (REDEFINED, taxonomy-only §16) | Every audit event the evidence bundle reads is typed in `RunEventType` and mapped both directions, so the M2-T001 reader surfaces it **from the audit JSONL** (mapper is sufficient; emit-site migration is NOT in M2 — it is deferred to the component milestones, §16). Covers routes, git-sandbox, skills, tests, undo, provenance, approval/tool-call decision events, cancelled/pending lifecycle. Pure additive; zero behavior change. (Old M2 "evidence/receipt fold" moved to M6 — §14.) |
175178
| ADR-0032-M3 | Plan gate is an interceptor using `PlanValidator`, landed parity-first (§13.3): a shadow-parity test asserting interceptor==inline per reason code went green before the inline branch was deleted in a separate commit. Denials and reason codes match current behavior; adversarial and first-hour tests remain green. |
176179
| ADR-0032-M4 (CLOSED — owner decisions B + B-analog, 2026-06-13) | **No gate moves to an interceptor; approval AND budget enforcement both STAY INLINE.** Both proved runtime-stateful on assessment, a poor fit for the pure-interceptor model. **Approval** (decision B): live JIT/session state, tool handler, auto-mode-swappable policy — every coupling gap was invisible to a unit parity test (`docs/work-log/m4-approval-sliceB-blocked-2026-06-13.md`). **Budget** (decision B-analog): it is three mechanisms — only the global cost cap (`_assert_cost_budget`) is stateless; the phase budget (live `phase_tracker`) and the warning ladder (`_budget_warning_levels_emitted` + `BudgetMonitor._emitted_levels`/`_prompted` dedup sets + an interactive `on_prompt` side-effect handler — the same `assert_allowed` shadow-coexistence trap that blocked approval) are stateful, and even the cost cap is enforced at two evolving-cost points per iteration that do not map 1:1 to events (`docs/work-log/m4-budget-stays-inline-2026-06-13.md`). Both gates' observability is already provided by M2 (their audit events — `tool_call_*`, `approval_*`, `budget_warning`, `budget_prompt`, `phase_budget_warning` — are typed + reader-surfaced); the M6 fold reads them without owning enforcement. Approval/budget behavior unchanged. **Net: plan gate (M3) is the sole governance gate moved to an interceptor.** |
177-
| ADR-0032-M5 | HookRegistry subscribes through the spine; Claude-Code-compatible hook names remain aliases; public hook API docs and tests pass. |
180+
| ADR-0032-M5 (REVISED — observability-only, 2026-06-13) | **Hook OBSERVABILITY folds onto the spine; hook EXECUTION stays in the tool-dispatch layer.** Assessment found the planned "HookRegistry on spine" unsuitable for the same runtime-coupling reason as approval/budget: PreToolUse/PostToolUse run in `teaagent/tools.py::execute` and **mutate in-flight `arguments`/`result`** (the spine has no channel to ferry mutated payloads back to the dispatch site), and the 6 session-lifecycle hooks (SessionStart/End, UserPromptSubmit, PreCompact, Stop, SubagentStop) have **no production caller** — nothing to strangle; wiring them is feature work. Done: the 5 dispatch-layer hook audit events (`tool_hook_pre_mutation`, `tool_hook_pre_mutation_blocked`, `tool_hook_vetoed`, `tool_hook_post_mutation`, `tool_hook_post_failed`) are typed in `RunEventType` + mapped both directions, so the M2-T001 reader surfaces hook veto/mutation activity from the audit JSONL for the M6 fold. Mapping/reader only; audit bytes unchanged; hook execution + mutation semantics unchanged. See `docs/work-log/m5-hooks-observability-only-2026-06-13.md`. |
178181
| ADR-0032-M6 (was M2 fold; corrected scope A) | Evidence and receipts are folded from the typed event stream and equal the legacy builder on success/failure/pending fixtures (cancelled once emitted in M2); the fold reads the full stream (no fallback flag, per Q1); synthetic receipt-only fixtures are retired or relabeled legacy. Runs only after M2 coverage + M3/M4 decision events exist. |
179182
| ADR-0032-M7 (was M6) | ContextBus and webhook sinks consume the spine; inline emission paths are deleted; validator shows no orphaned eventing modules. |
180183

@@ -552,7 +555,15 @@ commit once Slice A is green.
552555
- Parallelizable: no.
553556
- Human Review Required: no.
554557

555-
### ADR32-M5-T001: Hook Alias Matrix
558+
> **M5 REVISED — observability-only (2026-06-13).** T001 landed as a *typed
559+
> taxonomy* for the 5 audit events the dispatch-layer HookRegistry actually
560+
> emits (not an 8-name public alias matrix — 6 lifecycle hooks are unwired in
561+
> prod). T002 (execution bridge) is CLOSED as unsuitable: hooks mutate in-flight
562+
> args/results and the spine cannot ferry mutated payloads back. T003 (public
563+
> hook API docs) is deferred — no public hook semantics changed. See the M5
564+
> row in §7 and `docs/work-log/m5-hooks-observability-only-2026-06-13.md`.
565+
566+
### ADR32-M5-T001: Hook Alias Matrix [DONE as observability taxonomy]
556567

557568
- Goal: define the stable mapping from public hook names to `RunEventType`.
558569
- Scope: SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, PreCompact,
@@ -579,7 +590,14 @@ commit once Slice A is green.
579590
- Parallelizable: yes after M4.
580591
- Human Review Required: yes for public API wording.
581592

582-
### ADR32-M5-T002: HookRegistry Consumer/Interceptor Bridge
593+
### ADR32-M5-T002: HookRegistry Consumer/Interceptor Bridge [CLOSED — unsuitable]
594+
595+
> **CLOSED, not implemented (2026-06-13).** PreToolUse/PostToolUse mutate
596+
> in-flight `arguments`/`result` in `teaagent/tools.py::execute`; the spine has
597+
> no channel to carry a mutated payload back to that dispatch site, so an
598+
> interceptor cannot replace them without losing mutation (same coupling that
599+
> kept approval inline). Session-lifecycle hooks have no production caller —
600+
> nothing to migrate. Hook execution stays in the dispatch layer.
583601
584602
- Goal: run HookRegistry through EventSpine while preserving hook veto and
585603
consumer semantics.
@@ -607,7 +625,12 @@ commit once Slice A is green.
607625
- Parallelizable: no.
608626
- Human Review Required: no unless hook public semantics change.
609627

610-
### ADR32-M5-T003: Public Hook API Documentation
628+
### ADR32-M5-T003: Public Hook API Documentation [DEFERRED]
629+
630+
> **DEFERRED (2026-06-13).** No public hook semantics changed (execution stays
631+
> in the dispatch layer; only observability typing was added), so there is no
632+
> new stability contract to document here. Revisit only if the session-lifecycle
633+
> hooks are ever wired (separate product decision).
611634
612635
- Goal: document event-spine-backed hook lifecycle and stability contract.
613636
- Scope: public hook names, payload shapes, ordering, veto/isolation semantics.
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
# M5 — HookRegistry: Observability Onto the Spine, Execution Stays in Dispatch
2+
3+
> **Status:** observability slice DONE (taxonomy typed + reader-surfaced); the
4+
> enforcement-bridge half of the planned M5 is assessed UNSUITABLE for the same
5+
> runtime-coupling reason as approval (B) and budget (B-analog). Recommendation:
6+
> close M5 as **observability-only**. Owner decision pending on the bridge.
7+
8+
## What the plan assumed vs. what the code shows
9+
10+
The work-plan's M5 (§7 row, T002) assumed hooks run in the runner and that the
11+
migration would move **PreToolUse → spine interceptor**, **PostToolUse + session
12+
lifecycle → spine consumers**, touching `teaagent/runner/_core.py`. The
13+
observable code contradicts every part of that:
14+
15+
| Hook | Production caller | Nature |
16+
|---|---|---|
17+
| PreToolUse | `teaagent/tools.py::ToolRegistry.execute` (~line 222) | **Mutates in-flight `arguments`** fed to `tool.handler`; can veto via `HookError`; destructive-tool mutation guard |
18+
| PostToolUse | `teaagent/tools.py::ToolRegistry.execute` (~line 291) | **Mutates the tool `result`** returned upstream |
19+
| SessionStart / SessionEnd / UserPromptSubmit / PreCompact / Stop / SubagentStop | **none** — only tests call `run_session_*` etc. | defined + unit-tested, **not wired into any production path** |
20+
21+
So hooks live at the **tool-dispatch layer**, plumbed via
22+
`tool_registry.hook_registry` ([chat_agent.py:502](../../teaagent/chat_agent.py),
23+
[run_contract.py:105](../../teaagent/integration/run_contract.py)) — not the
24+
runner the plan named.
25+
26+
## Finding: the enforcement bridge is unsuitable (third consecutive case)
27+
28+
1. **PreToolUse/PostToolUse are mutating, not pure decisions.** `run_pre_hooks`
29+
rewrites `arguments`; the rewritten args feed `tool.handler`. `run_post_hooks`
30+
rewrites `result`. An EventSpine interceptor can veto (raise) but the spine
31+
has **no channel to carry mutated args/results back** to the dispatch site
32+
that consumes them. This is the same shape that kept approval inline (a gate
33+
that mutates in-flight state, not a pure event decision) — moving it onto the
34+
spine would either lose the mutation capability or require the spine to ferry
35+
mutable payloads back into `tools.py`, the coupling we rejected for approval.
36+
37+
2. **The session-lifecycle hooks have no inline path to strangle.** They are not
38+
invoked in production. "Move them to spine consumers" would move *nothing*;
39+
to make them do anything you would have to **newly wire** them — that is
40+
feature work, not a parity-preserving migration, and out of scope for the
41+
strangler arc.
42+
43+
3. Even setting (1)-(2) aside, PreToolUse runs *inside* `tool.handler` dispatch,
44+
after the runner's plan interceptor and inline approval already allowed the
45+
call — a different layer than the spine's `TOOL_CALL_REQUESTED` point.
46+
47+
## What IS suitable, and was done
48+
49+
The genuinely spine-shaped value is **observability** — identical to M2 and to
50+
the approval/budget observability that already reaches the M6 fold. The
51+
HookRegistry bridge already emits 5 audit events; they were untyped. Done:
52+
53+
- Added 5 members to `RunEventType` (`teaagent/runner/_events.py`):
54+
`TOOL_HOOK_PRE_MUTATION`, `TOOL_HOOK_PRE_MUTATION_BLOCKED`, `TOOL_HOOK_VETOED`,
55+
`TOOL_HOOK_POST_MUTATION`, `TOOL_HOOK_POST_FAILED`.
56+
- Added the 5 bidirectional mapper entries; the M2-T001 reader now surfaces hook
57+
veto/mutation activity **from the audit JSONL** for the M6 fold.
58+
- Test `test_m5_hook_audit_events_are_typed_and_reader_surfaced` +
59+
round-trip completeness (`len(mapper) == len(RunEventType)` now 31).
60+
61+
Mapping/reader only. **Hook execution is unchanged** — it stays in the dispatch
62+
layer, with its mutation semantics and audit emission exactly as before. Audit
63+
bytes are unchanged.
64+
65+
## Recommendation
66+
67+
**Close M5 as observability-only.** The hook taxonomy is typed and folds into
68+
M6; hook *enforcement/mutation* stays in `tools.py` by the same evidenced logic
69+
as approval (B) and budget (B-analog): the spine/interceptor model fits
70+
stateless, non-mutating decisions; mutating/dispatch-coupled mechanisms stay
71+
where they are.
72+
73+
Separately (and out of the migration's scope): the 6 unwired session-lifecycle
74+
hooks (SessionStart/End, UserPromptSubmit, PreCompact, Stop, SubagentStop) are
75+
**defined but dead** in production. Wiring them is a product decision, not a
76+
spine migration — flag for the backlog, do not bundle here.
77+
78+
## Pattern across M3-M5
79+
80+
- **M3 plan gate** → moved to interceptor cleanly (stateless decision).
81+
- **M4 approval** → stays inline (runtime-stateful: JIT/handler/swappable policy).
82+
- **M4 budget** → stays inline (dedup state + interactive handler + multi-point).
83+
- **M5 hooks** → execution stays in dispatch (mutating + dead lifecycle hooks);
84+
observability folds onto the spine.
85+
86+
The spine's realized value is the **read side** (typed evidence → M6 fold → M7
87+
consumers), not wholesale relocation of enforcement. Plan gate is the one gate
88+
that genuinely belonged on the spine.

teaagent/runner/_events.py

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,18 @@ class RunEventType(str, Enum):
7171
RUN_CANCELLED = 'run_cancelled'
7272
RUN_PENDING_APPROVAL = 'run_pending_approval'
7373

74+
# M5 hook-observability taxonomy (ADR 0032 M5, owner decision 2026-06-13):
75+
# the audit events emitted by the HookRegistry bridge in the tool-dispatch
76+
# layer (teaagent/tools.py). Typed + mapped so the M6 fold can surface hook
77+
# veto/mutation activity FROM the audit JSONL. Mapping/reader only — hook
78+
# *execution* stays in the dispatch layer (it mutates in-flight args/results,
79+
# so it is not a pure spine interceptor; see the M5 assessment work-log).
80+
TOOL_HOOK_PRE_MUTATION = 'tool_hook_pre_mutation'
81+
TOOL_HOOK_PRE_MUTATION_BLOCKED = 'tool_hook_pre_mutation_blocked'
82+
TOOL_HOOK_VETOED = 'tool_hook_vetoed'
83+
TOOL_HOOK_POST_MUTATION = 'tool_hook_post_mutation'
84+
TOOL_HOOK_POST_FAILED = 'tool_hook_post_failed'
85+
7486
# Planned (later phases): PLAN_RESOLVED, DECISION_RECEIVED,
7587
# CONTEXT_COMPACTED, BUDGET_CHECKPOINT, ITERATION_COMPLETED,
7688
# FINAL_VALIDATION, RECEIPT_EMITTED, SESSION_START, SESSION_END,
@@ -127,6 +139,13 @@ class RunEvent:
127139
RunEventType.APPROVAL_DENIED: 'approval_denied',
128140
RunEventType.RUN_CANCELLED: 'run_cancelled',
129141
RunEventType.RUN_PENDING_APPROVAL: 'run_pending_approval',
142+
# M5 hook-observability taxonomy — mapping only; hook execution stays in
143+
# the tool-dispatch layer (teaagent/tools.py), not spine-emitted.
144+
RunEventType.TOOL_HOOK_PRE_MUTATION: 'tool_hook_pre_mutation',
145+
RunEventType.TOOL_HOOK_PRE_MUTATION_BLOCKED: 'tool_hook_pre_mutation_blocked',
146+
RunEventType.TOOL_HOOK_VETOED: 'tool_hook_vetoed',
147+
RunEventType.TOOL_HOOK_POST_MUTATION: 'tool_hook_post_mutation',
148+
RunEventType.TOOL_HOOK_POST_FAILED: 'tool_hook_post_failed',
130149
}
131150

132151

tests/lifecycle/test_run_event_spine.py

Lines changed: 28 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -487,9 +487,10 @@ def decide(context: dict[str, Any]):
487487
def test_all_run_event_types_round_trip_through_mappers() -> None:
488488
"""Every RunEventType member round-trips through both mapper directions.
489489
490-
Covers all 26 members (7 M0 + 19 M2 evidence-event taxonomy) — forward
491-
through run_event_to_audit_event_type then back through
492-
audit_event_to_run_event_type, and inverse through the dict mappers.
490+
Covers all 31 members (7 M0 + 19 M2 evidence-event taxonomy + 5 M5
491+
hook-observability taxonomy) — forward through run_event_to_audit_event_type
492+
then back through audit_event_to_run_event_type, and inverse through the
493+
dict mappers.
493494
"""
494495
for event_type in RunEventType:
495496
aud_type = run_event_to_audit_event_type(event_type)
@@ -503,6 +504,30 @@ def test_all_run_event_types_round_trip_through_mappers() -> None:
503504
assert len(_AUDIT_EVENT_TO_RUN_EVENT_TYPE) == len(RunEventType)
504505

505506

507+
def test_m5_hook_audit_events_are_typed_and_reader_surfaced() -> None:
508+
"""M5: the 5 hook-observability audit events emitted by the dispatch-layer
509+
HookRegistry bridge are typed in RunEventType and mapped both directions, so
510+
the M2-T001 reader surfaces them from the audit JSONL for the M6 fold.
511+
512+
Hook *execution* stays in teaagent/tools.py (it mutates in-flight
513+
args/results); only its observability is folded onto the spine — the same
514+
taxonomy-only shape as M2 (and the approval/budget observability under the
515+
decision-B/B-analog findings).
516+
"""
517+
expected = {
518+
'tool_hook_pre_mutation': RunEventType.TOOL_HOOK_PRE_MUTATION,
519+
'tool_hook_pre_mutation_blocked': RunEventType.TOOL_HOOK_PRE_MUTATION_BLOCKED,
520+
'tool_hook_vetoed': RunEventType.TOOL_HOOK_VETOED,
521+
'tool_hook_post_mutation': RunEventType.TOOL_HOOK_POST_MUTATION,
522+
'tool_hook_post_failed': RunEventType.TOOL_HOOK_POST_FAILED,
523+
}
524+
for audit_type, run_type in expected.items():
525+
# Reader maps the audit JSONL string -> typed event (non-None => surfaced).
526+
assert audit_event_to_run_event_type(audit_type) == run_type
527+
# Forward mapping is exact and lossless.
528+
assert run_event_to_audit_event_type(run_type) == audit_type
529+
530+
506531
# ---------------------------------------------------------------------------
507532
# M3-T001: PlanGateInterceptor unit tests
508533
# ---------------------------------------------------------------------------

0 commit comments

Comments
 (0)