You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: close all review residuals (F1 hooks folded, F2 auto-guard, F3 pub/sub detection)
"Fix all remaining" — closes the three documented residuals from the post-migration
review end-to-end.
F1 (hooks now folded into evidence): added HookActivityRecord +
RunEvidenceBundle.hook_activity + extract_hook_activity(), wired into
_assemble_evidence_bundle. Hook veto/mutation activity now appears in the bundle/
receipt and folds through the typed stream (the M5 typing was the prerequisite).
Makes the original M5 claim true. Bundle gains a 'hook_activity' key; verified safe
across all consumers (no strict key-set assertion; receipt formatting unaffected).
F2 (cutover guard can no longer go stale): replaced the hand-maintained extractor-
type list with an AST auto-discovery in validate_event_spine_wiring.py
(check_evidence_extractor_types_typed) — it finds the event_type literals
run_evidence/proof_of_use compare against (==, inline `in {...}`, and `in NAME`
for module-level frozenset/set constants incl. annotated ones) and asserts each is
in RunEventType. Caught a real blind spot mid-implementation: annotated constants
(_HOOK_AUDIT_TYPES) were initially missed.
F3 (pub/sub bus shape detected): the orphan-bus scan now also flags the
subscribe+emit pair, so a RunEventStream-shaped bus is caught; RunEventStream added
to the allowlist. Validator docstring + ADR narrowed the documented limitation.
Validator now runs three checks (A taxonomy closure, B no-orphan-bus, C evidence-
type coverage); all in the check-event-spine-wiring pre-commit hook. Docs (plan §7
M5/M7 rows, FOLD/T003 tickets, ADR realized-architecture) updated to RESOLVED.
Constraint: F1 adds an additive bundle field (no behavior change to existing fields); F2/F3 are read-only static guards; remaining limitation (novel-naming buses, exotic dynamic event-type lookups) documented honestly.
Tested: tests/test_event_spine_wiring.py + test_run_evidence.py + lifecycle spine 53 passed; broad consumer regression 168 passed; validator exits 0, flags seeded subscribe+emit bus + seeded untyped evidence type.
Not-tested: full suite not run on 3.12 (hypothesis missing in 3.14 sandbox).
Confidence: high
Roadmap-Status: unchanged
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: docs/plans/adr-0032-m1-m6-work-plan-2026-06-13.md
+13-11Lines changed: 13 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -177,9 +177,9 @@ consumers by M6.
177
177
| ADR-0032-M2 (REDEFINED, taxonomy-only §16) | Every audit event the evidence bundle reads is typed in `RunEventType` and mapped both directions, so the M2-T001 reader surfaces it **from the audit JSONL** (mapper is sufficient; emit-site migration is NOT in M2 — it is deferred to the component milestones, §16). Covers routes, git-sandbox, skills, tests, undo, provenance, approval/tool-call decision events, cancelled/pending lifecycle. Pure additive; zero behavior change. (Old M2 "evidence/receipt fold" moved to M6 — §14.) |
178
178
| ADR-0032-M3 | Plan gate is an interceptor using `PlanValidator`, landed parity-first (§13.3): a shadow-parity test asserting interceptor==inline per reason code went green before the inline branch was deleted in a separate commit. Denials and reason codes match current behavior; adversarial and first-hour tests remain green. |
179
179
| ADR-0032-M4 (CLOSED — owner decisions B + B-analog, 2026-06-13) | **No gate moves to an interceptor; approval AND budget enforcement both STAY INLINE.** Both proved runtime-stateful on assessment, a poor fit for the pure-interceptor model. **Approval** (decision B): live JIT/session state, tool handler, auto-mode-swappable policy — every coupling gap was invisible to a unit parity test (`docs/work-log/m4-approval-sliceB-blocked-2026-06-13.md`). **Budget** (decision B-analog): it is three mechanisms — only the global cost cap (`_assert_cost_budget`) is stateless; the phase budget (live `phase_tracker`) and the warning ladder (`_budget_warning_levels_emitted` + `BudgetMonitor._emitted_levels`/`_prompted` dedup sets + an interactive `on_prompt` side-effect handler — the same `assert_allowed` shadow-coexistence trap that blocked approval) are stateful, and even the cost cap is enforced at two evolving-cost points per iteration that do not map 1:1 to events (`docs/work-log/m4-budget-stays-inline-2026-06-13.md`). Both gates' observability is already provided by M2 (their audit events — `tool_call_*`, `approval_*`, `budget_warning`, `budget_prompt`, `phase_budget_warning` — are typed + reader-surfaced); the M6 fold reads them without owning enforcement. Approval/budget behavior unchanged. **Net: plan gate (M3) is the sole governance gate moved to an interceptor.** |
180
-
| ADR-0032-M5 (REVISED — observability-only, 2026-06-13) | **Hook OBSERVABILITY folds onto the spine; hook EXECUTION stays in the tool-dispatch layer.** Assessment found the planned "HookRegistry on spine" unsuitable for the same runtime-coupling reason as approval/budget: PreToolUse/PostToolUse run in `teaagent/tools.py::execute` and **mutate in-flight `arguments`/`result`** (the spine has no channel to ferry mutated payloads back to the dispatch site), and the 6 session-lifecycle hooks (SessionStart/End, UserPromptSubmit, PreCompact, Stop, SubagentStop) have **no production caller** — nothing to strangle; wiring them is feature work. Done: the 5 dispatch-layer hook audit events (`tool_hook_pre_mutation`, `tool_hook_pre_mutation_blocked`, `tool_hook_vetoed`, `tool_hook_post_mutation`, `tool_hook_post_failed`) are typed in `RunEventType` + mapped both directions, so the M2-T001 reader can surface them as typed RunEvents. **Correction (post-migration review F1):** this is typing + reader-visibility ONLY — it is NOT yet folded into evidence/receipts. No evidence extractor reads `tool_hook_*` and `RunEvidenceBundle` has no hooks field, so hook veto/mutation activity does not currently appear in any bundle/receipt. Surfacing it would need a new `RunEvidenceBundle` hooks field + extractor (backlog). Mapping/reader only; audit bytes unchanged; hook execution + mutation semantics unchanged. See `docs/work-log/m5-hooks-observability-only-2026-06-13.md`. |
180
+
| ADR-0032-M5 (REVISED — observability-only, 2026-06-13) | **Hook OBSERVABILITY folds onto the spine; hook EXECUTION stays in the tool-dispatch layer.** Assessment found the planned "HookRegistry on spine" unsuitable for the same runtime-coupling reason as approval/budget: PreToolUse/PostToolUse run in `teaagent/tools.py::execute` and **mutate in-flight `arguments`/`result`** (the spine has no channel to ferry mutated payloads back to the dispatch site), and the 6 session-lifecycle hooks (SessionStart/End, UserPromptSubmit, PreCompact, Stop, SubagentStop) have **no production caller** — nothing to strangle; wiring them is feature work. Done: the 5 dispatch-layer hook audit events (`tool_hook_pre_mutation`, `tool_hook_pre_mutation_blocked`, `tool_hook_vetoed`, `tool_hook_post_mutation`, `tool_hook_post_failed`) are typed in `RunEventType` + mapped both directions, so the M2-T001 reader can surface them as typed RunEvents. **Update (review F1 RESOLVED, 2026-06-14):** initially this was typing + reader-visibility only; the "fold" claim was hollow because no extractor read `tool_hook_*`. Now fixed end-to-end: added `HookActivityRecord` + `RunEvidenceBundle.hook_activity` + `extract_hook_activity()`, wired into `_assemble_evidence_bundle`, so hook veto/mutation activity now appears in the bundle (and folds through the typed stream — the M5 typing was the prerequisite). Audit bytes unchanged; hook execution + mutation semantics unchanged. See `docs/work-log/m5-hooks-observability-only-2026-06-13.md`. |
181
181
| ADR-0032-M6 (was M2 fold; corrected scope A) — **COMPLETE (FOLD-T001 + T002)** | Evidence and receipts are folded from the typed event stream and equal the legacy builder on success/failure/pending fixtures (cancelled once emitted in M2); the fold reads the full stream (no fallback flag, per Q1). **FOLD-T001**: `build_evidence_from_events()` parallel builder sharing `_assemble_evidence_bundle` with the legacy path (cannot drift; only the event *source* differs), parity-asserted (`tests/test_run_evidence.py::test_m6_fold_*`). Fixed a structural gap: the typed `RunEvent` was lossy — dropped top-level `created_at` (threaded into command/test/approval timestamps); added optional `RunEvent.created_at`, reader populates it. **FOLD-T002 (cutover DONE)**: `build_run_evidence_bundle` now routes production evidence THROUGH the typed reader + fold — the typed stream is the production path; the raw-dict assembly survives only as the shared helper (so the two cannot diverge). Suite-wide green (evidence/receipt/summary/5-min-proof/first-hour/adversarial + all bundle consumers, ~218 tests). **Finding: no synthetic receipt-only fixtures existed to retire** — the receipt/evidence path was already event-backed (`test_run_receipt.py` writes real RunStore events; `test_real_run_receipt_completeness_from_plan` validates a real run); direct `RunEvidenceBundle(...)` constructions are legitimate downstream-consumer/checker unit tests, not masking fixtures. The plan anticipated a gap that does not exist. Parity test re-anchored against `_assemble_evidence_bundle` (the raw-dict path) so it stays meaningful post-cutover. |
182
-
| ADR-0032-M7 (was M6) — **COMPLETE as guard + document, 2026-06-13** | Original goal ("ContextBus + webhook consume the spine; delete inline eventing") **NOT done — it is a regression or vacuous.** Webhook is an `audit.add_sink` already fed transitively by the M1 spine→audit consumer; a *direct* spine consumer would see only the spine-emitted subset (coverage regression). ContextBus + integration `RunEventStream` are **unwired in production** (no callers) — nothing to migrate. The inline `audit.record` calls are the **complete event record** (read by evidence/receipts/webhook), not redundant eventing to delete. **Done instead (owner: guard + document):** `scripts/validate_event_spine_wiring.py` + `tests/test_event_spine_wiring.py` enforce the realized invariant — one typed lifecycle path (EventSpine→audit consumer), an allowlist of sanctioned event-delivery surfaces so a NEW competing lifecycle bus fails the gate, and taxonomy closure (no RunEventType orphaned from the audit record). Added as a pre-commit hook. ADR 0032 "Realized architecture (M1–M7)" section documents the outcome. **MIGRATION COMPLETE.** |
182
+
| ADR-0032-M7 (was M6) — **COMPLETE as guard + document, 2026-06-13** | Original goal ("ContextBus + webhook consume the spine; delete inline eventing") **NOT done — it is a regression or vacuous.** Webhook is an `audit.add_sink` already fed transitively by the M1 spine→audit consumer; a *direct* spine consumer would see only the spine-emitted subset (coverage regression). ContextBus + integration `RunEventStream` are **unwired in production** (no callers) — nothing to migrate. The inline `audit.record` calls are the **complete event record** (read by evidence/receipts/webhook), not redundant eventing to delete. **Done instead (owner: guard + document):** `scripts/validate_event_spine_wiring.py` + `tests/test_event_spine_wiring.py` enforce the realized invariant with three checks — (A) taxonomy closure (no RunEventType orphaned from the audit record); (B) no orphaned event bus (allowlist of sanctioned surfaces; high-signal methods **plus** the subscribe+emit pub/sub pair so the RunEventStream shape is caught — review F3); (C) evidence-extractor type coverage (AST-discovers the event_type literals run_evidence/proof_of_use read and asserts each is typed, so the M6 cutover can't silently drop evidence — review F2). Added as a pre-commit hook. ADR 0032 "Realized architecture (M1–M7)" section documents the outcome. **MIGRATION COMPLETE.** |
183
183
184
184
## 8. Task Plan
185
185
@@ -725,15 +725,17 @@ commit once Slice A is green.
0 commit comments