Skip to content

Commit 7cb78b8

Browse files
committed
feat(cedar-hitl): PreToolUse three-outcome REQUIRE_APPROVAL path
Wires the agent to the full §6.5 pseudocode: cap + rate-limit check, atomic TransactWriteItems for pending row + TaskTable AWAITING_APPROVAL, 2s→5s ConsistentRead poll, IMPL-24 VM-throttle race re-read, resume transition, scope propagation to allowlist, and denial-injection queue consumed at the next Stop seam. Completes §15.2 rows 26 + 27. Hook control flow (three outcomes) ---------------------------------- - ALLOW / DENY: existing Phase 1 behavior, now switching on ``.outcome`` rather than ``.allowed``. Legacy Phase 1/2 tests still green because PolicyDecision preserves the ``.allowed`` shim. - REQUIRE_APPROVAL (new): extracted into ``_handle_require_approval`` for readability. Delegates to ``task_state`` primitives and ``engine.*`` counter surfaces from the prior commits; no new DDB client construction here. Key pieces: - ``_compute_effective_timeout`` applies the §6.5 min(rule, default, lifetime) formula. The engine's ``_merge_annotations`` has already clipped decision.timeout_s against the task default; the hook adds the remaining-lifetime ceiling and floors at FLOOR_30S. ``clip_reason`` distinguishes ``rule_annotation`` (rule was tighter than task default) from ``maxLifetime_ceiling`` (task is late in its life) so ``approval_timeout_capped`` carries the right reason. - ``_remaining_maxlifetime_s`` reads ``AGENTCORE_MAX_LIFETIME_S`` + ``TASK_STARTED_AT`` env vars (8h default). Returns ``None`` when the start timestamp is absent — the hook treats that as "unknown, don't clip" rather than pre-DENYing, so Phase 1 test paths that don't set the env var still see the old task-default behaviour. Chunk 4/5 will wire these at task launch. - ``_poll_for_decision`` uses 2s cadence for the first 30s then 5s (IMPL-12). All polls use ``ConsistentRead=True`` per IMPL-24. 3 consecutive GetItem failures emit ``approval_poll_degraded``; 10 consecutive failures fall through as TIMED_OUT with a specific reason (§13.2). - ``_reconcile_late_decision`` implements IMPL-24 re-read: on a ConditionCheckFailed from the TIMED_OUT write, re-read with ConsistentRead. APPROVED → rebuild outcome, propagate scope to allowlist, run normal allow flow, emit ``approval_late_win``. DENIED → honor the user's sanitized reason. PENDING or row gone → fall through with TIMED_OUT (fail-closed, §13.12 last paragraph). Cancel-wins semantics (finding aws-samples#2) ---------------------------------- ``_denial_between_turns_hook`` is registered AFTER ``_nudge_between_turns_hook`` in ``between_turns_hooks`` so cancel short-circuits both. The hook re-checks ``_cancel_requested`` itself as belt-and-braces (matching the nudge hook) so a future reorder does not silently break cancel-wins. Denial queue is PRESERVED on cancel — not drained — so a denial still sitting on the queue when the task is being torn down does not leak across tasks (the engine is per-task per §IMPL-7). ``stop_hook`` threads ``engine`` into ``ctx`` so the denial hook can ``drain_denial_injections``. ``build_hook_matchers`` accepts a new ``user_id`` kwarg (§12.2) so approval rows carry caller identity for the REST side's ownership check. ``permissionDecisionReason`` guaranteed surface ----------------------------------------------- The hook's deny return is the ONLY guaranteed surface the SDK emits to the agent; denial injection is best-effort (pre-empted by cancel). ``_deny_response`` pipes every reason through ``_strip_ansi`` + ``_truncate(500)``: ANSI sequences can never reach the model, and the line stays loggable. §12.7 requirement. Tests: +24 agent hook tests (47 total in test_hooks.py). Run in 0.92s via a ``_fast_poll`` fixture that collapses ``asyncio.sleep`` to a no-op AND advances ``hooks.time.monotonic`` by the requested duration so the poll wall-clock deadline actually trips. Happy paths: - APPROVED + scope propagation to allowlist + milestones. - APPROVED with scope=this_call does NOT grow allowlist. - DENIED queues denial injection + populates recent-decision cache (next identical call auto-denies). - TIMED_OUT writes TIMED_OUT row and emits approval_timed_out. IMPL-24 race: four branches. - APPROVED re-read → allow flow, approval_late_win milestone, scope propagated, resume succeeds. - DENIED re-read → deny flow, approval_late_win milestone, user's reason is the permissionDecisionReason. - Still-PENDING re-read → fail-closed fall-through (no late_win). - Row-gone re-read → same fail-closed fall-through. Cap / rate-limit / write failure / resume failure branches all: - Short-circuit before any DDB write when the local guard fires (cap, rate limit). - Emit the right approval_* milestone. - Return DENY with a specific permissionDecisionReason. Sanitization: - ANSI stripped from deny reason. - Deny reason truncated to ≤500 chars. Timeout clipping: - rule_annotation reason when a rule's approval_timeout_s is below the task default; matching_rule_ids populated. - maxLifetime_ceiling reason when remaining lifetime is the tightest bound; matching_rule_ids is None. - approval_ceiling_shrinking emits exactly once per task (IMPL-26 latch). Denial injection hook (6 tests): - Draining produces a <user_denial request_id=... decided_at=...> block with XML-escaped reason. - Cancel short-circuit preserves the queue so the denial is not lost; just not injected into a dying agent. - Hostile reason (</user_denial>...<user_nudge>) is XML-escaped so the envelope cannot be forged. - No-engine ctx returns [] (Phase 1 call sites still work). - Registered LAST in ``between_turns_hooks`` (invariant for §6.5 finding aws-samples#2). - End-to-end via stop_hook: queued denial becomes ``decision=block`` + reason on the Stop return. Carry-forward ------------- - ``_remaining_maxlifetime_s`` returns None when TASK_STARTED_AT is unset — Chunk 4/5 will wire this at task launch. Tracked in §16. - ``approval_gate_count`` lives on the engine (session-scoped) not on TaskTable in v1. §13.6 notes that the reconciler + approval_gate_cap still bound worst-case across container restarts. Chunk 7+ tracks persistence when telemetry justifies it. - Denial injection emits a ``user_denial_injected`` milestone that is NOT in the §11.1 enumerated table. It mirrors ``nudge_acknowledged`` for stream visibility; keep the name distinct from the ``approval_*`` prefix so future §11.1 consumers can't confuse it with an approval outcome.
1 parent 3cd4b3b commit 7cb78b8

2 files changed

Lines changed: 1624 additions & 45 deletions

File tree

0 commit comments

Comments
 (0)