Skip to content

Commit 4c51a16

Browse files
committed
feat(cedar-hitl): approval milestone writers + engine counters
Adds the 14 agent-side approval milestone writers (§11.1) on ``_ProgressWriter`` so Chunk 3's hook integration has a typed API instead of stringly-typed ``write_agent_milestone`` calls, and the per-task gate counter / per-container sliding-window rate limit / denial-injection queue on ``PolicyEngine`` that §6.5 requires. Why now: the hook work lands cleanly only after these surfaces exist — every code path in ``pre_tool_use_hook``'s REQUIRE_APPROVAL branch calls one of these helpers. Shipping them separately lets the hook commit be about the state machine, not the event-shape bookkeeping. Engine additions: - ``approval_gate_count`` / ``increment_approval_gate_count``: the per-task counter §12.9 bounds at ``approvalGateCap``. Session-scoped in v1; persistence tracked in §17. - ``approvals_in_last_minute`` / ``record_approval_gate_timestamp``: sliding-window rate limit (20/min/container, §12.9). Prune on read so callers see the current count without a separate tick. - ``queue_denial_injection`` / ``drain_denial_injections``: queue consumed by ``_denial_between_turns_hook`` at the next Stop seam (§6.5). Reason is pre-sanitized upstream by ``DenyTaskFn``. - ``mark_ceiling_shrinking_emitted``: emit-once latch for IMPL-26. - ``APPROVAL_RATE_LIMIT`` / ``APPROVAL_RATE_WINDOW_S`` module consts the hook imports rather than re-deriving. Milestone writers (§11.1 table, 14 agent-emitted of 15): - ``pre_approvals_loaded``, ``approval_requested``, ``approval_granted``, ``approval_denied``, ``approval_timed_out``, ``approval_stranded``, ``approval_write_failed``, ``approval_resume_failed``, ``approval_poll_degraded``, ``approval_timeout_capped``, ``approval_ceiling_shrinking``, ``approval_cap_exceeded``, ``approval_rate_limit_exceeded``, ``approval_late_win``. - ``approval_decision_recorded`` (Lambda audit) and ``approval_timeout_capped_at_submit`` (CreateTaskFn) stay on the Lambda side — Chunk 5 owns those. Each helper is a thin wrapper over ``_put_event("agent_milestone", ...)`` so the shared circuit-breaker + classifier path (finding aws-samples#6/aws-samples#8) continues to apply. Metadata keys mirror the §11.1 shapes verbatim (``maxLifetime_remaining_s`` preserves the design-doc spelling for downstream parsers). Tests: +29 total. 17 on ``TestApprovalMilestoneHelpers`` pin the DDB payload shape for each helper (including the two ``approval_timeout_capped`` reason variants — rule_annotation carries matching_rule_ids; maxLifetime_ceiling omits the field). 12 on the engine: counter monotonicity, rate-window prune semantics at window boundary, denial-queue FIFO + drain-clears, ceiling-shrinking latch idempotency. No caller changes — engine and writer surfaces are additive. Hook integration lands in commit C.
1 parent 6689ee0 commit 4c51a16

4 files changed

Lines changed: 568 additions & 1 deletion

File tree

agent/src/policy.py

Lines changed: 78 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@
5858
import hashlib
5959
import json
6060
import time
61-
from collections import OrderedDict
61+
from collections import OrderedDict, deque
6262
from dataclasses import dataclass
6363
from enum import StrEnum
6464
from fnmatch import fnmatch
@@ -83,6 +83,8 @@
8383
CACHE_MAX_ENTRIES: int = 50 # §12.9: decoupled from approvalGateCap
8484
CACHE_TTL_S: float = 60.0 # §12.8 sliding-window TTL on DENIED/TIMED_OUT
8585
POLICIES_MAX_BYTES: int = 64 * 1024 # finding #12: reject blueprints > 64 KB
86+
APPROVAL_RATE_LIMIT: int = 20 # §12.9 per-container per-minute approval writes
87+
APPROVAL_RATE_WINDOW_S: float = 60.0 # sliding window paired with APPROVAL_RATE_LIMIT
8688

8789
_SEVERITY_ORDER = {"low": 0, "medium": 1, "high": 2}
8890
_DEFAULT_SEVERITY = "medium"
@@ -627,6 +629,20 @@ def __init__(
627629
)
628630
self._approval_gate_cap = approval_gate_cap
629631

632+
# §12.9 per-task gate counter + per-container sliding-window rate limit.
633+
# Both are session-scoped: the counter survives within the same task
634+
# but NOT across container restarts (the design carries-forward tracks
635+
# DDB persistence for the counter in Chunk 4/5; for now the §13.6
636+
# reconciler + approval_gate_cap bound the worst case).
637+
self._approval_gate_count: int = 0
638+
self._approvals_last_minute: deque[float] = deque()
639+
# §6.5 queue consumed by ``_denial_between_turns_hook``. Each entry
640+
# is ``{"request_id", "reason", "decided_at"}``; reason is already
641+
# sanitized by DenyTaskFn (§12.6).
642+
self._denial_injection_queue: list[dict] = []
643+
# IMPL-26: ``approval_ceiling_shrinking`` is emit-once per task.
644+
self._emitted_ceiling_shrinking: bool = False
645+
630646
# Validate task_type (non-fatal WARN to match Phase 1 behavior).
631647
from models import TaskType
632648

@@ -800,6 +816,67 @@ def allowlist(self) -> ApprovalAllowlist:
800816
def recent_decisions(self) -> RecentDecisionCache:
801817
return self._cache
802818

819+
# ---- Approval-gate counters + denial queue (§6.5, §12.9) --------------
820+
821+
@property
822+
def approval_gate_count(self) -> int:
823+
"""Session-scoped count of REQUIRE_APPROVAL gates emitted this task."""
824+
return self._approval_gate_count
825+
826+
def increment_approval_gate_count(self) -> None:
827+
"""Bump the per-task gate counter (called at row-write time)."""
828+
self._approval_gate_count += 1
829+
830+
def record_approval_gate_timestamp(self, now: float | None = None) -> None:
831+
"""Record a new approval-gate timestamp for the sliding rate-limit window."""
832+
ts = time.monotonic() if now is None else now
833+
self._approvals_last_minute.append(ts)
834+
self._prune_rate_window(ts)
835+
836+
def _prune_rate_window(self, now: float) -> None:
837+
"""Drop timestamps older than ``APPROVAL_RATE_WINDOW_S``."""
838+
cutoff = now - APPROVAL_RATE_WINDOW_S
839+
while self._approvals_last_minute and self._approvals_last_minute[0] < cutoff:
840+
self._approvals_last_minute.popleft()
841+
842+
@property
843+
def approvals_in_last_minute(self) -> int:
844+
"""Count of approval-gate writes in the last ``APPROVAL_RATE_WINDOW_S``.
845+
846+
Prunes the window before returning so callers see the current count.
847+
"""
848+
self._prune_rate_window(time.monotonic())
849+
return len(self._approvals_last_minute)
850+
851+
def queue_denial_injection(
852+
self, *, request_id: str, reason: str, decided_at: str | None
853+
) -> None:
854+
"""Append a denial-injection payload for ``_denial_between_turns_hook``.
855+
856+
Reason is expected to be pre-sanitized upstream (by ``DenyTaskFn``,
857+
§12.6). The hook is responsible for XML-escaping at injection time.
858+
"""
859+
self._denial_injection_queue.append(
860+
{"request_id": request_id, "reason": reason, "decided_at": decided_at}
861+
)
862+
863+
def drain_denial_injections(self) -> list[dict]:
864+
"""Pop and return the queued denial-injection payloads."""
865+
out = list(self._denial_injection_queue)
866+
self._denial_injection_queue.clear()
867+
return out
868+
869+
def mark_ceiling_shrinking_emitted(self) -> bool:
870+
"""Idempotency latch for ``approval_ceiling_shrinking`` (IMPL-26).
871+
872+
Returns ``True`` the first time it is called (caller should emit the
873+
milestone) and ``False`` on every subsequent call.
874+
"""
875+
if self._emitted_ceiling_shrinking:
876+
return False
877+
self._emitted_ceiling_shrinking = True
878+
return True
879+
803880
# ---- Probes + low-level evaluation ------------------------------------
804881

805882
def _probe_cedar(self, policies_text: str, tier_name: str) -> None:

agent/src/progress_writer.py

Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -608,3 +608,205 @@ def write_agent_error(self, error_type: str, message: str) -> None:
608608
"message_preview": self._preview(message),
609609
},
610610
)
611+
612+
# -- approval-gate milestones (§11.1, Chunk 3) -----------------------------
613+
#
614+
# Each helper is a thin wrapper over ``_put_event("agent_milestone", ...)``
615+
# carrying the structured metadata shape documented in §11.1. Centralizing
616+
# the milestone name + metadata keys here keeps the agent/Lambda/CLI
617+
# contract consistent — every approval_* producer goes through a named
618+
# method rather than stringly-typed ``write_agent_milestone`` calls.
619+
#
620+
# ``approval_decision_recorded`` is NOT emitted here: it is the
621+
# Lambda-side audit event (written by ApproveTaskFn / DenyTaskFn directly
622+
# to TaskEventsTable in Chunk 5). See IMPL-6.
623+
#
624+
# ``approval_timeout_capped_at_submit`` is emitted by CreateTaskFn and
625+
# also lives on the Lambda side (Chunk 5). Agent-side helpers cover the
626+
# 14 events the agent runtime emits.
627+
628+
def _put_approval_milestone(self, milestone: str, metadata: dict) -> None:
629+
"""Emit an ``agent_milestone`` with ``milestone`` + extra metadata.
630+
631+
Mirrors the shape of ``write_agent_milestone`` but preserves the
632+
structured metadata keys so consumers do not need to parse a
633+
free-form ``details`` string.
634+
"""
635+
payload: dict = {"milestone": milestone}
636+
payload.update(metadata)
637+
self._put_event("agent_milestone", payload)
638+
639+
def write_approval_pre_approvals_loaded(self, count: int, scopes: list[str]) -> None:
640+
"""Emit ``pre_approvals_loaded`` (agent init, §11.1)."""
641+
self._put_approval_milestone(
642+
"pre_approvals_loaded",
643+
{"count": count, "scopes": list(scopes)},
644+
)
645+
646+
def write_approval_requested(
647+
self,
648+
*,
649+
request_id: str,
650+
tool_name: str,
651+
input_preview: str,
652+
reason: str,
653+
severity: str,
654+
timeout_s: int,
655+
matching_rule_ids: list[str],
656+
) -> None:
657+
"""Emit ``approval_requested`` (PENDING row written, §11.1)."""
658+
self._put_approval_milestone(
659+
"approval_requested",
660+
{
661+
"request_id": request_id,
662+
"tool_name": tool_name,
663+
"input_preview": self._preview(input_preview),
664+
"reason": self._preview(reason),
665+
"severity": severity,
666+
"timeout_s": timeout_s,
667+
"matching_rule_ids": list(matching_rule_ids),
668+
},
669+
)
670+
671+
def write_approval_granted(
672+
self,
673+
*,
674+
request_id: str,
675+
scope: str,
676+
decided_at: str | None,
677+
) -> None:
678+
"""Emit ``approval_granted`` (user APPROVED, §11.1)."""
679+
self._put_approval_milestone(
680+
"approval_granted",
681+
{"request_id": request_id, "scope": scope, "decided_at": decided_at},
682+
)
683+
684+
def write_approval_denied(
685+
self,
686+
*,
687+
request_id: str,
688+
reason: str,
689+
decided_at: str | None,
690+
) -> None:
691+
"""Emit ``approval_denied`` (user DENIED, §11.1)."""
692+
self._put_approval_milestone(
693+
"approval_denied",
694+
{
695+
"request_id": request_id,
696+
"reason": self._preview(reason),
697+
"decided_at": decided_at,
698+
},
699+
)
700+
701+
def write_approval_timed_out(self, *, request_id: str, timeout_s: int) -> None:
702+
"""Emit ``approval_timed_out`` (timer expired, §11.1)."""
703+
self._put_approval_milestone(
704+
"approval_timed_out",
705+
{"request_id": request_id, "timeout_s": timeout_s},
706+
)
707+
708+
def write_approval_stranded(
709+
self,
710+
*,
711+
request_id: str,
712+
age_s: int,
713+
reason: str,
714+
) -> None:
715+
"""Emit ``approval_stranded`` (reconciler surface, §11.1)."""
716+
self._put_approval_milestone(
717+
"approval_stranded",
718+
{"request_id": request_id, "age_s": age_s, "reason": self._preview(reason)},
719+
)
720+
721+
def write_approval_write_failed(self, *, request_id: str | None, error: str) -> None:
722+
"""Emit ``approval_write_failed`` (TransactWriteItems failure, §11.1)."""
723+
self._put_approval_milestone(
724+
"approval_write_failed",
725+
{"request_id": request_id, "error": self._preview(error)},
726+
)
727+
728+
def write_approval_resume_failed(self, *, request_id: str, error: str) -> None:
729+
"""Emit ``approval_resume_failed`` (resume transaction failure, §11.1)."""
730+
self._put_approval_milestone(
731+
"approval_resume_failed",
732+
{"request_id": request_id, "error": self._preview(error)},
733+
)
734+
735+
def write_approval_poll_degraded(self, *, request_id: str, consecutive_failures: int) -> None:
736+
"""Emit ``approval_poll_degraded`` (3+ consecutive GetItem failures)."""
737+
self._put_approval_milestone(
738+
"approval_poll_degraded",
739+
{
740+
"request_id": request_id,
741+
"consecutive_failures": consecutive_failures,
742+
},
743+
)
744+
745+
def write_approval_timeout_capped(
746+
self,
747+
*,
748+
request_id: str,
749+
requested_timeout_s: int,
750+
effective_timeout_s: int,
751+
reason: str,
752+
matching_rule_ids: list[str] | None = None,
753+
) -> None:
754+
"""Emit ``approval_timeout_capped`` when min-wins clips the timeout.
755+
756+
``reason`` ∈ {rule_annotation, maxLifetime_ceiling, runtime_jwt_ceiling}.
757+
``matching_rule_ids`` populated when ``reason == "rule_annotation"``
758+
(IMPL-26).
759+
"""
760+
payload: dict = {
761+
"request_id": request_id,
762+
"requested_timeout_s": requested_timeout_s,
763+
"effective_timeout_s": effective_timeout_s,
764+
"reason": reason,
765+
}
766+
if matching_rule_ids is not None:
767+
payload["matching_rule_ids"] = list(matching_rule_ids)
768+
self._put_approval_milestone("approval_timeout_capped", payload)
769+
770+
def write_approval_ceiling_shrinking(
771+
self,
772+
*,
773+
request_id: str,
774+
max_lifetime_remaining_s: int,
775+
cleanup_margin_s: int,
776+
task_default_timeout_s: int,
777+
) -> None:
778+
"""Emit once-per-task ``approval_ceiling_shrinking`` (IMPL-26, §11.1)."""
779+
self._put_approval_milestone(
780+
"approval_ceiling_shrinking",
781+
{
782+
"request_id": request_id,
783+
"maxLifetime_remaining_s": max_lifetime_remaining_s,
784+
"cleanup_margin_s": cleanup_margin_s,
785+
"task_default_timeout_s": task_default_timeout_s,
786+
},
787+
)
788+
789+
def write_approval_cap_exceeded(self, *, request_id: str, count: int, cap: int) -> None:
790+
"""Emit ``approval_cap_exceeded`` (per-task gate-cap fired)."""
791+
self._put_approval_milestone(
792+
"approval_cap_exceeded",
793+
{"request_id": request_id, "count": count, "cap": cap},
794+
)
795+
796+
def write_approval_rate_limit_exceeded(self, *, request_id: str, rate: int, limit: int) -> None:
797+
"""Emit ``approval_rate_limit_exceeded`` (per-minute rate limit hit)."""
798+
self._put_approval_milestone(
799+
"approval_rate_limit_exceeded",
800+
{"request_id": request_id, "rate": rate, "limit": limit},
801+
)
802+
803+
def write_approval_late_win(self, *, request_id: str, outcome: str, reason: str) -> None:
804+
"""Emit ``approval_late_win`` (VM-throttle race mitigation, IMPL-24)."""
805+
self._put_approval_milestone(
806+
"approval_late_win",
807+
{
808+
"request_id": request_id,
809+
"outcome": outcome,
810+
"reason": self._preview(reason),
811+
},
812+
)

0 commit comments

Comments
 (0)