Skip to content

Commit 945b149

Browse files
author
bgagent
committed
docs(cedar-hitl): restore and revise HITL gates design, fold adversarial findings
Design doc was accidentally removed in 0742ebe; restored from b34d7cd and substantially revised under a new filename. "Phase 3" framing dropped — this is the Cedar HITL approval gates feature. - Renamed PHASE3_CEDAR_HITL.md → CEDAR_HITL_GATES.md; all "phase" gating removed (Phase 3a/3b → v1 / future work §17). - Integrated 16 findings from 2026-05-06 adversarial review with realistic scenarios. Major structural changes: - Decision aws-samples#23 (new): cross-engine parity contract between cedarpy (agent, Python) and @cedar-policy/cedar-wasm@4.10.0 (Lambda, TS). - §11.2: SlackUserMappingTable with OAuth user-initiated mapping; severity- gated Slack approvals; admin has no write path. - §7.1/§12.3: ApproveTaskFn uses cross-table TransactWriteItems for atomicity. - §10.1: user_id-status-index GSI on TaskApprovalsTable; v1 not v-later. - §15.6: cedar-wasm as a Lambda layer shared across policy Lambdas. - Gate-cap revision (2026-05-07): decision aws-samples#13 — default 50, blueprint- configurable via security.approvalGateCap (bounded 1–500), persisted on TaskTable. Cache memory bound decoupled: 50-entry LRU regardless of cap. IMPL-22 adds telemetry-driven re-evaluation criteria. - Timeout adversarial+advocate pass (2026-05-07): - §6.5 VM-throttle race fix: re-read row on failed TIMED_OUT ConditionCheckFailed; honor APPROVED if user beat the timer. IMPL-24. - Sub-120s @approval_timeout_s emits blueprint-load WARN. IMPL-25. - User-visible timeout cap milestones (approval_timeout_capped_at_submit, approval_ceiling_shrinking). IMPL-26. - Runtime JWT: no refresh logic in agent/src/ (container uses IAM role); ceiling stays min(1h, maxLifetime_remaining - 120s). IMPL-27. - Three new CloudWatch metrics for timeout tuning. IMPL-28. - §14.8 new: off-hours trade-off section (fail-closed is the invariant). - §13.13 new: notification-delivery failure does NOT pause the timer (bypass-prevention). - Added six mermaid diagrams: three-outcome decision flow, end-to-end round- trip, TaskApprovalsTable state machine, Slack user-mapping, fail-closed decision flow, cross-engine parity check. - Cross-references updated in INTERACTIVE_AGENTS.md and SECURITY.md. - Starlight mirror regenerated via docs/scripts/sync-starlight.mjs. No code changes in this commit — design work only. Implementation lands in a follow-up PR per §15.2 task list.
1 parent 4daf666 commit 945b149

7 files changed

Lines changed: 5106 additions & 1901 deletions

File tree

docs/design/CEDAR_HITL_GATES.md

Lines changed: 2544 additions & 0 deletions
Large diffs are not rendered by default.

docs/design/INTERACTIVE_AGENTS.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ This document describes the interactivity surfaces layered on top of that model
1919
3. **Watch**`bgagent watch <id>` polls `TaskEventsTable` with an adaptive interval (500 ms when events are arriving, back-off to 5 s when idle). Same endpoint used under the hood for foreground-block UX on `ask` and for HITL approval waits.
2020
4. **Nudge**`bgagent nudge <id> "<text>"` writes a row into `TaskNudgesTable`. The agent reads pending nudges between turns, acknowledges with a `nudge_acknowledged` milestone event, and integrates the nudge on its next turn.
2121
5. **Ask**`bgagent ask <id> "<question>"` (Phase 2) writes a question row. The agent answers at the next between-turns boundary; the answer surfaces as a `status_response` event. CLI default is foreground block-and-poll with a spinner; task and answer are both durable if the CLI disconnects.
22-
6. **Approval gates** — Phase 3 Cedar-driven hard gates. Agent emits `approval_requested`, waits for a decision from `bgagent approve` / `bgagent deny` or a Slack button-press. Detailed design in `PHASE3_CEDAR_HITL.md`.
22+
6. **Approval gates** — Phase 3 Cedar-driven hard gates. Agent emits `approval_requested`, waits for a decision from `bgagent approve` / `bgagent deny` or a Slack button-press. Detailed design in [`CEDAR_HITL_GATES.md`](./CEDAR_HITL_GATES.md).
2323

2424
### Core architectural choices
2525

@@ -222,7 +222,7 @@ Consumer: agent between-turns hook reads pending nudges, emits `nudge_acknowledg
222222

223223
### 3.7 TaskApprovalsTable (Phase 3)
224224

225-
Phase 3 approval-request spine. Detailed schema in `PHASE3_CEDAR_HITL.md`. Semantics summary:
225+
Phase 3 approval-request spine. Detailed schema in [`CEDAR_HITL_GATES.md`](./CEDAR_HITL_GATES.md). Semantics summary:
226226
- Agent writes an approval row with the request context.
227227
- Agent transitions `RUNNING → AWAITING_APPROVAL` and enters a poll loop.
228228
- User responds via REST (`POST /tasks/{id}/approvals/{request_id}`) or via a Slack button dispatched by the notification plane.
@@ -409,7 +409,7 @@ Flags:
409409

410410
### 5.6 `bgagent approve` / `deny` / `pending` / `policies` (Phase 3)
411411

412-
HITL approval commands. All flows are REST + DDB; no streaming. Detailed design in `PHASE3_CEDAR_HITL.md`. Summary:
412+
HITL approval commands. All flows are REST + DDB; no streaming. Detailed design in [`CEDAR_HITL_GATES.md`](./CEDAR_HITL_GATES.md). Summary:
413413

414414
- Agent emits `approval_required` with the tool context.
415415
- Notification plane dispatches the event (Slack with action buttons, email, GitHub).
@@ -550,7 +550,7 @@ RUNNING ──▶ AWAITING_APPROVAL ──▶ RUNNING (approve or deny-with-s
550550
└──▶ FAILED (stranded reconciler catches abandoned approval)
551551
```
552552

553-
The `AWAITING_APPROVAL` state holds the user's concurrency slot (paused but alive). See `PHASE3_CEDAR_HITL.md` for full semantics.
553+
The `AWAITING_APPROVAL` state holds the user's concurrency slot (paused but alive). See [`CEDAR_HITL_GATES.md`](./CEDAR_HITL_GATES.md) for full semantics.
554554

555555
### 8.3 Write rules
556556

@@ -733,7 +733,7 @@ Opt-in per task: 4 KB previews + full trajectory to S3 with TTL.
733733
- Hard-gate approval gates with Cedar policy evaluation
734734
- `bgagent approve` / `deny` / `pending` / `policies`
735735
- `AWAITING_APPROVAL` state + orchestrator handling
736-
- Full design in `PHASE3_CEDAR_HITL.md`
736+
- Full design in [`CEDAR_HITL_GATES.md`](./CEDAR_HITL_GATES.md)
737737

738738
### Phase 4 — Dispatcher polish
739739

@@ -746,7 +746,7 @@ Opt-in per task: 4 KB previews + full trajectory to S3 with TTL.
746746
### Deferred
747747

748748
- **LLM-synthesized status summary**`bgagent ask` without targeting the agent; Lambda calls an LLM to narrate state. Cost + hallucination trade-offs; revisit if v1 feedback warrants.
749-
- **Cedar `effect: "advise"` tier** — non-blocking FYI policy tier for post-v1. Design sketch in `PHASE3_CEDAR_HITL.md`.
749+
- **Cedar `effect: "advise"` tier** — non-blocking FYI policy tier for post-v1. Design sketch in [`CEDAR_HITL_GATES.md`](./CEDAR_HITL_GATES.md).
750750
- **Outbound WebSocket from agent** — only if a concrete sub-200 ms latency requirement surfaces. Agent-initiated egress avoids dual-auth problems and works on any compute.
751751
- **Multi-user watch** — multiple users attached to the same task's live event stream (teams).
752752

docs/design/SECURITY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ The blueprint framework ([REPO_ONBOARDING.md](./REPO_ONBOARDING.md)) allows per-
7171

7272
**Deployment control** - Custom steps are defined in the `Blueprint` CDK construct and deployed via `cdk deploy`. Only principals with CDK deployment permissions can add or modify them. There is no runtime API for custom step CRUD.
7373

74-
The **same deploy-only property extends to `Blueprint.security.cedarPolicies`** — user-authored Cedar policies live in the CDK source, are typed as `readonly string[]` on the construct, and reach `RepoTable` only through a CloudFormation custom resource invoked at deploy time. Phase 3 (Cedar-driven HITL approval gates see [`PHASE3_CEDAR_HITL.md`](./PHASE3_CEDAR_HITL.md)) is load-bearing on this property: the engine treats Cedar policies loaded at task start as trusted content. If the blueprint model ever changes to accept user-uploaded policy text via an API path, Phase 3's §12 trust model must be re-evaluated (add per-blueprint policy count cap, per-eval timeout, size cap).
74+
The **same deploy-only property extends to `Blueprint.security.cedarPolicies`** — user-authored Cedar policies live in the CDK source, are typed as `readonly string[]` on the construct, and reach `RepoTable` only through a CloudFormation custom resource invoked at deploy time. The Cedar-driven HITL approval gates feature (see [`CEDAR_HITL_GATES.md`](./CEDAR_HITL_GATES.md)) is load-bearing on this property: the engine treats Cedar policies loaded at task start as trusted content. If the blueprint model ever changes to accept user-uploaded policy text via an API path, the §12 trust model in that doc must be re-evaluated (add per-blueprint policy count cap, per-eval timeout, size cap).
7575

7676
**Input filtering** - The framework strips credential ARNs (`github_token_secret_arn`) and networking configuration (`egress_allowlist`) from the config before passing it to custom Lambda steps. If a custom step needs secrets, it must declare them explicitly and the operator must grant IAM permissions.
7777

0 commit comments

Comments
 (0)