Skip to content

Commit ab8cb57

Browse files
author
bgagent
committed
chore(docs): update roadmap inconsistencies
1 parent ce774c4 commit ab8cb57

2 files changed

Lines changed: 24 additions & 24 deletions

File tree

docs/guides/ROADMAP.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ What's shipped and what's coming next.
1111
- [x] **Durable orchestrator** - Lambda Durable Functions with checkpoint/resume; survives transient failures up to 9 hours
1212
- [x] **Task state machine** - SUBMITTED → HYDRATING → RUNNING → AWAITING_APPROVAL (Cedar HITL) → FINALIZING → COMPLETED / FAILED / CANCELLED / TIMED_OUT
1313
- [x] **Concurrency control** - Per-user limits (default 3) with atomic admission and automated drift reconciliation
14-
- [x] **Stranded task reconciler** - Scheduled Lambda detects tasks stuck in non-terminal states and drives them to failure with proper cleanup
14+
- [x] **Stranded task reconciler** - Scheduled Lambda detects tasks stuck in `SUBMITTED`, `HYDRATING`, or `AWAITING_APPROVAL` and drives them to failure with proper cleanup
1515
- [x] **Idempotency** - `Idempotency-Key` header on POST requests (24-hour TTL)
1616

1717
### Task types
@@ -22,9 +22,9 @@ What's shipped and what's coming next.
2222

2323
### Onboarding and customization
2424

25-
- [x] **Blueprint construct** - Per-repo CDK configuration (model, turns, budget, prompt overrides, egress, GitHub token)
25+
- [x] **Blueprint construct** - Per-repo CDK configuration (model, max turns, prompt overrides, egress allowlist, GitHub token, Cedar policies, approval gate cap)
2626
- [x] **Repo-level project config** - Agent loads `CLAUDE.md`, `.claude/rules/`, `.claude/settings.json`, `.mcp.json`
27-
- [x] **Per-repo overrides** - Model ID, max turns, max budget, system prompt overrides, poll interval, dedicated token
27+
- [x] **Per-repo overrides** - Model ID, max turns, max budget (per-task request and/or `RepoTable`; Blueprint CDK default pending), system prompt overrides, poll interval, dedicated token
2828

2929
### Security
3030

@@ -33,9 +33,10 @@ What's shipped and what's coming next.
3333
- [x] **Input guardrails** - Bedrock Guardrails screen task descriptions and PR/issue content (fail-closed)
3434
- [x] **Output screening** - Regex-based secret/PII scanner with PostToolUse hook redaction
3535
- [x] **Content sanitization** - HTML stripping, injection pattern neutralization, control character removal
36-
- [x] **Cedar policy engine and HITL gates** - Tool-call governance (allow / hard-deny / soft-deny requiring approval) with fail-closed default, per-repo Cedar policies, `AWAITING_APPROVAL` state, `bgagent approve` / `deny` / `pending` / `policies`, and REST approval APIs. See [CEDAR_HITL_GATES.md](../design/CEDAR_HITL_GATES.md)
36+
- [x] **Cedar policy engine and HITL gates** - Tool-call governance (allow / hard-deny / soft-deny requiring approval) with fail-closed default, per-repo Cedar policies, submit-time `initial_approvals`, `AWAITING_APPROVAL` state, `bgagent approve` / `deny` / `pending` / `policies`, and REST approval APIs. Stranded approvals in `AWAITING_APPROVAL` are cleared by the stranded-task reconciler. See [CEDAR_HITL_GATES.md](../design/CEDAR_HITL_GATES.md)
3737
- [x] **WAF** - Managed rule groups + rate-based rule (1,000 req/5 min/IP)
3838
- [x] **Pre-flight checks** - GitHub API reachability, repo access, token permissions (fail-closed)
39+
- [x] **Per-session IAM scoping** - Agent assumes a per-task **SessionRole** via `sts:AssumeRole` with session tags `{user_id, repo, task_id}` and refreshable credentials (1-hour role-chaining cap; tasks up to 8 h). Tenant-data DynamoDB access uses `dynamodb:LeadingKeys = ${aws:PrincipalTag/task_id}`; S3 traces/attachments use a `${aws:PrincipalTag/user_id}` prefix. Bedrock model invocation still uses the compute role (see **Bedrock IAM session-tag attribution** under What's next). See [SECURITY.md](../design/SECURITY.md)
3940
- [x] **Model invocation logging** - Full prompt/response audit trail (90-day retention)
4041

4142
### Memory and learning
@@ -116,7 +117,6 @@ Planned capabilities, grouped by theme. Items are independent and may ship in an
116117

117118
| Capability | Description |
118119
|------------|-------------|
119-
| ~~**Per-session IAM scoping**~~ | **Implemented.** Each task's agent assumes a per-task **SessionRole** via `sts:AssumeRole` with session tags (`user_id`, `repo`, `task_id`) using a refreshable provider (1-hour role-chaining cap; tasks run to 8 h). Tenant-data access is moved off the long-lived compute role: DynamoDB item access on the four `task_id`-partitioned tables is gated by a `dynamodb:LeadingKeys = ${aws:PrincipalTag/task_id}` condition (the enforceable boundary — leading-keys binds to the base-table partition key, not a GSI), and S3 trace/attachment access is scoped to a `${aws:PrincipalTag/user_id}` prefix. Bedrock invocation is ARN-scoped on both the AgentCore runtime and the ECS task role. Backend-agnostic. Eliminates cross-task blast radius from a compromised agent session. See [SECURITY.md](../design/SECURITY.md). |
120120
| **Per-repo GitHub credentials** | GitHub App per org/repo via AgentCore Token Vault. Auto-refresh for long sessions. Sets the pattern for GitLab, Jira, Slack integrations. |
121121
| **Principal-to-repo authorization** | Map Cognito identities to allowed repository sets. Users can only trigger work on authorized repos. |
122122
| **End-to-end task attribution** | Propagate `task_id`, `user_id`, and trace context consistently across orchestrator logs, agent OpenTelemetry, GitHub/API calls, and `TaskEvents` so every downstream action is attributable in incident response (aligns with Zero Trust agent-identity guidance). |
@@ -180,16 +180,16 @@ Planned capabilities, grouped by theme. Items are independent and may ship in an
180180
| **Additional git providers** | GitLab (and optionally Bitbucket). Same workflow, provider-specific API adapters. |
181181
| **Slack notification polish** | Rich Block Kit for `agent_milestone` and `approval_requested` (today many map to generic fallback text); in-thread approve/deny buttons wired to HITL APIs. Should render **Smart progress updates** when that ships. |
182182
| **Control panel** | Web UI: task list, task detail with logs/traces, cancel, metrics dashboards, cost attribution. Task detail should show manager-style progress alongside raw events/traces. |
183-
| **Email notification dispatcher** | SES-based email notifications via the existing fanout pipeline. Stub exists today (logs only). |
183+
| **Email notification dispatcher** | SES-based email notifications via the existing fanout pipeline. Log-only stub ships today (see unchecked **Email dispatcher** under What's ready). |
184184
| **Per-user notification preferences** | DynamoDB (or equivalent) store for preferred channels, per-channel config, and event filters (`INPUT_GATEWAY.md`). |
185185
| **Browser extension channel** | Lightweight extension to open tasks from GitHub issue/PR pages using existing webhook or OAuth-issued JWT; same internal message contract as other channels. |
186186

187187
### Compute and performance
188188

189189
| Capability | Description |
190190
|------------|-------------|
191-
| **Adaptive model router** | Per-turn model selection by complexity. Cheaper models for reads, Opus for complex reasoning. ~30-40% cost reduction. |
192-
| **Alternative compute** | ECS/Fargate or EKS via ComputeStrategy interface. For workloads exceeding AgentCore's 2 GB image limit or requiring GPU. |
191+
| **Adaptive model router** | Per-turn model selection by complexity. Cheaper models for reads, Opus for complex reasoning. ~30-40% cost reduction. Related: **Complexity-aware model router** under Cost governance. |
192+
| **Alternative compute** | ECS/Fargate or EKS via `ComputeStrategy` (`EcsComputeStrategy` exists; Agent stack wiring is commented out). For workloads exceeding AgentCore's 2 GB image limit or requiring GPU. |
193193
| **Environment pre-warming** | Pre-build container layers per repo. Snapshot-on-schedule (rebuild on push). Cold start from minutes to seconds. |
194194
| **S3-backed SDK session store (portable transcripts)** | Plumb the Claude Agent SDK `SessionStore` to S3 (dedicated bucket or prefix) with eager flush, IAM-scoped access, conditional part creates, checksums, adaptive retries, and structured logging. Emit metrics or alarms on transcript mirror failures; own graceful shutdown (`disconnect` on SIGTERM/cancel) so in-flight frames can flush. Persist `task_id` ↔ Claude session UUID (from the first `ResultMessage`) for resume on another worker; keep agent `cwd` stable so SDK-derived `project_key` paths stay predictable. Plan compaction when part count threatens resume latency; optional S3 Express One Zone when the fleet is single-AZ. Complements checked **Persistent session storage** (FUSE caches on `/mnt/workspace`) and end-of-task **trace** upload to `traces/...jsonl.gz`. |
195195

@@ -210,12 +210,12 @@ Planned capabilities, grouped by theme. Items are independent and may ship in an
210210

211211
| Capability | Description |
212212
|------------|-------------|
213-
| **Bedrock IAM session-tag attribution** | Route Bedrock inference through assumed credentials carrying `{user_id, repo, task_id}` session tags (extend the SessionRole / `aws_session.py` pattern from #209/#211). Enables native Cost Explorer and CUR 2.0 chargeback by user and repo. Operator must activate IAM principal cost allocation tags (see [Cost attribution guide](/getting-started/cost-attribution)). |
213+
| **Bedrock IAM session-tag attribution** | Route Bedrock **InvokeModel** through assumed credentials that carry `{user_id, repo, task_id}` session tags. **Per-session IAM scoping** (#209) already tags the SessionRole for DynamoDB/S3; model calls still use the AgentCore/ECS compute role today. Extend `aws_session.py` (or equivalent) so inference is chargeable in Cost Explorer / CUR 2.0 by principal tag. Operator must activate IAM principal cost allocation tags (see [COST_MODEL.md](../design/COST_MODEL.md#cost-attribution)). |
214214
| **Bedrock per-request metadata** | Pass `task_id`, `user_id`, and `repo` on each Bedrock call via request metadata / `X-Amzn-Bedrock-Request-Metadata` into model invocation logs. Complements IAM attribution; does not replace in-app `cost_usd`. Requires Claude Code / SDK support for metadata on InvokeModel. |
215-
| **Cost dashboard and export API** | Log Insights widgets on invocation logs; optional API or export for monthly spend roll-ups by `user_id` / `repo` from the task table. |
215+
| **Cost dashboard and export API** | Log Insights widgets on invocation logs; optional API or export for monthly spend roll-ups by `user_id` / `repo` from the task table. Operator dashboard today covers task-level cost aggregates, not Bedrock chargeback dimensions. |
216216
| **Optional tagged application inference profiles** | CDK-managed Bedrock application inference profiles per onboarded repo or environment; set `ANTHROPIC_MODEL` to tagged profile ARN for `resourceTags/*` billing when repo count is bounded. |
217-
| **Org and team budgets** | Per-user and per-team monthly token or USD budgets with alerting (e.g. 80%) and optional hard stop at 100%. |
218-
| **Complexity-aware model router** | Route each request to the most appropriate model based on task complexity (simple reads/edits to cheaper models, deeper reasoning to stronger models) while honoring budget and policy constraints. |
217+
| **Org and team budgets** | Per-user and per-team monthly token or USD budgets with alerting (e.g. 80%) and optional hard stop at 100%. Per-task `max_budget_usd` and turn caps ship today. |
218+
| **Complexity-aware model router** | Route each request to the most appropriate model based on task complexity (simple reads/edits to cheaper models, deeper reasoning to stronger models) while honoring budget and policy constraints. Related: **Adaptive model router** under Compute and performance. |
219219

220220
### Observability and safe deploy
221221

0 commit comments

Comments
 (0)