Roadmap

This roadmap outlines multiple iterations for ABCA. Each iteration adds features incrementally and builds on the previous one. Delivering a working slice at the end of each iteration is the goal. Non–backward-compatible changes between iterations are acceptable (e.g. switching CLI auth from IAM to Cognito, or changing the orchestration model) when they simplify the design or align with the target architecture.

The order and scope of items may shift as we learn; the list below reflects current design docs (ARCHITECTURE.md and component docs in docs/design/).

Ongoing engineering practice (cross-iteration)

These practices apply continuously across iterations and are not treated as one-time feature milestones.

Property-based correctness testing for orchestration invariants — Complement example-based tests (Jest/pytest) with property-based testing (fast-check for TypeScript and hypothesis for Python) so randomized inputs and interleavings validate invariants over many runs. The goal is to verify safety properties that are timing-sensitive or hard to cover with scenario tests alone (for example, concurrent state transitions and lock/contention behavior).
Machine-readable property catalog — Maintain a versioned property set with explicit mapping from each property to enforcing code paths and tests. Initial properties include:
- P-ABCA-1 terminal-state immutability: tasks in COMPLETED / FAILED / CANCELLED / TIMED_OUT cannot transition further.
- P-ABCA-2 concurrency counter consistency: for each user, active_count equals the number of tasks in active states (SUBMITTED, HYDRATING, RUNNING, FINALIZING).
- P-ABCA-3 event ordering: TaskEvents are strictly monotonic by event_id (ULID order).
- P-ABCA-4 memory fallback guarantee: if task finalization sees memory_written = false, fallback episode write is attempted and result is observable.
- P-ABCA-5 branch-name uniqueness: simultaneous tasks for the same repo generate distinct branch names (ULID-based suffix).
Definition-of-done hook — New orchestrator/concurrency changes should include: updated property mappings, at least one property-based test where applicable, and invariant notes in ORCHESTRATOR.md to keep docs and executable checks aligned.
Memory extraction prompt versioning — Hash memory extraction prompts (in agent/memory.py: write_task_episode, write_repo_learnings) alongside system prompts so changes to extraction logic are tracked by prompt_version. This enables correlating memory quality changes with extraction prompt updates in the evaluation pipeline.

Iteration 1 — First shippable slice (done)

Goal: An agent runs on AWS in an isolated environment; user submits a task from the CLI and gets a PR when done.

Agent on AWS — Agent runs in a sandboxed compute environment (AgentCore Runtime MicroVM or equivalent). Each task gets an isolated session (compute, memory, filesystem). Container/image has shell, filesystem, dev tooling; session isolation is built-in.
CLI trigger — User can submit a task via CLI (script or simple CLI): provide repo + task description (text and/or GitHub issue ref). Single entry path; no multi-channel yet.
Autonomous agent loop — Agent SDK runs with full tool access in headless mode (read, write, edit, bash, glob, grep; permissionMode: "bypassPermissions" or equivalent). No human prompts during execution.
Git workflow — Agent creates a branch, commits incrementally, pushes to GitHub, and creates a pull request when done. Branch naming convention: e.g. bgagent/<task-id>/<short-desc>.
GitHub only — Single git provider (GitHub). Agent clones repo, works on branch, opens PR via GitHub API (OAuth or token via AgentCore Identity).
Minimal orchestration — Task is created, execution is triggered (e.g. Lambda or direct invoke), agent runs to completion or failure. Platform infers outcome from GitHub (PR created or not) or from session end. No durable orchestration (e.g. no Step Functions / Durable Functions) required for this slice if we accept "fire-and-forget" plus polling.
Task state (minimal) — At least: task id, status (e.g. running / completed / failed), repo, and way to poll or wait for completion. Persistence can be minimal (e.g. DynamoDB or single table).
API authentication — CLI authenticates to the API (e.g. IAM SigV4 or Cognito JWT). Prevents unauthorized task submission.
Scaling — Each task runs in its own isolated session; no shared mutable state so the system can scale with concurrent tasks (within runtime quotas).

Out of scope for Iteration 1: Repo onboarding (any repo the credentials can access is allowed), multiple channels, durable execution with checkpoint/resume, rich observability, memory/code attribution, webhook, Slack.

Iteration 2 — Production orchestrator, task management, and observability (done)

Goal: Robust task lifecycle, durable execution, security foundations, basic cost guardrails, and visibility into what's running. This iteration makes the platform production-grade for single-channel (CLI) usage.

Task management and API

Task management — Submit, list (e.g. my tasks), get status (per task), cancel (stop a running task). Clear task state machine (SUBMITTED → HYDRATING → RUNNING → FINALIZING → COMPLETED / FAILED / CANCELLED / TIMED_OUT). See ORCHESTRATOR.md.
API contract — Implement the external API: POST /v1/tasks, GET /v1/tasks, GET /v1/tasks/{id}, DELETE /v1/tasks/{id}, GET /v1/tasks/{id}/events. Consistent error format, pagination, idempotency. See API_CONTRACT.md.
Input gateway (single entry point) — All requests go through one gateway: verify auth, normalize payload to an internal message schema, validate (required fields, repo/issue refs), then dispatch to the task pipeline. The gateway is designed for extensibility — adding new channels later requires only new adapters, not core changes. In this iteration, CLI is the only channel; the gateway architecture is established so future channels (webhook, Slack) plug in cleanly. See INPUT_GATEWAY.md.
Idempotency — Task submit accepts an idempotency key (e.g. Idempotency-Key header); duplicate submits with the same key do not create a second task. Prevents duplicate work on retries. Keys are stored with a 24-hour TTL.
Improve CLI — Dedicated CLI package (@abca/cli in cli/) with commands: configure, login, submit, list, status, cancel, events. Cognito auth with token caching and auto-refresh, --wait mode that polls until completion, --output json for scripting, and --verbose for debugging.

Orchestration and storage

Durable execution — Orchestrator on top of the agent using Lambda Durable Functions: checkpoint/resume, async session monitoring via DynamoDB polling, timeout recovery, idempotent step execution. Long-running sessions (hours) survive transient failures; agent commits regularly so work is not lost. See ORCHESTRATOR.md for the task state machine, execution model, failure modes, concurrency management, data model, and implementation strategy.
Storage — (1) Task and event storage — Tasks table, TaskEvents (audit log), UserConcurrency counters in DynamoDB. (2) Durable execution state — Lambda Durable Functions checkpoints (managed by the service). (3) Artifact storage (optional) — S3 bucket for future screenshot/video uploads.

Security and network

Threat model — Document the threat model for the current architecture using threat-composer. Cover: input validation, agent isolation, credential management, data flow, and trust boundaries. Update the threat model as new features land in future iterations. Threat modeling informs the security controls built in this and subsequent iterations — it must come before, not after, the production gateway and orchestrator.
Network isolation (basic) — Deploy the agent compute environment within a VPC. Restrict outbound egress to allowlisted endpoints: GitHub API, Amazon Bedrock, AgentCore services, and necessary AWS service endpoints (DynamoDB, CloudWatch, S3). No open internet access by default. This prevents a compromised or confused agent from reaching arbitrary endpoints. Fine-grained per-repo allowlisting and egress logging are deferred to Iteration 3a.

Cost and observability

Observability — Metrics: task duration, token usage (from agent SDK result), cold start, error rate, active task counts, and submitted backlog. Dashboards: active tasks, submitted backlog, completion rate, basic task list. Alarms: stuck tasks (e.g. RUNNING > 9 hours), sustained submitted backlog over threshold, orchestration failures, counter drift. Logs: Agent/runtime logs (e.g. CloudWatch) tied to task id. See OBSERVABILITY.md.

Platform operations

Builds on Iteration 1: Same agent + git workflow; adds orchestrator, gateway, task CRUD, API contract, observability, security foundations, and cost guardrails.

Out of scope for Iteration 2: Webhook trigger (no second channel yet), multi-modal input (text-based tasks are sufficient), repo onboarding, memory, customization.

Iteration 3 (wip, we are here — 3a and 3b done)

Iteration 3a — Repo onboarding and access control

Goal: Only onboarded repos can receive tasks; per-repo credentials replace the single shared OAuth token; agent environment is customizable per repo.

Builds on Iteration 2: Gateway and orchestration stay; adds onboarding gate, webhook channel, DNS Firewall, better context hydration, turn caps, cost budget caps, prompt guide, data lifecycle, agent harness improvements (turn budget, default branch, safety net, lint verification), operator dashboard, WAF, model invocation logging, and input length limits.

Iteration 3b — Core memory and learning (done)

Goal: Agents learn from past interactions; memory Tier 1 (repository knowledge + task execution history) is operational; prompt versioning and commit attribution provide traceability.

Interaction memory / code attribution (Tier 1) — AgentCore Memory resource provisioned via CDK L2 construct (@aws-cdk/aws-bedrock-agentcore-alpha) with named semantic (SemanticKnowledge) and episodic (TaskEpisodes) extraction strategies using explicit namespace templates: /{actorId}/knowledge/ for semantic records, /{actorId}/episodes/{sessionId}/ for per-task episodes, and /{actorId}/episodes/ for episodic reflection (cross-task summaries). Events are written with actorId = repo ("owner/repo") and sessionId = taskId, so the extraction pipeline places records at /{repo}/knowledge/ and /{repo}/episodes/{taskId}/. Memory is loaded at task start during context hydration (two parallel RetrieveMemoryRecordsCommand calls using repo-derived namespace prefixes — /{repo}/knowledge/ for semantic, /{repo}/episodes/ for episodic) with a 5-second timeout and 2,000-token budget. Memory is written at task end by the agent (agent/memory.py: write_task_episode and write_repo_learnings via create_event). An orchestrator fallback (writeMinimalEpisode in orchestrator.ts) writes a minimal episode if the agent container crashes or times out. All memory operations are fail-open — failures never block task execution. See MEMORY.md and OBSERVABILITY.md (Code attribution). Implementation: src/constructs/agent-memory.ts, src/handlers/shared/memory.ts, agent/memory.py.
Insights and agent self-feedback — The agent writes structured summaries at the end of each task via write_task_episode (status, PR URL, cost, duration) and write_repo_learnings (codebase patterns and conventions). Agent self-feedback is captured via an "## Agent notes" section in the PR body, extracted post-task by the entrypoint (_extract_agent_notes in agent/entrypoint.py) and stored as part of the task episode. See MEMORY.md (Extraction prompts) and EVALUATION.md.
Prompt versioning — System prompts are hashed (SHA-256 of deterministic prompt parts, excluding memory context which varies per run) via computePromptVersion in src/handlers/shared/prompt-version.ts. The prompt_version is stored on the task record in DynamoDB during hydration, enabling future A/B comparison of prompt changes against task outcomes. See EVALUATION.md and ORCHESTRATOR.md (data model).
Per-prompt commit attribution — A prepare-commit-msg git hook (agent/prepare-commit-msg.sh) is installed during repo setup and appends Task-Id: <task_id> and Prompt-Version: <hash> trailers to every agent commit. The hook gracefully skips trailers when TASK_ID is unset (e.g. during manual commits). See MEMORY.md.

Builds on Iteration 3a: Onboarding and per-repo config are in place; adds memory Tier 1 (repo knowledge, task episodes), insights, agent self-feedback, prompt versioning, and commit attribution. These are all write-at-end / read-at-start additions that do not change the orchestrator blueprint.

Iteration 3bis

Goal: Address architectural risks identified by external review before moving to new features. These are fixes to existing code, not new capabilities.

Follow-ups (identified during review, not blocking):

Reconciler batch error tracking — Added errors counter to reconcile-concurrency.ts. Incremented in the per-user catch block. Final log line now includes { scanned, corrected, errors }. Logs at ERROR if errors === scanned && scanned > 0 (systemic failure).
Test: decrementConcurrency CCF path — Added two tests in orchestrate-task.test.ts: one for ConditionalCheckFailedException (best-effort, no throw) and one for non-CCF errors (swallowed with warn log, no throw).
Test: reconciler non-CCF update failure — Added test in reconcile-concurrency.test.ts: two users with drift, user-1's UpdateItemCommand fails with non-CCF error, user-2 still corrected (per-user error isolation).
Consistent error serialization — Replaced all String(err) in error/warn log contexts with err instanceof Error ? err.message : String(err) across context-hydration.ts, orchestrator.ts, memory.ts, and repo-config.ts.

Iteration 3c — Validation and new task types

Goal: Multi-layered validation catches errors, enforces code quality, and assesses change risk before PRs are created; the platform supports more than one task type; multi-modal input broadens what users can express.

Per-repo GitHub credentials (GitHub App + AgentCore Token Vault) — Replace the single shared OAuth token with a GitHub App installed per-organization or per-repository, using AgentCore Identity's Token Vault for credential management (recommended approach). Each onboarded repo is associated with a GitHub App installation that grants fine-grained permissions (read/write to that repo only). This eliminates the security gap where any authenticated user can trigger agent work against any repo the shared token can access.

Implementation approach — AgentCore Token Vault integration:
1. WorkloadIdentity resource — Create a CfnWorkloadIdentity in CDK representing the agent's identity, enabling token exchange with GitHub.
2. Token Vault credential provider — Register the GitHub App's credentials in the AgentCore Token Vault. For server-to-server authentication, the GitHub App uses a private key to sign JWTs that are exchanged for installation tokens via the GitHub API. For the user-authorization OAuth flow (acting on behalf of a user), the App's client ID and client secret are registered as an OAuth credential provider. The Token Vault handles token refresh automatically — no expiry issues for long-running tasks (sessions exceeding 1 hour).
3. Orchestrator token generation — At task hydration time, the orchestrator calls the GitHub API to generate an installation token (1-hour TTL, scoped to the target repo) and passes it to the agent at session start.
4. Agent-side token refresh — For tasks running longer than 1 hour, the agent calls GetWorkloadAccessToken (permissions already granted to the runtime execution role: bedrock-agentcore:GetWorkloadAccessToken, GetWorkloadAccessTokenForJWT, GetWorkloadAccessTokenForUserId) to obtain a fresh token from the Token Vault. No Secrets Manager reads needed at runtime.
5. Blueprint configuration — Extend Blueprint credentials with githubAppId, githubAppPrivateKeySecretArn, and githubAppInstallationId (per-org or per-repo).
6. Gateway integration (future) — Wire an AgentCore Gateway target for GitHub API calls with automatic credential injection, enabling audit trails and Cedar policy enforcement per request. Git transport (clone/push) still requires a token in the remote URL, so Gateway-mediated access applies to API operations only.
Why Token Vault over Secrets Manager: The runtime already has GetWorkloadAccessToken permissions (granted by the AgentCore Runtime construct). Token Vault is purpose-built for dynamic credential vending — it manages refresh automatically, supports arbitrary OAuth providers (GitHub, GitLab, Jira, Slack via the same pattern), and keeps credentials out of the sandbox as static secrets. This sets up the pattern for all future third-party integrations.

Per-user identity flow (future, connects to SSO): With a GitHub App, installation tokens can be scoped per-repository and per-permission set. Combined with federated identity (SSO), the orchestrator can look up the user's GitHub identity and generate tokens scoped to the target repo with only the permissions that user would have. Git commits are attributed to the GitHub App acting on behalf of the user.

This is a prerequisite for any multi-user or multi-team deployment.
Orchestrator pre-flight checks (fail-closed) — Add a pre-flight step before start-session so doomed tasks fail fast without consuming AgentCore runtime. The orchestrator performs lightweight readiness checks with strict timeouts (for example, 5 seconds): verify GitHub API reachability, verify repository existence and credential access (GET /repos/{owner}/{repo} or equivalent), and optionally verify AgentCore Runtime availability when a status probe exists. If pre-flight fails, the task transitions to FAILED immediately with a clear terminal reason (GITHUB_UNREACHABLE, REPO_NOT_FOUND_OR_NO_ACCESS, RUNTIME_UNAVAILABLE), releases the concurrency slot, emits an event/notification, and does not invoke the agent. Unlike memory/context hydration (fail-open), pre-flight is explicitly fail-closed: inability to verify repo access blocks execution by design.
Persistent session storage (cache layer) — Enabled AgentCore Runtime persistent session storage (preview) for selective cache persistence across stop/resume. A per-session filesystem is mounted at /mnt/workspace via FilesystemConfigurations (CFN escape hatch on the L2 construct). The S3-backed FUSE mount does not support flock() (returns ENOTRECOVERABLE / os error 524), so only caches whose tools never call flock() go on the mount (npm_config_cache, CLAUDE_CONFIG_DIR). Caches for tools that use flock() stay on local ephemeral disk (MISE_DATA_DIR=/tmp/mise-data — mise's pipx backend delegates to uv which flocks inside installs/; UV_CACHE_DIR=/tmp/uv-cache). Repo clones stay on /workspace (local) for the same reason. The AGENT_WORKSPACE env var and {workspace} system prompt placeholder are wired for a future move to persistent repo clones if the mount adds flock() support. Each runtimeSessionId gets isolated storage (no cross-task leakage). 14-day TTL; data deleted on runtime version update. See COMPUTE.md.
Pre-execution task risk classification — Add a lightweight risk classifier at task submission (before orchestration starts) to drive proportional controls for agent execution. Initial implementation can be rule-based and Blueprint-configurable: prompt keywords (for example, database, auth, security, infrastructure), metadata from issue labels, and file/path signals when available (for example, **/migrations/**, **/.github/**, infra directories). Persist risk_level (low / medium / high / critical) on the task record and use it to set defaults and policy: model tier/cascade, turn and budget defaults, prompt strictness/conservatism, approval requirements before merge, and optional autonomous-execution blocks for critical tasks. This is intentionally pre-execution and complements (does not replace) post-execution PR risk/blast-radius analysis.
Principal-to-repository authorization mapping — Bind repository access to the requesting principal, not merely any authenticated user. Map Cognito identities to allowed repository sets so that User A cannot trigger agent work on User B's repositories. This is distinct from the credential mechanism (GitHub App tokens solve the credential blast radius but not the authorization problem). Implementation: add a user_id → repo[] authorization table (or extend onboarding config with authorized_users), check authorization in the orchestrator before session start, and return UNAUTHORIZED_REPO on mismatch. See SECURITY.md.
Tiered validation pipeline — Three tiers of post-agent validation run sequentially after the agent finishes but before finalization. Each tier can fail the PR independently, and failure output is fed back to the agent for a fix cycle (capped at 2 retries per tier to bound cost). If the agent still fails, the PR is created with a validation report (labels, comments, and a risk summary) so the reviewer knows. All three tiers are implemented via the blueprint framework's Layer 2 custom steps (phase: 'post-agent'). See REPO_ONBOARDING.md for the 3-layer customization model, ORCHESTRATOR.md for the step execution contract, and EVALUATION.md for the full design.
- Tier 1 — Tool validation (build, test, lint) — Run deterministic tooling: test suites, linters, type checkers, SAST scanners, or a custom script. This is the existing "deterministic validation" concept. Binary pass/fail; failures are concrete (test output, lint errors) and actionable by the agent in a fix cycle. Already partially implemented via the system prompt instructing the agent to run tests.
- Tier 2 — Code quality analysis — Static analysis of the agent's diff against code quality principles: DRY (duplicated code detection), SOLID violations, design pattern adherence, complexity metrics (cyclomatic, cognitive), naming conventions, and repo-specific style rules (from onboarding config). Implemented as an LLM-based review step or a combination of static analysis tools (e.g. SonarQube rules, custom linters) and LLM judgment. Produces structured findings (severity, location, rule, suggestion) that the agent can act on in a fix cycle. Findings below a configurable severity threshold are advisory (included in the PR as comments) rather than blocking.
- Tier 3 — Risk and blast radius analysis — Analyze the scope and impact of the agent's changes to detect unintended side effects in other parts of the codebase. Includes: dependency graph analysis (what modules/functions consume the changed code), change surface area (number of files, lines, and modules touched), semantic impact assessment (does the change alter public APIs, shared types, configuration, or database schemas), and regression risk scoring. Produces a risk level (low / medium / high / critical) attached to the PR as a label and included in the validation report. High-risk changes may require explicit human approval before merge (foundation for the HITL approval mode in Iteration 6). The risk level considers: number of downstream dependents affected, whether the change touches shared infrastructure or core abstractions, test coverage of the affected area, and whether the change introduces new external dependencies.
PR risk level and validation report — Every agent-created PR includes a structured validation report (as a PR comment or check run) summarizing: Tier 1 results (pass/fail per tool), Tier 2 findings (code quality issues by severity), Tier 3 risk assessment (risk level, blast radius summary, affected modules). The PR is labeled with the computed risk level (risk:low, risk:medium, risk:high, risk:critical). Risk level is persisted in the task record for evaluation and trending. See EVALUATION.md.
Other task types: PR review and PR-iteration — Support additional task types beyond "implement from issue": iterate on pull request (pr_iteration) reads review comments and addresses them (implement changes, push updates, post summary). Review pull request (pr_review) is a read-only task type where the agent analyzes a PR's changes and posts structured review comments via the GitHub Reviews API. The pr_review agent runs without Write or Edit tools (defense-in-depth), skips ensure_committed and push, and treats build status as informational only. Each review comment uses a structured format: type (comment/question/issue/good_point), severity for issues (minor/medium/major/critical), title, description with memory attribution, proposed fix, and a ready-to-use AI prompt. The CLI exposes --review-pr <number> (mutually exclusive with --pr).
Input guardrail screening (Bedrock Guardrails) — Amazon Bedrock Guardrails screen task descriptions at submission time and assembled PR prompts during context hydration (pr_iteration, pr_review). Uses PROMPT_ATTACK content filter at HIGH strength. Fail-closed: Bedrock outages block tasks rather than letting unscreened content through. See SECURITY.md.
Guardrail screening for GitHub issue content (new_task) — Bedrock Guardrail screening now covers GitHub issue bodies and comments fetched during context hydration for new_task tasks. The assembled user prompt is screened through the PROMPT_ATTACK filter when issue content is present; when no issue content is fetched (task_description only), hydration-time screening is skipped because the task description was already screened at submission time. Same fail-closed pattern as PR tasks. See SECURITY.md.
Multi-modal input — Accept text and images (or other modalities) in the task payload; pass through to the agent. Gateway and schema support it; agent harness supports it where available. Primary use case: screenshots of bugs, UI mockups, or design specs attached to issues.

Scope note: Iteration 3c contains a wide range of items — from security-critical (GitHub App credentials, guardrail screening) to quality-improving (tiered validation, risk classification) to capability-expanding (multi-modal input). Items marked [x] are done. The remaining items can be delivered incrementally; the tiered validation pipeline and risk classification in particular can ship independently of per-repo credentials and multi-modal input.

Builds on Iteration 3b: Memory is operational; this iteration changes the orchestrator blueprint (tiered validation pipeline, new task type) and broadens the input schema. These are independently testable from memory.

Iteration 3d — Review feedback loop and evaluation

Goal: The primary feedback loop (PR reviews → memory → future tasks) is operational; automated evaluation provides measurable quality signals; PR outcomes are tracked as feedback.

Post-execution output screening — Post-execution screening for secrets, PII, and unsafe content is enforced as a runtime control. Tool outputs are screened after each tool call completes via a PostToolUse hook (agent/src/hooks.py) backed by a regex-based output scanner (agent/src/output_scanner.py). Detected patterns include AWS access keys, AWS secret keys, GitHub tokens (PAT, OAuth, App, fine-grained), private keys (PEM blocks), Bearer tokens, and connection strings with embedded passwords. When sensitive content is found, the hook returns updatedMCPToolOutput with redacted content (steered enforcement — content is sanitized, not blocked). Findings emit OUTPUT_SCREENING telemetry events via agent/src/telemetry.py. This closes the gap where an agent could leak a .env value into a PR description or commit message — input-only guardrails cannot catch this. Informed by the ABCA Threat Model Matrix (Threat 7: Sensitive data disclosure, rated Medium-High; Priority 3). See SECURITY.md (Mid-execution enforcement).
Context hydration screening for untrusted content — Add Bedrock Guardrails screening of hydrated context (PR review comments, issue bodies, review feedback) at the point of injection into the agent prompt, not only at task submission time. The current guardrail screening happens at submission for task descriptions and during hydration for pr_iteration/pr_review task types, but if an attacker posts a malicious PR review comment after the task is created, it may be hydrated into context without screening when fetched during the review feedback memory loop (Iteration 3d). Implementation: extend context-hydration.ts to screen all externally-sourced content through the PROMPT_ATTACK filter before including it in the assembled prompt, with fail-closed semantics matching the existing guardrail pattern. Tag screened content with trust_level: untrusted-external metadata. Informed by the ABCA Threat Model Matrix (Threats 1 and 6: Agent goal hijack and Memory/context poisoning). See SECURITY.md.
Review feedback memory loop (Tier 2) — Capture PR review comments via GitHub webhook, extract actionable rules via LLM, and persist them as searchable memory so the agent internalizes reviewer preferences over time. This is the primary feedback loop between human reviewers and the agent — no shipping coding agent does this today. Requires a GitHub webhook → API Gateway → Lambda pipeline (separate from agent execution). Two types of extracted knowledge: repo-level rules ("don't use any types") and task-specific corrections. See MEMORY.md (Review feedback memory) and SECURITY.md (prompt injection via review comments).
PR outcome tracking — Track whether agent-created PRs are merged, revised, or rejected via GitHub webhooks (pull_request.closed events). A merged PR is a positive signal; closed-without-merge is a negative signal. These outcome signals feed into the evaluation pipeline and enable the episodic memory to learn which approaches succeed. See MEMORY.md (PR outcome signals) and EVALUATION.md.
Evaluation pipeline (basic) — Automated evaluation of agent runs: failure categorization (reasoning errors, missed instructions, missing tests, timeouts, tool failures). Results are stored and surfaced in observability dashboards. Basic version: rules-based analysis of task outcomes and agent responses. Track memory effectiveness metrics: first-review merge rate, revision cycles, CI pass rate on first push, review comment density, and repeated mistakes. Advanced version (ML-based trace analysis, A/B prompt comparison, feedback loop into prompts) is deferred to Iteration 5. See EVALUATION.md and OBSERVABILITY.md.
Behavioral circuit breaker specification — Define the concrete specification for mid-execution behavioral monitoring (currently listed as planned work in Iteration 5). The circuit breaker monitors aggregate agent behavior within a running session and triggers pause/terminate/alert actions when anomalous patterns are detected. Signals: tool-call frequency (calls per minute), cumulative session cost velocity, repeated failures on the same tool (>N consecutive), file mutation rate (files written per minute), anomalous file access patterns (reads outside the repo tree, access to sensitive paths like /etc/, ~/.ssh/), and memory write bursts (>N writes in a window). Actions: pause (suspend session, emit alert, await operator decision), terminate (stop session, transition to FAILED with CIRCUIT_BREAKER reason code), alert (continue but emit high-priority notification). Thresholds: configurable per-repo via Blueprint security.circuitBreaker with platform-wide defaults (e.g., >50 tool calls/minute, >$10 cumulative cost, >5 consecutive same-tool failures). The specification is delivered in this iteration as a design artifact; implementation ships in Iteration 5 as part of mid-execution behavioral monitoring. Informed by the ABCA Threat Model Matrix (Threats 2, 8, 9: Tool misuse, Runaway cost, and Rogue behavior). See SECURITY.md (Mid-execution enforcement).
Per-tool-call structured telemetry — Instrument the agent harness (agent/src/telemetry.py) to emit structured events for every tool call: tool name, input hash (SHA-256), output hash, duration, cost attribution, and result status. Events flow through the existing create_event path and are surfaced in CloudWatch. This is foundational for: (a) the evaluation pipeline (tool-call-level success/failure analysis), (b) the centralized policy framework Phase 1 (tool calls become PolicyDecisionEvent sources in Iteration 5), and (c) future mid-execution policy enforcement (tool-call interceptor in Iteration 5). Without per-tool-call telemetry, the platform can only observe sessions as opaque black boxes — model invocation logs capture LLM reasoning but not the tool execution that connects reasoning to action. Informed by the Guardian system's tool-call interception architecture (Hu et al. 2025). See OBSERVABILITY.md and SECURITY.md (Mid-execution enforcement).

Prerequisite: 3e Phase 1 (input hardening) ships with this iteration. The review feedback memory loop writes attacker-controlled content (PR review comments) to persistent memory. Without content sanitization, provenance tagging, and integrity hashing (3e Phase 1), this creates a known attack vector — poisoned review comments stored as persistent rules that influence all future tasks on the repo. 3e Phase 1 items (memory content sanitization, GitHub issue input sanitization, source provenance on memory writes, content integrity hashing) must be implemented before or concurrently with the review feedback pipeline. See SECURITY.md (Prompt injection via PR review comments).

Builds on Iteration 3c: Validation and PR review task type are in place; this iteration adds new infrastructure (webhook → Lambda → LLM extraction pipeline), connects the feedback loop, and closes output screening and context hydration screening gaps identified by the ABCA Threat Model Matrix.

Iteration 3e — Memory security and integrity

Goal: Harden the memory system against both adversarial corruption (prompt injection into memory, poisoned tool outputs, experience grafting) and emergent corruption (hallucination crystallization, feedback loops, stale context accumulation). OWASP classifies this as ASI06 — Memory & Context Poisoning in the 2026 Top 10 for Agentic Applications.

Background

Deep research identified 9 memory-layer security gaps in the current architecture (see the Memory Security Analysis section in MEMORY.md). The platform has strong network-layer security (VPC isolation, DNS Firewall, HTTPS-only egress) but lacks memory content validation, provenance tracking, trust scoring, anomaly detection, and rollback capabilities. Research shows that MINJA-style attacks achieve 95%+ injection success rates against undefended agent memory systems, and that emergent self-corruption (hallucination crystallization, error compounding feedback loops) is equally dangerous because it lacks an external attacker signature.

Phase 1 — Input hardening (ships with Iteration 3d)

Phase 1 is a prerequisite for Iteration 3d's review feedback memory loop. Attacker-controlled PR review comments must not enter persistent memory without sanitization, provenance tagging, and integrity checking. These items ship concurrently with 3d, not after it.

Memory content sanitization — Add content validation in loadMemoryContext() (src/handlers/shared/memory.ts). Scan retrieved memory records for injection patterns (embedded instructions, system prompt overrides, command injection payloads) before including them in the agent's context. Implement a sanitizeMemoryContent() function that strips or flags suspicious patterns while preserving legitimate repository knowledge.
GitHub issue input sanitization — Add trust-boundary-aware sanitization in context-hydration.ts for GitHub issue bodies and comments. These are attacker-controlled inputs that currently flow into the agent's context without differentiation. Strip control characters, embedded instruction patterns, and known injection payloads. Tag the content source as untrusted-external in the hydrated context.
Source provenance on memory writes — Tag all memory writes with source provenance metadata. In memory.ts (writeMinimalEpisode) and agent/memory.py (write_task_episode, write_repo_learnings), add a source_type field to event metadata: agent_episode, agent_learning, orchestrator_fallback, github_issue, or review_feedback. This enables trust-differentiated retrieval in Phase 2.
Content integrity hashing — Add SHA-256 content hashing on all memory writes. Store the hash in event metadata. At read time, verify that content has not been modified between write and read. Implementation: compute hash before CreateEventCommand, store as content_hash metadata, verify on RetrieveMemoryRecordsCommand results.

Phase 2 — Trust-aware retrieval

Trust scoring at retrieval — Modify loadMemoryContext() to weight retrieved memories by temporal freshness, source type reliability, and pattern consistency with other memories. Memories from orchestrator_fallback and agent_episode sources receive higher trust than memories derived from external inputs. Entries below a configurable trust threshold are deprioritized or excluded from the 2,000-token budget.
Configurable temporal decay — Implement per-entry TTL with configurable decay rates. Unverified or externally-sourced memory entries decay faster (e.g., 30-day default) than agent-generated or human-confirmed entries (e.g., 365-day default). Add trust_tier and decay_rate to the memory metadata schema.
Memory validation Lambda — Add a lightweight validation function triggered on CreateEventCommand (via EventBridge rule on AgentCore events or as a post-write hook). The validator runs a classifier that checks whether new memory content looks like legitimate repository knowledge or could influence future agent behavior in unintended ways (the "guardian pattern"). Flag suspicious entries for operator review.

Phase 3 — Detection and response

Memory write anomaly detection — Instrument memory write operations with CloudWatch custom metrics: write frequency per repo, average content length, source type distribution. Add CloudWatch Alarms for anomalous patterns (e.g., burst of writes from a single task, unusually long content, writes with untrusted-external source type exceeding a threshold).
Circuit breaker in orchestrator — Add circuit breaker logic in orchestrator.ts: if the agent's tool invocation patterns or memory write patterns deviate from a baseline (e.g., sudden increase in memory writes, writes containing instruction-like patterns), pause the task and emit an alert. The circuit breaker transitions the task to a new MEMORY_REVIEW state that requires operator intervention.
Memory quarantine API — Expose an operator API endpoint (POST /v1/memory/quarantine, GET /v1/memory/quarantine) for flagging and isolating suspicious memory entries. Quarantined entries are excluded from retrieval but preserved for forensic analysis.
Memory rollback capability — Implement point-in-time memory snapshots. Before each task starts, snapshot the current memory state for the target repo (via the existing loadMemoryContext path, persisted to S3). If poisoning is detected post-task, operators can restore the repo's memory to the pre-task snapshot. Add POST /v1/memory/rollback endpoint.

Phase 4 — Advanced protections

Write-ahead validation (guardian model) — Route proposed memory writes through a smaller, cheaper model (e.g., Haiku) that evaluates whether the content is legitimate learned context or could be adversarial. Adds latency (~100-500ms per write) but catches sophisticated attacks that evade pattern-based sanitization. Configurable per-repo via Blueprint.
Cross-task behavioral drift detection — Compare agent reasoning patterns and tool invocation sequences across tasks for the same repo. Detect drift from established baselines that could indicate memory-influenced behavioral manipulation. Implemented as a post-task analysis step in the evaluation pipeline.
Cryptographic provenance chain — Implement Merkle tree-based provenance for memory entry chains per repo. Each new entry includes a hash of the previous entry, creating an append-only, tamper-evident chain. Enables cryptographic verification that no entries have been inserted, modified, or deleted between known-good checkpoints.
Red team validation — Red team the memory system using published attack methodologies: MINJA (query-based memory injection), AgentPoison (RAG retrieval poisoning), and experience grafting. Document results and adjust defenses. Add automated red team tests to the evaluation pipeline using the DeepTeam framework (OWASP ASI06 attack categories).

Non-backward-compatible changes

Memory metadata schema changes (source_type, content_hash, trust_tier, decay_rate) require schema_version: "3" and are not readable by v2 code paths without migration.
The MEMORY_REVIEW task state is a new addition to the state machine (requires orchestrator, API contract, and observability updates).
Trust-scored retrieval changes the memory context budget allocation, which may affect prompt version hashing.

Builds on Iteration 3d: Review feedback memory and PR outcome tracking are in place; Phases 2–4 harden the memory system that those components write to. Phase 1 (input hardening) ships with 3d as a prerequisite — see Iteration 3d. The phased approach allows incremental deployment with measurable security improvement at each phase.

Iteration 4 — Integrations, visual proof, and control panel

Goal: Additional git providers; agent can run the app and attach visual proof; Slack integration; web dashboard for operators and users; real-time streaming.

Additional git providers — Support GitLab (and optionally Bitbucket or others). Same workflow (clone, branch, commit, push, PR/MR). Provider-specific APIs, auth, and webhook adapters. The gateway and task schema are already channel-agnostic (repo is owner/repo); this iteration adds a git_provider field and provider-specific adapters. Onboarding (Iter 3a) must support non-GitHub repos.
Live execution and visual proof — Agent can execute the application after build/tests, capture screenshots or videos as proof that changes work, and upload them (e.g. as PR attachments or to an S3 artifact store linked from the PR). Requires compute support: virtual display (Xvfb) or headless browser (Playwright/Puppeteer), capture scripts, and outbound upload. See COMPUTE.md (Visual proof). This may require a larger compute profile (more CPU/RAM/disk) or a dedicated "visual proof" step in the blueprint.
Slack channel — Slack adapter for the input gateway: users can submit tasks, check status, and receive notifications from Slack. Inbound: verify Slack signing secret, normalize Slack payload to the internal message schema. Outbound: render internal notifications as Slack Block Kit messages, post to the originating channel/thread. Requires a Slack→platform user mapping. See INPUT_GATEWAY.md.
Automated skills creation pipeline — Pipeline that creates or updates agent skills (or similar artifacts) from repo interaction or from onboarding. For example: the pipeline observes that a repo always requires npm run lint:fix before tests pass, and generates a skill or rule that the agent uses automatically. Builds on customization (Iter 3a) and memory (Iter 3b–3d).
User preference memory (Tier 3) — Per-user memory for PR style, commit conventions, test coverage expectations, and other execution preferences. Extracted from task descriptions (explicit) and review feedback patterns (implicit). Lower priority than repo-level and review feedback memory, but enables personalization when multiple users submit tasks. See MEMORY.md (User preference memory, Tier 3).
Control panel (web dashboard) — Web UI for operators and users: list tasks (with filters by status, repo, user), view task detail and status history, cancel tasks, link to agent logs, and show basic metrics (active tasks, submitted backlog, completion rate, error rate). Optional: submit a task from the UI (the panel becomes another channel via the input gateway). See CONTROL_PANEL.md. Tech stack TBD (e.g. React + AppSync or REST).
Real-time event streaming (WebSocket) — Replace or supplement the polling-based GET /v1/tasks/{id}/events with an API Gateway WebSocket API for real-time task status updates. WebSocket is chosen over SSE because multiplayer sessions (Iteration 6) and iterative feedback require bidirectional communication. This improves the experience for the control panel, Slack integration, and CLI --wait mode. Requires connection management (DynamoDB connection table). See API_CONTRACT.md (OQ1).
Live session replay and mid-task nudge — Extend WebSocket streaming with structured trajectory events (thinking steps, tool calls, cost, timing) for real-time session observation and post-hoc replay with timeline scrubbing. Add a "nudge" mechanism to inject one-shot course corrections between agent turns (via TaskNudges table and mid-session message injection). Structured streaming with cost telemetry provides better debugging and operational visibility than raw terminal logs. Requires bidirectional WebSocket (same as real-time streaming) plus agent harness support for consuming nudge messages.
Browser extension client — A lightweight Chrome/Firefox extension that lets users trigger tasks directly from the browser (e.g. while viewing a GitHub issue, click a button to submit it as a task). The extension calls the existing webhook API (Iteration 3a) with the current page's issue URL, requiring minimal new infrastructure — just a small client-side wrapper over the webhook endpoint. See INPUT_GATEWAY.md.

Builds on Iteration 3d: Onboarding, memory (Tiers 1–2), evaluation, and validation are in place; adds git providers, visual proof, Slack, skills pipeline, user preference memory, control panel, real-time streaming, and browser extension.

Iteration 5 — Scale, cost, and platform maturity

Goal: Faster cold start, multi-user/team, full cost management, guardrails, and alternative runtime support.

Automated container (devbox) from repo — Optionally derive or customize the agent container image from the repo (e.g. Dockerfile, dev container config, language-specific base images). Tied to onboarding: per-repo workload config. Reduces cold start for repos with known environments and ensures the agent has the right tools (compilers, SDKs, linters) pre-installed.
CI/CD pipeline — Automated deployment pipeline for the platform itself: source → build → test → synth → deploy to staging → deploy to production. Use CDK Pipelines or equivalent. The current ad-hoc CDK deploy workflow is not sufficient for a production orchestrator managing long-running tasks — deployments need to be safe (canary, rollback), auditable, and repeatable.
Environment pre-warming (snapshot-on-schedule) — Pre-build container layers or repo snapshots (code + deps pre-installed) per repo; store in ECR or equivalent. Reduces cold start from minutes to seconds for known repos. The onboarding pipeline (Iter 3a) can trigger pre-warming as part of repo setup or on a schedule. Periodically snapshot the onboarded repo's container image (code + deps) to ECR, rebuild on push to the default branch (via webhook or EventBridge), and use that as the base for new sessions. Optionally begin sandbox warming when a user starts composing a task (proactive warming). Snapshot-based session starts (if AgentCore supports it) further reduce startup time. See COMPUTE.md.
Multi-user / team support — Multiple users with shared task history, team-level visibility, and optionally shared approval queues or budgets. Adds a team_id or org_id to the task model. Team admins can view all tasks for their team, set team-level concurrency limits, and configure team-wide cost budgets. Builds on existing task model (user_id, filters) and adds authorization rules (team members can view each other's tasks).
Memory isolation for multi-tenancy — AgentCore Memory has no per-namespace IAM isolation. For multi-tenant deployments, private repo knowledge could leak cross-repo unless isolation is enforced. Options: silo model (separate memory resource per org — strongest), pool model (single resource with strict application-layer namespace scoping — sufficient for single-org), or shared model (intentional cross-repo learning — only for same-org repos). The onboarding pipeline should create or assign memory resources based on the isolation model. See SECURITY.md and MEMORY.md.
Full cost management — per-user and per-team monthly budgets, cost attribution dashboards (cost per task, per repo, per user), alerts when budgets are approaching limits. Token usage and compute cost are tracked per task and aggregated. The control panel (Iter 4) displays cost dashboards.
Adaptive model router with cost-aware cascade — Per-turn model selection via a lightweight heuristic engine. File reads and simple edits use a cheaper model (Haiku); multi-file refactors use Sonnet; complex reasoning escalates to Opus. Error escalation: if the agent fails twice on the same step, upgrade model for the retry. As the cost budget ceiling approaches, cascade down to cheaper models. Blueprint modelCascade config enables per-repo tuning. Potential 30-40% cost reduction on inference-dominated workloads. Requires agent harness changes to support mid-session model switching.
Advanced evaluation and feedback loop — Extend the basic evaluation pipeline from Iteration 3d: ML-based or LLM-based trace analysis (not just rules), A/B prompt comparison framework, automated feedback into prompt templates (e.g. "for repo X, always run tests before opening PR"), and per-repo or per-failure-type improvement tracking. Evaluation results can update the repo's agent configuration stored during onboarding. Optional patterns from adaptive teaching research (e.g. plan → targeted critique → execution; separate evaluator vs prompt/reflection roles; fitness from LLM judging plus efficiency metrics; evolution of teaching templates from failed trajectories with Pareto-style candidate sets for diverse failure modes) can inform offline or scheduled improvement of Blueprint prompts and checklists without replacing ABCA's core orchestrator.
Formal orchestrator verification (TLA+) — Add a formal specification of the orchestrator in TLA+ and verify it with TLC model checking. Scope includes the task state machine (8 states, valid transitions, terminal states), concurrency admission control (atomic increment + max check), cancellation races (cancel arriving during any orchestration step), reconciler/orchestrator interleavings (counter drift correction while tasks are active), and the polling loop (agent writes terminal status, orchestrator observes and finalizes). Define invariants such as valid-state progression, no illegal transitions, and repo-level safety constraints (for example, at most one active RUNNING task per repo when configured). Keep the spec aligned with src/constructs/task-status.ts and orchestrator docs so regressions surface as model-check counterexamples before production. Note: The TLA+ specification can be started earlier (e.g. during Iteration 3d) since the state machine and concurrency model are already stable. The spec is documentation that also catches bugs — writing it does not depend on Iteration 5 features. Consider starting the state machine and cancellation models as part of the ongoing engineering practice.
Guardrails (output and tool-call) with interceptor pattern — Extend Bedrock Guardrails from input screening (implemented in Iteration 3c) to output filtering and agent tool-call guardrails. Apply content filters to model responses during agent execution, restrict sensitive content generation, and enforce organizational policies (e.g. "do not modify files in /infrastructure"). Guardrails configuration can be per-repo (via onboarding) or platform-wide.

Tool-call interceptor (Guardian pattern) — pre- and post-execution stages implemented: A Cedar-based policy engine (agent/src/policy.py) with PreToolUse hooks and a regex-based output scanner (agent/src/output_scanner.py) with PostToolUse hooks (agent/src/hooks.py) intercept tool calls between the agent SDK's decision and actual execution. Pre-execution stage (implemented): Every tool call is evaluated against Cedar deny-list policies: pr_review agents are denied Write/Edit tools, writes to protected paths (.github/workflows/*, .git/*) are blocked, and destructive bash commands (rm -rf /, git push --force) are denied. The engine is fail-closed — if cedarpy is unavailable or evaluation errors occur, all tool calls are denied. Per-repo custom Cedar policies are supported via Blueprint security.cedarPolicies. Denied decisions emit POLICY_DECISION telemetry events via agent/src/telemetry.py. Post-execution stage (implemented): Tool outputs are screened for secrets and PII (AWS keys, GitHub tokens, private keys, connection strings, Bearer tokens) via output_scanner.py. When sensitive content is found, the PostToolUse hook returns updatedMCPToolOutput with redacted content (steered enforcement). Findings emit OUTPUT_SCREENING telemetry events. Remaining work: Cost threshold checks, bash command allowlist per capability tier, and Bedrock Guardrails-based output filtering (complementing the regex-based scanner). Combined with per-tool-call structured telemetry (Iteration 3d), every interceptor decision will be logged as a PolicyDecisionEvent. This pattern is informed by the Guardian system (Hu et al. 2025) — a "guardian agent" that monitors and can intercept tool calls before execution. See SECURITY.md (Mid-execution enforcement).
Mid-execution behavioral monitoring — Lightweight monitoring of agent behavior within a running session, filling the gap between input guardrails (pre-session) and validation (post-session). A behavioral circuit breaker in the agent harness tracks aggregate metrics: tool-call frequency (calls per minute), cumulative session cost, repeated failures on the same tool, and file mutation rate. When metrics exceed configurable thresholds (e.g. >50 tool calls/minute, >$10 cumulative cost, >5 consecutive failures on the same tool), the circuit breaker pauses or terminates the session and emits a circuit_breaker_triggered event. This catches runaway loops, cost explosions, and stuck agents before the hard session timeout. Thresholds are configurable per-repo via Blueprint security props. The circuit breaker operates within the existing agent harness — no sidecar process or external service required. For ABCA's single-agent-per-task model, embedded monitoring is simpler and more reliable than an external sidecar; sidecar architecture becomes relevant when multi-agent orchestration lands (Iteration 6). See SECURITY.md (Mid-execution enforcement).
Centralized policy framework — Consolidate the platform's distributed policy decisions into a unified policy framework and audit layer. Policy logic today is scattered across 20+ files (input validation in validation.ts and create-task-core.ts, admission control in orchestrator.ts, guardrail screening in context-hydration.ts, budget resolution across validation.ts/orchestrator.ts/agent/src/config.py, tool access in agent/src/policy.py + agent/src/hooks.py, network egress in dns-firewall.ts/agent.ts, state transitions in task-status.ts/orchestrator.ts). The agent-side Cedar policy engine (agent/src/policy.py) is a first step — it provides in-process tool-call governance with fail-closed semantics and per-repo custom policies. The full framework extends this to the TypeScript orchestrator side. This fragmentation makes it difficult to audit what policies exist, verify consistency, or change policy behavior without touching multiple files.

Phase 1 — Policy audit normalization: Define a stable PolicyDecisionEvent schema: decision_id (ULID), policy_name (e.g. admission.concurrency, budget.max_turns, guardrail.input_screening), policy_version, phase (submission | admission | pre_flight | hydration | session_start | session | finalization), input_hash (SHA-256 of the decision input for reproducibility), result (allow | deny | modify), reason_codes[], enforcement (enforced | observed | steered), and task_id. The three enforcement modes serve distinct purposes: enforced means the decision is binding (deny blocks, allow proceeds), observed means the decision is logged but not enforced (shadow mode for safe rollout), and steered means the decision modifies the input or output rather than blocking (redact PII, sanitize paths, mask secrets). New rules deploy in observed mode first; operators validate false-positive rates via PolicyDecisionEvent logs, then promote to enforced or steered. This observe-before-enforce workflow enables gradual rollout of security policies without risking false blocks on legitimate tasks. Emit a policy_decision event via emitTaskEvent at every existing enforcement point. Today, some decisions emit events (admission_rejected, preflight_failed, guardrail_blocked) while others silently return HTTP errors — normalize them all. This is pure instrumentation of existing code paths; no behavior change.

Phase 2 — Cedar policy engine (partially implemented): Introduce Cedar (not OPA) as the single policy engine for both operational policy (budget/quota/tool-access resolution, tool-call interception rules) and authorization (extended for multi-tenant access control when multi-user/team support lands). Cedar is AWS-native, has formal verification guarantees, and integrates with AgentCore Gateway.

Current state: An in-process Cedar policy engine is implemented in the agent harness (agent/src/policy.py) using cedarpy for tool-call governance. The engine enforces a deny-list model: pr_review agents are forbidden from Write/Edit, writes to .github/workflows/* and .git/* are blocked, and destructive bash commands are denied. The engine is fail-closed (denies on error, cedarpy unavailability, or Cedar NoDecision). Per-repo custom Cedar policies can be injected via Blueprint security.cedarPolicies and are validated at initialization. Task types are validated against the TaskType enum (agent/src/models.py). Denied decisions emit POLICY_DECISION telemetry events.

Remaining work: Extend Cedar to the TypeScript orchestrator side. Cedar replaces the scattered budget/quota/tool-access merge logic (3-tier max_turns resolution, 2-tier max_budget_usd resolution, per-repo configuration merge in loadBlueprintConfig) with a unified policy evaluation. A thin policy.ts adapter module translates Cedar decisions into PolicyDecision objects (PolicyInput → Cedar evaluation → PolicyDecision with computed budgets, tool profile, risk tier, redaction directives) consumed by existing handlers — no new service, no network hop. Input validation (format checks, range checks) remains at the input boundary; Cedar handles resolution and policy composition. Migrate from in-process cedarpy to Amazon Verified Permissions for runtime-configurable policies.

Operational tool-call policies use a virtual-action classification pattern to support the three enforcement modes (enforced, observed, steered) within Cedar's binary permit/forbid model. Instead of asking Cedar "allow or deny?", the interceptor evaluates against multiple virtual actions (invoke_tool, invoke_tool_steered, invoke_tool_denied) and uses the first permitted action to determine the mode. For example: forbid(principal, action == Action::"invoke_tool", resource) when { resource.path like ".github/workflows/*" && principal.capability_tier != "elevated" } blocks the call, while permit(principal, action == Action::"invoke_tool_steered", resource) when { context.output_contains_pii } triggers PII redaction. This keeps Cedar doing what it does best (binary decisions with formal verification) while the interceptor interprets the combination of decisions as allow/steer/deny.

Authorization policies (extended with multi-user/team): When multi-user/team support lands, the same Cedar policy store expands to cover tenant-specific authorization: "users in team X can submit tasks to repos A, B, C", "team Y has a monthly budget of $500", "repos tagged critical require pr_review before new_task". This replaces the current single-dimensional ownership check (record.user_id !== userId) with multi-dimensional authorization (user, team, repo, action, risk level). No new policy engine — the same Cedar instance grows to cover authorization alongside operational policy.

Runtime-configurable policies: Cedar policies are stored in Amazon Verified Permissions and loaded at hydration/session-start time. Policy changes take effect without CDK redeployment — operators update policies via the Verified Permissions API, and the next task evaluation picks them up. Deployment-time invariants (schema validation, state machine transitions) remain in CDK code.

Policy versioning, rollback, and observe-before-enforce semantics carry forward from Phase 1. Cedar policies are evaluated at submission, admission, hydration, session (tool-call interception), and finalization.

Why not OPA: OPA uses Rego (a custom DSL) and runs as a sidecar or external service. ABCA's policies change at the same cadence as infrastructure (deployed via CDK). A separate service with a separate language adds operational burden without proportionate benefit for a single-tenant platform. Cedar is a better fit: it's a typed language with formal verification, it's AWS-native (used by Amazon Verified Permissions and AgentCore Gateway), and policies can be evaluated in-process via the Cedar SDK without a separate service. Unlike OPA/Rego (which can return arbitrary JSON), Cedar's binary decisions require the virtual-action pattern for steering — but this keeps policy evaluation formally verifiable, which OPA cannot guarantee.

What stays out of the policy framework: Schema validation (repo format, max_turns range, task description length) stays at the input boundary. State machine transitions stay in the orchestrator. DNS Firewall stays in CDK. These are infrastructure invariants, not policy decisions — they don't vary by tenant, user, or context.

See SECURITY.md (Policy enforcement and audit).
Capability-based security model — Fine-grained enforcement beyond Bedrock Guardrails, operating at three levels: (1) Tool-level capabilities — Bash command allowlist (git, npm, make permitted; curl, wget blocked), configurable per capability tier (standard / elevated / read-only). (2) File-system scope — Blueprint declares include/exclude path patterns; Write/Edit/Read tools are filtered to the declared scope. (3) Input trust scoring — Authenticated user input = trusted; external GitHub issues = untrusted; PR review comments entering memory = adversarial. Trust level selects the capability set. Essential once review feedback memory (Iter 3d) introduces attacker-controlled content into the agent's context. Blueprint security prop configures the capability profile per repo. Capability tiers become inputs to the centralized policy framework and are governed by Cedar policies (Phase 2).
Additional execution environment — Support an alternative to AgentCore Runtime (e.g. ECS/Fargate, EKS) behind the ComputeStrategy interface (see REPO_ONBOARDING.md). The orchestrator calls abstract methods (startSession, stopSession, pollSession); the implementation maps to AgentCore, Fargate, or EKS. Repos select the strategy via compute_type in their blueprint configuration. Reduces vendor lock-in and enables workloads that exceed AgentCore limits (e.g. GPU, larger images, longer sessions). The ComputeStrategy interface contract is defined in Iteration 3a; Iteration 5 adds alternative implementations.
Full web dashboard — Extend the control panel from Iteration 4: detailed dashboards (cost, performance, evaluation), reasoning trace viewer or log explorer (linked to OpenTelemetry traces from AgentCore), task submit/cancel from the UI, and admin views (system health, capacity, user management).
Customization (advanced) with tiered tool access — Agent can be extended with MCP servers, plugins, and skills beyond the basic prompt-from-repo customization in Iteration 3a. Composable tool sets per repo. MCP server discovery and lifecycle management. More tools increase behavioral unpredictability, so use a tiered tool access model: a minimal default tool set (bash allowlist, git, verify/lint/test) that all repos get, with MCP servers and plugins as opt-in per repo during onboarding. Per-repo tool profiles are stored in the onboarding config and loaded by the orchestrator. This balances flexibility with predictability. See SECURITY.md and REPO_ONBOARDING.md.

Builds on Iteration 4: Adds pre-warming, multi-user, cost management, guardrails, alternate runtime, and advanced customization with tiered tool access.

Iteration 6 — Learning, advanced workflows, and reuse

Goal: Skills learned from repo interaction; multi-repo tasks; iterative human-agent collaboration; reusable CDK constructs.

GitHub Actions integration — Publish a GitHub Action that triggers a ABCA task (e.g. on issue label like agent:fix, on flaky test detection, or on PR comment command). The Action calls the webhook endpoint from Iteration 3a. Natural integration for GitHub-centric workflows.
Automated pipeline for learning skills from repo interaction — Pipeline that observes agent interactions with repositories and produces reusable skills (rules, prompts, tools) that improve future runs. Builds on memory, code attribution, and evaluation. Example: the pipeline notices that tasks on repo X frequently fail because of a missing env variable, and generates a rule that the agent always sets it.
Agent swarm orchestration — Planner-worker architecture for complex, multi-file tasks that overwhelm a single agent session. A lightweight planner decomposes the task into a DAG of subtasks with scope boundaries and interface contracts. Each subtask runs as an independent child task in its own AgentCore session. A merge orchestrator cherry-picks commits, resolves conflicts, and runs the full test suite before opening one consolidated PR. New DynamoDB fields: parent_task_id, child_task_ids[], subtask_contract. New blueprint steps: decompose-task, fan-out + wait-all, merge-and-verify. Naturally bounds PR size and enables work that no single-session agent can handle (large features, cross-cutting refactors, migrations).
Multi-repo support — Tasks that span multiple repositories (e.g. change an API in repo A and update the consumer in repo B). Requires: multi-branch orchestration (one branch per repo), coordinated PR creation (linked PRs), cross-repo auth (GitHub App installations for both repos), and cross-repo testing. This is architecturally significant and needs a dedicated design doc before implementation.
Iterative feedback and multiplayer sessions — User can send follow-up instructions to a completed or running task (e.g. "also add tests for X" or "change the approach to use library Y"). For completed tasks, the platform starts a new session on the same branch with the follow-up context. For running tasks, this requires message injection into a live session — which depends on agent harness support for session persistence and message channels. Design the interaction model carefully: what happens to in-flight work when instructions change? Multiplayer extension: allow multiple authorized users to inject context into a running or follow-up session (e.g. team code reviews or collaborative debugging with the agent). Per-prompt commit attribution (Iter 3b) supports tracking which user's input led to which changes.
HITL approval mode — Optional mid-task approval gates for high-risk operations (e.g. "agent wants to delete 50 files — approve?"). The orchestrator pauses the task, emits a notification, and waits for user approval before continuing. Requires changes to the agent harness (pause/resume) and the orchestrator (a new AWAITING_APPROVAL state in the state machine).
Scheduled triggers — Cron or schedule-based task creation (e.g. "run dependency update every Monday", "check for flaky tests nightly"). Implemented as EventBridge Scheduler rules that call the task creation API. Schedules are configured per repo during onboarding or via the control panel.
CDK constructs — Publish reusable CDK constructs (e.g. BackgroundAgentStack, OnboardingPipelineStack, TaskOrchestrator) so other teams can compose the platform into their own CDK apps. Document construct APIs, publish to a construct library (e.g. Construct Hub), and version following semver.

Builds on Iteration 5: Leverages memory, evaluation, and customization to close the loop (learn → improve); adds advanced workflows and exposes the platform as constructs.

Summary and mapping to design

Iteration 1 — Core agent + git (isolated run, CLI submit, branch + PR, minimal task state).
Iteration 2 — Production orchestrator, API contract, task management (list/status/cancel), durable execution, observability, threat model, network isolation, basic cost guardrails, CI/CD.
Iteration 3a — Repo onboarding, DNS Firewall (domain-level egress filtering), webhook trigger (foundation for GitHub Actions integration in Iteration 6), per-repo customization (prompt from repo), data retention, turn/iteration caps, cost budget caps, user prompt guide, agent harness improvements (turn budget, default branch, safety net, lint, softened conventions), operator dashboard, WAF, model invocation logging, input length limits.
Iteration 3b ✅ — Memory Tier 1 (repo knowledge, task episodes), insights, agent self-feedback, prompt versioning, per-prompt commit attribution. CDK L2 construct with named semantic + episodic strategies using namespace templates (/{actorId}/knowledge/, /{actorId}/episodes/{sessionId}/), fail-open memory load/write, orchestrator fallback episode, SHA-256 prompt hashing, git trailer attribution.
Iteration 3c — Per-repo GitHub App credentials via AgentCore Token Vault (CfnWorkloadIdentity + Token Vault credential provider for automatic token refresh; agent uses GetWorkloadAccessToken for long-running sessions; sets pattern for GitLab/Jira/Slack integrations), principal-to-repository authorization mapping (Cognito identity → allowed repo sets, distinct from credential scoping — Threat Model Priority 1), orchestrator pre-flight checks (fail-closed before session start), persistent session storage for select caches (AgentCore Runtime /mnt/workspace mount for npm/Claude config; mise/uv/repo on local disk due to FUSE flock() limitation), pre-execution task risk classification (model/limits/approval policy selection), tiered validation pipeline (tool validation, code quality analysis, post-execution risk/blast radius analysis), PR risk level, PR review task type (pr_review — read-only structured review with tool restriction, defense-in-depth enforcement, CLI --review-pr flag), input guardrail screening (Bedrock Guardrails, fail-closed — including GitHub issue content for new_task), multi-modal input.
Iteration 3d — Post-execution output screening (done — regex-based secret/PII scanner in agent/src/output_scanner.py with PostToolUse hook in agent/src/hooks.py; screens AWS keys, GitHub tokens, private keys, connection strings, Bearer tokens; steered enforcement via updatedMCPToolOutput redaction; OUTPUT_SCREENING telemetry events), context hydration screening for untrusted content (PR review comments, issue bodies screened at injection point, not only at submission — Threats 1/6), behavioral circuit breaker specification (signal taxonomy, threshold defaults, action model — design artifact, implementation in Iteration 5 — Threats 2/8/9), review feedback memory loop (Tier 2), PR outcome tracking, evaluation pipeline (basic), per-tool-call structured telemetry (tool name, input/output hash, duration, cost — foundational for evaluation and Iteration 5 policy enforcement). Co-ships with 3e Phase 1 (memory input hardening: content sanitization, provenance tagging, integrity hashing) as a prerequisite for safely writing attacker-controlled content to memory.
Iteration 3e — Memory security and integrity: Phase 1 (input hardening — content sanitization, provenance tagging, integrity hashing) ships with 3d as a prerequisite; Phases 2–4 follow: trust-aware retrieval (trust scoring, temporal decay, guardian validation), detection and response (anomaly detection, circuit breaker, quarantine, rollback), advanced protections (write-ahead validation, behavioral drift detection, cryptographic provenance, red teaming). Addresses OWASP ASI06 (Memory & Context Poisoning).
Iteration 3bis (hardening) — Orchestrator IAM grant for Memory (was silently AccessDenied), memory schema versioning (schema_version: "2"), Python repo format validation, severity-aware error logging in Python memory, narrowed entrypoint try-catch, orchestrator fallback episode observability, conditional writes in agent task_state.py (ConditionExpression guards), orchestrator Lambda error alarm (CloudWatch, retryAttempts: 0), concurrency counter reconciliation (scheduled Lambda, drift correction), multi-AZ NAT documentation (already configurable), Python unit tests (pytest), entrypoint decomposition into agent/src/ modules (config, models, pipeline, runner, context, prompt_builder, hooks, policy, post_hooks, repo, shell, telemetry — with entrypoint.py as re-export shim), Cedar policy engine (in-process cedarpy, fail-closed deny-list for tool-call governance, PreToolUse hooks, per-repo custom policies via Blueprint security.cedarPolicies), TaskType enum with validation, dual prompt assembly deprecation docstring, graceful thread drain in server.py (shutdown hook + atexit), dead QUEUED state removal (8 states, 4 active).
Iteration 4 — Additional git providers, visual proof (screenshots/videos), Slack channel, skills pipeline, user preference memory (Tier 3), control panel (restrict CORS to dashboard origin), real-time event streaming (WebSocket), live session replay and mid-task nudge, browser extension client, MFA for production.
Iteration 5 — Automated container (devbox) from repo, CI/CD pipeline, snapshot-on-schedule pre-warming, multi-user/team, memory isolation for multi-tenancy, full cost management, adaptive model router with cost-aware cascade, advanced evaluation (optional adaptive-teaching / trajectory-driven prompt patterns), formal orchestrator verification with TLA+/TLC, Bedrock Guardrails output/tool-call with Guardian interceptor pattern (pre-execution stage implemented via Cedar agent/src/policy.py + PreToolUse hooks; post-execution stage implemented via agent/src/output_scanner.py + PostToolUse hooks agent/src/hooks.py; remaining: cost threshold checks, bash command allowlist per capability tier, Bedrock Guardrails-based output filtering complementing regex scanner) — input screening in 3c, mid-execution behavioral monitoring (tool-call frequency circuit breaker, cost runaway detection, aggregate behavioral bounds within agent harness), centralized policy framework (Phase 1: policy audit normalization with PolicyDecisionEvent schema across all enforcement points, three enforcement modes — enforced | observed | steered — with observe-before-enforce rollout workflow; Phase 2: Cedar partially implemented in agent harness with in-process cedarpy for tool-call governance; remaining: extend Cedar to TypeScript orchestrator for budget/quota resolution, migrate to Amazon Verified Permissions for runtime-configurable policies, virtual-action classification pattern for enforce/observe/steer, extended for multi-tenant authorization when multi-user/team lands), capability-based security model (tiers feed into policy framework), alternate runtime, advanced customization with tiered tool access (MCP/plugins via AgentCore Gateway), full dashboard, AI-specific WAF rules.
Iteration 6 — Agent swarm orchestration, skills learning, multi-repo, iterative feedback and multiplayer sessions, HITL approval, scheduled triggers, CDK constructs.

Design docs to keep in sync: ARCHITECTURE.md, ORCHESTRATOR.md, API_CONTRACT.md, INPUT_GATEWAY.md, REPO_ONBOARDING.md, MEMORY.md, OBSERVABILITY.md, COMPUTE.md, CONTROL_PANEL.md, SECURITY.md, EVALUATION.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap

Ongoing engineering practice (cross-iteration)

Iteration 1 — First shippable slice (done)

Iteration 2 — Production orchestrator, task management, and observability (done)

Task management and API

Orchestration and storage

Security and network

Cost and observability

Platform operations

Iteration 3 (wip, we are here — 3a and 3b done)

Iteration 3a — Repo onboarding and access control

Iteration 3b — Core memory and learning (done)

Iteration 3bis

Iteration 3c — Validation and new task types

Iteration 3d — Review feedback loop and evaluation

Iteration 3e — Memory security and integrity

Background

Phase 1 — Input hardening (ships with Iteration 3d)

Phase 2 — Trust-aware retrieval

Phase 3 — Detection and response

Phase 4 — Advanced protections

Non-backward-compatible changes

Iteration 4 — Integrations, visual proof, and control panel

Iteration 5 — Scale, cost, and platform maturity

Iteration 6 — Learning, advanced workflows, and reuse

Summary and mapping to design

FilesExpand file tree

ROADMAP.md

Latest commit

History

ROADMAP.md

File metadata and controls

Roadmap

Ongoing engineering practice (cross-iteration)

Iteration 1 — First shippable slice (done)

Iteration 2 — Production orchestrator, task management, and observability (done)

Task management and API

Orchestration and storage

Security and network

Cost and observability

Platform operations

Iteration 3 (wip, we are here — 3a and 3b done)

Iteration 3a — Repo onboarding and access control

Iteration 3b — Core memory and learning (done)

Iteration 3bis

Iteration 3c — Validation and new task types

Iteration 3d — Review feedback loop and evaluation

Iteration 3e — Memory security and integrity

Background

Phase 1 — Input hardening (ships with Iteration 3d)

Phase 2 — Trust-aware retrieval

Phase 3 — Detection and response

Phase 4 — Advanced protections

Non-backward-compatible changes

Iteration 4 — Integrations, visual proof, and control panel

Iteration 5 — Scale, cost, and platform maturity

Iteration 6 — Learning, advanced workflows, and reuse

Summary and mapping to design