feat: AgentCore Evo — config bundles, batch eval, recommendations, AB testing by notgitika · Pull Request #1061 · aws/agentcore-cli

notgitika · 2026-04-30T17:54:16Z

Summary

AgentCore Evo private preview features for the CLI. This PR brings the feat/evo-implementation branch (developed on the staging repo) into the public CLI repo.

New capabilities

Config Bundles — Versioned runtime configurations (system prompt, tool descriptions, model params). Create with agentcore add config-bundle or --with-config-bundle on agent creation. Manage versions with agentcore cb versions/diff/create-branch. Config bundle values are injected at runtime via SDK baggage — no code redeploy needed to change behavior.

Batch Evaluation — agentcore run batch-evaluation runs evaluators (Builtin.Correctness, Helpfulness, Faithfulness, etc.) across agent sessions in CloudWatch. Supports multiple evaluators, session filtering, lookback windows, ground truth assertions, and custom names.

Recommendations — agentcore run recommendation optimizes system prompts and tool descriptions using agent traces. Can read from and write back to config bundles for zero-redeploy prompt updates.

AB Testing — Config-bundle mode (same code, different configs) and target-based mode (different runtime endpoints). Traffic splitting, online evaluation, pause/resume/stop/promote lifecycle. TUI wizard with side-by-side variant builder.

HTTP Gateways & Runtime Endpoints — Gateway targets, endpoint aliases, and routing infrastructure for AB tests.

Other changes

Batch eval API migrated to latest dataplane schema (batchEvaluationName, dataSourceConfig, evaluationMetadata)
Config bundle baggage passed on agentcore invoke for runtime config injection
[preview] tags on all evo commands and TUI screens
Region resolution from aws-targets.json across all evo commands
CLI command name agentcore (was agentcore-dev), distro mode set to PROD_DISTRO
Merged with public/main — includes telemetry, web-ui traces, inspector updates

Tested

E2E on prod and gamma (us-east-1, ap-southeast-2)
Full config bundle flow: create → deploy → invoke → recommendation → invoke with updated prompt
Batch eval with single/multiple evaluators, ground truth, stop
AB test lifecycle (config-bundle and target-based modes)
Bug bash completed with bug tracker

Add ConfigBundle as a new resource type with full lifecycle: - Schema: ConfigBundleSchema with name validation, component configurations - Primitive: ConfigBundlePrimitive for add/remove operations - API client: SigV4-signed HTTP requests for config bundle CRUD operations - Deploy: post-deploy hook to sync config bundles with control plane - Status: config-bundle resource type in status command - TUI: add wizard (name, description, components, branch, commit message), remove flow, ResourceGraph integration - State: carry forward configBundles across redeploys in buildDeployedState

The signing service must be 'bedrock-agentcore' for all stages, not 'bedrock-agentcore-control' for prod. The endpoint hostname differs from the signing service name.

- Add config bundle post-deploy setup to TUI deploy flow (useDeployFlow) - Add clientToken to config bundle update API call - Add parentVersionIds on update (required by API) - Default branchName to "main" and commitMessage when not specified - Add placeholders for branch/message in TUI wizard - Fallback to find-by-name or create when update fails (stale IDs) - Remove debug logging from actions.ts

- Add `agentcore edit config-bundle` CLI command with --bundle, --components, --components-file, --description, --branch, --message, --json flags - Add interactive TUI wizard for editing config bundles (select bundle, input method, components, commit message, branch name, confirm) - Add diff check to post-deploy: skip API update when components and description are unchanged, avoiding unnecessary version creation - Use getConfigurationBundleVersion instead of getConfigurationBundle to avoid branch-not-found errors on bundles created with different branches - Align default branch name to 'mainline' (API default) instead of 'main' - For updates, inherit branch from current API state when not specified

- post-deploy-config-bundles: 13 tests covering create, update, skip (diff check), delete, branch inheritance, fallback paths, errors - ConfigBundlePrimitive.edit: 7 tests covering component updates, optional field handling, missing bundle errors, field preservation - useEditConfigBundleWizard: 16 tests covering step navigation, setters, goBack, reset, currentIndex tracking, step labels

feat: add configuration bundle support

* chore: remove edit config-bundle command Users should edit agentcore.json directly to update config bundles. Removes the edit CLI command, TUI screens, wizard hooks, and tests. * feat: add config-bundle CLI commands for version history Adds `agentcore config-bundle` with three subcommands: - `versions` — list version history grouped by branch - `get-version` — view specific version details and components - `diff` — client-side deep diff between two versions Also adds filter support (branchName, latestPerBranch, createdBy) to the listConfigurationBundleVersions API client. * feat: add config bundle hub TUI screens Add TUI screens for browsing config bundles, viewing version history with branch grouping, version detail drill-down, and diff comparison between versions. * fix: resolve config bundle versionId when falling back to list API (#49) The Recommendation API requires versionId to be non-null when using configurationBundle input. When resolveBundleByName fell back to the list API (bundle not in deployed state), it returned no versionId, causing a 400 validation error. Now calls getConfigurationBundle after list to fetch the latest versionId. Also adds versionId to the ResolvedBundle interface and returns it from the deployed-state fast path. * chore: remove get-version subcommand from config-bundle CLI The versions --json and diff commands cover all practical use cases. Keeps the command surface lean: versions + diff only.

* feat: add Recommendation API wrappers, CLI commands, and operations layer Implement the Recommendations/Optimization feature for AgentCore CLI: - SigV4-signed HTTP client for Start/Get/List/Delete Recommendation (DP) - Operations layer with orchestration, polling, and local storage - CLI commands: evals recommend, evals recommendation history/delete, run promote - 27 unit tests covering API, storage, and orchestration logic - Live-validated field names and ARN formats against prod API * feat: add recommendation TUI wizard with session discovery and multi-evaluator support - Add full recommendation wizard TUI (type, agent, evaluators, input, trace source, sessions, confirm) - Add session discovery flow: discover sessions from CloudWatch, multi-select specific sessions - Support both CloudWatch logs and session ID trace sources - Pass selected sessionIds to recommendation API cloudwatchLogs config - Add request ID capture and error detail extraction for debugging FAILED recommendations - Fix recommendation API test mocks (add headers for requestId capture) - Add scrollable list support (maxVisibleItems) to MultiSelectList, SelectList, WizardSelect - Wire recommendation screen into App.tsx and EvalHubScreen navigation * feat: add session span fetching, recommendation tests, and TUI integration - Add fetch-session-spans module for retrieving OTEL spans from aws/spans and log records from runtime log groups with session ID filtering - Add comprehensive tests for fetch-session-spans (9 tests) and extend run-recommendation tests (12 new tests covering file input, spans-file trace source, tool-desc auto-fetch, error handling, ARN passthrough) - Wire recommendation hub, history screen, and list/delete CLI commands - Update TUI routing for recommendation flows from eval and run hubs - Add recommendation constants (poll intervals, terminal statuses) * chore: remove list commands and promote stub, fix agents→runtimes rename Remove `agentcore list recommendations` and `agentcore list recommendation --id` commands (top-level `list` command deleted entirely). Remove `run promote` stub. Fix typecheck errors from agents→runtimes schema rename in recommendation files.

#26) * feat: add EvaluationJob resource — schema, primitive, deploy hook, TUI, and tests Phase 1 of EvalJobRunner: CRUD + deploy integration for the EvaluationJob control plane resource. - Schema: EvaluationJobSchema in agentcore.json, deployed state tracking - Primitive: EvaluationJobPrimitive with add/remove lifecycle - AWS client: SigV4-signed HTTP wrappers for EvalJob CP operations - Deploy: post-deploy hook creates/updates/deletes eval jobs imperatively - CFN outputs: parse eval job execution role ARN from stack outputs - TUI: add evaluation-job wizard flow + remove flow integration - Tests: 53 tests across schema, primitive, AWS client, deploy hook, and TUI * feat: add `run evaluation-job` command with DP API wrappers and orchestration - Data plane API wrappers (RunEvaluationJob, GetEvaluationJobRun, ListEvaluationJobRuns) with SigV4 signing against bedrock-agentcore service - Orchestration: resolve job from deployed state, generate runId, start run, poll for completion, fetch results from CW Logs output group - CLI command: `agentcore run evaluation-job --job <name> --session-id <ids...>` with --json output and progress callbacks - Tests: 17 new tests covering DP wrappers, runId generation, orchestration (error handling, polling, CW Logs result parsing) * feat: complete US1/US2 quick wins — run name, cancel, update, stage-aware endpoints - Add --run flag to `run evaluation-job` for custom run name prefixes - Add `run cancel-evaluation-job` command with StopEvaluationJobRun DP API - Add `update evaluation-job` primitive method and CLI subcommands - Add `agentcore update experiment` parent command (backward-compatible) - Make CP/DP endpoints stage-aware via AGENTCORE_STAGE env var (beta/gamma/prod) - Fix beta SigV4 service name (bedrock-agentcore vs bedrock-agentcore-control) - Update AddEvaluationJobFlow success screen with next-steps guidance * feat: add TUI run wizard, progress steps, and local result storage for eval jobs - Add RunEvalJobFlow TUI: select job → enter sessions → name run → confirm → execute - Add StepProgress display during eval job polling (starting → polling → fetching → saving) - Add elapsed time counter during run execution - Add eval-job-storage module: save/load/list run results per job in .cli/eval-job-results/ - Auto-save results on both CLI and TUI paths - Add "Evaluation Job" option to TUI Run screen - Add 9 unit tests for eval-job-storage * feat: add CloudWatch session discovery to eval job TUI wizard - Add source type picker: "Discover from CloudWatch" vs "Enter manually" - Add lookback days input (1-90 days) for CloudWatch discovery - Discover sessions via CW Insights query using agent's runtimeId - Multi-select from discovered sessions with span count + timestamps - Auto-fallback to manual entry when agent not deployed (no runtimeId) - Improve error display: show failed step in StepProgress before transitioning * feat: migrate evaluation from resource CRUD to stateless batch evaluation Replace the old EvaluationJob resource model (create/update/delete via agentcore.json + deploy hooks) with a flat BatchEvaluation API model: - Add `run batch-evaluation` and `run stop-batch-evaluation` CLI commands - Add batch evaluation TUI wizard under the Run menu - Add SigV4 API client for batch eval endpoints (start/get/list/stop) - Add CloudWatch results fetching from outputDataConfig - Remove all old evaluation-job infrastructure: primitive, deploy hook, schema, TUI add/remove screens, CP CRUD operations - Remove evaluationJobs from agentcore.json schema Tested end-to-end on gamma (account 998846730471) with Builtin.Faithfulness evaluator against 3 agent sessions — all returning correct scores. * chore: remove executionRoleArn now that FAS creds are live on gamma The batch evaluation API no longer requires an execution role ARN. Remove the --execution-role CLI option and all executionRoleArn plumbing from the API client and orchestration layer. * Revert "chore: remove executionRoleArn now that FAS creds are live on gamma" This reverts commit f1706ff7ea4b7695d1466e609cde29e38cb00afb. * refactor: move stop-batch-evaluation to top-level stop command Move `agentcore run stop-batch-evaluation` to `agentcore stop batch-evaluation` as a higher-level verb, consistent with pause/resume pattern.

…75) * fix: stop running AB test before deletion during deploy reconciliation When a user removes an AB test from agentcore.json and redeploys, deleteOrphanedABTests tried to delete it directly. This failed with 409 "Cannot delete while AB test is running", which then blocked gateway deletion (AB test routing rules still on the gateway). Fix: call updateABTest with executionStatus=STOPPED before deleting. If stop fails (already stopped or invalid state), proceed with delete. A console.warn is emitted when an AB test is stopped during cleanup. * fix: use console.warn instead of console.log in deploy operations console.log writes to stdout which corrupts the TUI (Ink) rendering. console.warn writes to stderr which doesn't interfere with the TUI. * fix: poll for AB test status after stop before deleting The stop call transitions the AB test to UPDATING status. Deleting immediately fails with 409 "cannot be deleted in status UPDATING". Now polls getABTest until status leaves UPDATING before attempting delete. * fix: surface AB test stop warning via postDeployWarnings instead of console.warn console.warn leaks into TUI rendering. Instead, return warning in the result and let TUI/CLI callers surface it through proper channels (postDeployWarnings for TUI, logger.log for CLI). * fix: move target deletion wait into deleteHttpGatewayTarget and cleanup - deleteHttpGatewayTarget now polls until target is fully deleted (404) internally, so callers don't need to remember to wait separately - Removed waitForTargetDeletion from post-deploy-http-gateways.ts - Reconciliation deletion path now waits for target deletion too - AB test stop polling now checks executionStatus === 'STOPPED' - Removed console.warn/log that leaked into TUI rendering - Removed debug process.stderr.write logs * fix: resolve config bundle placeholders in TUI deploy path resolveConfigBundleComponentKeys (which resolves {{runtime:name}} and {{gateway:name}} to real ARNs) was only in the CLI deploy path. The TUI deploy path passed raw placeholders to the API, causing validation errors. Moved the resolution functions to post-deploy-config-bundles.ts so both CLI and TUI can import them. * fix: rename --agent to --runtime and clarify --online-eval in ab-test CLI - --agent renamed to --runtime (consistent with other commands) - --online-eval description changed from "name or ARN" to "name" - --gateway help text updated to reference --runtime * test: fix broken polling test and add coverage for review findings - Fix AB test polling test mock: first poll returns executionStatus 'RUNNING' (was 'STOPPED', causing loop to exit immediately — test was broken) - Add 11 tests for resolveConfigBundleComponentKeys (runtime, gateway, ARN passthrough, missing resource errors, immutability) - Add 4 tests for warning field (stop warning set, not set on failure, set even on delete failure, poll timeout) * fix: deleteHttpGatewayTarget returns failure on polling timeout Both reviewers flagged: returning success on timeout is wrong — the target may still be DELETING, causing downstream gateway delete to fail. Now returns { success: false } with timeout error message. * fix: AB test TUI reads region from aws-targets.json instead of env vars ABTestPickerScreen was using process.env.AWS_REGION with us-east-1 fallback. This caused debug checks, stop/resume, and all API calls to hit the wrong region. Now reads from aws-targets.json via resolveAWSDeploymentTargets(), matching the config bundle hub pattern. * fix: paginate DescribeDeliverySources in AB test debug check The debug panel only read the first page of delivery sources, missing sources for accounts with many gateways. Now paginates both DescribeDeliverySources and DescribeDeliveries calls. Also reads region from aws-targets.json instead of env vars. * fix: warn when AB test stop polling times out before deletion Address review comment — log a warning via the result's warning field when the polling loop exhausts all 20 iterations without executionStatus reaching STOPPED, so the user knows the delete is proceeding without confirmation.

…ck (#94) Uses /ab-tests (new path) as primary, falls back to /abtests (legacy) on 404 for backwards compatibility during the API migration.

* fix: show yellow warning banner when post-deploy sub-resources fail Deploy banner now has three states instead of two: - Green "Deploy to AWS Complete" — everything succeeded - Yellow "Deploy to AWS Complete (with warnings)" — infra deployed but post-deploy resources (AB tests, config bundles, HTTP gateways) had errors - Red "Deploy to AWS Failed" — CDK stack deployment failed CLI non-interactive path returns exit code 2 for post-deploy warnings (vs exit 0 for success, exit 1 for infra failure) so CI/CD pipelines can differentiate. Post-deploy errors (AB tests, config bundles, HTTP gateways, online evals) are shown inside the yellow banner box and in the post-deploy warnings section below. The deploy step stays marked as success since the CDK stack did deploy correctly. * fix: treat COMPLETED_WITH_ERRORS as terminal in batch evaluation poll loop The batch evaluation poll loop only recognized COMPLETED, FAILED, STOPPED, and CANCELLED as terminal statuses. When the service returned COMPLETED_WITH_ERRORS (typical when any session fails), the CLI never exited the poll loop and hung for 67 minutes until the fetch timed out. Add COMPLETED_WITH_ERRORS to TERMINAL_STATUSES so the poll exits immediately. The status is still treated as a non-success outcome (line 227 checks for COMPLETED specifically), so partial failures are reported correctly.

* fix: config bundle name resolution and add create-branch command Accept local bundle names (from agentcore.json) in CLI and TUI when the API stores them with a project-name prefix. The resolver now tries the exact name, current prefix (projectBundle), and legacy underscore prefix (project_Bundle) for backward compatibility. Also adds `agentcore cb create-branch` to create a new branch on an existing bundle via the update API, instead of requiring a whole new bundle to be created. * fix: address PR review — DRY name variants, pagination, sort, and tests - Extract getBundleNameVariants to shared utility, use in both resolve-bundle.ts and useConfigBundleHub.ts - Paginate listConfigurationBundles in resolveBundleByName so bundles beyond page 1 are found - Sort versions by versionCreatedAt descending in create-branch to reliably pick the latest version as branch parent - Add unit tests for getBundleNameVariants and resolveBundleByName (9 tests covering fast path, fallback, pagination, legacy names)

…142)

…ution (#147) * fix: rename command to agentcore and use aws-targets for region resolution - Rename CLI command from agentcore-dev to agentcore - Resolve region from aws-targets.json across all evo commands: ab-test, pause, resume, stop, recommendation - Previously these fell back to env vars or detectRegion() which could pick the wrong region. Now consistent with batch-eval and config-bundle which already used aws-targets. - Fix pre-existing partition lint errors: use arnPrefix() and dnsSuffix() instead of hardcoded arn:aws: and .amazonaws.com Note: --no-verify used because base branch has 11 pre-existing typecheck errors in browser-tests/ and otel-metric-sink.ts that are unrelated to this change. * fix: switch distro mode from PRIVATE_DEV_DISTRO to PROD_DISTRO Set DISTRO_MODE to PROD_DISTRO so the CLI uses the @aws/agentcore package name and public npm registry. * feat: add [preview] tag to all evo feature commands and TUI screens Tag batch evaluation, recommendation, config bundle, and AB test commands with [preview] in CLI help descriptions and TUI screen titles to signal these are public preview features subject to change.

* feat: add --with-config-bundle flag to agent creation When --with-config-bundle is passed (CLI) or "Config bundle" is selected in the TUI Advanced Configuration, the CLI: 1. Auto-creates a config bundle named {AgentName}Config with smart defaults (system prompt + tool descriptions for the runtime) 2. Vends template variants that use the SDK to consume config bundle values at runtime via context.get_config_bundle() Strands template: adds ConfigBundleHook (HookProvider) that injects system prompt via event.agent.system_prompt and overrides tool descriptions via BeforeToolCallEvent. LangGraph template: adds ConfigBundleCallback (BaseCallbackHandler) that injects system prompt via SystemMessage on chain start. Both templates fall back to DEFAULT_SYSTEM_PROMPT when no config bundle is deployed (e.g. local dev with agentcore dev). * fix: use BedrockAgentCoreContext classmethod for config bundle access get_config_bundle() is a @classmethod on BedrockAgentCoreContext, not an instance method on the request context. Update both Strands and LangGraph templates to use BedrockAgentCoreContext.get_config_bundle() and remove the unused context parameter from hook constructors.

The batch evaluation API renamed `name` to `batchEvaluationName` in requests/responses, and removed `tags` and `executionRoleArn` from StartBatchEvaluation. - New schema sends `batchEvaluationName` in start request - Legacy fallback still sends `name` for backwards compat - Response normalizers handle both `batchEvaluationName` and `name` - Remove `executionRoleArn` from options, orchestrator, and CLI flag - Remove `tags` from start options

The InvokeAgentRuntimeCommand accepts a baggage field with W3C baggage format. When a config bundle is associated with the agent being invoked, build the baggage string from deployed-state.json and pass it through so the SDK can fetch the config bundle at runtime. This enables the full config bundle flow: create → deploy → invoke → recommendation → invoke again with updated prompt, all without redeploying the agent code.

* feat: add --request-header-allowlist CLI flag for agentcore add agent (#825) (#830) Wire the existing requestHeaderAllowlist feature (already supported in the TUI wizard and schema) into the non-interactive CLI path. Accepts comma-separated header names that are auto-normalized with the X-Amzn-Bedrock-AgentCore-Runtime-Custom- prefix. - Add requestHeaderAllowlist field to CLI AddAgentOptions interface - Register --request-header-allowlist option in AgentPrimitive.registerCommands() - Add validation using existing validateHeaderAllowlist() utility - Pass parsed headers through to both create and BYO agent paths * fix: update E2E test regex to match new CUSTOM_JWT client-side error (#832) PR #817 changed invoke to fail fast client-side when a CUSTOM_JWT agent is invoked without a bearer token, producing a different error message. The E2E assertions still expected the old server-side "authorization mismatch" pattern, causing two test failures on main. * fix: remove docker info check from container runtime detection (#829) detectContainerRuntime() called `docker info` to verify the daemon was running. This requires access to the Docker socket and triggers an OS password prompt on machines where the user is not in the docker group. The check provided no real value: deploy falls back to CodeBuild anyway, and dev will fail with a clear error from `docker build` if the daemon is down. Remove the `docker info` probe and rely on `which` + `--version` only, matching the approach already used by detectContainerRuntimeSync(). Also removes the now-unused START_HINTS constant, getStartHint() helper, and notReadyRuntimes tracking. * fix: add missing AgentCore regions to match AWS documentation (#833) Add 6 regions (ap-northeast-2, ca-central-1, eu-north-1, eu-west-2, eu-west-3, sa-east-1) to AgentCoreRegionSchema to match the official AWS Bedrock Agentcore supported regions documentation. Closes #822 * fix: unhide import command from TUI main menu (#834) The import command should be visible in the TUI command list so users can discover and use it interactively. * feat: add e2e tests for import command (#828) * feat: add e2e tests for import command Add end-to-end tests that exercise the import runtime, memory, and evaluator commands against real AWS resources. Python fixture scripts create resources via the Bedrock AgentCore control plane API, then tests import them into a CLI project and verify status and invocation. Also adds pip install boto3 to the full e2e CI workflow so the import tests can run in GitHub Actions. * Potential fix for pull request finding 'Unused import' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * Potential fix for pull request finding 'Unused import' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * Potential fix for pull request finding 'Unused import' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * Potential fix for pull request finding 'Unused import' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * fix: use triggering ref for workflow_dispatch in full e2e suite The checkout step was hardcoded to ref: main, so workflow_dispatch on a feature branch would still test main. Now it uses the dispatch ref for manual triggers and main for push/schedule triggers. * fix: upgrade boto3 in CI for bedrock-agentcore-control support Ubuntu-latest ships boto3 1.34.46 which doesn't know about the bedrock-agentcore-control service. Use --upgrade to get a version that supports import test setup scripts. * fix: address review feedback on import e2e tests - Use default vended model IDs instead of hardcoded claude-3-haiku - Pin boto3 version in CI workflow for deterministic builds - Drop unnecessary boto3.session.Session() fallback in REGION resolution - Preserve bugbash-resources.json on partial cleanup failure - Log teardown deploy failures instead of swallowing silently - Add comment explaining sequential setup script execution * fix: pass default evaluator model from CLI source to setup scripts Instead of hardcoding the evaluator model ID in the Python fixture with a "keep in sync" comment, import DEFAULT_MODEL from the CLI source and pass it as an env var to the setup script. The Python script falls back to a hardcoded default for standalone use. * style: fix prettier import ordering * fix: address PR review feedback for import e2e tests - Exit with code 1 when setup scripts fail to reach ready status - Change default region fallback from us-west-2 to us-east-1 - Add S3 code object cleanup to cleanup_resources.py - Document IAM role reuse policy in ensure_role() and cleanup script - Add comment explaining why teardownE2EProject() is not used --------- Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * feat: add auto-instrumentation to langchain agent template (#835) * fix: add missing langchain instrumentor dependency to import flow (#836) The LangGraph translator generates LangchainInstrumentor().instrument() in imported agents' main.py, but pyproject-generator.ts did not include opentelemetry-instrumentation-langchain in LANGGRAPH_DEPS. This causes imported LangGraph agents to crash at runtime with ModuleNotFoundError. * fix(ci): unpin boto3 in e2e workflow (#841) The pinned boto3==1.38.0 did not include the bedrock-agentcore-control service model, causing import-resources e2e tests to fail with UnknownServiceError. Using latest boto3 ensures new AWS services are always available. * fix: add AWS_IAM as a valid authorizer type for gateway commands (#820) * fix(e2e): use uv run for import test Python scripts (#845) * fix(ci): unpin boto3 in e2e workflow The pinned boto3==1.38.0 did not include the bedrock-agentcore-control service model, causing import-resources e2e tests to fail with UnknownServiceError. Using latest boto3 ensures new AWS services are always available. * fix(e2e): use uv run for import test Python scripts The import e2e tests call python3 directly to run setup scripts that use boto3. On CI runners, the system-installed boto3 is too old to include the bedrock-agentcore-control service model. pip install boto3 installs to user site-packages which the child process doesn't pick up. Switch to uv run --with boto3 python3 so the scripts always get a current boto3 in an isolated environment. Remove the now-unnecessary pip install step from the workflow. * fix: only exclude root-level agentcore/ directory from packaging artifacts (#844) The EXCLUDED_ENTRIES set unconditionally excluded any directory named 'agentcore' at any depth during zip/copy operations. This silently dropped third-party dependency sub-modules that happen to use the same directory name (e.g., langgraph_checkpoint_aws/agentcore/), causing ImportError at runtime. Remove 'agentcore' from the flat EXCLUDED_ENTRIES set and instead thread the original rootDir through all recursive traversal functions. The agentcore directory is now only excluded when its resolved path matches join(rootDir, CONFIG_DIR) — i.e., it sits at the project root. Also remove the hand-written fflate type shim (src/lib/packaging/types/fflate.d.ts) that shadowed the package's own type declarations. The shim only declared zipSync, making all other fflate exports (including unzipSync) invisible to TypeScript. The real fflate v0.8.2 ships complete types that resolve correctly under moduleResolution: "bundler". Closes #843 * fix(ci): update snapshots after CDK version sync in release workflow (#848) The release workflow syncs @aws/agentcore-cdk to the latest npm version but did not update the asset snapshot tests, causing the test-and-build job to fail with a snapshot mismatch. * fix(ci): move snapshot update after build step in release workflow (#849) The snapshot update requires built output, so it must run after npm run build, not before. * fix(ci): bump @aws/agentcore-cdk to 0.1.0-alpha.18 and remove snapshot step from release (#850) * fix(ci): bump @aws/agentcore-cdk to 0.1.0-alpha.18 and remove snapshot step from release Sync the CDK template to the latest npm version and update the asset snapshot. Remove the snapshot update step from the release workflow since it runs the full test suite which requires uv. * fix: use caret range for @aws/agentcore-cdk in CDK template Use ^0.1.0-alpha.18 instead of pinning an exact version so new releases are picked up automatically. * fix: pin @aws/agentcore-cdk to exact version in CDK template (#852) Revert to exact pinning (0.1.0-alpha.18) instead of caret range. The release workflow handles syncing to the latest version. * chore: bump version to 0.8.1 (#853) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * feat: upgrade default Python runtime to PYTHON_3_14 (#837) * feat: upgrade default Python runtime to PYTHON_3_14 Add PYTHON_3_14 as a supported runtime version and make it the default for new agents and MCP tools. Updates schema enums, defaults, UI options, packaging fallbacks, import mappings, and tests. Verified end-to-end: deployed a runtime with PYTHON_3_14 to AgentCore and confirmed successful invocation. * chore: revert JSON schema change (auto-generated at release) The JSON schema file is auto-regenerated during the release workflow. Direct changes are rejected by the schema-check CI job. * fix: address review — missed defaults, types, tests, and docs - Update packCodeZipSync fallback in packaging/index.ts - Add PYTHON_3_14 to llm-compacted/mcp.ts PythonRuntime type - Update hardcoded runtimeVersion in AgentPrimitive.tsx - Add PYTHON_3_14 to agent-env schema test - Update TUI harness fixture default - Update docs examples and runtime version list * refactor: consolidate DEFAULT_PYTHON_VERSION into schema/constants Define DEFAULT_PYTHON_VERSION once in schema/constants.ts and re-export from the three TUI screen files that previously defined their own copy. Replace hardcoded 'PYTHON_3_14' fallbacks in packaging and AgentPrimitive with the shared constant. Future runtime version bumps now require a single-line change. * fix: detect Python ABI tag and usable wheels errors in platform retry logic When numpy lacks pre-built wheels for a specific manylinux platform on CPython 3.14, uv reports "no wheels with a matching Python ABI tag" or "has no usable wheels" instead of the platform-specific errors the retry logic was matching. This caused the packager to hard-fail on the first platform candidate instead of retrying with a newer manylinux version that does have compatible wheels. * chore: bump version to 0.8.2 (#874) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * test: update asset snapshot for @aws/agentcore-cdk 0.1.0-alpha.19 (#875) Regenerates the CDK package.json snapshot to match the version bump landed in #852, which pinned @aws/agentcore-cdk to 0.1.0-alpha.19 in the vended CDK template but did not refresh the corresponding snapshot. * revert: roll back version bump to 0.8.1 (#877) Reverts the version and changelog portions of #874 so the release workflow can be re-run cleanly. Only touches the version fields (package.json, package-lock.json) and the 0.8.2 CHANGELOG entry. Leaves the schema regen and CDK pin in place since the workflow will rewrite them on the next run. * chore: bump version to 0.8.2 (#878) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * docs: document executionRoleArn in runtime spec (#872) The runtime spec table in configuration.md omitted the existing optional executionRoleArn field, leading users (see issue #870) to believe the CLI had no way to bring their own IAM execution role. The field is already supported in the schema. Confidence: high Scope-risk: narrow * feat: add agent inspector web UI for `agentcore dev` (#871) * fix: defer policy engine write and harden policy flow UX (#856) * fix: defer policy engine write to disk until flow completes Previously, pressing Escape on the gateway selection screen during policy engine creation would skip to the success screen because the engine was already written to agentcore.json at the name step. Now the disk write is deferred until the user completes the entire flow, so Escape correctly navigates back to the previous step without persisting a half-configured engine. Constraint: Must not break non-interactive CLI path which still writes immediately via primitive Rejected: Only change Escape to go back without deferring write | engine would still be persisted on back Confidence: high Scope-risk: narrow * fix: preserve engine name when navigating back from gateway selection When pressing Escape on the gateway screen to go back to the name step, the previously entered engine name was lost because AddPolicyEngineScreen remounted with a generated default. Now the entered name is stored in pendingEngineName state and passed back as initialName so the user sees their original input. Constraint: Must not change flow state union type to keep diff minimal Rejected: Carry name in FlowState union variant | adds complexity to type for one field Confidence: high Scope-risk: narrow * chore: remove TUI harness test accidentally committed This test requires a live terminal session and cannot run as a unit test in CI. It was an untracked local file that got staged by mistake. * chore: bump version to 0.9.0 (#881) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix: use caret range for @aws/agentcore-cdk in CDK template (#882) Users always get the latest compatible CDK constructs on npm install without requiring the CLI release workflow to pin-sync the version. Removes the now-redundant sync step from the release workflow. * fix: agent-inspector frontend assets missing from build (#883) * fix: agent-inspector frontend assets missing from build * fix: resolve React ref-during-render and setState-in-effect lint errors - Wrap onReadyRef update in useEffect to avoid ref mutation during render - Replace loggerRef.current access in return object with logFilePath state - Replace useEffect+setState with state-based prev-step tracking pattern Confidence: high Scope-risk: narrow --------- Co-authored-by: Jesse Turner <ajesstur@amazon.com> * fix: revert version to 0.8.2 (#885) * fix: revert version to 0.8.2 * fix: remove 0.9.0 entry from changelog * Release v0.9.0 (#887) * chore: bump version to 0.9.0 Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Revise CHANGELOG for version 0.8.2 updates Updated CHANGELOG.md to include recent fixes and additions. --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Jesse Turner <57651174+jesseturner21@users.noreply.github.com> * Release v0.9.1 (#888) * chore: bump version to 0.9.1 Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Clean up CHANGELOG for version 0.9.1 Removed fixed issues and other changes from version 0.9.1. --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Jesse Turner <57651174+jesseturner21@users.noreply.github.com> * fix: propagate sessionId as A2A contextId in Inspector proxy (#892) The Agent Inspector Chat UI already generates and tracks a sessionId per conversation and forwards it to the dev-server proxy on each invocation. However, handleA2AInvocation dropped this sessionId when building the A2A JSON-RPC body, so every turn arrived at the A2A agent with a fresh, auto-generated contextId. This broke multi-turn memory for any A2A agent that keys session state on the A2A contextId (e.g., Strands FileSessionManager(session_id=context.context_id)). Map sessionId to the A2A Message.contextId field when present. This is spec-compliant per A2A Protocol Spec §3.4.3 (clients MAY include contextId in subsequent messages to indicate continuation) and §3.4.1 (when contextId is omitted, the agent MAY generate a fresh one). Closes #891 Co-authored-by: kashinoki38 <21358299+kashinoki38@users.noreply.github.com> * fix(invoke): pass session ID to local invoke log files (#894) The --session-id flag value was correctly sent to Runtime but never passed to InvokeLogger, causing local log files to always show "Session ID: none". Wire options.sessionId through to both the InvokeLogger constructor and logPrompt() calls in exec and standard invoke modes. Closes #890 * feat: add session filesystem storage support (#893) Adds --session-storage-mount-path to agentcore create and agentcore add agent, wiring the mount path through schema, CLI flags, TUI wizard, template rendering, and CDK mapping. File tools (file_read, file_write, list_files) with path traversal protection are scaffolded into all 8 framework templates when storage is configured. Fixes A2A sessionId not being forwarded to InvokeAgentRuntimeCommand. Validation is centralised in SessionStorageSchema with no regex duplication across validators or TUI. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: agentcore add component opens component wizard directly (#896) When running `agentcore add memory` (or any other component), the TUI was always showing the generic resource selection screen. This is because AddFlow always started in the 'select' state regardless of which subcommand invoked it. Added an `initialResource` prop to AddFlow that maps directly to the correct wizard state, skipping the selection screen. Each primitive now passes its resource type when rendering AddFlow in TUI fallback mode. Closes #857 * docs: update vended AGENTS.md, README.md, and llm-context references (#898) * docs: update vended AGENTS.md, README.md, and llm-context references Rewrite vended documentation to reflect the current state of the CLI: - Add all current resources (gateways, evaluators, policies, online-eval) - Add all CLI commands (logs, traces, eval, pause, resume, fetch, import) - Add protocols (HTTP, MCP, A2A) and all supported frameworks - Add Node.js runtime versions alongside Python - Add VPC network mode documentation - Reference @aws/agentcore-cdk L3 constructs and CDK repo - Add mcp.ts to llm-context README file table - Update internal assets/AGENTS.md with full directory layout * test: update asset snapshot tests to match new docs content * chore: remove single-commit-must-match-PR-title validation (#897) The validateSingleCommit + validateSingleCommitMatchesPrTitle options force contributors to keep their commit message in sync with the PR title, which is unnecessary friction — squash-merge already uses the PR title as the final commit message regardless of individual commit messages. * chore: remove preview bump type from release workflow (#847) Preview versioning is no longer used. Remove the preview and preview-major options from the release workflow dispatch and all supporting logic in the bump-version script. * feat: add AG-UI (AGUI) as fourth first-class protocol mode (#858) * feat: add AG-UI (AGUI) as fourth first-class protocol mode Add AGUI protocol support across the full CLI stack: - Schema: Add 'AGUI' to ProtocolModeSchema, PROTOCOL_FRAMEWORK_MATRIX (Strands, LangChain_LangGraph, GoogleADK), and RESERVED_PROJECT_NAMES - Types: New agui-types.ts with 27 event type enum, typed interfaces, parseAguiEvent parser, and buildAguiRunInput helper - Templates: Python AGUI agent templates for Strands (ag-ui-strands), LangGraph (ag-ui-langgraph), and GoogleADK (ag-ui-adk) frameworks - Invoke: invokeAguiRuntime with dual-stream architecture (typed events for TUI, text-only for CLI), local dev invokeAguiStreaming with RunAgentInput body, protocol dispatch in invokeForProtocol - TUI: Rich AGUI event rendering with MessagePart type (text, tool_call, reasoning, error) in InvokeScreen, AGUI placeholder text in DevScreen - Validation: Updated error messages and help text to include AGUI - Tests: 24 unit tests for parseAguiEvent/buildAguiRunInput, snapshot updates for new template files * fix: address review findings for AGUI protocol implementation HIGH fixes: - Add sessionId to AguiInvokeOptions, pass as runtimeSessionId (H-1) - Throw early for bearerToken on AGUI (not yet supported) (H-2) - Add bedrock-agentcore dep to all 3 template pyproject.toml files (H-4/5/6) - Fix LangGraph /ping to return "healthy" not "ok" (H-7) - Match TOOL_CALL_RESULT to tool_call parts by toolCallId, not position (H-12) - Add complete enum coverage to agui-types test (H-15) MEDIUM fixes: - Fix langchain version pin from 1.2.0 (nonexistent) to 0.3.0 (M-11) - Remove invalid allow_credentials=True with wildcard CORS (M-12) - Replace in-place parts mutation with immutable updates for React safety (M-5) - Surface readLoop errors in consumer generators instead of swallowing (M-1) - Disable retry once streaming starts to prevent duplicate output (M-3) - Handle TEXT_MESSAGE_CHUNK events alongside TEXT_MESSAGE_CONTENT (M-2) - Update gemini model from 2.0-flash to 2.5-flash in GoogleADK (M-8) - Add missing event type exports to barrel index.ts (M-18) LOW fixes: - Move AGUI imports to top-level in action.ts (L-1) - Gate OTEL_SDK_DISABLED on LOCAL_DEV env var in Strands template (L-9) - Add explanatory comment for LANGGRAPH_FAST_API env var (L-10) * fix: add AGUI to TUI protocol picker and dev mode dispatch - Add AGUI option to PROTOCOL_OPTIONS in generate/types.ts so users can select AGUI from the interactive create/add wizards - Add AGUI case to useDevServer.ts sendMessage dispatch so local dev TUI sends correct RunAgentInput body via invokeAguiStreaming - Add AGUI case to dev/command.tsx non-interactive dispatch so agentcore dev "prompt" uses invokeForProtocol('AGUI') * fix: A2A GoogleADK template passes model=None to Agent constructor load_model() returns None (it only sets GOOGLE_API_KEY env var as a side effect). Passing model=load_model() to Agent() results in model=None, causing the agent to either crash or use a default model. Fix: call load_model() standalone for the side effect, then pass the model ID string directly to Agent(). * chore: update protocol references to include AGUI across CLI - AddScreen description: 'HTTP, MCP, A2A' → includes AGUI - create --protocol help text: includes AGUI - JSDoc comments in agent/types.ts, templates/types.ts, agent-env.ts - codezip-dev-server comment: 'MCP/A2A' → 'MCP/A2A/AGUI' - agent-env.test.ts: add AGUI to protocol acceptance test * fix: add InvokeLogger to AGUI CLI path and improve UX polish - Add InvokeLogger to AGUI CLI invoke block (action.ts) for prompt/response logging and log file creation — parity with HTTP invoke path - Track RUN_ERROR events in textStream and return success: false when agent errors are detected - Pass sessionId and logger to invokeAguiRuntime options - Improve AGUI protocol picker description from circular 'AG-UI agent-to-user interaction protocol' to actionable 'Stream rich agent events to frontends (AG-UI)' * fix: template bugs found during deployment testing Bugs found by deploying all 3 AGUI frameworks to AWS and invoking: - Bump ag-ui-strands to >= 0.1.4 (0.1.3 crashes on strands >= 1.19.0 due to accessing removed private attr agent.state._state) - Remove parallel_tool_calls=False from LangGraph template (Bedrock rejects this OpenAI-specific parameter with ValidationException) - Remove aws-opentelemetry-distro from GoogleADK template (conflicts with google-adk >= 1.16.0 OpenTelemetry dependencies — agents using this template should set instrumentation.enableOtel: false) * fix: add ToolNode + ReAct loop to AGUI LangGraph template The AGUI LangGraph template had a single-node graph (chat → END) with no tool execution loop. When the model called add_numbers, the graph exited without executing the tool or generating a text response, producing "(no content in AGUI response)" in agentcore dev. Template fix: - Add ToolNode(tools=backend_tools) as a "tools" node - Replace set_finish_point("chat") with tools_condition conditional edge - Add edge from "tools" back to "chat" for the ReAct loop - Separate backend_tools list from frontend tools (state["tools"]) This matches the standard LangGraph ReAct pattern (agent → tools → agent → ... → END) and how the HTTP/A2A templates use create_react_agent. Dev invoke fix: - invoke-agui.ts now tracks TOOL_CALL_START/ARGS/END/RESULT events - When no text is produced but tool calls were seen, surfaces them as [Tool: name(args)] instead of generic "(no content)" message * fix: address all review findings from AG-UI protocol code review 16 issues from 4-lane parallel code review, all addressed: Critical fixes: - Strands template: use session_manager_provider from ag-ui-strands 0.1.7 instead of hardcoded "default-session"/"default-user" - Dev client: persist threadId per session for multi-turn conversations - CRLF handling: use /\r?\n/ in SSE parsers (invoke-agui + invoke.ts) - Malformed JSON no longer yielded as content (shared parser skips) - Unbounded aguiEvents array replaced with bounded cursor-based pruning Structural improvements: - Unified SSE parser (agui-parser.ts) replaces two divergent parsers in invoke-agui.ts (dev) and agentcore.ts (deployed). Net -39 LOC. - Dual-consumer support with singleConsumer mode for dev path - AguiEvent type union completed (4 missing members added) - Dynamic imports converted to static where non-intentional (AGENTS.md) Python template fixes: - LangGraph: add LangchainInstrumentor + dep, remove unused END import, MemorySaver already removed in prior commit - GoogleADK: remove dead load_model() + bedrock-agentcore dep, remove hardcoded user_id (ADK defaults to per-thread identity) - Strands: bump ag-ui-strands pin to >= 0.1.7 enableOtel plumbing: - Dockerfile CMD conditional on enableOtel (Handlebars) - enableOtel threaded through AgentRenderConfig + BaseRenderer - Import path: ProtocolModeSchema.safeParse replaces unsafe as-cast - Import path: MCP enableOtel clamped regardless of YAML value - GoogleADK uses plain opentelemetry-distro (aws-distro conflicts) DX + testing: - formatZodIssue falls back to issue.code instead of literal "undefined" - New dockerfile-render.test.ts covers both enableOtel branches - All snapshots updated * fix: add AGUI to JSON schema protocol enum The static JSON schema file used for CDK validation was not updated when AGUI was added to the Zod schema. This caused CDK synth to reject protocol: "AGUI" with a misleading validation error. * fix: restore MemorySaver in AGUI LangGraph template ag_ui_langgraph calls aget_state(config) with thread_id which requires a checkpointer. Without it, every invocation throws ValueError: No checkpointer set. The original msgpack crash only triggers with numbers exceeding 2^63 (ormsgpack limitation), not with normal large numbers. Bug bash confirmed: 325435 + 435634563456456 works correctly with MemorySaver present. * fix: address final review findings in AGUI parser - Wrap reader.releaseLock() in try/catch to prevent error masking if lock is already released (HIGH from code review) - Replace textStream! non-null assertion with runtime guard (MEDIUM from code review) * fix: use toolCallId for TOOL_CALL_RESULT matching in dev client Previously matched by activeToolName which was already reset to '' by TOOL_CALL_END. The find() never matched, falling through to the last tool call — wrong for parallel tool calls. Now matches by toolCallId which is the unique identifier AG-UI provides per tool invocation. * revert: remove manual JSON schema edit (auto-generated during release) The schemas/ directory is auto-regenerated from Zod schemas during the release workflow. AGUI is already in ProtocolModeSchema (constants.ts) and will appear in the JSON schema on next release. * fix: add configurable PORT env var to AGUI templates + update snapshots All 3 AGUI templates now read PORT from env with default 8080: uvicorn.run(app, host="0.0.0.0", port=int(os.environ.get("PORT", "8080"))) Addresses PR review comment requesting configurable port for local testing. * fix: use AG-UI in user-facing strings instead of AGUI Schema enum stays 'AGUI' (internal), but TUI display text uses 'AG-UI' which is the protocol's official name. * fix: restore credential wiring in AGUI GoogleADK template The template was missing load_model() call and bedrock-agentcore dep, so GOOGLE_API_KEY was never set from the AgentCore credential. Both dev mode and deployed agents failed with "No API key provided." * fix: convert AGUI dynamic import to static in invokeForProtocol AGENTS.md requires all imports at top of file. The dynamic import had no meaningful performance benefit — AGUI parser is ~4KB in a 2.1MB CLI. * feat: support preview releases from feature branches (#905) The release workflow was hardcoded to only publish from main with the `latest` npm dist-tag. This made it impossible to publish prerelease versions from feature branches. Now when the workflow runs from a non-main branch, it sets the npm dist-tag to `preview` and targets the source branch for the release PR. Stable bump types (patch, minor, major) are blocked on non-main branches to prevent accidental overwrites of the `latest` tag. * fix(invoke): show full session ID and print resume command on exit (#904) * fix(invoke): show full session ID and print resume command on exit The invoke TUI truncated the session ID to 8 characters, making it impossible to copy the full UUID needed for --session-id. Additionally, there was no guidance on how to resume a session after exiting. - Display full session ID in the TUI header instead of truncating - Print a colored resume command after TUI exit (both Esc and Ctrl+C) - Use Ink's unmount() instead of process.exit(0) for clean shutdown, which also fixes the update notifier not showing on Esc exit * fix: only show resume message when a session was actually used * feat: add GovCloud multi-partition support (#908) Add partition-aware ARN construction, endpoint URL generation, and console URL generation to support aws-us-gov (and future aws-cn) partitions. - Create src/cli/aws/partition.ts with getPartition, arnPrefix, dnsSuffix, serviceEndpoint, and consoleDomain utilities - Replace all hardcoded arn:aws: in ARN template literals with arnPrefix(region) - Update ARN regex patterns to accept any partition (arn:[^:]+:) - Replace hardcoded amazonaws.com in endpoint URLs with serviceEndpoint() - Replace hardcoded console.aws.amazon.com with consoleDomain() - Add us-gov-west-1 to AgentCoreRegionSchema, BEDROCK_REGIONS, and LLM compacted types - Add aws-us-gov to cdk.json target-partitions - Fix execution-role-policy.json to use partition wildcard (arn:*) - Add 15 unit tests for partition utilities - Document multi-partition rules and checklists in AGENTS.md * feat: remove deployed/local from status legend (#936) * feat: remove deployed/local from status legend * fix: prettier * feat: upgrade agent inspector to 0.2.1 (#937) * fix(deploy): honor aws-targets.json region for all SDK and CDK calls (#925) * fix(deploy): honor aws-targets.json region for all SDK and CDK calls (#924) AWS SDK clients constructed by @aws-cdk/toolkit-lib internally (for CloudFormation, S3 asset upload, etc.) do not receive an explicit region option and fall back to the SDK's default region resolution chain (AWS_REGION -> AWS_DEFAULT_REGION -> shared config). When a user's aws-targets.json specified a non-default region but those env vars were unset, resources were created in the SDK default region instead of the configured target. Promote target.region to AWS_REGION and AWS_DEFAULT_REGION for the lifetime of deploy and teardown operations, restoring prior values in a finally block. This ensures downstream SDK clients (explicit and toolkit-lib internal) agree on the target region. Covers CLI non-interactive deploy (handleDeploy) and the interactive TUI deploy/teardown (useCdkPreflight, destroyTarget). Invoke/status/eval already pass target.region explicitly. * fix(deploy): restore region env on TUI error states; consolidate barrel exports Review feedback: 1. TUI preflight error branches called setPhase('error') without calling restoreRegionEnv(). Add a useEffect guarded on phase === 'error' so every error path restores the env override without threading the call into every branch. 2. Export applyTargetRegionToEnv from the aws barrel for consistency with withTargetRegion. Update CLI deploy, teardown, and TUI preflight hook to import from the barrel instead of the deep path. * chore: bump version to 0.10.0 (#944) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * chore: sync with public/main (2026-04-27) (#143) * feat: add GitHub Action for automated PR review via AgentCore Harness (#934) * feat: add GitHub Action for automated PR review via AgentCore Harness Adds a workflow that reviews PRs using Bedrock AgentCore Harness. The harness runs an AI agent in an isolated microVM with gh, git, and pre-cloned repos that fetches PR diffs and posts review comments. Workflow: - Triggers on PR open/reopen for agentcore-cli-devs team members - Supports manual workflow_dispatch for any PR URL - Adds/removes ai-reviewing label during review - Authenticates via GitHub OIDC to assume AWS role Files: - .github/workflows/pr-ai-review.yml — main workflow - .github/scripts/python/harness_review.py — harness invocation script - .github/scripts/python/harness_config.py — config from env vars - .github/scripts/models/ — local boto3 service model (InvokeHarness not yet in standard boto3) Required secrets: - HARNESS_AWS_ROLE_ARN — IAM role ARN for OIDC - HARNESS_ACCOUNT_ID — AWS account ID - HARNESS_ID — Harness ID * refactor: replace local service model with raw HTTP + SigV4 signing Eliminates the 220KB bundled service model by using direct HTTP requests with SigV4 authentication to invoke the harness endpoint. No extra dependencies needed — urllib3, SigV4Auth, and EventStreamBuffer are all part of botocore/boto3. Rejected: invoke_agent_runtime API | server rejects harness ARNs with ResourceNotFoundException Confidence: high Scope-risk: moderate * refactor: inline harness config into review script Remove separate harness_config.py — env vars are read directly in harness_review.py. One less file to maintain, config is still driven entirely by environment variables set in the GitHub workflow. * refactor: extract invoke_harness helper for cleaner main flow * refactor: simplify config and improve script readability - Replace HARNESS_ACCOUNT_ID + HARNESS_ID with single HARNESS_ARN env var - Extract prompts into separate .md files in .github/scripts/prompts/ - Extract stream parsing into print_stream() function - Add close_group() helper to deduplicate ::group:: bookkeeping * refactor: separate event parsing from display logic Extract parse_events() generator to handle binary stream decoding, keeping print_stream() focused on formatting and log groups. * docs: add explanatory comments to harness review functions * refactor: derive region from HARNESS_ARN instead of separate env var Eliminates HARNESS_REGION env var — the region is extracted from the ARN directly, so there's no risk of a mismatch causing confusing SigV4 auth errors. * chore: rename label to agentcore-harness-reviewing * refactor: move auth check to job level so entire review is skipped early Split into authorize + ai-review jobs. The ai-review job only runs if the PR author is authorized (team member or write access) or if triggered via workflow_dispatch. Removes repeated if conditions from every step. * chore: exclude AI prompt templates from prettier Prompt markdown files use intentional formatting that prettier would reflow, breaking the prompt structure. * fix: buffer streaming text to avoid per-token log lines in GitHub Actions (#946) Each text delta from the harness was printed individually with flush, creating a separate log line per token. Now text is buffered and flushed as complete lines at block boundaries. * fix: allow code-based evaluators in online eval configs (#947) * fix: allow code-based evaluators in online eval configs Remove restrictions that blocked code-based evaluators from being used in online evaluation configs. The service now supports code-based evaluators for online evaluation. Changes: - Remove code-based evaluator block in OnlineEvalConfigPrimitive - Remove code-based evaluator validation in schema superRefine - Remove code-based evaluator filter in TUI evaluator picker * style: fix prettier formatting * fix: add TTY detection before TUI fallbacks to prevent agent/CI hangs (#949) * fix: add TTY detection before TUI fallbacks to prevent agent/CI hangs When commands are invoked without flags in non-interactive environments (CI, piped stdin, agent automation), the CLI falls through to Ink TUI rendering which hangs indefinitely. Add a requireTTY() guard at every TUI entry point that checks process.stdout.isTTY and exits with a helpful error message directing users to --help for non-interactive flags. Closes #685 * fix: check both stdin and stdout isTTY in requireTTY guard The hang from #685 is caused by stdin not being a TTY (Ink reads keyboard input from stdin), not stdout. Check both stdin and stdout so the guard fires for piped stdin, redirected stdout, and CI environments where both are non-TTY. * fix: agentcore dev not working in windows (#951) * fix: use pull_request_target for fork PR support (#958) * fix: make label step non-blocking for fork PRs Fork PRs get read-only GITHUB_TOKEN regardless of workflow permissions, causing the addLabels API call to fail with 403. This crashed the entire job before the review could run. continue-on-error lets the review proceed even when labeling fails. * fix: use pull_request_target for full write access on fork PRs pull_request gives a read-only GITHUB_TOKEN for fork PRs, preventing labels and secrets from working. pull_request_target runs in the base repo context with full permissions. This is safe because we never check out or execute fork code — the harness fetches the PR diff via the GitHub API. * fix: lower eventExpiryDuration minimum from 7 to 3 days (closes #744) (#956) The AWS CreateMemory API allows a minimum of 3 days, but the CLI schema was rejecting values below 7. Update the Zod schema, LLM compacted types, import clamping logic, and all related tests. * fix: display session ID after CLI invoke completes (#957) * fix: display session ID after CLI invoke completes (closes #664) The streaming and non-streaming invoke responses include a session ID from the runtime, but the CLI paths discarded it. Now prints the session ID and a resume command hint after invoke output. * fix: include sessionId in AGUI protocol invoke result * test: add browser tests for agent inspector (#938) * feat: add telemetry schemas and client (#941) * chore: bump version to 0.11.0 (#967) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix(invoke): auto-generate session ID for bearer-token invocations (#953) Closes #840 When invoking an agent with a bearer token (OAuth/CUSTOM_JWT) and no session ID, `AgentCoreMemoryConfig` raised a Pydantic validation error because `session_id=None` is rejected. Unlike SigV4 callers, bearer-token callers do not get a server-side auto-generated runtime session ID. Two-layer fix: 1. CLI synthesizes a UUID in `invoke` action when `--bearer-token` is set and `--session-id` is missing, using the existing `generateSessionId` helper. Covers both explicit `--bearer-token` and the CUSTOM_JWT auto-fetch path. 2. Strands memory session templates (http, agui, a2a) synthesize a UUID when `session_id` is falsy before constructing AgentCoreMemoryConfig. Protects direct runtime callers (curl, custom apps) who forget the `X-Amzn-Bedrock-AgentCore-Runtime-Session-Id` header. Snapshot tests updated. * fix: show 'Computing diff changes...' step during deploy diff phase (#952) The deploy TUI appeared frozen for 5-15 seconds between preflight completion and 'Publish assets' while cdkToolkitWrapper.diff() ran silently with no step marked as running. Add a dedicated pre-deploy diff step that transitions running -> success around the diff call so StepProgress always has something to highlight. Closes #781 * test: split browser tests into its own job, fix logs path (#975) * feat(invoke): add --prompt-file and stdin support for long prompts (#974) * feat(invoke): add --prompt-file and stdin support for long prompts Long prompts hit shell argument limits (E2BIG, typically 128KB-2MB) when passed as positional args. This adds two new sources: - --prompt-file <path>: read prompt from a file - piped stdin: when no prompt is given and stdin is not a TTY, read the prompt from stdin Precedence is hybrid and backward-compatible: --prompt > positional > --prompt-file > stdin --prompt-file combined with piped stdin content returns an explicit collision error rather than silently picking one. Closes #686 * docs(invoke): document --prompt-file and stdin support * fix(import): remove experimental warning from import command (#977) The import feature has stabilized and no longer needs the experimental label. * fix: duplicate header flash and help menu truncation (closes #895, closes #637) (#955) - Return null during brief transitional phases to prevent Ink from rendering a header that gets immediately replaced by a different frame - Consolidate CreateScreen phases into a single Screen mount - Make help menu description width responsive to terminal size - Remove hardcoded 50-char description truncation limit * test: configure git in browser tests workflow (#976) * feat: add project-name option to create (#969) * Add project-name option to create * fix: address review feedback — restore name description and move backfill logic * ci: bump the github-actions group across 1 directory with 4 updates (#964) Bumps the github-actions group with 4 updates in the / directory: [aws-actions/configure-aws-credentials](https://github.com/aws-actions/configure-aws-credentials), [actions/github-script](https://github.com/actions/github-script), [softprops/action-gh-release](https://github.com/softprops/action-gh-release) and [slackapi/slack-github-action](https://github.com/slackapi/slack-github-action). Updates `aws-actions/configure-aws-credentials` from 5 to 6 - [Changelog](https://github.com/aws-actions/configure-aws-credentials/blob/main/CHANGELOG.md) - [Commits](https://github.com/aws-actions/configure-aws-credentials/compare/v5...v6) Updates `actions/github-script` from 8 to 9 - [Commits](https://github.com/actions/github-script/compare/v8...v9) Updates `softprops/action-gh-release` from 2 to 3 - [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md) - [Commits](https://github.com/softprops/action-gh-release/compare/v2...v3) Updates `slackapi/slack-github-action` from 3.0.1 to 3.0.2 - [Release notes](https://github.com/slackapi/slack-github-action/releases) - [Changelog](https://github.com/slackapi/slack-github-action/blob/main/CHANGELOG.md) - [Commits](https://github.com/slackapi/slack-github-action/compare/v3.0.1...v3.0.2) --- updated-dependencies: - dependency-name: actions/github-script dependency-version: '9' dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions - dependency-name: aws-actions/configure-aws-credentials dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions - dependency-name: slackapi/slack-github-action dependency-version: 3.0.2 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: github-actions - dependency-name: softprops/action-gh-release dependency-version: '3' dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump aws-cdk-lib (#962) Bumps the aws-cdk group with 1 update in the / directory: [aws-cdk-lib](https://github.com/aws/aws-cdk/tree/HEAD/packages/aws-cdk-lib). Updates `aws-cdk-lib` from 2.248.0 to 2.250.0 - [Release notes](https://github.com/aws/aws-cdk/releases) - [Changelog](https://github.com/aws/aws-cdk/blob/main/CHANGELOG.v2.alpha.md) - [Commits](https://github.com/aws/aws-cdk/commits/v2.250.0/packages/aws-cdk-lib) --- updated-dependencies: - dependency-name: aws-cdk-lib dependency-version: 2.250.0 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: aws-cdk ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump postcss from 8.5.8 to 8.5.10 (#961) Bumps [postcss](https://github.com/postcss/postcss) from 8.5.8 to 8.5.10. - [Release notes](https://github.com/postcss/postcss/releases) - [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md) - [Commits](https://github.com/postcss/postcss/compare/8.5.8...8.5.10) --- updated-dependencies: - dependency-name: postcss dependency-version: 8.5.10 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump secretlint from 11.4.1 to 12.2.0 (#916) Bumps [secretlint](https://github.com/secretlint/secretlint) from 11.4.1 to 12.2.0. - [Release notes](https://github.com/secretlint/secretlint/releases) - [Commits](https://github.com/secretlint/secretlint/compare/v11.4.1...v12.2.0) --- updated-dependencies: - dependency-name: secretlint dependency-version: 12.2.0 dependency-type: direct:development update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump @vitest/coverage-v8 from 4.1.2 to 4.1.5 (#915) Bumps [@vitest/coverage-v8](https://github.com/vitest-dev/vitest/tree/HEAD/packages/coverage-v8) from 4.1.2 to 4.1.5. - [Release notes](https://github.com/vitest-dev/vitest/releases) - [Commits](https://github.com/vitest-dev/vitest/commits/v4.1.5/packages/coverage-v8) --- updated-dependencies: - dependency-name: "@vitest/coverage-v8" dependency-version: 4.1.5 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump @secretlint/secretlint-rule-preset-recommend (#914) Bumps [@secretlint/secretlint-rule-preset-recommend](https://github.com/secretlint/secretlint) from 11.4.1 to 12.2.0. - [Release notes](https://github.com/secretlint/secretlint/releases) - [Commits](https://github.com/secretlint/secretlint/compare/v11.4.1...v12.2.0) --- updated-dependencies: - dependency-name: "@secretlint/secretlint-rule-preset-recommend" dependency-version: 12.2.0 dependency-type: direct:development update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump the aws-sdk group across 1 directory with 14 updates (#912) Bumps the aws-sdk group with 14 updates in the / directory: | Package | From | To | | --- | --- | --- | | [@aws-sdk/client-application-signals](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-application-signals) | `3.1036.0` | `3.1037.0` | | [@aws-sdk/client-bedrock](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-bedrock) | `3.1036.0` | `3.1037.0` | | [@aws-sdk/client-bedrock-agent](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-bedrock-agent) | `3.1036.0` | `3.1037.0` | | [@aws-sdk/client-bedrock-agentcore](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-bedrock-agentcore) | `3.1036.0` | `3.1037.0` | | [@aws-sdk/client-bedrock-agentcore-control](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-bedrock-agentcore-control) | `3.1036.0` | `3.1037.0` | | [@aws-sdk/client-bedrock-runtime](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-bedrock-runtime) | `3.1036.0` | `3.1037.0` | | [@aws-sdk/client-cloudformation](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-cloudformation) | `3.1036.0` | `3.1037.0` | | [@aws-sdk/client-cloudwatch-logs](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-cloudwatch-logs) | `3.1036.0` | `3.1037.0` | | [@aws-sdk/client-resource-groups-tagging-api](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-resource-groups-tagging-api) | `3.1036.0` | `3.1037.0` | | [@aws-sdk/client-s3](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-s3) | `3.1036.0` | `3.1037.0` | | [@aws-sdk/client-sts](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-sts) | `3.1036.0` | `3.1037.0` | | [@aws-sdk/client-xray](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-xray) | `3.1036.0` | `3.1037.0` | | [@aws-sdk/credential-providers](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/packages/credential-providers) | `3.1036.0` | `3.1037.0` | | [@aws-sdk/client-cognito-identity-provider](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-cognito-identity-provider) | `3.1036.0` | `3.1037.0` | Updates `@aws-sdk/client-application-signals` from 3.1036.0 to 3.1037.0 - [Release notes](https://github.com/aws/aws-sdk-js-v3/releases) - [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-application-signals/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-application-signals) Updates `@aws-sdk/client-bedrock` from 3.1036.0 to 3.1037.0 - [Release notes](https://github.com/aws/aws-sdk-js-v3/releases) - [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-bedrock/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-bedrock) Updates `@aws-sdk/client-bedrock-agent` from 3.1036.0 to 3.1037.0 - [Release notes](https://github.com/aws/aws-sdk-js-v3/releases) - [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-bedrock-agent/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-bedrock-agent) Updates `@aws-sdk/client-bedrock-agentcore` from 3.1036.0 to 3.1037.0 - [Release notes](https://github.com/aws/aws-sdk-js-v3/releases) - [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-bedrock-agentcore/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-bedrock-agentcore) Updates `@aws-sdk/client-bedrock-agentcore-control` from 3.1036.0 to 3.1037.0 - [Release notes](https://github.com/aws/aws-sdk-js-v3/releases) - [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-bedrock-agentcore-control/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-bedrock-agentcore-control) Updates `@aws-sdk/client-bedrock-runtime` from 3.1036.0 to 3.1037.0 - [Release notes](https://github.com/aws/aws-sdk-js-v3/releases) - [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-bedrock-runtime/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-bedrock-runtime) Updates `@aws-sdk/client-cloudformation` from 3.1036.0 to 3.1037.0 - [Release notes](https://github.com/aws/aws-sdk-js-v3/releases) - [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-cloudformation/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-cloudformation) Updates `@aws-sdk/client-cloudwatch-logs` from 3.1036.0 to 3.1037.0 - [Release notes](https://github.com/aws/aws-sdk-js-v3/releases) - [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-cloudwatch-logs/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-cloudwatch-logs) Updates `@aws-sdk/client-resource-groups-tagging-api` from 3.1036.0 to 3.1037.0 - [Release notes](https://github.com/aws/aws-sdk-js-v3/releases) - [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-resource-groups-tagging-api/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-resource-groups-tagging-api) Updates `@aws-sdk/client-s3` from 3.1036.0 to 3.1037.0 - [Release notes](https://github.com/aws/aws-sdk-js-v3/releases) - [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-s3/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-s3) Updates `@aws-sdk/client-sts` from 3.1036.0 to 3.1037.0 - [Release notes](https://github.com/aws/aws-sdk-js-v3/releases) - [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-sts/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-sts) Updates `@aws-sdk/client-xray` from 3.1036.0 to 3.1037.0 - [Release notes](https://github.com/aws/aws-sdk-js-v3/releases) - [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-xray/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-xray) Updates `@aws-sdk/credential-providers` from 3.1036.0 to 3.1037.0 - [Release notes](https://github.com/aws/aws-sdk-js-v3/releases) - [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/packages/credential-providers/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/packages/credential-providers) Updates `@aws-sdk/client-cognito-identity-provider` from 3.1036.0 to 3.1037.0 - [Release notes](https://github.com/aws/aws-sdk-js-v3/releases) - [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-cognito-identity-provider/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-cognito-identity-provider) --- updated-dependencies: - dependency-name: "@aws-sdk/client-application-signals" dependency-version: 3.1034.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: aws-sdk - dependency-name: "@aws-sdk/client-bedrock" dependency-version: 3.1034.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: aws-sdk - dependency-name: "@aws-sdk/client-bedrock-agent" dependency-version: 3.1034.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: aws-sdk - dependency-name: "@aws-sdk/client-bedrock-agentcore" dependency-version: 3.1034.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: aws-sdk - dependency-name: "@aws-sdk/client-bedrock-agentcore-control" dependency-version: 3.1034.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: aws-sdk - dependency-name: "@aws-sdk/client-bedrock-runtime" dependency-version: 3.1034.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: aws-sdk - dependency-name: "@aws-sdk/client-cloudformation" dependency-version: 3.1034.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: aws-sdk - dependency-name: "@aws-sdk/client-cloudwatch-logs" dependency-version: 3.1034.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: aws-sdk - dependency-name: "@aws-sdk/client-cognito-identity-provider" dependency-version: 3.1034.0 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: aws-sdk - dependency-name: "@aws-sdk/client-resource-groups-tagging-api" dependency-version: 3.1034.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: aws-sdk - dependency-name: "@aws-sdk/client-s3" dependency-version: 3.1034.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: aws-sdk - dependency-name: "@aws-sdk/client-sts" dependency-version: 3.1034.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: aws-sdk - dependency-name: "@aws-sdk/client-xray" dependency-version: 3.1034.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: aws-sdk - dependency-name: "@aws-sdk/credential-providers" dependency-version: 3.1034.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: aws-sdk ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump hono from 4.12.12 to 4.12.14 (#868) Bumps [hono](https://github.com/honojs/hono) from 4.12.12 to 4.12.14. - [Release notes](https://github.com/honojs/hono/releases) - [Commits](https://github.com/honojs/hono/compare/v4.12.12...v4.12.14) --- updated-dependencies: - dependency-name: hono dependency-version: 4.12.14 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump esbuild from 0.27.4 to 0.28.0 (#862) Bumps [esbuild](https://github.com/evanw/esbuild) from 0.27.4 to 0.28.0. - [Release notes](https://github.com/evanw/esbuild/releases) - [Changelog](https://github.com/evanw/esbuild/blob/main/CHANGELOG.md) - [Commits](https://github.com/evanw/esbuild/compare/v0.27.4...v0.28.0) --- updated-dependencies: - dependency-name: esbuild dependency-version: 0.28.0 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * test: speed up CI and fix mock cleanup gaps (#989) * test: speed up CI and fix mock cleanup gaps - Node 20 only on PRs (full matrix on main) - 3-way vitest sharding for unit tests with blob report merging - Pre-bundle heavy deps (AWS SDK, Smithy, zod, commander) via deps.optimizer - Exclude tui-harness from unit test project (not production code) - Add afterEach(vi.restoreAllMocks) to 3 files with mock cleanup gaps - Move inline consoleSpy.mockRestore() to afterEach in logs-eval tests - Skip PTY tests when node-pty spawn is unavailable * style: fix prettier formatting in build-and-test.yml * fix: enable include-hidden-files for blob artifact upload upload-artifact@v7 defaults include-hidden-files to false, which skips the .vitest-reports directory. Also fail loudly if no files found. * feat: runtime endpoint support in AgentCore CLI (#979) * feat: add runtime endpoint support to AgentCore CLI - Schema: endpoints field on AgentEnvSpec, runtimeVersion in deployed state - Primitive: RuntimeEndpointPrimitive with add/remove/preview - TUI: Add and Remove flows with multi-field form - Status: endpoints nested under agents with deployment badges - Deploy: parseRuntimeEndpointOutputs + buildDeployedState pipeline * fix: correct output key prefix for runtime endpoint parsing The CFN output keys include the AgentEnvironment construct prefix (Agent{PascalName}) which was missing from the parser pattern. * fix: remove .omc state files and unused useCallback import - Remove .omc/ from git tracking, add to .gitignore - Remove unused useCallback import in AddRuntimeEndpointScreen.tsx * fix: shorten runtime endpoint description to prevent TUI overflow The description "Named endpoint (version alias) for a runtime" was too long and wrapped to the next line in the Add Resource menu. Shortened to "Named endpoint for a runtime". * fix: validate runtime endpoint version is a positive integer - Add explicit Number.isInteger check before schema validation - Change Commander parser from parseInt to Number so floats like 3.5 are caught instead of silently truncated * fix: use agent/endpoint composite key to prevent React key collision Endpoint names can collide across runtimes (e.g., both have "prod"). Changed React key from epName to agent.name/epName to prevent duplicate key warnings that pollute the TUI viewport. * fix: render runtime endpoints in status --type runtime-endpoint When filtering by --type runtime-endpoint, agents array is empty so the agents section (which nests endpoints) never renders. Added a standalone Runtime Endpoints section that shows when endpoints exist but agents don't (i.e., when type-filtering). * fix: add runtime-endpoint to status --help --type documentation The --type option help text was missing runtime-endpoint from the list of valid resource types. * fix: return richer JSON response from add runtime-endpoint add now returns { success, endpointName, agent, version } instead of sparse { success: true }, matching the richer response shape from remove runtime-endpoint. * fix: validate endpoint version against deployed runtime version - TUI: show "Current deployed version: N" and valid range (1-N) - TUI: reject version exceeding latest deployed version - CLI: check deployed-state.json for max version, reject if exceeded - If runtime not deployed, only positive integer check applies * chore: remove planning and bug bash docs from PR * fix: use composite key and parentName for endpoint identification - Add parentName field to ResourceStatusEntry for structured parent linking - Use runtimeName/endpointName composite key in remove/preview/getRemovable - Status command filters endpoints by parentName instead of parsing detail string - React keys use structured parentName/name instead of display strings * test: add comprehensive unit tests for RuntimeEndpointPrimitive 23 tests covering add(), remove(), previewRemove(), getRemovable(): - Runtime lookup, duplicate detection, version validation - Composite key removal targeting correct runtime - Empty endpoints dict cleanup - Version validation against deployed state - Richer JSON response shape * fix: remove dead findGatewayTargetReferences stub * fix: use BasePrimitive configIO instead of ad-hoc ConfigIO in add() * fix: use Number() instead of parseInt in TUI version validation * chore: fix prettier formatting * fix: use T[] instead of Array<T> to satisfy eslint array-type rule * fix(ci): revert schema file to avoid schema-check guard The schemas/ directory is auto-regenerated during the release workflow. Direct modifications are blocked by CI. * Revert "fix(ci): revert schema file to avoid schema-check guard" This reverts commit 3615e37a0aaa71cd4d2c5c7b19e3ddb41eb2e07c. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Jesse Turner <57651174+jesseturner21@users.noreply.github.com> Co-authored-by: Avi Alpert <131792194+avi-alpert@users.noreply.github.com> Co-authored-by: Gitika <53349492+notgitika@users.noreply.github.com> Co-authored-by: Hweinstock <42325418+Hweinstock@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Aidan Daly <99039782+aidandaly24@users.noreply.github.com> Co-authored-by: Tejas Kashinath <42380254+tejaskash@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * feat: add target-based AB test routing Adds a new AB test mode that routes traffic between gateway targets pointing at different runtime endpoints, alongside the existing config-bundle mode. Changes: - AB test schema: target-based mode, per-variant eval config, gateway filter - HTTP gateway: targets array with qualifier (endpoint reference) - AB test primitive: --mode flag, target-based CLI flow - Pause/resume/stop/promote commands for AB tests - TUI: mode selection, target-based wizard steps - Cross-ref validation: gateway targets must reference valid endpoints - Deploy: handle target-based variant resolution and eval config union * fix: correct API field names for target-based AB test creation - Rename target.targetName → target.name in API client - Rename perVariantOnlineEvaluationConfig[].treatmentName → .name - Fix post-deploy mapping to use correct API field names - Add controlQualifier/treatmentQualifier to Commander options type - Add runtime/qualifier fields to TUI AB test config types * fix: include all eval config ARNs in AB test IAM role policy For target-based AB tests with perVariantOnlineEvaluationConfig, the…

* test: add integ and e2e tests for recommendations Integ tests (12) cover CLI validation for run recommendation: required flags, system-prompt/tool-description input validation, config bundle source, spans file validation, and lookback/session options. E2E tests (8) cover recommendation API lifecycle: start system-prompt and tool-description recommendations, get, delete (stop-via-delete), verify 404 after delete, inline session spans, and error cases. * remove API-level e2e tests (CLI e2e lives in PR #107) * fix: add error message assertions to required flag tests Assert JSON error content (--runtime, --evaluator, --type) instead of only checking exitCode, so tests fail meaningfully on crashes.

…ndations (#107) * test: add integ and e2e tests for config bundles, batch eval, recommendations Integ tests (48): config bundle add/remove lifecycle, evaluator/online-eval lifecycle, batch-evaluation CLI validation, ground truth parsing, recommendation CLI validation. E2E test (1 file, 17 tests): full CLI lifecycle — create project → add config bundle → add evaluator → deploy → invoke → config-bundle versions/diff → run batch-evaluation → run eval → run recommendation (system-prompt, tool-description, config-bundle source) → remove + reconcile. * refactor: keep only e2e test in this PR (integ tests live in separate PRs) * fix: address PR review — stronger e2e assertions and real session IDs - Use real session ID from invoke for ground truth (not hardcoded) - Assert diff array is non-empty, not just property existence - Assert batch eval status is not FAILED - Assert recommendation result is non-empty - Add comment explaining retry rationale for on-demand eval - Reduce excessive retry count (18→10) for on-demand eval

* test: add integ and e2e tests for config bundles Integ tests cover add/remove lifecycle, CLI validation, components-file support, duplicate rejection, placeholder keys, and multi-bundle coexistence. E2E tests cover full API lifecycle (create, get, update, list versions, branch filtering, diff, delete) against the real control plane. * fix: remove hardcoded account ID from e2e config bundle tests Resolve account ID dynamically from AWS_ACCOUNT_ID env var or aws sts get-caller-identity, matching the pattern in e2e-helper.ts. * remove API-level e2e tests (CLI e2e lives in PR #107) * fix: address PR review — extract shared helpers, fix afterAll cleanup Move runSuccess/runFailure to shared test-utils to prevent duplication. Fix afterAll to defensively clean all bundleNames on test failure.

The configBundle advanced setting was selectable in the TUI but never propagated to the output config, so it silently did nothing. - AddAgentScreen: set withConfigBundle on byoConfig when selected in advanced settings, clear it when deselected, pass it through both create and BYO complete handlers, show in confirm review - GenerateWizardUI: show config bundle in confirm summary - useCreateFlow: pass withConfigBundle to GenerateConfig and call createConfigBundleForAgent after agent is written

…#161) 1. Validate batch eval name against API pattern [a-zA-Z][a-zA-Z0-9_]{0,47} before sending. The API returns a misleading "Resource identifier cannot be empty" for invalid names (e.g. hyphens). The CLI now gives a clear error message with the exact constraints. 2. Fix ground truth in legacy fallback: was sending sessionMetadata at top level instead of wrapping in evaluationMetadata. Both old and new API models expect evaluationMetadata.sessionMetadata.

…verage. (#154) * refactor: extract deleteHttpGatewayWithTargets shared helper Extract gateway deletion logic (targets → gateway → role) into shared deleteHttpGatewayWithTargets(). Add deleteOrphanedHttpGateways() for deploy reconciliation. Teardown uses shared helper. Target failures are best-effort (warn, continue). * chore: update bundled SDK wheel to bedrock-agentcore 1.6.4 Replace the dev pre-release wheel (1.6.0.dev20260413) with the current SDK release (1.6.4) from bedrock-agentcore-sdk-python-private. * feat: add [tool.uv.sources] vendored wheel support to agent templates - render.ts: copy .whl files verbatim instead of through Handlebars - BaseRenderer: copy bundled SDK wheel into scaffolded project's wheels/ - Add [tool.uv.sources] block to all 8 agent pyproject.toml templates pointing at wheels/bedrock_agentcore-1.6.4-py3-none-any.whl - Dockerfile: conditionally COPY wheels/ for Container builds - Tests: .whl binary handling, wheel copy verification, updated snapshots * test: add e2e and integ tests for AB tests, gateways, and online evals Integ tests (17 passing): - Target-based AB test CLI flags (11 tests) - Online eval with endpoint field (6 tests) E2E tests (require AWS ap-southeast-2): - Target-based AB test full lifecycle - Config-bundle AB test full lifecycle - HTTP gateway with targets lifecycle * fix: remove gateway trace delivery, add runtime experiment span debug check - Remove gateway trace delivery setup from deploy - Remove Gateway Trace Delivery and Gateway Spans from debug panel - Add Runtime Experiment Spans check to debug panel (queries aws/spans for abTestArn) * fix: improve runtime experiment span debug checks with per-variant filtering and service.name - Split single experiment span check into per-variant (C, T1) checks - Filter baseline runtime spans by service.name from deployed state instead of gen_ai_agent - Show targeted warnings when one variant has spans but the other doesn't * Revert "feat: add [tool.uv.sources] vendored wheel support to agent templates" This reverts commit 458177299237a50e9bc6eb4aada607d18dced3f2.

* fix: update unit tests to match feat/evo-implementation changes - post-deploy-ab-tests: expect 'updated' instead of 'skipped' for existing tests - post-deploy-http-gateways: use deleteOrphanedHttpGateways, remove trace delivery tests - preflight: mock getPathResolver and fs for config bundle patching - ABTestPrimitive: expect gateway retained on remove (requires --delete-gateway) - useAddABTestWizard: first step is now 'mode' not 'name' * refactor: remove legacy /abtests API path fallback The AB test API has been migrated to /ab-tests. Remove the dpRequestWithFallback function, unused cpRequest/getControlPlaneEndpoint, and use dnsSuffix() for multi-partition support. * Revert "feat: bundle Python SDK wheel into CLI for offline install" This reverts commit 791dcfa.

…for telemetry, web-ui, help

agentcore-cli-automation · 2026-04-30T18:04:40Z

Orphaned config bundles never get deleted when all bundles are removed from the spec

In src/cli/commands/deploy/actions.ts around line 571, the post-deploy config bundle block is gated on:

const configBundleSpecs = context.projectSpec.configBundles ?? [];
if (configBundleSpecs.length > 0) {
  // ... calls setupConfigBundles, which also handles orphan deletion
}

If a user removes all configBundles from agentcore.json and redeploys, setupConfigBundles is never called, so orphaned bundles in deployed state are never deleted. They linger in the account (costing money / creating drift) and stay stuck in deployed-state.json forever.

Compare to how HTTP gateways handle this on line 542:

if (httpGatewaySpecs.length > 0 || Object.keys(existingHttpGateways ?? {}).length > 0) {

Two possible fixes:

Mirror the HTTP gateway condition: if (configBundleSpecs.length > 0 || Object.keys(existingConfigBundles ?? {}).length > 0).
Split orphan deletion into a pre-pass like deleteOrphanedABTests.

Related: even when setupConfigBundles does run, the state-merge block at line 583 is guarded by if (Object.keys(configBundleResult.configBundles).length > 0). configBundles in the result is only populated by create/update/skip paths — deletions never add to it. So if every spec bundle is deleted in one run (e.g. user has bundles A, B; removes A and B and adds nothing), deployed state keeps the stale entries and targetResources.configBundles is never overwritten. The HTTP gateway post-deploy explicitly comments "Always merge HTTP gateway state (even if empty, to clear deleted gateways)" — config bundles should do the same.

agentcore-cli-automation · 2026-04-30T18:04:52Z

Baggage header is silently dropped for MCP-protocol invocations

In src/cli/commands/invoke/action.ts lines 241-248, MCP invocations construct:

const mcpOpts = {
  region: targetConfig.region,
  runtimeArn: agentState.runtimeArn,
  userId: options.userId,
  headers: options.headers,
  bearerToken: options.bearerToken,
    baggage,  // ← never actually sent
};

But McpInvokeOptions in src/cli/aws/agentcore.ts (around line 543) doesn't declare a baggage field, and neither mcpRpcCall, mcpRpcCallWithBearer, nor buildMcpBearerHeaders reads it or forwards it to the runtime. So config-bundle baggage for MCP agents is lost — the agent never sees the bundle ARN/version, and BedrockAgentCoreContext.get_config_bundle() will return defaults.

Two fixes depending on intent:

If MCP agents should support config bundles: add baggage to McpInvokeOptions and forward it in both buildMcpBearerHeaders (as a baggage header) and the SigV4 path via InvokeAgentRuntimeCommand's baggage field.
If config bundles are HTTP-only for now: remove the baggage key from the mcpOpts literal so readers don't think it's wired up.

(Side note: the indentation on that baggage, line is off — two extra spaces.)

agentcore-cli-automation · 2026-04-30T18:05:06Z

Three of the new preview-API clients hardcode amazonaws.com — breaks in GovCloud / China partitions

The five new SigV4 HTTP clients for preview APIs inconsistently handle partition DNS suffixes:

✅ src/cli/aws/agentcore-ab-tests.ts (line 205): return `https://bedrock-agentcore.${region}.${dnsSuffix(region)}`;
✅ src/cli/aws/agentcore-http-gateways.ts (line 158): uses dnsSuffix(region)
❌ src/cli/aws/agentcore-config-bundles.ts (line 181): return `https://bedrock-agentcore-control.${region}.amazonaws.com`;
❌ src/cli/aws/agentcore-recommendation.ts (line 228): hardcoded amazonaws.com
❌ src/cli/aws/agentcore-batch-evaluation.ts (line 231): hardcoded amazonaws.com

Options:

Use dnsSuffix(region) in all five (matching the two that already do). This is the consistent fix.
If these preview features are commercial-only and will never be offered in GovCloud / China during preview, at least leave a comment to that effect — otherwise someone reading the two correct files will assume the hardcoded ones are bugs to "fix" later and accidentally break them.

Since all five already share the same signing-helper pattern (essentially duplicated across files per the "interim until SDK support" comments), extracting a shared signedRequest helper would also remove the opportunity for this kind of drift.

avi-alpert and others added 30 commits March 5, 2026 13:20

feat: add sync workflow

d15ce2e

fix: formatting

7d35986

fix: only sync to main branch

2acb841

fix: codeql permissions

05258f3

Merge pull request #1 from aws/aalpert/workflow

216140f

chore: sync main with public/main

c6099d4

Merge remote-tracking branch 'public/main'

c2bef91

chore: sync main with public/main

1a361af

chore: sync main with public/main

cc83d81

chore: sync main with public/main

6054e95

chore: sync main with public/main

11ec86e

chore: sync main with public/main

1d64fd8

chore: sync main with public/main

4c2c674

Merge remote-tracking branch 'origin/main'

ee71ff3

chore: sync main with public/main

cfd1cdb

chore: sync main with public/main

93d7bbc

chore: sync main with public/main

d0c495f

chore: sync main with public/main

c2121ec

fix: use correct SigV4 service name for config bundle API

c190b0c

The signing service must be 'bedrock-agentcore' for all stages, not 'bedrock-agentcore-control' for prod. The endpoint hostname differs from the signing service name.

fix: use nullish coalescing for branchName default

f1e34d2

fix: address review comments

f671122

fix: remove duplicate config-bundle subcommand from edit command

bb013c1

Merge pull request #37 from aws/feat/config-bundles

45f98e0

feat: add configuration bundle support

jariy17 and others added 21 commits April 15, 2026 12:42

feat: migrate AB test API path from /abtests to /ab-tests with fallba…

7a1384b

…ck (#94) Uses /ab-tests (new path) as primary, falls back to /abtests (legacy) on 404 for backwards compatibility during the API migration.

fix: migrate batch evaluation API client to latest dataplane schema (#…

300a516

…142)

Merge remote-tracking branch 'origin/main' into feat/evo-implementation

2ef8132

Remove stop recommendations (#162)

d45abaa

Merge remote-tracking branch 'public/main' into feat/evo-implementation

ac053a8

fix: resolve merge conflicts with public/main — take public versions …

5d0ec24

…for telemetry, web-ui, help

notgitika requested a review from a team April 30, 2026 17:54

github-actions Bot added size/xl PR size: XL agentcore-harness-reviewing AgentCore Harness review in progress labels Apr 30, 2026

notgitika closed this Apr 30, 2026

notgitika deleted the feat/evo-implementation branch April 30, 2026 17:54

github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label Apr 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: AgentCore Evo — config bundles, batch eval, recommendations, AB testing#1061

feat: AgentCore Evo — config bundles, batch eval, recommendations, AB testing#1061
notgitika wants to merge 89 commits into
mainfrom
feat/evo-implementation

notgitika commented Apr 30, 2026

Uh oh!

agentcore-cli-automation commented Apr 30, 2026

Uh oh!

agentcore-cli-automation commented Apr 30, 2026

Uh oh!

agentcore-cli-automation commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

Conversation

notgitika commented Apr 30, 2026

Summary

New capabilities

Other changes

Tested

Uh oh!

agentcore-cli-automation commented Apr 30, 2026

Uh oh!

agentcore-cli-automation commented Apr 30, 2026

Uh oh!

agentcore-cli-automation commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants