Skip to content

feat: AgentCore Evo — config bundles, batch eval, recommendations, AB testing#1061

Closed
notgitika wants to merge 89 commits into
mainfrom
feat/evo-implementation
Closed

feat: AgentCore Evo — config bundles, batch eval, recommendations, AB testing#1061
notgitika wants to merge 89 commits into
mainfrom
feat/evo-implementation

Conversation

@notgitika

Copy link
Copy Markdown
Contributor

Summary

AgentCore Evo private preview features for the CLI. This PR brings the feat/evo-implementation branch (developed on the staging repo) into the public CLI repo.

New capabilities

Config Bundles — Versioned runtime configurations (system prompt, tool descriptions, model params). Create with agentcore add config-bundle or --with-config-bundle on agent creation. Manage versions with agentcore cb versions/diff/create-branch. Config bundle values are injected at runtime via SDK baggage — no code redeploy needed to change behavior.

Batch Evaluationagentcore run batch-evaluation runs evaluators (Builtin.Correctness, Helpfulness, Faithfulness, etc.) across agent sessions in CloudWatch. Supports multiple evaluators, session filtering, lookback windows, ground truth assertions, and custom names.

Recommendationsagentcore run recommendation optimizes system prompts and tool descriptions using agent traces. Can read from and write back to config bundles for zero-redeploy prompt updates.

AB Testing — Config-bundle mode (same code, different configs) and target-based mode (different runtime endpoints). Traffic splitting, online evaluation, pause/resume/stop/promote lifecycle. TUI wizard with side-by-side variant builder.

HTTP Gateways & Runtime Endpoints — Gateway targets, endpoint aliases, and routing infrastructure for AB tests.

Other changes

  • Batch eval API migrated to latest dataplane schema (batchEvaluationName, dataSourceConfig, evaluationMetadata)
  • Config bundle baggage passed on agentcore invoke for runtime config injection
  • [preview] tags on all evo commands and TUI screens
  • Region resolution from aws-targets.json across all evo commands
  • CLI command name agentcore (was agentcore-dev), distro mode set to PROD_DISTRO
  • Merged with public/main — includes telemetry, web-ui traces, inspector updates

Tested

  • E2E on prod and gamma (us-east-1, ap-southeast-2)
  • Full config bundle flow: create → deploy → invoke → recommendation → invoke with updated prompt
  • Batch eval with single/multiple evaluators, ground truth, stop
  • AB test lifecycle (config-bundle and target-based modes)
  • Bug bash completed with bug tracker

avi-alpert and others added 30 commits March 5, 2026 13:20
Add ConfigBundle as a new resource type with full lifecycle:
- Schema: ConfigBundleSchema with name validation, component configurations
- Primitive: ConfigBundlePrimitive for add/remove operations
- API client: SigV4-signed HTTP requests for config bundle CRUD operations
- Deploy: post-deploy hook to sync config bundles with control plane
- Status: config-bundle resource type in status command
- TUI: add wizard (name, description, components, branch, commit message),
  remove flow, ResourceGraph integration
- State: carry forward configBundles across redeploys in buildDeployedState
The signing service must be 'bedrock-agentcore' for all stages, not
'bedrock-agentcore-control' for prod. The endpoint hostname differs
from the signing service name.
- Add config bundle post-deploy setup to TUI deploy flow (useDeployFlow)
- Add clientToken to config bundle update API call
- Add parentVersionIds on update (required by API)
- Default branchName to "main" and commitMessage when not specified
- Add placeholders for branch/message in TUI wizard
- Fallback to find-by-name or create when update fails (stale IDs)
- Remove debug logging from actions.ts
- Add `agentcore edit config-bundle` CLI command with --bundle, --components,
  --components-file, --description, --branch, --message, --json flags
- Add interactive TUI wizard for editing config bundles (select bundle,
  input method, components, commit message, branch name, confirm)
- Add diff check to post-deploy: skip API update when components and
  description are unchanged, avoiding unnecessary version creation
- Use getConfigurationBundleVersion instead of getConfigurationBundle to
  avoid branch-not-found errors on bundles created with different branches
- Align default branch name to 'mainline' (API default) instead of 'main'
- For updates, inherit branch from current API state when not specified
- post-deploy-config-bundles: 13 tests covering create, update, skip
  (diff check), delete, branch inheritance, fallback paths, errors
- ConfigBundlePrimitive.edit: 7 tests covering component updates,
  optional field handling, missing bundle errors, field preservation
- useEditConfigBundleWizard: 16 tests covering step navigation,
  setters, goBack, reset, currentIndex tracking, step labels
feat: add configuration bundle support
* chore: remove edit config-bundle command

Users should edit agentcore.json directly to update config bundles.
Removes the edit CLI command, TUI screens, wizard hooks, and tests.

* feat: add config-bundle CLI commands for version history

Adds `agentcore config-bundle` with three subcommands:
- `versions` — list version history grouped by branch
- `get-version` — view specific version details and components
- `diff` — client-side deep diff between two versions

Also adds filter support (branchName, latestPerBranch, createdBy)
to the listConfigurationBundleVersions API client.

* feat: add config bundle hub TUI screens

Add TUI screens for browsing config bundles, viewing version history
with branch grouping, version detail drill-down, and diff comparison
between versions.

* fix: resolve config bundle versionId when falling back to list API (#49)

The Recommendation API requires versionId to be non-null when using
configurationBundle input. When resolveBundleByName fell back to the
list API (bundle not in deployed state), it returned no versionId,
causing a 400 validation error.

Now calls getConfigurationBundle after list to fetch the latest
versionId. Also adds versionId to the ResolvedBundle interface and
returns it from the deployed-state fast path.

* chore: remove get-version subcommand from config-bundle CLI

The versions --json and diff commands cover all practical use cases.
Keeps the command surface lean: versions + diff only.
* feat: add Recommendation API wrappers, CLI commands, and operations layer

Implement the Recommendations/Optimization feature for AgentCore CLI:
- SigV4-signed HTTP client for Start/Get/List/Delete Recommendation (DP)
- Operations layer with orchestration, polling, and local storage
- CLI commands: evals recommend, evals recommendation history/delete, run promote
- 27 unit tests covering API, storage, and orchestration logic
- Live-validated field names and ARN formats against prod API

* feat: add recommendation TUI wizard with session discovery and multi-evaluator support

- Add full recommendation wizard TUI (type, agent, evaluators, input, trace source, sessions, confirm)
- Add session discovery flow: discover sessions from CloudWatch, multi-select specific sessions
- Support both CloudWatch logs and session ID trace sources
- Pass selected sessionIds to recommendation API cloudwatchLogs config
- Add request ID capture and error detail extraction for debugging FAILED recommendations
- Fix recommendation API test mocks (add headers for requestId capture)
- Add scrollable list support (maxVisibleItems) to MultiSelectList, SelectList, WizardSelect
- Wire recommendation screen into App.tsx and EvalHubScreen navigation

* feat: add session span fetching, recommendation tests, and TUI integration

- Add fetch-session-spans module for retrieving OTEL spans from aws/spans
  and log records from runtime log groups with session ID filtering
- Add comprehensive tests for fetch-session-spans (9 tests) and extend
  run-recommendation tests (12 new tests covering file input, spans-file
  trace source, tool-desc auto-fetch, error handling, ARN passthrough)
- Wire recommendation hub, history screen, and list/delete CLI commands
- Update TUI routing for recommendation flows from eval and run hubs
- Add recommendation constants (poll intervals, terminal statuses)

* chore: remove list commands and promote stub, fix agents→runtimes rename

Remove `agentcore list recommendations` and `agentcore list recommendation --id`
commands (top-level `list` command deleted entirely). Remove `run promote` stub.
Fix typecheck errors from agents→runtimes schema rename in recommendation files.
#26)

* feat: add EvaluationJob resource — schema, primitive, deploy hook, TUI, and tests

Phase 1 of EvalJobRunner: CRUD + deploy integration for the EvaluationJob
control plane resource.

- Schema: EvaluationJobSchema in agentcore.json, deployed state tracking
- Primitive: EvaluationJobPrimitive with add/remove lifecycle
- AWS client: SigV4-signed HTTP wrappers for EvalJob CP operations
- Deploy: post-deploy hook creates/updates/deletes eval jobs imperatively
- CFN outputs: parse eval job execution role ARN from stack outputs
- TUI: add evaluation-job wizard flow + remove flow integration
- Tests: 53 tests across schema, primitive, AWS client, deploy hook, and TUI

* feat: add `run evaluation-job` command with DP API wrappers and orchestration

- Data plane API wrappers (RunEvaluationJob, GetEvaluationJobRun, ListEvaluationJobRuns)
  with SigV4 signing against bedrock-agentcore service
- Orchestration: resolve job from deployed state, generate runId, start run,
  poll for completion, fetch results from CW Logs output group
- CLI command: `agentcore run evaluation-job --job <name> --session-id <ids...>`
  with --json output and progress callbacks
- Tests: 17 new tests covering DP wrappers, runId generation, orchestration
  (error handling, polling, CW Logs result parsing)

* feat: complete US1/US2 quick wins — run name, cancel, update, stage-aware endpoints

- Add --run flag to `run evaluation-job` for custom run name prefixes
- Add `run cancel-evaluation-job` command with StopEvaluationJobRun DP API
- Add `update evaluation-job` primitive method and CLI subcommands
- Add `agentcore update experiment` parent command (backward-compatible)
- Make CP/DP endpoints stage-aware via AGENTCORE_STAGE env var (beta/gamma/prod)
- Fix beta SigV4 service name (bedrock-agentcore vs bedrock-agentcore-control)
- Update AddEvaluationJobFlow success screen with next-steps guidance

* feat: add TUI run wizard, progress steps, and local result storage for eval jobs

- Add RunEvalJobFlow TUI: select job → enter sessions → name run → confirm → execute
- Add StepProgress display during eval job polling (starting → polling → fetching → saving)
- Add elapsed time counter during run execution
- Add eval-job-storage module: save/load/list run results per job in .cli/eval-job-results/
- Auto-save results on both CLI and TUI paths
- Add "Evaluation Job" option to TUI Run screen
- Add 9 unit tests for eval-job-storage

* feat: add CloudWatch session discovery to eval job TUI wizard

- Add source type picker: "Discover from CloudWatch" vs "Enter manually"
- Add lookback days input (1-90 days) for CloudWatch discovery
- Discover sessions via CW Insights query using agent's runtimeId
- Multi-select from discovered sessions with span count + timestamps
- Auto-fallback to manual entry when agent not deployed (no runtimeId)
- Improve error display: show failed step in StepProgress before transitioning

* feat: migrate evaluation from resource CRUD to stateless batch evaluation

Replace the old EvaluationJob resource model (create/update/delete via
agentcore.json + deploy hooks) with a flat BatchEvaluation API model:

- Add `run batch-evaluation` and `run stop-batch-evaluation` CLI commands
- Add batch evaluation TUI wizard under the Run menu
- Add SigV4 API client for batch eval endpoints (start/get/list/stop)
- Add CloudWatch results fetching from outputDataConfig
- Remove all old evaluation-job infrastructure: primitive, deploy hook,
  schema, TUI add/remove screens, CP CRUD operations
- Remove evaluationJobs from agentcore.json schema

Tested end-to-end on gamma (account 998846730471) with Builtin.Faithfulness
evaluator against 3 agent sessions — all returning correct scores.

* chore: remove executionRoleArn now that FAS creds are live on gamma

The batch evaluation API no longer requires an execution role ARN.
Remove the --execution-role CLI option and all executionRoleArn
plumbing from the API client and orchestration layer.

* Revert "chore: remove executionRoleArn now that FAS creds are live on gamma"

This reverts commit f1706ff7ea4b7695d1466e609cde29e38cb00afb.

* refactor: move stop-batch-evaluation to top-level stop command

Move `agentcore run stop-batch-evaluation` to `agentcore stop batch-evaluation`
as a higher-level verb, consistent with pause/resume pattern.
jariy17 and others added 21 commits April 15, 2026 12:42
…75)

* fix: stop running AB test before deletion during deploy reconciliation

When a user removes an AB test from agentcore.json and redeploys,
deleteOrphanedABTests tried to delete it directly. This failed with
409 "Cannot delete while AB test is running", which then blocked
gateway deletion (AB test routing rules still on the gateway).

Fix: call updateABTest with executionStatus=STOPPED before deleting.
If stop fails (already stopped or invalid state), proceed with delete.
A console.warn is emitted when an AB test is stopped during cleanup.

* fix: use console.warn instead of console.log in deploy operations

console.log writes to stdout which corrupts the TUI (Ink) rendering.
console.warn writes to stderr which doesn't interfere with the TUI.

* fix: poll for AB test status after stop before deleting

The stop call transitions the AB test to UPDATING status. Deleting
immediately fails with 409 "cannot be deleted in status UPDATING".
Now polls getABTest until status leaves UPDATING before attempting delete.

* fix: surface AB test stop warning via postDeployWarnings instead of console.warn

console.warn leaks into TUI rendering. Instead, return warning in the
result and let TUI/CLI callers surface it through proper channels
(postDeployWarnings for TUI, logger.log for CLI).

* fix: move target deletion wait into deleteHttpGatewayTarget and cleanup

- deleteHttpGatewayTarget now polls until target is fully deleted (404)
  internally, so callers don't need to remember to wait separately
- Removed waitForTargetDeletion from post-deploy-http-gateways.ts
- Reconciliation deletion path now waits for target deletion too
- AB test stop polling now checks executionStatus === 'STOPPED'
- Removed console.warn/log that leaked into TUI rendering
- Removed debug process.stderr.write logs

* fix: resolve config bundle placeholders in TUI deploy path

resolveConfigBundleComponentKeys (which resolves {{runtime:name}} and
{{gateway:name}} to real ARNs) was only in the CLI deploy path. The TUI
deploy path passed raw placeholders to the API, causing validation errors.

Moved the resolution functions to post-deploy-config-bundles.ts so both
CLI and TUI can import them.

* fix: rename --agent to --runtime and clarify --online-eval in ab-test CLI

- --agent renamed to --runtime (consistent with other commands)
- --online-eval description changed from "name or ARN" to "name"
- --gateway help text updated to reference --runtime

* test: fix broken polling test and add coverage for review findings

- Fix AB test polling test mock: first poll returns executionStatus 'RUNNING'
  (was 'STOPPED', causing loop to exit immediately — test was broken)
- Add 11 tests for resolveConfigBundleComponentKeys (runtime, gateway,
  ARN passthrough, missing resource errors, immutability)
- Add 4 tests for warning field (stop warning set, not set on failure,
  set even on delete failure, poll timeout)

* fix: deleteHttpGatewayTarget returns failure on polling timeout

Both reviewers flagged: returning success on timeout is wrong — the target
may still be DELETING, causing downstream gateway delete to fail. Now
returns { success: false } with timeout error message.

* fix: AB test TUI reads region from aws-targets.json instead of env vars

ABTestPickerScreen was using process.env.AWS_REGION with us-east-1
fallback. This caused debug checks, stop/resume, and all API calls
to hit the wrong region. Now reads from aws-targets.json via
resolveAWSDeploymentTargets(), matching the config bundle hub pattern.

* fix: paginate DescribeDeliverySources in AB test debug check

The debug panel only read the first page of delivery sources, missing
sources for accounts with many gateways. Now paginates both
DescribeDeliverySources and DescribeDeliveries calls.

Also reads region from aws-targets.json instead of env vars.

* fix: warn when AB test stop polling times out before deletion

Address review comment — log a warning via the result's warning field
when the polling loop exhausts all 20 iterations without executionStatus
reaching STOPPED, so the user knows the delete is proceeding without
confirmation.
…ck (#94)

Uses /ab-tests (new path) as primary, falls back to /abtests (legacy)
on 404 for backwards compatibility during the API migration.
* fix: show yellow warning banner when post-deploy sub-resources fail

Deploy banner now has three states instead of two:
- Green "Deploy to AWS Complete" — everything succeeded
- Yellow "Deploy to AWS Complete (with warnings)" — infra deployed but
  post-deploy resources (AB tests, config bundles, HTTP gateways) had errors
- Red "Deploy to AWS Failed" — CDK stack deployment failed

CLI non-interactive path returns exit code 2 for post-deploy warnings
(vs exit 0 for success, exit 1 for infra failure) so CI/CD pipelines
can differentiate.

Post-deploy errors (AB tests, config bundles, HTTP gateways, online evals)
are shown inside the yellow banner box and in the post-deploy warnings
section below. The deploy step stays marked as success since the CDK
stack did deploy correctly.

* fix: treat COMPLETED_WITH_ERRORS as terminal in batch evaluation poll loop

The batch evaluation poll loop only recognized COMPLETED, FAILED,
STOPPED, and CANCELLED as terminal statuses. When the service returned
COMPLETED_WITH_ERRORS (typical when any session fails), the CLI never
exited the poll loop and hung for 67 minutes until the fetch timed out.

Add COMPLETED_WITH_ERRORS to TERMINAL_STATUSES so the poll exits
immediately. The status is still treated as a non-success outcome
(line 227 checks for COMPLETED specifically), so partial failures
are reported correctly.
* fix: config bundle name resolution and add create-branch command

Accept local bundle names (from agentcore.json) in CLI and TUI when the
API stores them with a project-name prefix. The resolver now tries the
exact name, current prefix (projectBundle), and legacy underscore prefix
(project_Bundle) for backward compatibility.

Also adds `agentcore cb create-branch` to create a new branch on an
existing bundle via the update API, instead of requiring a whole new
bundle to be created.

* fix: address PR review — DRY name variants, pagination, sort, and tests

- Extract getBundleNameVariants to shared utility, use in both
  resolve-bundle.ts and useConfigBundleHub.ts
- Paginate listConfigurationBundles in resolveBundleByName so bundles
  beyond page 1 are found
- Sort versions by versionCreatedAt descending in create-branch to
  reliably pick the latest version as branch parent
- Add unit tests for getBundleNameVariants and resolveBundleByName
  (9 tests covering fast path, fallback, pagination, legacy names)
…ution (#147)

* fix: rename command to agentcore and use aws-targets for region resolution

- Rename CLI command from agentcore-dev to agentcore
- Resolve region from aws-targets.json across all evo commands:
  ab-test, pause, resume, stop, recommendation
- Previously these fell back to env vars or detectRegion() which
  could pick the wrong region. Now consistent with batch-eval and
  config-bundle which already used aws-targets.
- Fix pre-existing partition lint errors: use arnPrefix() and
  dnsSuffix() instead of hardcoded arn:aws: and .amazonaws.com

Note: --no-verify used because base branch has 11 pre-existing
typecheck errors in browser-tests/ and otel-metric-sink.ts that
are unrelated to this change.

* fix: switch distro mode from PRIVATE_DEV_DISTRO to PROD_DISTRO

Set DISTRO_MODE to PROD_DISTRO so the CLI uses the @aws/agentcore
package name and public npm registry.

* feat: add [preview] tag to all evo feature commands and TUI screens

Tag batch evaluation, recommendation, config bundle, and AB test
commands with [preview] in CLI help descriptions and TUI screen
titles to signal these are public preview features subject to change.
* feat: add --with-config-bundle flag to agent creation

When --with-config-bundle is passed (CLI) or "Config bundle" is selected
in the TUI Advanced Configuration, the CLI:

1. Auto-creates a config bundle named {AgentName}Config with smart
   defaults (system prompt + tool descriptions for the runtime)
2. Vends template variants that use the SDK to consume config bundle
   values at runtime via context.get_config_bundle()

Strands template: adds ConfigBundleHook (HookProvider) that injects
system prompt via event.agent.system_prompt and overrides tool
descriptions via BeforeToolCallEvent.

LangGraph template: adds ConfigBundleCallback (BaseCallbackHandler)
that injects system prompt via SystemMessage on chain start.

Both templates fall back to DEFAULT_SYSTEM_PROMPT when no config
bundle is deployed (e.g. local dev with agentcore dev).

* fix: use BedrockAgentCoreContext classmethod for config bundle access

get_config_bundle() is a @classmethod on BedrockAgentCoreContext, not
an instance method on the request context. Update both Strands and
LangGraph templates to use BedrockAgentCoreContext.get_config_bundle()
and remove the unused context parameter from hook constructors.
The batch evaluation API renamed `name` to `batchEvaluationName` in
requests/responses, and removed `tags` and `executionRoleArn` from
StartBatchEvaluation.

- New schema sends `batchEvaluationName` in start request
- Legacy fallback still sends `name` for backwards compat
- Response normalizers handle both `batchEvaluationName` and `name`
- Remove `executionRoleArn` from options, orchestrator, and CLI flag
- Remove `tags` from start options
The InvokeAgentRuntimeCommand accepts a baggage field with W3C baggage
format. When a config bundle is associated with the agent being
invoked, build the baggage string from deployed-state.json and pass
it through so the SDK can fetch the config bundle at runtime.

This enables the full config bundle flow: create → deploy → invoke
→ recommendation → invoke again with updated prompt, all without
redeploying the agent code.
* feat: add --request-header-allowlist CLI flag for agentcore add agent (#825) (#830)

Wire the existing requestHeaderAllowlist feature (already supported in the
TUI wizard and schema) into the non-interactive CLI path. Accepts
comma-separated header names that are auto-normalized with the
X-Amzn-Bedrock-AgentCore-Runtime-Custom- prefix.

- Add requestHeaderAllowlist field to CLI AddAgentOptions interface
- Register --request-header-allowlist option in AgentPrimitive.registerCommands()
- Add validation using existing validateHeaderAllowlist() utility
- Pass parsed headers through to both create and BYO agent paths

* fix: update E2E test regex to match new CUSTOM_JWT client-side error (#832)

PR #817 changed invoke to fail fast client-side when a CUSTOM_JWT agent
is invoked without a bearer token, producing a different error message.
The E2E assertions still expected the old server-side "authorization
mismatch" pattern, causing two test failures on main.

* fix: remove docker info check from container runtime detection (#829)

detectContainerRuntime() called `docker info` to verify the daemon was
running.  This requires access to the Docker socket and triggers an OS
password prompt on machines where the user is not in the docker group.

The check provided no real value: deploy falls back to CodeBuild anyway,
and dev will fail with a clear error from `docker build` if the daemon
is down.  Remove the `docker info` probe and rely on `which` + `--version`
only, matching the approach already used by detectContainerRuntimeSync().

Also removes the now-unused START_HINTS constant, getStartHint() helper,
and notReadyRuntimes tracking.

* fix: add missing AgentCore regions to match AWS documentation (#833)

Add 6 regions (ap-northeast-2, ca-central-1, eu-north-1, eu-west-2,
eu-west-3, sa-east-1) to AgentCoreRegionSchema to match the official
AWS Bedrock Agentcore supported regions documentation.

Closes #822

* fix: unhide import command from TUI main menu (#834)

The import command should be visible in the TUI command list so users
can discover and use it interactively.

* feat: add e2e tests for import command (#828)

* feat: add e2e tests for import command

Add end-to-end tests that exercise the import runtime, memory, and
evaluator commands against real AWS resources. Python fixture scripts
create resources via the Bedrock AgentCore control plane API, then
tests import them into a CLI project and verify status and invocation.

Also adds pip install boto3 to the full e2e CI workflow so the import
tests can run in GitHub Actions.

* Potential fix for pull request finding 'Unused import'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* Potential fix for pull request finding 'Unused import'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* Potential fix for pull request finding 'Unused import'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* Potential fix for pull request finding 'Unused import'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* fix: use triggering ref for workflow_dispatch in full e2e suite

The checkout step was hardcoded to ref: main, so workflow_dispatch
on a feature branch would still test main. Now it uses the dispatch
ref for manual triggers and main for push/schedule triggers.

* fix: upgrade boto3 in CI for bedrock-agentcore-control support

Ubuntu-latest ships boto3 1.34.46 which doesn't know about the
bedrock-agentcore-control service. Use --upgrade to get a version
that supports import test setup scripts.

* fix: address review feedback on import e2e tests

- Use default vended model IDs instead of hardcoded claude-3-haiku
- Pin boto3 version in CI workflow for deterministic builds
- Drop unnecessary boto3.session.Session() fallback in REGION resolution
- Preserve bugbash-resources.json on partial cleanup failure
- Log teardown deploy failures instead of swallowing silently
- Add comment explaining sequential setup script execution

* fix: pass default evaluator model from CLI source to setup scripts

Instead of hardcoding the evaluator model ID in the Python fixture
with a "keep in sync" comment, import DEFAULT_MODEL from the CLI
source and pass it as an env var to the setup script. The Python
script falls back to a hardcoded default for standalone use.

* style: fix prettier import ordering

* fix: address PR review feedback for import e2e tests

- Exit with code 1 when setup scripts fail to reach ready status
- Change default region fallback from us-west-2 to us-east-1
- Add S3 code object cleanup to cleanup_resources.py
- Document IAM role reuse policy in ensure_role() and cleanup script
- Add comment explaining why teardownE2EProject() is not used

---------

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* feat: add auto-instrumentation to langchain agent template (#835)

* fix: add missing langchain instrumentor dependency to import flow (#836)

The LangGraph translator generates LangchainInstrumentor().instrument()
in imported agents' main.py, but pyproject-generator.ts did not include
opentelemetry-instrumentation-langchain in LANGGRAPH_DEPS. This causes
imported LangGraph agents to crash at runtime with ModuleNotFoundError.

* fix(ci): unpin boto3 in e2e workflow (#841)

The pinned boto3==1.38.0 did not include the bedrock-agentcore-control
service model, causing import-resources e2e tests to fail with
UnknownServiceError. Using latest boto3 ensures new AWS services are
always available.

* fix: add AWS_IAM as a valid authorizer type for gateway commands (#820)

* fix(e2e): use uv run for import test Python scripts (#845)

* fix(ci): unpin boto3 in e2e workflow

The pinned boto3==1.38.0 did not include the bedrock-agentcore-control
service model, causing import-resources e2e tests to fail with
UnknownServiceError. Using latest boto3 ensures new AWS services are
always available.

* fix(e2e): use uv run for import test Python scripts

The import e2e tests call python3 directly to run setup scripts that
use boto3. On CI runners, the system-installed boto3 is too old to
include the bedrock-agentcore-control service model. pip install boto3
installs to user site-packages which the child process doesn't pick up.

Switch to uv run --with boto3 python3 so the scripts always get a
current boto3 in an isolated environment. Remove the now-unnecessary
pip install step from the workflow.

* fix: only exclude root-level agentcore/ directory from packaging artifacts (#844)

The EXCLUDED_ENTRIES set unconditionally excluded any directory named
'agentcore' at any depth during zip/copy operations. This silently
dropped third-party dependency sub-modules that happen to use the same
directory name (e.g., langgraph_checkpoint_aws/agentcore/), causing
ImportError at runtime.

Remove 'agentcore' from the flat EXCLUDED_ENTRIES set and instead thread
the original rootDir through all recursive traversal functions. The
agentcore directory is now only excluded when its resolved path matches
join(rootDir, CONFIG_DIR) — i.e., it sits at the project root.

Also remove the hand-written fflate type shim
(src/lib/packaging/types/fflate.d.ts) that shadowed the package's own
type declarations. The shim only declared zipSync, making all other
fflate exports (including unzipSync) invisible to TypeScript. The real
fflate v0.8.2 ships complete types that resolve correctly under
moduleResolution: "bundler".

Closes #843

* fix(ci): update snapshots after CDK version sync in release workflow (#848)

The release workflow syncs @aws/agentcore-cdk to the latest npm version
but did not update the asset snapshot tests, causing the test-and-build
job to fail with a snapshot mismatch.

* fix(ci): move snapshot update after build step in release workflow (#849)

The snapshot update requires built output, so it must run after
npm run build, not before.

* fix(ci): bump @aws/agentcore-cdk to 0.1.0-alpha.18 and remove snapshot step from release (#850)

* fix(ci): bump @aws/agentcore-cdk to 0.1.0-alpha.18 and remove snapshot step from release

Sync the CDK template to the latest npm version and update the asset
snapshot. Remove the snapshot update step from the release workflow
since it runs the full test suite which requires uv.

* fix: use caret range for @aws/agentcore-cdk in CDK template

Use ^0.1.0-alpha.18 instead of pinning an exact version so new
releases are picked up automatically.

* fix: pin @aws/agentcore-cdk to exact version in CDK template (#852)

Revert to exact pinning (0.1.0-alpha.18) instead of caret range.
The release workflow handles syncing to the latest version.

* chore: bump version to 0.8.1 (#853)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* feat: upgrade default Python runtime to PYTHON_3_14 (#837)

* feat: upgrade default Python runtime to PYTHON_3_14

Add PYTHON_3_14 as a supported runtime version and make it the default
for new agents and MCP tools. Updates schema enums, defaults, UI options,
packaging fallbacks, import mappings, and tests.

Verified end-to-end: deployed a runtime with PYTHON_3_14 to AgentCore
and confirmed successful invocation.

* chore: revert JSON schema change (auto-generated at release)

The JSON schema file is auto-regenerated during the release workflow.
Direct changes are rejected by the schema-check CI job.

* fix: address review — missed defaults, types, tests, and docs

- Update packCodeZipSync fallback in packaging/index.ts
- Add PYTHON_3_14 to llm-compacted/mcp.ts PythonRuntime type
- Update hardcoded runtimeVersion in AgentPrimitive.tsx
- Add PYTHON_3_14 to agent-env schema test
- Update TUI harness fixture default
- Update docs examples and runtime version list

* refactor: consolidate DEFAULT_PYTHON_VERSION into schema/constants

Define DEFAULT_PYTHON_VERSION once in schema/constants.ts and re-export
from the three TUI screen files that previously defined their own copy.
Replace hardcoded 'PYTHON_3_14' fallbacks in packaging and AgentPrimitive
with the shared constant. Future runtime version bumps now require a
single-line change.

* fix: detect Python ABI tag and usable wheels errors in platform retry logic

When numpy lacks pre-built wheels for a specific manylinux platform on
CPython 3.14, uv reports "no wheels with a matching Python ABI tag" or
"has no usable wheels" instead of the platform-specific errors the retry
logic was matching. This caused the packager to hard-fail on the first
platform candidate instead of retrying with a newer manylinux version
that does have compatible wheels.

* chore: bump version to 0.8.2 (#874)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* test: update asset snapshot for @aws/agentcore-cdk 0.1.0-alpha.19 (#875)

Regenerates the CDK package.json snapshot to match the version bump
landed in #852, which pinned @aws/agentcore-cdk to 0.1.0-alpha.19 in
the vended CDK template but did not refresh the corresponding snapshot.

* revert: roll back version bump to 0.8.1 (#877)

Reverts the version and changelog portions of #874 so the release
workflow can be re-run cleanly.

Only touches the version fields (package.json, package-lock.json) and
the 0.8.2 CHANGELOG entry. Leaves the schema regen and CDK pin in
place since the workflow will rewrite them on the next run.

* chore: bump version to 0.8.2 (#878)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* docs: document executionRoleArn in runtime spec (#872)

The runtime spec table in configuration.md omitted the existing
optional executionRoleArn field, leading users (see issue #870) to
believe the CLI had no way to bring their own IAM execution role.
The field is already supported in the schema.

Confidence: high
Scope-risk: narrow

* feat: add agent inspector web UI for `agentcore dev` (#871)

* fix: defer policy engine write and harden policy flow UX (#856)

* fix: defer policy engine write to disk until flow completes

Previously, pressing Escape on the gateway selection screen during
policy engine creation would skip to the success screen because the
engine was already written to agentcore.json at the name step. Now
the disk write is deferred until the user completes the entire flow,
so Escape correctly navigates back to the previous step without
persisting a half-configured engine.

Constraint: Must not break non-interactive CLI path which still writes immediately via primitive
Rejected: Only change Escape to go back without deferring write | engine would still be persisted on back
Confidence: high
Scope-risk: narrow

* fix: preserve engine name when navigating back from gateway selection

When pressing Escape on the gateway screen to go back to the name step,
the previously entered engine name was lost because AddPolicyEngineScreen
remounted with a generated default. Now the entered name is stored in
pendingEngineName state and passed back as initialName so the user sees
their original input.

Constraint: Must not change flow state union type to keep diff minimal
Rejected: Carry name in FlowState union variant | adds complexity to type for one field
Confidence: high
Scope-risk: narrow

* chore: remove TUI harness test accidentally committed

This test requires a live terminal session and cannot run as a unit test
in CI. It was an untracked local file that got staged by mistake.

* chore: bump version to 0.9.0 (#881)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: use caret range for @aws/agentcore-cdk in CDK template (#882)

Users always get the latest compatible CDK constructs on npm install
without requiring the CLI release workflow to pin-sync the version.
Removes the now-redundant sync step from the release workflow.

* fix: agent-inspector frontend assets missing from build (#883)

* fix: agent-inspector frontend assets missing from build

* fix: resolve React ref-during-render and setState-in-effect lint errors

- Wrap onReadyRef update in useEffect to avoid ref mutation during render
- Replace loggerRef.current access in return object with logFilePath state
- Replace useEffect+setState with state-based prev-step tracking pattern

Confidence: high
Scope-risk: narrow

---------

Co-authored-by: Jesse Turner <ajesstur@amazon.com>

* fix: revert version to 0.8.2 (#885)

* fix: revert version to 0.8.2

* fix: remove 0.9.0 entry from changelog

* Release v0.9.0 (#887)

* chore: bump version to 0.9.0

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Revise CHANGELOG for version 0.8.2 updates

Updated CHANGELOG.md to include recent fixes and additions.

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jesse Turner <57651174+jesseturner21@users.noreply.github.com>

* Release v0.9.1 (#888)

* chore: bump version to 0.9.1

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Clean up CHANGELOG for version 0.9.1

Removed fixed issues and other changes from version 0.9.1.

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jesse Turner <57651174+jesseturner21@users.noreply.github.com>

* fix: propagate sessionId as A2A contextId in Inspector proxy (#892)

The Agent Inspector Chat UI already generates and tracks a sessionId per
conversation and forwards it to the dev-server proxy on each invocation.
However, handleA2AInvocation dropped this sessionId when building the
A2A JSON-RPC body, so every turn arrived at the A2A agent with a fresh,
auto-generated contextId. This broke multi-turn memory for any A2A
agent that keys session state on the A2A contextId (e.g., Strands
FileSessionManager(session_id=context.context_id)).

Map sessionId to the A2A Message.contextId field when present. This is
spec-compliant per A2A Protocol Spec §3.4.3 (clients MAY include
contextId in subsequent messages to indicate continuation) and §3.4.1
(when contextId is omitted, the agent MAY generate a fresh one).

Closes #891

Co-authored-by: kashinoki38 <21358299+kashinoki38@users.noreply.github.com>

* fix(invoke): pass session ID to local invoke log files (#894)

The --session-id flag value was correctly sent to Runtime but never
passed to InvokeLogger, causing local log files to always show
"Session ID: none". Wire options.sessionId through to both the
InvokeLogger constructor and logPrompt() calls in exec and standard
invoke modes.

Closes #890

* feat: add session filesystem storage support (#893)

Adds --session-storage-mount-path to agentcore create and agentcore add agent,
wiring the mount path through schema, CLI flags, TUI wizard, template rendering,
and CDK mapping. File tools (file_read, file_write, list_files) with path traversal
protection are scaffolded into all 8 framework templates when storage is configured.
Fixes A2A sessionId not being forwarded to InvokeAgentRuntimeCommand. Validation
is centralised in SessionStorageSchema with no regex duplication across validators
or TUI.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: agentcore add component opens component wizard directly (#896)

When running `agentcore add memory` (or any other component), the TUI
was always showing the generic resource selection screen. This is because
AddFlow always started in the 'select' state regardless of which
subcommand invoked it.

Added an `initialResource` prop to AddFlow that maps directly to the
correct wizard state, skipping the selection screen. Each primitive now
passes its resource type when rendering AddFlow in TUI fallback mode.

Closes #857

* docs: update vended AGENTS.md, README.md, and llm-context references (#898)

* docs: update vended AGENTS.md, README.md, and llm-context references

Rewrite vended documentation to reflect the current state of the CLI:
- Add all current resources (gateways, evaluators, policies, online-eval)
- Add all CLI commands (logs, traces, eval, pause, resume, fetch, import)
- Add protocols (HTTP, MCP, A2A) and all supported frameworks
- Add Node.js runtime versions alongside Python
- Add VPC network mode documentation
- Reference @aws/agentcore-cdk L3 constructs and CDK repo
- Add mcp.ts to llm-context README file table
- Update internal assets/AGENTS.md with full directory layout

* test: update asset snapshot tests to match new docs content

* chore: remove single-commit-must-match-PR-title validation (#897)

The validateSingleCommit + validateSingleCommitMatchesPrTitle options
force contributors to keep their commit message in sync with the PR
title, which is unnecessary friction — squash-merge already uses the
PR title as the final commit message regardless of individual commit
messages.

* chore: remove preview bump type from release workflow (#847)

Preview versioning is no longer used. Remove the preview and
preview-major options from the release workflow dispatch and
all supporting logic in the bump-version script.

* feat: add AG-UI (AGUI) as fourth first-class protocol mode (#858)

* feat: add AG-UI (AGUI) as fourth first-class protocol mode

Add AGUI protocol support across the full CLI stack:

- Schema: Add 'AGUI' to ProtocolModeSchema, PROTOCOL_FRAMEWORK_MATRIX
  (Strands, LangChain_LangGraph, GoogleADK), and RESERVED_PROJECT_NAMES
- Types: New agui-types.ts with 27 event type enum, typed interfaces,
  parseAguiEvent parser, and buildAguiRunInput helper
- Templates: Python AGUI agent templates for Strands (ag-ui-strands),
  LangGraph (ag-ui-langgraph), and GoogleADK (ag-ui-adk) frameworks
- Invoke: invokeAguiRuntime with dual-stream architecture (typed events
  for TUI, text-only for CLI), local dev invokeAguiStreaming with
  RunAgentInput body, protocol dispatch in invokeForProtocol
- TUI: Rich AGUI event rendering with MessagePart type (text, tool_call,
  reasoning, error) in InvokeScreen, AGUI placeholder text in DevScreen
- Validation: Updated error messages and help text to include AGUI
- Tests: 24 unit tests for parseAguiEvent/buildAguiRunInput, snapshot
  updates for new template files

* fix: address review findings for AGUI protocol implementation

HIGH fixes:
- Add sessionId to AguiInvokeOptions, pass as runtimeSessionId (H-1)
- Throw early for bearerToken on AGUI (not yet supported) (H-2)
- Add bedrock-agentcore dep to all 3 template pyproject.toml files (H-4/5/6)
- Fix LangGraph /ping to return "healthy" not "ok" (H-7)
- Match TOOL_CALL_RESULT to tool_call parts by toolCallId, not position (H-12)
- Add complete enum coverage to agui-types test (H-15)

MEDIUM fixes:
- Fix langchain version pin from 1.2.0 (nonexistent) to 0.3.0 (M-11)
- Remove invalid allow_credentials=True with wildcard CORS (M-12)
- Replace in-place parts mutation with immutable updates for React safety (M-5)
- Surface readLoop errors in consumer generators instead of swallowing (M-1)
- Disable retry once streaming starts to prevent duplicate output (M-3)
- Handle TEXT_MESSAGE_CHUNK events alongside TEXT_MESSAGE_CONTENT (M-2)
- Update gemini model from 2.0-flash to 2.5-flash in GoogleADK (M-8)
- Add missing event type exports to barrel index.ts (M-18)

LOW fixes:
- Move AGUI imports to top-level in action.ts (L-1)
- Gate OTEL_SDK_DISABLED on LOCAL_DEV env var in Strands template (L-9)
- Add explanatory comment for LANGGRAPH_FAST_API env var (L-10)

* fix: add AGUI to TUI protocol picker and dev mode dispatch

- Add AGUI option to PROTOCOL_OPTIONS in generate/types.ts so users
  can select AGUI from the interactive create/add wizards
- Add AGUI case to useDevServer.ts sendMessage dispatch so local dev
  TUI sends correct RunAgentInput body via invokeAguiStreaming
- Add AGUI case to dev/command.tsx non-interactive dispatch so
  agentcore dev "prompt" uses invokeForProtocol('AGUI')

* fix: A2A GoogleADK template passes model=None to Agent constructor

load_model() returns None (it only sets GOOGLE_API_KEY env var as a
side effect). Passing model=load_model() to Agent() results in
model=None, causing the agent to either crash or use a default model.

Fix: call load_model() standalone for the side effect, then pass the
model ID string directly to Agent().

* chore: update protocol references to include AGUI across CLI

- AddScreen description: 'HTTP, MCP, A2A' → includes AGUI
- create --protocol help text: includes AGUI
- JSDoc comments in agent/types.ts, templates/types.ts, agent-env.ts
- codezip-dev-server comment: 'MCP/A2A' → 'MCP/A2A/AGUI'
- agent-env.test.ts: add AGUI to protocol acceptance test

* fix: add InvokeLogger to AGUI CLI path and improve UX polish

- Add InvokeLogger to AGUI CLI invoke block (action.ts) for prompt/response
  logging and log file creation — parity with HTTP invoke path
- Track RUN_ERROR events in textStream and return success: false when
  agent errors are detected
- Pass sessionId and logger to invokeAguiRuntime options
- Improve AGUI protocol picker description from circular 'AG-UI
  agent-to-user interaction protocol' to actionable 'Stream rich agent
  events to frontends (AG-UI)'

* fix: template bugs found during deployment testing

Bugs found by deploying all 3 AGUI frameworks to AWS and invoking:

- Bump ag-ui-strands to >= 0.1.4 (0.1.3 crashes on strands >= 1.19.0
  due to accessing removed private attr agent.state._state)
- Remove parallel_tool_calls=False from LangGraph template (Bedrock
  rejects this OpenAI-specific parameter with ValidationException)
- Remove aws-opentelemetry-distro from GoogleADK template (conflicts
  with google-adk >= 1.16.0 OpenTelemetry dependencies — agents using
  this template should set instrumentation.enableOtel: false)

* fix: add ToolNode + ReAct loop to AGUI LangGraph template

The AGUI LangGraph template had a single-node graph (chat → END) with
no tool execution loop. When the model called add_numbers, the graph
exited without executing the tool or generating a text response,
producing "(no content in AGUI response)" in agentcore dev.

Template fix:
- Add ToolNode(tools=backend_tools) as a "tools" node
- Replace set_finish_point("chat") with tools_condition conditional edge
- Add edge from "tools" back to "chat" for the ReAct loop
- Separate backend_tools list from frontend tools (state["tools"])

This matches the standard LangGraph ReAct pattern (agent → tools →
agent → ... → END) and how the HTTP/A2A templates use create_react_agent.

Dev invoke fix:
- invoke-agui.ts now tracks TOOL_CALL_START/ARGS/END/RESULT events
- When no text is produced but tool calls were seen, surfaces them
  as [Tool: name(args)] instead of generic "(no content)" message

* fix: address all review findings from AG-UI protocol code review

16 issues from 4-lane parallel code review, all addressed:

Critical fixes:
- Strands template: use session_manager_provider from ag-ui-strands 0.1.7
  instead of hardcoded "default-session"/"default-user"
- Dev client: persist threadId per session for multi-turn conversations
- CRLF handling: use /\r?\n/ in SSE parsers (invoke-agui + invoke.ts)
- Malformed JSON no longer yielded as content (shared parser skips)
- Unbounded aguiEvents array replaced with bounded cursor-based pruning

Structural improvements:
- Unified SSE parser (agui-parser.ts) replaces two divergent parsers
  in invoke-agui.ts (dev) and agentcore.ts (deployed). Net -39 LOC.
- Dual-consumer support with singleConsumer mode for dev path
- AguiEvent type union completed (4 missing members added)
- Dynamic imports converted to static where non-intentional (AGENTS.md)

Python template fixes:
- LangGraph: add LangchainInstrumentor + dep, remove unused END import,
  MemorySaver already removed in prior commit
- GoogleADK: remove dead load_model() + bedrock-agentcore dep, remove
  hardcoded user_id (ADK defaults to per-thread identity)
- Strands: bump ag-ui-strands pin to >= 0.1.7

enableOtel plumbing:
- Dockerfile CMD conditional on enableOtel (Handlebars)
- enableOtel threaded through AgentRenderConfig + BaseRenderer
- Import path: ProtocolModeSchema.safeParse replaces unsafe as-cast
- Import path: MCP enableOtel clamped regardless of YAML value
- GoogleADK uses plain opentelemetry-distro (aws-distro conflicts)

DX + testing:
- formatZodIssue falls back to issue.code instead of literal "undefined"
- New dockerfile-render.test.ts covers both enableOtel branches
- All snapshots updated

* fix: add AGUI to JSON schema protocol enum

The static JSON schema file used for CDK validation was not updated
when AGUI was added to the Zod schema. This caused CDK synth to
reject protocol: "AGUI" with a misleading validation error.

* fix: restore MemorySaver in AGUI LangGraph template

ag_ui_langgraph calls aget_state(config) with thread_id which requires
a checkpointer. Without it, every invocation throws ValueError: No
checkpointer set. The original msgpack crash only triggers with numbers
exceeding 2^63 (ormsgpack limitation), not with normal large numbers.
Bug bash confirmed: 325435 + 435634563456456 works correctly with
MemorySaver present.

* fix: address final review findings in AGUI parser

- Wrap reader.releaseLock() in try/catch to prevent error masking
  if lock is already released (HIGH from code review)
- Replace textStream! non-null assertion with runtime guard
  (MEDIUM from code review)

* fix: use toolCallId for TOOL_CALL_RESULT matching in dev client

Previously matched by activeToolName which was already reset to '' by
TOOL_CALL_END. The find() never matched, falling through to the last
tool call — wrong for parallel tool calls. Now matches by toolCallId
which is the unique identifier AG-UI provides per tool invocation.

* revert: remove manual JSON schema edit (auto-generated during release)

The schemas/ directory is auto-regenerated from Zod schemas during the
release workflow. AGUI is already in ProtocolModeSchema (constants.ts)
and will appear in the JSON schema on next release.

* fix: add configurable PORT env var to AGUI templates + update snapshots

All 3 AGUI templates now read PORT from env with default 8080:
  uvicorn.run(app, host="0.0.0.0", port=int(os.environ.get("PORT", "8080")))

Addresses PR review comment requesting configurable port for local testing.

* fix: use AG-UI in user-facing strings instead of AGUI

Schema enum stays 'AGUI' (internal), but TUI display text uses
'AG-UI' which is the protocol's official name.

* fix: restore credential wiring in AGUI GoogleADK template

The template was missing load_model() call and bedrock-agentcore dep,
so GOOGLE_API_KEY was never set from the AgentCore credential. Both
dev mode and deployed agents failed with "No API key provided."

* fix: convert AGUI dynamic import to static in invokeForProtocol

AGENTS.md requires all imports at top of file. The dynamic import had
no meaningful performance benefit — AGUI parser is ~4KB in a 2.1MB CLI.

* feat: support preview releases from feature branches (#905)

The release workflow was hardcoded to only publish from main with the
`latest` npm dist-tag. This made it impossible to publish prerelease
versions from feature branches.

Now when the workflow runs from a non-main branch, it sets the npm
dist-tag to `preview` and targets the source branch for the release PR.
Stable bump types (patch, minor, major) are blocked on non-main branches
to prevent accidental overwrites of the `latest` tag.

* fix(invoke): show full session ID and print resume command on exit (#904)

* fix(invoke): show full session ID and print resume command on exit

The invoke TUI truncated the session ID to 8 characters, making it
impossible to copy the full UUID needed for --session-id. Additionally,
there was no guidance on how to resume a session after exiting.

- Display full session ID in the TUI header instead of truncating
- Print a colored resume command after TUI exit (both Esc and Ctrl+C)
- Use Ink's unmount() instead of process.exit(0) for clean shutdown,
  which also fixes the update notifier not showing on Esc exit

* fix: only show resume message when a session was actually used

* feat: add GovCloud multi-partition support (#908)

Add partition-aware ARN construction, endpoint URL generation, and
console URL generation to support aws-us-gov (and future aws-cn)
partitions.

- Create src/cli/aws/partition.ts with getPartition, arnPrefix,
  dnsSuffix, serviceEndpoint, and consoleDomain utilities
- Replace all hardcoded arn:aws: in ARN template literals with
  arnPrefix(region)
- Update ARN regex patterns to accept any partition (arn:[^:]+:)
- Replace hardcoded amazonaws.com in endpoint URLs with
  serviceEndpoint()
- Replace hardcoded console.aws.amazon.com with consoleDomain()
- Add us-gov-west-1 to AgentCoreRegionSchema, BEDROCK_REGIONS,
  and LLM compacted types
- Add aws-us-gov to cdk.json target-partitions
- Fix execution-role-policy.json to use partition wildcard (arn:*)
- Add 15 unit tests for partition utilities
- Document multi-partition rules and checklists in AGENTS.md

* feat: remove deployed/local from status legend (#936)

* feat: remove deployed/local from status legend

* fix: prettier

* feat: upgrade agent inspector to 0.2.1 (#937)

* fix(deploy): honor aws-targets.json region for all SDK and CDK calls (#925)

* fix(deploy): honor aws-targets.json region for all SDK and CDK calls (#924)

AWS SDK clients constructed by @aws-cdk/toolkit-lib internally (for
CloudFormation, S3 asset upload, etc.) do not receive an explicit region
option and fall back to the SDK's default region resolution chain
(AWS_REGION -> AWS_DEFAULT_REGION -> shared config). When a user's
aws-targets.json specified a non-default region but those env vars were
unset, resources were created in the SDK default region instead of the
configured target.

Promote target.region to AWS_REGION and AWS_DEFAULT_REGION for the
lifetime of deploy and teardown operations, restoring prior values in a
finally block. This ensures downstream SDK clients (explicit and
toolkit-lib internal) agree on the target region.

Covers CLI non-interactive deploy (handleDeploy) and the interactive TUI
deploy/teardown (useCdkPreflight, destroyTarget). Invoke/status/eval
already pass target.region explicitly.

* fix(deploy): restore region env on TUI error states; consolidate barrel exports

Review feedback:
1. TUI preflight error branches called setPhase('error') without calling
   restoreRegionEnv(). Add a useEffect guarded on phase === 'error' so every
   error path restores the env override without threading the call into
   every branch.
2. Export applyTargetRegionToEnv from the aws barrel for consistency with
   withTargetRegion. Update CLI deploy, teardown, and TUI preflight hook
   to import from the barrel instead of the deep path.

* chore: bump version to 0.10.0 (#944)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* chore: sync with public/main (2026-04-27) (#143)

* feat: add GitHub Action for automated PR review via AgentCore Harness (#934)

* feat: add GitHub Action for automated PR review via AgentCore Harness

Adds a workflow that reviews PRs using Bedrock AgentCore Harness.
The harness runs an AI agent in an isolated microVM with gh, git,
and pre-cloned repos that fetches PR diffs and posts review comments.

Workflow:
- Triggers on PR open/reopen for agentcore-cli-devs team members
- Supports manual workflow_dispatch for any PR URL
- Adds/removes ai-reviewing label during review
- Authenticates via GitHub OIDC to assume AWS role

Files:
- .github/workflows/pr-ai-review.yml — main workflow
- .github/scripts/python/harness_review.py — harness invocation script
- .github/scripts/python/harness_config.py — config from env vars
- .github/scripts/models/ — local boto3 service model (InvokeHarness
  not yet in standard boto3)

Required secrets:
- HARNESS_AWS_ROLE_ARN — IAM role ARN for OIDC
- HARNESS_ACCOUNT_ID — AWS account ID
- HARNESS_ID — Harness ID

* refactor: replace local service model with raw HTTP + SigV4 signing

Eliminates the 220KB bundled service model by using direct HTTP requests
with SigV4 authentication to invoke the harness endpoint. No extra
dependencies needed — urllib3, SigV4Auth, and EventStreamBuffer are all
part of botocore/boto3.

Rejected: invoke_agent_runtime API | server rejects harness ARNs with ResourceNotFoundException
Confidence: high
Scope-risk: moderate

* refactor: inline harness config into review script

Remove separate harness_config.py — env vars are read directly in
harness_review.py. One less file to maintain, config is still
driven entirely by environment variables set in the GitHub workflow.

* refactor: extract invoke_harness helper for cleaner main flow

* refactor: simplify config and improve script readability

- Replace HARNESS_ACCOUNT_ID + HARNESS_ID with single HARNESS_ARN env var
- Extract prompts into separate .md files in .github/scripts/prompts/
- Extract stream parsing into print_stream() function
- Add close_group() helper to deduplicate ::group:: bookkeeping

* refactor: separate event parsing from display logic

Extract parse_events() generator to handle binary stream decoding,
keeping print_stream() focused on formatting and log groups.

* docs: add explanatory comments to harness review functions

* refactor: derive region from HARNESS_ARN instead of separate env var

Eliminates HARNESS_REGION env var — the region is extracted from the
ARN directly, so there's no risk of a mismatch causing confusing
SigV4 auth errors.

* chore: rename label to agentcore-harness-reviewing

* refactor: move auth check to job level so entire review is skipped early

Split into authorize + ai-review jobs. The ai-review job only runs
if the PR author is authorized (team member or write access) or if
triggered via workflow_dispatch. Removes repeated if conditions from
every step.

* chore: exclude AI prompt templates from prettier

Prompt markdown files use intentional formatting that prettier
would reflow, breaking the prompt structure.

* fix: buffer streaming text to avoid per-token log lines in GitHub Actions (#946)

Each text delta from the harness was printed individually with flush,
creating a separate log line per token. Now text is buffered and
flushed as complete lines at block boundaries.

* fix: allow code-based evaluators in online eval configs (#947)

* fix: allow code-based evaluators in online eval configs

Remove restrictions that blocked code-based evaluators from being used
in online evaluation configs. The service now supports code-based
evaluators for online evaluation.

Changes:
- Remove code-based evaluator block in OnlineEvalConfigPrimitive
- Remove code-based evaluator validation in schema superRefine
- Remove code-based evaluator filter in TUI evaluator picker

* style: fix prettier formatting

* fix: add TTY detection before TUI fallbacks to prevent agent/CI hangs (#949)

* fix: add TTY detection before TUI fallbacks to prevent agent/CI hangs

When commands are invoked without flags in non-interactive environments
(CI, piped stdin, agent automation), the CLI falls through to Ink TUI
rendering which hangs indefinitely. Add a requireTTY() guard at every
TUI entry point that checks process.stdout.isTTY and exits with a
helpful error message directing users to --help for non-interactive flags.

Closes #685

* fix: check both stdin and stdout isTTY in requireTTY guard

The hang from #685 is caused by stdin not being a TTY (Ink reads
keyboard input from stdin), not stdout. Check both stdin and stdout
so the guard fires for piped stdin, redirected stdout, and CI
environments where both are non-TTY.

* fix: agentcore dev not working in windows (#951)

* fix: use pull_request_target for fork PR support (#958)

* fix: make label step non-blocking for fork PRs

Fork PRs get read-only GITHUB_TOKEN regardless of workflow permissions,
causing the addLabels API call to fail with 403. This crashed the entire
job before the review could run. continue-on-error lets the review
proceed even when labeling fails.

* fix: use pull_request_target for full write access on fork PRs

pull_request gives a read-only GITHUB_TOKEN for fork PRs, preventing
labels and secrets from working. pull_request_target runs in the base
repo context with full permissions. This is safe because we never
check out or execute fork code — the harness fetches the PR diff via
the GitHub API.

* fix: lower eventExpiryDuration minimum from 7 to 3 days (closes #744) (#956)

The AWS CreateMemory API allows a minimum of 3 days, but the CLI schema
was rejecting values below 7. Update the Zod schema, LLM compacted
types, import clamping logic, and all related tests.

* fix: display session ID after CLI invoke completes (#957)

* fix: display session ID after CLI invoke completes (closes #664)

The streaming and non-streaming invoke responses include a session ID
from the runtime, but the CLI paths discarded it. Now prints the
session ID and a resume command hint after invoke output.

* fix: include sessionId in AGUI protocol invoke result

* test: add browser tests for agent inspector (#938)

* feat: add telemetry schemas and client (#941)

* chore: bump version to 0.11.0 (#967)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix(invoke): auto-generate session ID for bearer-token invocations (#953)

Closes #840

When invoking an agent with a bearer token (OAuth/CUSTOM_JWT) and no
session ID, `AgentCoreMemoryConfig` raised a Pydantic validation error
because `session_id=None` is rejected. Unlike SigV4 callers, bearer-token
callers do not get a server-side auto-generated runtime session ID.

Two-layer fix:

1. CLI synthesizes a UUID in `invoke` action when `--bearer-token` is set
   and `--session-id` is missing, using the existing `generateSessionId`
   helper. Covers both explicit `--bearer-token` and the CUSTOM_JWT
   auto-fetch path.

2. Strands memory session templates (http, agui, a2a) synthesize a UUID
   when `session_id` is falsy before constructing AgentCoreMemoryConfig.
   Protects direct runtime callers (curl, custom apps) who forget the
   `X-Amzn-Bedrock-AgentCore-Runtime-Session-Id` header.

Snapshot tests updated.

* fix: show 'Computing diff changes...' step during deploy diff phase (#952)

The deploy TUI appeared frozen for 5-15 seconds between preflight
completion and 'Publish assets' while cdkToolkitWrapper.diff() ran
silently with no step marked as running.

Add a dedicated pre-deploy diff step that transitions running -> success
around the diff call so StepProgress always has something to highlight.

Closes #781

* test: split browser tests into its own job, fix logs path (#975)

* feat(invoke): add --prompt-file and stdin support for long prompts (#974)

* feat(invoke): add --prompt-file and stdin support for long prompts

Long prompts hit shell argument limits (E2BIG, typically 128KB-2MB)
when passed as positional args. This adds two new sources:

- --prompt-file <path>: read prompt from a file
- piped stdin: when no prompt is given and stdin is not a TTY,
  read the prompt from stdin

Precedence is hybrid and backward-compatible:
  --prompt > positional > --prompt-file > stdin

--prompt-file combined with piped stdin content returns an explicit
collision error rather than silently picking one.

Closes #686

* docs(invoke): document --prompt-file and stdin support

* fix(import): remove experimental warning from import command (#977)

The import feature has stabilized and no longer needs the experimental label.

* fix: duplicate header flash and help menu truncation (closes #895, closes #637) (#955)

- Return null during brief transitional phases to prevent Ink from
  rendering a header that gets immediately replaced by a different frame
- Consolidate CreateScreen phases into a single Screen mount
- Make help menu description width responsive to terminal size
- Remove hardcoded 50-char description truncation limit

* test: configure git in browser tests workflow (#976)

* feat: add project-name option to create (#969)

* Add project-name option to create

* fix: address review feedback — restore name description and move backfill logic

* ci: bump the github-actions group across 1 directory with 4 updates (#964)

Bumps the github-actions group with 4 updates in the / directory: [aws-actions/configure-aws-credentials](https://github.com/aws-actions/configure-aws-credentials), [actions/github-script](https://github.com/actions/github-script), [softprops/action-gh-release](https://github.com/softprops/action-gh-release) and [slackapi/slack-github-action](https://github.com/slackapi/slack-github-action).


Updates `aws-actions/configure-aws-credentials` from 5 to 6
- [Changelog](https://github.com/aws-actions/configure-aws-credentials/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws-actions/configure-aws-credentials/compare/v5...v6)

Updates `actions/github-script` from 8 to 9
- [Commits](https://github.com/actions/github-script/compare/v8...v9)

Updates `softprops/action-gh-release` from 2 to 3
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md)
- [Commits](https://github.com/softprops/action-gh-release/compare/v2...v3)

Updates `slackapi/slack-github-action` from 3.0.1 to 3.0.2
- [Release notes](https://github.com/slackapi/slack-github-action/releases)
- [Changelog](https://github.com/slackapi/slack-github-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/slackapi/slack-github-action/compare/v3.0.1...v3.0.2)

---
updated-dependencies:
- dependency-name: actions/github-script
  dependency-version: '9'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: aws-actions/configure-aws-credentials
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: slackapi/slack-github-action
  dependency-version: 3.0.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
- dependency-name: softprops/action-gh-release
  dependency-version: '3'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps-dev): bump aws-cdk-lib (#962)

Bumps the aws-cdk group with 1 update in the / directory: [aws-cdk-lib](https://github.com/aws/aws-cdk/tree/HEAD/packages/aws-cdk-lib).


Updates `aws-cdk-lib` from 2.248.0 to 2.250.0
- [Release notes](https://github.com/aws/aws-cdk/releases)
- [Changelog](https://github.com/aws/aws-cdk/blob/main/CHANGELOG.v2.alpha.md)
- [Commits](https://github.com/aws/aws-cdk/commits/v2.250.0/packages/aws-cdk-lib)

---
updated-dependencies:
- dependency-name: aws-cdk-lib
  dependency-version: 2.250.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: aws-cdk
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump postcss from 8.5.8 to 8.5.10 (#961)

Bumps [postcss](https://github.com/postcss/postcss) from 8.5.8 to 8.5.10.
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/postcss/postcss/compare/8.5.8...8.5.10)

---
updated-dependencies:
- dependency-name: postcss
  dependency-version: 8.5.10
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps-dev): bump secretlint from 11.4.1 to 12.2.0 (#916)

Bumps [secretlint](https://github.com/secretlint/secretlint) from 11.4.1 to 12.2.0.
- [Release notes](https://github.com/secretlint/secretlint/releases)
- [Commits](https://github.com/secretlint/secretlint/compare/v11.4.1...v12.2.0)

---
updated-dependencies:
- dependency-name: secretlint
  dependency-version: 12.2.0
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps-dev): bump @vitest/coverage-v8 from 4.1.2 to 4.1.5 (#915)

Bumps [@vitest/coverage-v8](https://github.com/vitest-dev/vitest/tree/HEAD/packages/coverage-v8) from 4.1.2 to 4.1.5.
- [Release notes](https://github.com/vitest-dev/vitest/releases)
- [Commits](https://github.com/vitest-dev/vitest/commits/v4.1.5/packages/coverage-v8)

---
updated-dependencies:
- dependency-name: "@vitest/coverage-v8"
  dependency-version: 4.1.5
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps-dev): bump @secretlint/secretlint-rule-preset-recommend (#914)

Bumps [@secretlint/secretlint-rule-preset-recommend](https://github.com/secretlint/secretlint) from 11.4.1 to 12.2.0.
- [Release notes](https://github.com/secretlint/secretlint/releases)
- [Commits](https://github.com/secretlint/secretlint/compare/v11.4.1...v12.2.0)

---
updated-dependencies:
- dependency-name: "@secretlint/secretlint-rule-preset-recommend"
  dependency-version: 12.2.0
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump the aws-sdk group across 1 directory with 14 updates (#912)

Bumps the aws-sdk group with 14 updates in the / directory:

| Package | From | To |
| --- | --- | --- |
| [@aws-sdk/client-application-signals](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-application-signals) | `3.1036.0` | `3.1037.0` |
| [@aws-sdk/client-bedrock](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-bedrock) | `3.1036.0` | `3.1037.0` |
| [@aws-sdk/client-bedrock-agent](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-bedrock-agent) | `3.1036.0` | `3.1037.0` |
| [@aws-sdk/client-bedrock-agentcore](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-bedrock-agentcore) | `3.1036.0` | `3.1037.0` |
| [@aws-sdk/client-bedrock-agentcore-control](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-bedrock-agentcore-control) | `3.1036.0` | `3.1037.0` |
| [@aws-sdk/client-bedrock-runtime](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-bedrock-runtime) | `3.1036.0` | `3.1037.0` |
| [@aws-sdk/client-cloudformation](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-cloudformation) | `3.1036.0` | `3.1037.0` |
| [@aws-sdk/client-cloudwatch-logs](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-cloudwatch-logs) | `3.1036.0` | `3.1037.0` |
| [@aws-sdk/client-resource-groups-tagging-api](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-resource-groups-tagging-api) | `3.1036.0` | `3.1037.0` |
| [@aws-sdk/client-s3](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-s3) | `3.1036.0` | `3.1037.0` |
| [@aws-sdk/client-sts](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-sts) | `3.1036.0` | `3.1037.0` |
| [@aws-sdk/client-xray](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-xray) | `3.1036.0` | `3.1037.0` |
| [@aws-sdk/credential-providers](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/packages/credential-providers) | `3.1036.0` | `3.1037.0` |
| [@aws-sdk/client-cognito-identity-provider](https://github.com/aws/aws-sdk-js-v3/tree/HEAD/clients/client-cognito-identity-provider) | `3.1036.0` | `3.1037.0` |



Updates `@aws-sdk/client-application-signals` from 3.1036.0 to 3.1037.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-application-signals/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-application-signals)

Updates `@aws-sdk/client-bedrock` from 3.1036.0 to 3.1037.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-bedrock/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-bedrock)

Updates `@aws-sdk/client-bedrock-agent` from 3.1036.0 to 3.1037.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-bedrock-agent/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-bedrock-agent)

Updates `@aws-sdk/client-bedrock-agentcore` from 3.1036.0 to 3.1037.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-bedrock-agentcore/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-bedrock-agentcore)

Updates `@aws-sdk/client-bedrock-agentcore-control` from 3.1036.0 to 3.1037.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-bedrock-agentcore-control/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-bedrock-agentcore-control)

Updates `@aws-sdk/client-bedrock-runtime` from 3.1036.0 to 3.1037.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-bedrock-runtime/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-bedrock-runtime)

Updates `@aws-sdk/client-cloudformation` from 3.1036.0 to 3.1037.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-cloudformation/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-cloudformation)

Updates `@aws-sdk/client-cloudwatch-logs` from 3.1036.0 to 3.1037.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-cloudwatch-logs/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-cloudwatch-logs)

Updates `@aws-sdk/client-resource-groups-tagging-api` from 3.1036.0 to 3.1037.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-resource-groups-tagging-api/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-resource-groups-tagging-api)

Updates `@aws-sdk/client-s3` from 3.1036.0 to 3.1037.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-s3/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-s3)

Updates `@aws-sdk/client-sts` from 3.1036.0 to 3.1037.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-sts/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-sts)

Updates `@aws-sdk/client-xray` from 3.1036.0 to 3.1037.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-xray/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-xray)

Updates `@aws-sdk/credential-providers` from 3.1036.0 to 3.1037.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/packages/credential-providers/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/packages/credential-providers)

Updates `@aws-sdk/client-cognito-identity-provider` from 3.1036.0 to 3.1037.0
- [Release notes](https://github.com/aws/aws-sdk-js-v3/releases)
- [Changelog](https://github.com/aws/aws-sdk-js-v3/blob/main/clients/client-cognito-identity-provider/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-js-v3/commits/v3.1037.0/clients/client-cognito-identity-provider)

---
updated-dependencies:
- dependency-name: "@aws-sdk/client-application-signals"
  dependency-version: 3.1034.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: aws-sdk
- dependency-name: "@aws-sdk/client-bedrock"
  dependency-version: 3.1034.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: aws-sdk
- dependency-name: "@aws-sdk/client-bedrock-agent"
  dependency-version: 3.1034.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: aws-sdk
- dependency-name: "@aws-sdk/client-bedrock-agentcore"
  dependency-version: 3.1034.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: aws-sdk
- dependency-name: "@aws-sdk/client-bedrock-agentcore-control"
  dependency-version: 3.1034.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: aws-sdk
- dependency-name: "@aws-sdk/client-bedrock-runtime"
  dependency-version: 3.1034.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: aws-sdk
- dependency-name: "@aws-sdk/client-cloudformation"
  dependency-version: 3.1034.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: aws-sdk
- dependency-name: "@aws-sdk/client-cloudwatch-logs"
  dependency-version: 3.1034.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: aws-sdk
- dependency-name: "@aws-sdk/client-cognito-identity-provider"
  dependency-version: 3.1034.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: aws-sdk
- dependency-name: "@aws-sdk/client-resource-groups-tagging-api"
  dependency-version: 3.1034.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: aws-sdk
- dependency-name: "@aws-sdk/client-s3"
  dependency-version: 3.1034.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: aws-sdk
- dependency-name: "@aws-sdk/client-sts"
  dependency-version: 3.1034.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: aws-sdk
- dependency-name: "@aws-sdk/client-xray"
  dependency-version: 3.1034.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: aws-sdk
- dependency-name: "@aws-sdk/credential-providers"
  dependency-version: 3.1034.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: aws-sdk
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps-dev): bump hono from 4.12.12 to 4.12.14 (#868)

Bumps [hono](https://github.com/honojs/hono) from 4.12.12 to 4.12.14.
- [Release notes](https://github.com/honojs/hono/releases)
- [Commits](https://github.com/honojs/hono/compare/v4.12.12...v4.12.14)

---
updated-dependencies:
- dependency-name: hono
  dependency-version: 4.12.14
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps-dev): bump esbuild from 0.27.4 to 0.28.0 (#862)

Bumps [esbuild](https://github.com/evanw/esbuild) from 0.27.4 to 0.28.0.
- [Release notes](https://github.com/evanw/esbuild/releases)
- [Changelog](https://github.com/evanw/esbuild/blob/main/CHANGELOG.md)
- [Commits](https://github.com/evanw/esbuild/compare/v0.27.4...v0.28.0)

---
updated-dependencies:
- dependency-name: esbuild
  dependency-version: 0.28.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* test: speed up CI and fix mock cleanup gaps (#989)

* test: speed up CI and fix mock cleanup gaps

- Node 20 only on PRs (full matrix on main)
- 3-way vitest sharding for unit tests with blob report merging
- Pre-bundle heavy deps (AWS SDK, Smithy, zod, commander) via deps.optimizer
- Exclude tui-harness from unit test project (not production code)
- Add afterEach(vi.restoreAllMocks) to 3 files with mock cleanup gaps
- Move inline consoleSpy.mockRestore() to afterEach in logs-eval tests
- Skip PTY tests when node-pty spawn is unavailable

* style: fix prettier formatting in build-and-test.yml

* fix: enable include-hidden-files for blob artifact upload

upload-artifact@v7 defaults include-hidden-files to false, which
skips the .vitest-reports directory. Also fail loudly if no files found.

* feat: runtime endpoint support in AgentCore CLI (#979)

* feat: add runtime endpoint support to AgentCore CLI

- Schema: endpoints field on AgentEnvSpec, runtimeVersion in deployed state
- Primitive: RuntimeEndpointPrimitive with add/remove/preview
- TUI: Add and Remove flows with multi-field form
- Status: endpoints nested under agents with deployment badges
- Deploy: parseRuntimeEndpointOutputs + buildDeployedState pipeline

* fix: correct output key prefix for runtime endpoint parsing

The CFN output keys include the AgentEnvironment construct prefix
(Agent{PascalName}) which was missing from the parser pattern.

* fix: remove .omc state files and unused useCallback import

- Remove .omc/ from git tracking, add to .gitignore
- Remove unused useCallback import in AddRuntimeEndpointScreen.tsx

* fix: shorten runtime endpoint description to prevent TUI overflow

The description "Named endpoint (version alias) for a runtime" was too
long and wrapped to the next line in the Add Resource menu. Shortened
to "Named endpoint for a runtime".

* fix: validate runtime endpoint version is a positive integer

- Add explicit Number.isInteger check before schema validation
- Change Commander parser from parseInt to Number so floats like
  3.5 are caught instead of silently truncated

* fix: use agent/endpoint composite key to prevent React key collision

Endpoint names can collide across runtimes (e.g., both have "prod").
Changed React key from epName to agent.name/epName to prevent
duplicate key warnings that pollute the TUI viewport.

* fix: render runtime endpoints in status --type runtime-endpoint

When filtering by --type runtime-endpoint, agents array is empty so
the agents section (which nests endpoints) never renders. Added a
standalone Runtime Endpoints section that shows when endpoints exist
but agents don't (i.e., when type-filtering).

* fix: add runtime-endpoint to status --help --type documentation

The --type option help text was missing runtime-endpoint from the
list of valid resource types.

* fix: return richer JSON response from add runtime-endpoint

add now returns { success, endpointName, agent, version } instead of
sparse { success: true }, matching the richer response shape from
remove runtime-endpoint.

* fix: validate endpoint version against deployed runtime version

- TUI: show "Current deployed version: N" and valid range (1-N)
- TUI: reject version exceeding latest deployed version
- CLI: check deployed-state.json for max version, reject if exceeded
- If runtime not deployed, only positive integer check applies

* chore: remove planning and bug bash docs from PR

* fix: use composite key and parentName for endpoint identification

- Add parentName field to ResourceStatusEntry for structured parent linking
- Use runtimeName/endpointName composite key in remove/preview/getRemovable
- Status command filters endpoints by parentName instead of parsing detail string
- React keys use structured parentName/name instead of display strings

* test: add comprehensive unit tests for RuntimeEndpointPrimitive

23 tests covering add(), remove(), previewRemove(), getRemovable():
- Runtime lookup, duplicate detection, version validation
- Composite key removal targeting correct runtime
- Empty endpoints dict cleanup
- Version validation against deployed state
- Richer JSON response shape

* fix: remove dead findGatewayTargetReferences stub

* fix: use BasePrimitive configIO instead of ad-hoc ConfigIO in add()

* fix: use Number() instead of parseInt in TUI version validation

* chore: fix prettier formatting

* fix: use T[] instead of Array<T> to satisfy eslint array-type rule

* fix(ci): revert schema file to avoid schema-check guard

The schemas/ directory is auto-regenerated during the release workflow.
Direct modifications are blocked by CI.

* Revert "fix(ci): revert schema file to avoid schema-check guard"

This reverts commit 3615e37a0aaa71cd4d2c5c7b19e3ddb41eb2e07c.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Jesse Turner <57651174+jesseturner21@users.noreply.github.com>
Co-authored-by: Avi Alpert <131792194+avi-alpert@users.noreply.github.com>
Co-authored-by: Gitika <53349492+notgitika@users.noreply.github.com>
Co-authored-by: Hweinstock <42325418+Hweinstock@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aidan Daly <99039782+aidandaly24@users.noreply.github.com>
Co-authored-by: Tejas Kashinath <42380254+tejaskash@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* feat: add target-based AB test routing

Adds a new AB test mode that routes traffic between gateway targets
pointing at different runtime endpoints, alongside the existing
config-bundle mode.

Changes:
- AB test schema: target-based mode, per-variant eval config, gateway filter
- HTTP gateway: targets array with qualifier (endpoint reference)
- AB test primitive: --mode flag, target-based CLI flow
- Pause/resume/stop/promote commands for AB tests
- TUI: mode selection, target-based wizard steps
- Cross-ref validation: gateway targets must reference valid endpoints
- Deploy: handle target-based variant resolution and eval config union

* fix: correct API field names for target-based AB test creation

- Rename target.targetName → target.name in API client
- Rename perVariantOnlineEvaluationConfig[].treatmentName → .name
- Fix post-deploy mapping to use correct API field names
- Add controlQualifier/treatmentQualifier to Commander options type
- Add runtime/qualifier fields to TUI AB test config types

* fix: include all eval config ARNs in AB test IAM role policy

For target-based AB tests with perVariantOnlineEvaluationConfig,
the…
* test: add integ and e2e tests for recommendations

Integ tests (12) cover CLI validation for run recommendation: required
flags, system-prompt/tool-description input validation, config bundle
source, spans file validation, and lookback/session options.

E2E tests (8) cover recommendation API lifecycle: start system-prompt
and tool-description recommendations, get, delete (stop-via-delete),
verify 404 after delete, inline session spans, and error cases.

* remove API-level e2e tests (CLI e2e lives in PR #107)

* fix: add error message assertions to required flag tests

Assert JSON error content (--runtime, --evaluator, --type) instead of
only checking exitCode, so tests fail meaningfully on crashes.
…ndations (#107)

* test: add integ and e2e tests for config bundles, batch eval, recommendations

Integ tests (48): config bundle add/remove lifecycle, evaluator/online-eval
lifecycle, batch-evaluation CLI validation, ground truth parsing,
recommendation CLI validation.

E2E test (1 file, 17 tests): full CLI lifecycle — create project → add
config bundle → add evaluator → deploy → invoke → config-bundle versions/diff
→ run batch-evaluation → run eval → run recommendation (system-prompt,
tool-description, config-bundle source) → remove + reconcile.

* refactor: keep only e2e test in this PR (integ tests live in separate PRs)

* fix: address PR review — stronger e2e assertions and real session IDs

- Use real session ID from invoke for ground truth (not hardcoded)
- Assert diff array is non-empty, not just property existence
- Assert batch eval status is not FAILED
- Assert recommendation result is non-empty
- Add comment explaining retry rationale for on-demand eval
- Reduce excessive retry count (18→10) for on-demand eval
* test: add integ and e2e tests for config bundles

Integ tests cover add/remove lifecycle, CLI validation, components-file
support, duplicate rejection, placeholder keys, and multi-bundle coexistence.

E2E tests cover full API lifecycle (create, get, update, list versions,
branch filtering, diff, delete) against the real control plane.

* fix: remove hardcoded account ID from e2e config bundle tests

Resolve account ID dynamically from AWS_ACCOUNT_ID env var or
aws sts get-caller-identity, matching the pattern in e2e-helper.ts.

* remove API-level e2e tests (CLI e2e lives in PR #107)

* fix: address PR review — extract shared helpers, fix afterAll cleanup

Move runSuccess/runFailure to shared test-utils to prevent duplication.
Fix afterAll to defensively clean all bundleNames on test failure.
The configBundle advanced setting was selectable in the TUI but never
propagated to the output config, so it silently did nothing.

- AddAgentScreen: set withConfigBundle on byoConfig when selected in
  advanced settings, clear it when deselected, pass it through both
  create and BYO complete handlers, show in confirm review
- GenerateWizardUI: show config bundle in confirm summary
- useCreateFlow: pass withConfigBundle to GenerateConfig and call
  createConfigBundleForAgent after agent is written
…#161)

1. Validate batch eval name against API pattern [a-zA-Z][a-zA-Z0-9_]{0,47}
   before sending. The API returns a misleading "Resource identifier cannot
   be empty" for invalid names (e.g. hyphens). The CLI now gives a clear
   error message with the exact constraints.

2. Fix ground truth in legacy fallback: was sending sessionMetadata at
   top level instead of wrapping in evaluationMetadata. Both old and new
   API models expect evaluationMetadata.sessionMetadata.
…verage. (#154)

* refactor: extract deleteHttpGatewayWithTargets shared helper

Extract gateway deletion logic (targets → gateway → role) into
shared deleteHttpGatewayWithTargets(). Add deleteOrphanedHttpGateways()
for deploy reconciliation. Teardown uses shared helper.
Target failures are best-effort (warn, continue).

* chore: update bundled SDK wheel to bedrock-agentcore 1.6.4

Replace the dev pre-release wheel (1.6.0.dev20260413) with the
current SDK release (1.6.4) from bedrock-agentcore-sdk-python-private.

* feat: add [tool.uv.sources] vendored wheel support to agent templates

- render.ts: copy .whl files verbatim instead of through Handlebars
- BaseRenderer: copy bundled SDK wheel into scaffolded project's wheels/
- Add [tool.uv.sources] block to all 8 agent pyproject.toml templates
  pointing at wheels/bedrock_agentcore-1.6.4-py3-none-any.whl
- Dockerfile: conditionally COPY wheels/ for Container builds
- Tests: .whl binary handling, wheel copy verification, updated snapshots

* test: add e2e and integ tests for AB tests, gateways, and online evals

Integ tests (17 passing):
- Target-based AB test CLI flags (11 tests)
- Online eval with endpoint field (6 tests)

E2E tests (require AWS ap-southeast-2):
- Target-based AB test full lifecycle
- Config-bundle AB test full lifecycle
- HTTP gateway with targets lifecycle

* fix: remove gateway trace delivery, add runtime experiment span debug check

- Remove gateway trace delivery setup from deploy
- Remove Gateway Trace Delivery and Gateway Spans from debug panel
- Add Runtime Experiment Spans check to debug panel (queries aws/spans for abTestArn)

* fix: improve runtime experiment span debug checks with per-variant filtering and service.name

- Split single experiment span check into per-variant (C, T1) checks
- Filter baseline runtime spans by service.name from deployed state instead of gen_ai_agent
- Show targeted warnings when one variant has spans but the other doesn't

* Revert "feat: add [tool.uv.sources] vendored wheel support to agent templates"

This reverts commit 458177299237a50e9bc6eb4aada607d18dced3f2.
* fix: update unit tests to match feat/evo-implementation changes

- post-deploy-ab-tests: expect 'updated' instead of 'skipped' for existing tests
- post-deploy-http-gateways: use deleteOrphanedHttpGateways, remove trace delivery tests
- preflight: mock getPathResolver and fs for config bundle patching
- ABTestPrimitive: expect gateway retained on remove (requires --delete-gateway)
- useAddABTestWizard: first step is now 'mode' not 'name'

* refactor: remove legacy /abtests API path fallback

The AB test API has been migrated to /ab-tests. Remove the
dpRequestWithFallback function, unused cpRequest/getControlPlaneEndpoint,
and use dnsSuffix() for multi-partition support.

* Revert "feat: bundle Python SDK wheel into CLI for offline install"

This reverts commit 791dcfa.
@notgitika notgitika requested a review from a team April 30, 2026 17:54
@github-actions github-actions Bot added size/xl PR size: XL agentcore-harness-reviewing AgentCore Harness review in progress labels Apr 30, 2026
@notgitika notgitika closed this Apr 30, 2026
@notgitika notgitika deleted the feat/evo-implementation branch April 30, 2026 17:54
@agentcore-cli-automation

Copy link
Copy Markdown

Orphaned config bundles never get deleted when all bundles are removed from the spec

In src/cli/commands/deploy/actions.ts around line 571, the post-deploy config bundle block is gated on:

const configBundleSpecs = context.projectSpec.configBundles ?? [];
if (configBundleSpecs.length > 0) {
  // ... calls setupConfigBundles, which also handles orphan deletion
}

If a user removes all configBundles from agentcore.json and redeploys, setupConfigBundles is never called, so orphaned bundles in deployed state are never deleted. They linger in the account (costing money / creating drift) and stay stuck in deployed-state.json forever.

Compare to how HTTP gateways handle this on line 542:

if (httpGatewaySpecs.length > 0 || Object.keys(existingHttpGateways ?? {}).length > 0) {

Two possible fixes:

  1. Mirror the HTTP gateway condition: if (configBundleSpecs.length > 0 || Object.keys(existingConfigBundles ?? {}).length > 0).
  2. Split orphan deletion into a pre-pass like deleteOrphanedABTests.

Related: even when setupConfigBundles does run, the state-merge block at line 583 is guarded by if (Object.keys(configBundleResult.configBundles).length > 0). configBundles in the result is only populated by create/update/skip paths — deletions never add to it. So if every spec bundle is deleted in one run (e.g. user has bundles A, B; removes A and B and adds nothing), deployed state keeps the stale entries and targetResources.configBundles is never overwritten. The HTTP gateway post-deploy explicitly comments "Always merge HTTP gateway state (even if empty, to clear deleted gateways)" — config bundles should do the same.

@agentcore-cli-automation

Copy link
Copy Markdown

Baggage header is silently dropped for MCP-protocol invocations

In src/cli/commands/invoke/action.ts lines 241-248, MCP invocations construct:

const mcpOpts = {
  region: targetConfig.region,
  runtimeArn: agentState.runtimeArn,
  userId: options.userId,
  headers: options.headers,
  bearerToken: options.bearerToken,
    baggage,  // ← never actually sent
};

But McpInvokeOptions in src/cli/aws/agentcore.ts (around line 543) doesn't declare a baggage field, and neither mcpRpcCall, mcpRpcCallWithBearer, nor buildMcpBearerHeaders reads it or forwards it to the runtime. So config-bundle baggage for MCP agents is lost — the agent never sees the bundle ARN/version, and BedrockAgentCoreContext.get_config_bundle() will return defaults.

Two fixes depending on intent:

  1. If MCP agents should support config bundles: add baggage to McpInvokeOptions and forward it in both buildMcpBearerHeaders (as a baggage header) and the SigV4 path via InvokeAgentRuntimeCommand's baggage field.
  2. If config bundles are HTTP-only for now: remove the baggage key from the mcpOpts literal so readers don't think it's wired up.

(Side note: the indentation on that baggage, line is off — two extra spaces.)

@agentcore-cli-automation

Copy link
Copy Markdown

Three of the new preview-API clients hardcode amazonaws.com — breaks in GovCloud / China partitions

The five new SigV4 HTTP clients for preview APIs inconsistently handle partition DNS suffixes:

  • src/cli/aws/agentcore-ab-tests.ts (line 205): return `https://bedrock-agentcore.${region}.${dnsSuffix(region)}`;
  • src/cli/aws/agentcore-http-gateways.ts (line 158): uses dnsSuffix(region)
  • src/cli/aws/agentcore-config-bundles.ts (line 181): return `https://bedrock-agentcore-control.${region}.amazonaws.com`;
  • src/cli/aws/agentcore-recommendation.ts (line 228): hardcoded amazonaws.com
  • src/cli/aws/agentcore-batch-evaluation.ts (line 231): hardcoded amazonaws.com

Options:

  1. Use dnsSuffix(region) in all five (matching the two that already do). This is the consistent fix.
  2. If these preview features are commercial-only and will never be offered in GovCloud / China during preview, at least leave a comment to that effect — otherwise someone reading the two correct files will assume the hardcoded ones are bugs to "fix" later and accidentally break them.

Since all five already share the same signing-helper pattern (essentially duplicated across files per the "interim until SDK support" comments), extracting a shared signedRequest helper would also remove the opportunity for this kind of drift.

@github-actions github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/xl PR size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants