fix(local-cli): Loosen the ARN-mode gate in `run eval` from `!!(runtimeAr... (#737) by aidandaly24 · Pull Request #42 · aidandaly24/agentcore-cli

aidandaly24 · 2026-06-25T06:11:41Z

Issues

run eval: --evaluator-arn naming is misleading and Builtin evaluators fail without a project aws/agentcore-cli#737 — agentcore run eval --runtime-arn ... --evaluator Builtin.Correctness (no project) is wrongly rejected with "No agentcore project found." despite Builtin evaluators being supported in ARN mode; the workaround --evaluator-arn Builtin.Correctness is non-obvious. The --evaluator-arn flag also misleadingly accepts non-ARN Builtin IDs.

Root cause

command.tsx:121 gates with !!(runtimeArn && evaluatorArn) requiring both flags; --evaluator Builtin.* alone yields isArnMode=false and triggers requireProject() (project.tsx:88-91) before handleRunEval, even though resolveFromArn (run-eval.ts:76-86) supports Builtin.* in ARN mode. Misleading name: resolveEvaluatorArns (run-eval.ts:40-45) passes non-ARNs through verbatim. Both from d41e14b (aws#706), unchanged at HEAD v0.20.2.

The fix

Loosen command.tsx:121 to const isArnMode = !!cliOptions.runtimeArn; (resolveFromArn already validates evaluators and errors cleanly). Fix/rename the misleading --evaluator-arn flag at :86 (and :198 batch-eval): minimally correct the description to note it accepts ARNs or Builtin.*/managed IDs; preferably add --evaluator-id with --evaluator-arn as a deprecated alias. Design decision: hidden alias vs breaking hard rename.

Files touched: src/cli/commands/run/command.tsx:121 (isArnMode gate) and :86 (--evaluator-arn flag definition/description); :198 (batch-evaluation --evaluator-arn) for naming consistency. Behavior already supported in src/cli/operations/eval/run-eval.ts:76-96 (resolveFromArn) and :40-45 (resolveEvaluatorArns). Error origin: src/cli/tui/guards/project.tsx:84-92.

Validation evidence

The fix was verified by reproducing the original symptom and re-running after the change:

Original symptom reproduced and fixed at the real file src/cli/commands/run/command.tsx:122 (task description said src/actions/run-eval/command.tsx:121, but the logic matches exactly; project guard is src/cli/tui/guards/project.tsx:88-91). The fix is in the working tree on branch fix/737: gate changed from const isArnMode = !!(cliOptions.runtimeArn && cliOptions.evaluatorArn); to const isArnMode = !!cliOptions.runtimeArn; (plus a docstring tweak on --evaluator-arn) and a new CLI-level test file src/cli/commands/run/tests/run-eval-arn-gating.test.ts.

BEFORE (reverted gate to buggy !!(runtimeArn && evaluatorArn), rebuilt, ran from /tmp non-project dir):
node dist/cli/index.mjs run eval --runtime-arn arn:aws:bedrock-agentcore:us-east-1:123456789012:runtime/my-runtime-abc123 --evaluator Builtin.Correctness --region us-east-1 --json
=> printed No agentcore project found. / Run agentcore create to fix this. — requireProject() fired before handleRunEval/resolveFromArn. Symptom confirmed.

AFTER (restored fix, rebuilt OK -> dist/cli/index.mjs, same non-project dir):

Builtin.Correctness in ARN mode => NO project error; proceeded into resolveFromArn (Builtin.* accepted) and handleRunEval, returning {"success":false,"error":"No session spans found for agent "my-runtime-abc123" in the last 7 day(s). Has the agent been invoked?"}. Proves gate passed and Builtin evaluator treated as valid.
Custom evaluator my-custom-eval in ARN mode => {"success":false,"error":"Custom evaluator ... cannot be resolved in ARN mode"} (resolveFromArn error, not the project-missing error). Both new vitest cases pass.

Test suite: green.

Staged on the fork as a draft for human review. Promote to aws/agentcore-cli after vetting.

…for Builtin evaluators The ARN-mode gate required both --runtime-arn and --evaluator-arn, so `run eval --runtime-arn ... --evaluator Builtin.Correctness` was wrongly rejected with "No agentcore project found." even though resolveFromArn already supports Builtin.* evaluators in ARN mode. Loosen the gate to key off --runtime-arn alone, and clarify the --evaluator-arn description to steer Builtin.* IDs toward -e/--evaluator.

github-actions · 2026-06-25T07:41:49Z

Coverage Report

Status	Category	Percentage	Covered / Total
🔵	Lines	37.16%	13593 / 36577
🔵	Statements	36.43%	14452 / 39667
🔵	Functions	31.8%	2333 / 7336
🔵	Branches	31.1%	9000 / 28930

Generated in workflow #96 for commit bbcfb66 by the Vitest Coverage Report Action

github-actions Bot added size/s PR size: S agentcore-harness-reviewing AgentCore Harness review in progress and removed agentcore-harness-reviewing AgentCore Harness review in progress labels Jun 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(local-cli): Loosen the ARN-mode gate in `run eval` from `!!(runtimeAr... (#737)#42

fix(local-cli): Loosen the ARN-mode gate in `run eval` from `!!(runtimeAr... (#737)#42
aidandaly24 wants to merge 1 commit into
mainfrom
fix/737

aidandaly24 commented Jun 25, 2026

Uh oh!

github-actions Bot commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aidandaly24 commented Jun 25, 2026

Issues

Root cause

The fix

Validation evidence

Uh oh!

github-actions Bot commented Jun 25, 2026

Coverage Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant