fix(eval): migrate authored expected output to vars by christso · Pull Request #1657 · EntityProcess/agentv

christso · 2026-07-05T03:33:10Z

Summary

Authored Promptfoo-aligned YAML now treats expected_output as removed at the top level, in default_test, and in inline tests[] rows. Reference answers belong in vars.expected_output, where they are inert unless an authored assertion or grader explicitly consumes {{ expected_output }}.

The hard-deprecation codemod now migrates legacy authored references into vars.expected_output, preserving existing assertion/criteria strategies and adding a reference-matching llm-rubric only when the legacy case had no explicit grading strategy. Examples and local fixtures have been migrated to the supported shape, and the generated eval schema reflects that authored YAML expected_output is no longer accepted.

The TypeScript SDK materialization bridge keeps carrying internal expected_output payloads for now, so this PR stays within the authored YAML slice and leaves SDK/API compatibility cleanup to av-kfik.28.4.

Related: av-kfik.28.2

Validation

bun --filter @agentv/core test
bun --filter agentv test
bun test packages/core/test/evaluation/loaders/jsonl-parser.test.ts packages/core/test/evaluation/validation/eval-validator.test.ts scripts/migrate-hard-deprecations.test.ts packages/core/test/evaluation/criteria-optional.test.ts packages/core/test/evaluation/suite-level-input.test.ts packages/core/test/evaluation/conversation-mode.test.ts packages/core/test/evaluation/validation/eval-schema-sync.test.ts packages/core/test/evaluation/validation/eval-file-schema.test.ts
bun test packages/core/test/evaluation/eval-inline-experiment.test.ts
bun test apps/cli/test/commands/prepare/prepare.test.ts apps/cli/test/eval.integration.test.ts
bun test apps/cli/test/commands/runs/rerun.test.ts apps/cli/test/commands/grade/grade-prepared.test.ts
bun --filter @agentv/core build
bun --filter @agentv/sdk build
bun run generate:schema
bun run validate:examples
bun run lint
git diff --check

Live provider dogfood was not run because this slice changes authored YAML loading, validation, schema, codemod, and examples; it does not change provider execution, LLM grader execution, or run artifact layout.

Post-Deploy Monitoring & Validation

No additional production monitoring required. This is a local authoring/schema/codemod change with no deployed service path; CI validation and example validation are the release signals.

cloudflare-workers-and-pages · 2026-07-05T03:33:50Z

Deploying agentv with Cloudflare Pages

Latest commit:	`b318d2f`
Status:	✅ Deploy successful!
Preview URL:	https://535e62c2.agentv.pages.dev
Branch Preview URL:	https://grading-expected-output-vars.agentv.pages.dev

View logs

christso force-pushed the grading-expected-output-vars branch from a77661f to b89d5ae Compare July 5, 2026 03:39

fix(eval): migrate authored expected output to vars

b318d2f

christso force-pushed the grading-expected-output-vars branch from b89d5ae to b318d2f Compare July 5, 2026 03:48

christso merged commit 6f92049 into main Jul 5, 2026
8 checks passed

christso deleted the grading-expected-output-vars branch July 5, 2026 03:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(eval): migrate authored expected output to vars#1657

fix(eval): migrate authored expected output to vars#1657
christso merged 1 commit into
mainfrom
grading-expected-output-vars

christso commented Jul 5, 2026 •

edited

Loading

Uh oh!

cloudflare-workers-and-pages Bot commented Jul 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

christso commented Jul 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Post-Deploy Monitoring & Validation

Uh oh!

cloudflare-workers-and-pages Bot commented Jul 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying agentv with Cloudflare Pages

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

christso commented Jul 5, 2026 •

edited

Loading

cloudflare-workers-and-pages Bot commented Jul 5, 2026 •

edited

Loading