Fix SDK eval breadcrumbs for SDK/custom runs#4578
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (14)
📝 WalkthroughSummary by CodeRabbitRelease Notes
WalkthroughThis PR consolidates evaluation type handling in the EvalRunDetails component tree by replacing scattered string-literal unions with a shared ChangesEvaluation Type Standardization
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Thank you @Rajesh270712 The GIF is a local reproduction of the issue outside of Agenta. Please share a demo of a deployed version of Agenta where the issue is fixed. Unfortunately we don't accept PRs from contributors that did not deploy a version of the software and tested their contributions. |
Summary
This is a follow-up to #4552 after the maintainer requested the repository template and visual proof.
This PR fixes the SDK/custom eval run details breadcrumb for #4549.
SDK eval result pages could fall back to the
Auto Evalsbreadcrumb because the eval run details route normalized or omitted thecustomrun kind. The change preserves the shared evaluation run kind through the details page, mapscustomtoSDK Evals, and keeps SDK/custom runs on the non-human metric-column path while preserving human eval behavior.Fixes #4549
Testing
Verified locally
npx --yes pnpm@11.1.2 exec tsx --tsconfig oss/tsconfig.json --test oss/src/components/EvalRunDetails/utils/evaluationMetricColumns.test.ts- passed, 2 tests, 0 failures.npx --yes pnpm@11.1.2 --filter @agenta/oss lint- passed with Node 26 vs expected Node 24 and Next lint deprecation warnings.npx --yes pnpm@11.1.2 exec prettier --check <changed files>- passed during QA.git diff --check origin/main...HEAD- passed.Added or updated tests
oss/src/components/EvalRunDetails/utils/evaluationMetricColumns.test.tscovering metric-column selection for SDK/custom and human evals.QA follow-up
npx --yes pnpm@11.1.2 --filter @agenta/oss types:checkstill exits 1 on existing repository baseline diagnostics in this environment; no remaining focused diagnostic was tied to the SDK/custom metric helper path.Demo
This proof shows the patched SDK/custom breadcrumb rendering as
SDK Evalsand linking back withkind=custom. It is local component-level proof because seeded local auth/API data was unavailable for a full product-session recording.Checklist