Fix SDK eval breadcrumbs for SDK/custom runs by Rajesh270712 · Pull Request #4578 · Agenta-AI/agenta

Rajesh270712 · 2026-06-08T10:47:48Z

Summary

This is a follow-up to #4552 after the maintainer requested the repository template and visual proof.

This PR fixes the SDK/custom eval run details breadcrumb for #4549.

SDK eval result pages could fall back to the Auto Evals breadcrumb because the eval run details route normalized or omitted the custom run kind. The change preserves the shared evaluation run kind through the details page, maps custom to SDK Evals, and keeps SDK/custom runs on the non-human metric-column path while preserving human eval behavior.

Fixes #4549

Testing

Verified locally

npx --yes pnpm@11.1.2 exec tsx --tsconfig oss/tsconfig.json --test oss/src/components/EvalRunDetails/utils/evaluationMetricColumns.test.ts - passed, 2 tests, 0 failures.
npx --yes pnpm@11.1.2 --filter @agenta/oss lint - passed with Node 26 vs expected Node 24 and Next lint deprecation warnings.
npx --yes pnpm@11.1.2 exec prettier --check <changed files> - passed during QA.
git diff --check origin/main...HEAD - passed.

Added or updated tests

Added oss/src/components/EvalRunDetails/utils/evaluationMetricColumns.test.ts covering metric-column selection for SDK/custom and human evals.

QA follow-up

QA approved the branch as ready after branch, diff, identity, focused test, lint, and formatting checks.
Full npx --yes pnpm@11.1.2 --filter @agenta/oss types:check still exits 1 on existing repository baseline diagnostics in this environment; no remaining focused diagnostic was tied to the SDK/custom metric helper path.

Demo

Local component-level proof GIF: https://gist.githubusercontent.com/Rajesh270712/bd10c47fd502a981c5db77d199963f91/raw/agenta-pr-4552-sdk-eval-breadcrumb-proof.gif

This proof shows the patched SDK/custom breadcrumb rendering as SDK Evals and linking back with kind=custom. It is local component-level proof because seeded local auth/API data was unavailable for a full product-session recording.

Checklist

I have included a video or screen recording for UI changes, or marked Demo as N/A
Relevant tests pass locally
Relevant linting and formatting pass locally
I have signed the CLA, or I will sign it when the bot prompts me

coderabbitai · 2026-06-08T10:48:20Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 4870a3b4-bcc8-4527-9463-f73aff49caae

📥 Commits

Reviewing files that changed from the base of the PR and between 98b8a9d and 2df47bf.

📒 Files selected for processing (14)

web/oss/src/components/EvalRunDetails/Table.tsx
web/oss/src/components/EvalRunDetails/atoms/metricProcessor.ts
web/oss/src/components/EvalRunDetails/atoms/table/types.ts
web/oss/src/components/EvalRunDetails/components/FocusDrawer.tsx
web/oss/src/components/EvalRunDetails/components/Page.tsx
web/oss/src/components/EvalRunDetails/components/columnVisibility/ColumnVisibilityPopoverContent.tsx
web/oss/src/components/EvalRunDetails/evaluationPreviewTableStore.ts
web/oss/src/components/EvalRunDetails/hooks/usePreviewColumns.tsx
web/oss/src/components/EvalRunDetails/state/evalType.ts
web/oss/src/components/EvalRunDetails/test.tsx
web/oss/src/components/EvalRunDetails/utils/buildPreviewColumns.tsx
web/oss/src/components/EvalRunDetails/utils/buildSkeletonColumns.ts
web/oss/src/components/EvalRunDetails/utils/evaluationMetricColumns.test.ts
web/oss/src/components/EvalRunDetails/utils/evaluationMetricColumns.ts

📝 Walkthrough

Summary by CodeRabbit

Release Notes

New Features
- Added support for "custom" evaluation type in evaluation run details.
Refactor
- Standardized metric column selection logic for consistent behavior across evaluation types.
Tests
- Added unit tests for metric column selection functionality.

Walkthrough

This PR consolidates evaluation type handling in the EvalRunDetails component tree by replacing scattered string-literal unions with a shared EvaluationRunKind type, introducing centralized metric column selection utilities, and refactoring components to use typed helpers instead of inline branching on evaluation type.

Changes

Evaluation Type Standardization

Layer / File(s)	Summary
Metric column selection utilities `web/oss/src/components/EvalRunDetails/utils/evaluationMetricColumns.ts`, `...test.ts`	New utility module exports `usesHumanMetricColumns`, `usesAutoMetricColumns`, and `selectStaticMetricColumnsForEvaluationType` to encapsulate metric column selection logic; tests verify selection for `custom` and `human` evaluation kinds.
Core type definitions `web/oss/src/components/EvalRunDetails/atoms/table/types.ts`, `state/evalType.ts`, `evaluationPreviewTableStore.ts`, `atoms/metricProcessor.ts`	`EvaluationTableColumn.visibleFor`, `PreviewEvaluationType`, `EvaluationPreviewMeta.evaluationType`, and metric processor options all adopt `EvaluationRunKind` instead of local string unions.
Top-level component props `web/oss/src/components/EvalRunDetails/components/Page.tsx`, `test.tsx`	`EvalRunPreviewPage` and `EvalRunTestPage` update their evaluation-type props to `EvaluationRunKind`; breadcrumb mapping adds support for the `custom` evaluation kind.
Column processing utilities and hooks `web/oss/src/components/EvalRunDetails/utils/buildPreviewColumns.tsx`, `buildSkeletonColumns.ts`, `hooks/usePreviewColumns.tsx`	Utility functions and hooks refactored to use `EvaluationRunKind` types and delegate metric selection to `selectStaticMetricColumnsForEvaluationType`, removing inline `evaluationType` branching.
Table and presentation components `web/oss/src/components/EvalRunDetails/Table.tsx`, `components/columnVisibility/ColumnVisibilityPopoverContent.tsx`, `components/FocusDrawer.tsx`	Table and column-visibility components now use `EvaluationRunKind` props and call metric selection helpers; FocusDrawer reorganizes imports to source `MetricColumnDefinition` from shared entities.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 60.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Fix SDK eval breadcrumbs for SDK/custom runs' clearly and specifically summarizes the main change—fixing breadcrumb rendering for SDK/custom evaluation runs.
Description check	✅ Passed	The description is comprehensive and directly related to the changeset, explaining the problem (breadcrumb fallback), solution (preserving evaluation run kind), and testing/verification details.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

mmabrouk · 2026-06-08T10:57:17Z

Thank you @Rajesh270712

The GIF is a local reproduction of the issue outside of Agenta.

Please share a demo of a deployed version of Agenta where the issue is fixed. Unfortunately we don't accept PRs from contributors that did not deploy a version of the software and tested their contributions.

Rajesh270712 added 3 commits June 5, 2026 00:23

Fix SDK eval details breadcrumb

84bced6

Use shared eval run kind in test route

998d863

Keep SDK eval metrics on auto path

2df47bf

dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working Frontend tests labels Jun 8, 2026

mmabrouk closed this Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix SDK eval breadcrumbs for SDK/custom runs#4578

Fix SDK eval breadcrumbs for SDK/custom runs#4578
Rajesh270712 wants to merge 3 commits into
Agenta-AI:mainfrom
Rajesh270712:reputation/REP-PR-005-sdk-eval-breadcrumb

Rajesh270712 commented Jun 8, 2026

Uh oh!

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

mmabrouk commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Rajesh270712 commented Jun 8, 2026

Summary

Testing

Verified locally

Added or updated tests

QA follow-up

Demo

Checklist

Uh oh!

coderabbitai Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

mmabrouk commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading