feat(frontend): Run-on modes in the evaluator creation drawer (shared controls) by mmabrouk · Pull Request #4557 · Agenta-AI/agenta

mmabrouk · 2026-06-05T12:46:12Z

Why

The Run on selector (test case / app output / trace) was only wired into the full-page evaluator playground. The evaluator-creation drawer still hardcoded runDisabled={!hasAppConnected} and only showed the test-set dropdown after an app was connected — so in the drawer you were forced to pick an app even when you wanted to run the evaluator directly on a test case. The drawer had silently drifted out of sync with the page.

What

Rather than paste the run-on wiring into the drawer (a fourth copy), this extracts the logic the page and drawer were already duplicating and shares it:

useEvaluatorRunControls() — one hook for the app adapter, app-select handler, run-on mode + handlePickRunOn, and the run gate (runDisabled = runOnMode === "app" && !hasAppConnected).
EvaluatorRunControls — the run-on selector + app picker + disconnect affordance + test-set dropdown, as one cluster used by both the page header and the drawer header, so they can't diverge again.

Result:

Page: behavior-preserving (just sources its controls from the shared hook/cluster).
Drawer: gains all three run-on modes, the run-on selector, a disconnect affordance, and an always-available test-set dropdown. Test-case mode now runs without forcing an app — the bug is fixed.
Removes the appWorkflowAdapter / handleAppSelect / evaluator-node-lookup triplication across the page body, drawer header, and drawer body.

Net: 218 insertions / 274 deletions across 5 files (2 new, 3 slimmed).

Notes

runOnMode stays persisted per project (shared by page and drawer); the per-evaluator question is tracked separately for a later PR, as discussed.
runDisabled only manifests where the run panel renders (the page and the expanded drawer); the collapsed/config-only drawer ignores it, unchanged.

Stacked on

Based on fe-fix/app-workflow-router-unification-regression-fix (the merged evaluator-playground branch, which already contains the page-side run-on feature from #4553).

Test plan

Open the New Evaluation flow → create-evaluator drawer → switch Run on to "Run directly on a test case": the test-case editor is usable and runs without selecting an app.
Switch to "Run on an app output" with no app: the run panel shows the "Select an app" empty state; pick an app → it runs.
Confirm the full-page evaluator playground is unchanged (modes, default, dark mode, disconnect).

The Run-on selector (test case / app output / trace) was only wired into the full-page evaluator playground. The evaluator-creation drawer still hardcoded `runDisabled={!hasAppConnected}` and only showed the test-set dropdown after an app was connected, so it forced the user to pick an app even when they wanted to run the evaluator directly on a test case. Rather than copy the run-on wiring into the drawer (a fourth duplicate), extract the shared logic the page and drawer were already duplicating: - useEvaluatorRunControls(): app adapter, app-select handler, run-on mode + handlePickRunOn, and the runDisabled gate (runOnMode === 'app' && !appConnected). - EvaluatorRunControls: the run-on selector + app picker + disconnect + test-set cluster, shared by the page header and the drawer header so they can't drift. The page is behavior-preserving; the drawer gains all three modes, the run-on selector, a disconnect affordance, and an always-available test-set dropdown. This also removes the adapter/handleAppSelect/evaluator-node triplication across the page body, drawer header, and drawer body.

vercel · 2026-06-05T12:46:18Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Jun 8, 2026 12:10pm

coderabbitai · 2026-06-05T12:46:20Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: b24abe20-60b2-4159-a643-79b5089dea82

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fe-feat/evaluator-drawer-run-on

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-05T13:07:13Z

Railway Preview Environment


Preview URL	https://gateway-production-d7bf.up.railway.app/w
Project	`agenta-oss-pr-4557`
Image tag	`pr-4557-1bda40a`
Status	Deployed
Railway logs	Open logs
Workflow logs	View workflow run
Updated at 2026-06-05T13:54:26.050Z

The creation drawer renders inside EvaluationRunsTableStoreProvider, a scoped jotai store that mirrors only a handful of global atoms. The playground state, however, runs on the default store (the playground package uses getDefaultStore() throughout). So in the drawer the run-on mode was read/written in the scoped store while the playground lived in the default store — the two split, and switching to test-case mode never reached the run panel: it stayed stuck on the 'Select an app' empty state. Read and write all run-on / playground atoms through getDefaultStore() in useEvaluatorRunControls, mirroring the existing workaround in usePreviewVariantConfig and TestsetCells. On the full page (no scoped store) this is a no-op; in the drawer it aligns run-on state with the playground so test-case mode shows the inputs/outputs as it does on the page.

The evaluator drawer rendered by WorkflowRevisionDrawerWrapper reimplemented the run panel gate as `runDisabled={!hasAppConnected}`, ignoring the run-on mode. So switching its Run-on selector to 'test case' updated the header while the panel kept showing the 'Select an app' empty state and demanding an app — the page and creation drawer respected the mode, only this third surface didn't. Route it through the shared useEvaluatorRunControls hook (+ SelectAppEmptyState and the prop-less EvaluatorPlaygroundHeader), the same wiring the page and the creation drawer use, so the gate is `runOnMode === 'app' && !hasAppConnected` everywhere and the three surfaces can't drift again. Removes this drawer's duplicated app adapter / app-select / run-gate logic. Also drop the getDefaultStore() patch from useEvaluatorRunControls: runtime debugging proved these surfaces are not in a scoped store (the drawer that was broken is WorkflowRevisionDrawerWrapper, not the scoped-store CreateEvaluator drawer), so the override was a no-op based on a wrong hypothesis.

dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 5, 2026

dosubot Bot added the Frontend label Jun 5, 2026

vercel Bot deployed to Preview June 5, 2026 12:46 View deployment

dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jun 5, 2026

vercel Bot deployed to Preview June 5, 2026 13:45 View deployment

ardaerzin self-requested a review June 8, 2026 08:52

ardaerzin approved these changes Jun 8, 2026

View reviewed changes

mmabrouk changed the base branch from fe-fix/app-workflow-router-unification-regression-fix to release/v0.102.0 June 8, 2026 09:08

Merge branch 'release/v0.102.0' into fe-feat/evaluator-drawer-run-on

3a6319e

mmabrouk marked this pull request as draft June 8, 2026 09:09

mmabrouk marked this pull request as ready for review June 8, 2026 09:10

vercel Bot deployed to Preview June 8, 2026 09:11 View deployment

vercel Bot deployed to Preview June 8, 2026 12:10 View deployment

mmabrouk merged commit fa7b3be into release/v0.102.0 Jun 8, 2026
19 of 20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(frontend): Run-on modes in the evaluator creation drawer (shared controls)#4557

feat(frontend): Run-on modes in the evaluator creation drawer (shared controls)#4557
mmabrouk merged 4 commits into
release/v0.102.0from
fe-feat/evaluator-drawer-run-on

mmabrouk commented Jun 5, 2026

Uh oh!

vercel Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading

Review skipped

Uh oh!

github-actions Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mmabrouk commented Jun 5, 2026

Why

What

Notes

Stacked on

Test plan

Uh oh!

vercel Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

github-actions Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Railway Preview Environment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented Jun 5, 2026 •

edited

Loading

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading

github-actions Bot commented Jun 5, 2026 •

edited

Loading