Skip to content

fix: show correct breadcrumb label for SDK evaluations#4567

Open
GanJiaKouN16 wants to merge 7 commits into
Agenta-AI:mainfrom
GanJiaKouN16:fix/eval-breadcrumbs-sdk-type
Open

fix: show correct breadcrumb label for SDK evaluations#4567
GanJiaKouN16 wants to merge 7 commits into
Agenta-AI:mainfrom
GanJiaKouN16:fix/eval-breadcrumbs-sdk-type

Conversation

@GanJiaKouN16
Copy link
Copy Markdown

Summary

Fixes #4549 — breadcrumbs always showed "Auto Evals" for SDK evaluations instead of "SDK Evals".

Root cause

  1. test.tsx normalized type="custom" to "auto" before passing to EvalRunPreviewPage, discarding the SDK type
  2. Page.tsx typeMap had no "custom" entry
  3. buildBreadcrumbs.ts hardcoded "auto evaluation" as the fallback label

Changes

File Change
EvalRunDetails/test.tsx Remove custom → auto normalization; pass type through as-is
EvalRunDetails/components/Page.tsx Add custom: {label: "SDK Evals", kind: "custom"} to typeMap (matches tab label in EvaluationsView.tsx)
lib/helpers/buildBreadcrumbs.ts Change fallback from "auto evaluation" to "Evaluations"

Closes #4549

Wire the existing GenerateResetLinkModal and PasswordResetLinkModal
into the Actions dropdown in the workspace members table.

- Add 'Reset password' menu item for workspace members (not self)
- Add resetPassword API function in profile service
- Show confirmation dialog before generating the reset link
- Display the generated password reset link with copy functionality

Closes Agenta-AI#2572
Several tables with row-level click navigation were missing the
shouldIgnoreRowClick guard, causing clicks on interactive elements
(checkboxes, dropdowns, buttons) to accidentally trigger row navigation.

Changes:
- Consolidate shouldIgnoreRowClick with broader selector list (merges
  EvaluationRunsTablePOC's extra selectors: [role='button'],
  [role='menuitem'], [role='checkbox'], .ant-btn, etc.)
- Export INTERACTIVE_ROW_SELECTORS constant for reuse
- Add guard to ObservabilityTable (traces)
- Add guard to SessionsTable
- Add guard to PromptsPage
- Add guard to TestcasesTableShell
- Add guard to EntityTable
- Replace partial data-ivt-stop-row-click check in ScenarioListView
  with full shouldIgnoreRowClick
- Update useEntityTableState to use consolidated selectors
- Remove duplicate shouldIgnoreRowClick from navigationActions.ts
- Update EvaluationRunsTablePOC to import from shared utility

Closes Agenta-AI#3254
The evaluation table was showing a generic 'too many requests' message
instead of the actual provider error because:

1. executeViaFetch never checked for body-level errors on HTTP 200.
   The Python SDK can return HTTP 200 with a non-200 status.code
   embedded in the response body (WorkflowBatchResponse.status.code).
   This path was silently treated as success.

2. Error stacktrace/type/code were not propagated through the pipeline.
   Even when the HTTP error path was taken, only the message was
   extracted — the SDK's status.type, status.code, and status.stacktrace
   were dropped.

Changes:
- executeViaFetch: detect body-level errors on HTTP 200 by checking
  responseData.status.code !== 200 and return an error result
- executeViaFetch: extract stacktrace (coercing string[] to string),
  type, and code from both HTTP-error and body-error paths
- Add stacktrace and type to ExecutionResult, RunResult, and
  ExecuteWorkflowRevisionResult error shapes
- runInvocationAction: pass stacktrace and type through to
  upsertStepResultWithInvocation
- upsertStepResultWithInvocation: accept type field in error param

No UI changes needed — InvocationCell already renders stepError.message
and stepError.stacktrace when present; extractStepError already reads
error.code, error.type, error.stacktrace from persisted step data.

Closes Agenta-AI#3324
…iddleware

The vault middleware built env var names using f'{provider.upper()}_API_KEY'
which produces TOGETHER_AI_API_KEY for the 'together_ai' provider kind.
The actual env var is TOGETHERAI_API_KEY (no underscore), matching the
frontend (llmProviders.ts, transforms.ts), backend (env.py), and the
Daytona sandbox runner (daytona.py).

Add an explicit _PROVIDER_ENV_VAR_MAP dict (mirroring the Daytona runner
pattern) that maps each provider kind to its correct env var name, with
fallback to the original f-string pattern for any future providers.

Closes Agenta-AI#3659
… drawer

The 'Open evaluator registry' button in the Trace Drawer's
EvaluatorDetailsPopover navigated human evaluators to the registry
page (/evaluators?tab=human&openEvaluator=...) instead of the
evaluator playground. Automatic evaluators already linked correctly.

Unify both evaluator types to navigate to the playground
(/evaluators/playground?revisions=...), consistent with how other
parts of the codebase link to evaluators (EvaluatorSection,
ConfigurationView). Update button text to 'Open evaluator playground'.

Closes Agenta-AI#4535
Replace the SendGrid-only email backend with SMTP support, keeping
SendGrid as a legacy fallback for existing deployments.

Changes:
- email_service.py: use smtplib for SMTP (priority), SendGrid as
  fallback, no-op when neither is configured
- env.py: add SmtpConfig (SMTP_HOST, SMTP_PORT, SMTP_USERNAME,
  SMTP_PASSWORD, SMTP_FROM_ADDRESS, SMTP_USE_TLS), keep SendgridConfig
  for backwards compatibility; update AuthFacade.email_method to check
  both
- OSS/EE organization_service.py: use env.smtp.enabled || env.sendgrid
  .enabled; use configured from_address instead of hardcoded email
- user_service.py: same email-enabled check update
- db_manager_ee.py: remove dead sendgrid import and unused sg client
- pyproject.toml: remove sendgrid dependency (imported lazily only when
  SENDGRID_API_KEY is set)
- env example files: add SMTP vars, mark SendGrid as legacy
- docs: add SMTP config table, mark SendGrid as legacy

Closes Agenta-AI#4536
The breadcrumb always showed 'Auto Evals' for SDK evaluations because:

1. test.tsx normalized type='custom' to 'auto' before passing to
   EvalRunPreviewPage, losing the SDK type information
2. Page.tsx typeMap had no 'custom' entry
3. buildBreadcrumbs.ts hardcoded 'auto evaluation' as fallback label

Fix:
- Remove the custom→auto normalization in test.tsx
- Add 'custom' → 'SDK Evals' entry to Page.tsx typeMap (matches the
  tab label in EvaluationsView.tsx)
- Change buildBreadcrumbs.ts fallback from 'auto evaluation' to
  'Evaluations'

Closes Agenta-AI#4549
@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 6, 2026

Someone is attempting to deploy a commit to the agenta projects Team on Vercel.

A member of the Team first needs to authorize it.

@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 6, 2026
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Jun 6, 2026

CLA assistant check
All committers have signed the CLA.

@dosubot dosubot Bot added bug Something isn't working Frontend labels Jun 6, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 6, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 8b7d0173-aa36-4163-a30d-31a0cbf5d105

📥 Commits

Reviewing files that changed from the base of the PR and between 6821d0f and ad6e140.

📒 Files selected for processing (37)
  • api/ee/src/services/db_manager_ee.py
  • api/ee/src/services/organization_service.py
  • api/oss/src/services/email_service.py
  • api/oss/src/services/organization_service.py
  • api/oss/src/services/user_service.py
  • api/oss/src/utils/env.py
  • api/pyproject.toml
  • docs/docs/self-host/02-configuration.mdx
  • hosting/docker-compose/ee/env.ee.dev.example
  • hosting/docker-compose/ee/env.ee.gh.example
  • hosting/docker-compose/oss/env.oss.dev.example
  • hosting/docker-compose/oss/env.oss.gh.example
  • sdks/python/agenta/sdk/middlewares/running/vault.py
  • web/oss/src/components/EvalRunDetails/atoms/runInvocationAction.ts
  • web/oss/src/components/EvalRunDetails/components/Page.tsx
  • web/oss/src/components/EvalRunDetails/test.tsx
  • web/oss/src/components/EvaluationRunsTablePOC/actions/navigationActions.ts
  • web/oss/src/components/EvaluationRunsTablePOC/components/EvaluationRunsTable/index.tsx
  • web/oss/src/components/InfiniteVirtualTable/hooks/useTableManager.tsx
  • web/oss/src/components/SharedDrawers/TraceDrawer/components/EvaluatorDetailsPopover.tsx
  • web/oss/src/components/SharedDrawers/TraceDrawer/hooks/useEvaluatorNavigation.ts
  • web/oss/src/components/TestcasesTableNew/components/TestcasesTableShell.tsx
  • web/oss/src/components/pages/observability/components/ObservabilityTable/index.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/index.tsx
  • web/oss/src/components/pages/prompts/PromptsPage.tsx
  • web/oss/src/components/pages/settings/WorkspaceManage/cellRenderers.tsx
  • web/oss/src/lib/helpers/buildBreadcrumbs.ts
  • web/oss/src/services/evaluations/invocations/api.ts
  • web/oss/src/services/profile/index.ts
  • web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx
  • web/packages/agenta-entities/src/runnable/types.ts
  • web/packages/agenta-entity-ui/src/shared/EntityTable.tsx
  • web/packages/agenta-playground/src/executeWorkflowRevision.ts
  • web/packages/agenta-playground/src/state/execution/executionRunner.ts
  • web/packages/agenta-playground/src/state/execution/types.ts
  • web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useEntityTableState.ts
  • web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useTableManager.tsx
💤 Files with no reviewable changes (3)
  • api/ee/src/services/db_manager_ee.py
  • api/pyproject.toml
  • web/oss/src/components/EvaluationRunsTablePOC/actions/navigationActions.ts
✅ Files skipped from review due to trivial changes (3)
  • web/oss/src/components/SharedDrawers/TraceDrawer/components/EvaluatorDetailsPopover.tsx
  • docs/docs/self-host/02-configuration.mdx
  • hosting/docker-compose/oss/env.oss.gh.example
🚧 Files skipped from review as they are similar to previous changes (28)
  • web/oss/src/lib/helpers/buildBreadcrumbs.ts
  • web/oss/src/services/evaluations/invocations/api.ts
  • hosting/docker-compose/oss/env.oss.dev.example
  • web/oss/src/services/profile/index.ts
  • web/oss/src/components/pages/observability/components/SessionsTable/index.tsx
  • web/oss/src/components/TestcasesTableNew/components/TestcasesTableShell.tsx
  • web/oss/src/components/pages/observability/components/ObservabilityTable/index.tsx
  • web/oss/src/components/EvalRunDetails/test.tsx
  • web/oss/src/components/EvalRunDetails/components/Page.tsx
  • web/packages/agenta-entity-ui/src/shared/EntityTable.tsx
  • web/oss/src/components/InfiniteVirtualTable/hooks/useTableManager.tsx
  • web/oss/src/components/pages/prompts/PromptsPage.tsx
  • api/oss/src/services/organization_service.py
  • hosting/docker-compose/ee/env.ee.gh.example
  • web/packages/agenta-entities/src/runnable/types.ts
  • api/ee/src/services/organization_service.py
  • sdks/python/agenta/sdk/middlewares/running/vault.py
  • web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useEntityTableState.ts
  • web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx
  • web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useTableManager.tsx
  • web/packages/agenta-playground/src/state/execution/executionRunner.ts
  • web/packages/agenta-playground/src/state/execution/types.ts
  • api/oss/src/services/user_service.py
  • web/oss/src/components/EvaluationRunsTablePOC/components/EvaluationRunsTable/index.tsx
  • hosting/docker-compose/ee/env.ee.dev.example
  • api/oss/src/utils/env.py
  • web/oss/src/components/pages/settings/WorkspaceManage/cellRenderers.tsx
  • api/oss/src/services/email_service.py

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • SMTP added as the primary email option; SendGrid marked as legacy
    • Password reset flow for workspace members (generate and copy reset links)
  • Improvements

    • Email sending now prefers SMTP, with smarter sender address selection and graceful fallback to invite links when email isn’t configured
    • Richer error reporting (type & stacktrace) across runs and executions
    • Table row clicks ignore interactive elements more reliably
    • Evaluator navigation unified to playground; custom eval labeling improved
  • Documentation

    • Added SMTP configuration guidance and updated environment examples

Walkthrough

The PR adds SMTP email support with SendGrid fallback, consolidates interactive-element click handling across tables, extends evaluation type support for custom SDK evals with richer error reporting, unifies evaluator navigation routes, introduces workspace member password reset, and improves SDK provider environment variable mapping.

Changes

Email System: SMTP Support with SendGrid Legacy Fallback

Layer / File(s) Summary
SMTP Configuration Model and Environment Setup
api/oss/src/utils/env.py
New SmtpConfig pydantic model reads SMTP connection parameters from environment variables with enabled property. EnvironSettings extended to include smtp: SmtpConfig field. AuthFacade.email_method returns "otp" when either SMTP or SendGrid is enabled.
Email Service Backend Implementation: SMTP and SendGrid Dispatch
api/oss/src/services/email_service.py
Email service refactored with module-level flags determining backend availability. New internal helpers _send_via_smtp and _send_via_sendgrid handle sending. Dispatcher in send_email routes through appropriate backend or no-ops when both disabled.
Service Integration: Organization and User Services
api/oss/src/services/organization_service.py, api/oss/src/services/user_service.py, api/ee/src/services/organization_service.py
Organization and user services now check both env.smtp.enabled and env.sendgrid.enabled for email availability; sender address derived from configured sources with fallback.
Enterprise Email Configuration Cleanup
api/ee/src/services/db_manager_ee.py
EE-specific SendGrid import and client initialization removed.
Documentation and Environment Examples
docs/docs/self-host/02-configuration.mdx, hosting/docker-compose/*/*/env.*.example
Configuration documentation updated with SMTP section; SendGrid marked as legacy. Docker Compose examples include commented SMTP configuration and legacy SendGrid notes.
SendGrid Dependency Removal
api/pyproject.toml
sendgrid>=6,<7 package dependency removed from project configuration.

Evaluation Runs: Custom SDK Eval Type and Richer Error Details

Layer / File(s) Summary
Error Type Expansion Across Frontend and Backend
web/oss/src/components/EvalRunDetails/atoms/runInvocationAction.ts, web/oss/src/services/evaluations/invocations/api.ts, web/packages/agenta-entities/src/runnable/types.ts, web/packages/agenta-playground/src/executeWorkflowRevision.ts, web/packages/agenta-playground/src/state/execution/executionRunner.ts, web/packages/agenta-playground/src/state/execution/types.ts
Error objects expanded to include optional type and stacktrace fields alongside existing message and code across frontend atoms, service APIs, entity types, execution runner callbacks, and playground interfaces.
Custom Evaluation Type (SDK Evals) UI Support
web/oss/src/components/EvalRunDetails/test.tsx, web/oss/src/components/EvalRunDetails/components/Page.tsx
Evaluation test page stops normalizing custom eval type to auto; type propagates directly. Evaluation breadcrumb mapping extended to handle "custom" type with "SDK Evals" label and kind=custom query parameter.
Evaluator Navigation Unification
web/oss/src/components/SharedDrawers/TraceDrawer/components/EvaluatorDetailsPopover.tsx, web/oss/src/components/SharedDrawers/TraceDrawer/hooks/useEvaluatorNavigation.ts
Evaluator navigation unified to use /evaluators/playground?revisions=... route for both human and auto evaluators. Popover CTA button text updated to "Open evaluator playground".
Breadcrumb Label Updates
web/oss/src/lib/helpers/buildBreadcrumbs.ts
Evaluation breadcrumb label updated from "auto evaluation" to "Evaluations" for consistency.

Interactive Row Click Handling: Centralization and Expansion

Layer / File(s) Summary
Interactive Selector Centralization and Expansion
web/oss/src/components/InfiniteVirtualTable/hooks/useTableManager.tsx, web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useTableManager.tsx
New exported INTERACTIVE_ROW_SELECTORS CSS selector string consolidates interactive-element detection (button, menuitem, checkbox, select, data-interactive, Ant-specific classes). shouldIgnoreRowClick refactored to use target.closest(INTERACTIVE_ROW_SELECTORS) instead of hardcoded per-selector checks.
Table Components: Click Handler Updates
web/oss/src/components/EvaluationRunsTablePOC/actions/navigationActions.ts, web/oss/src/components/EvaluationRunsTablePOC/components/EvaluationRunsTable/index.tsx, web/oss/src/components/TestcasesTableNew/components/TestcasesTableShell.tsx, web/oss/src/components/pages/observability/components/ObservabilityTable/index.tsx, web/oss/src/components/pages/observability/components/SessionsTable/index.tsx, web/oss/src/components/pages/prompts/PromptsPage.tsx, web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx, web/packages/agenta-entity-ui/src/shared/EntityTable.tsx
Multiple table components import shouldIgnoreRowClick from InfiniteVirtualTable and gate row navigation/selection on click event via centralized check. Navigation-specific implementation removed from navigationActions and imported from shared location.
Selector Consumer Hook Consolidation
web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useEntityTableState.ts
useEntityTableState hook updated to derive DEFAULT_INTERACTIVE_SELECTORS from centralized INTERACTIVE_ROW_SELECTORS constant instead of maintaining local hardcoded array.

Workspace Member Password Reset Feature

Layer / File(s) Summary
Password Reset Profile Service and API Integration
web/oss/src/services/profile/index.ts
New resetPassword(userId: string) function in profile service posts to api/profile/reset-password endpoint with user_id query parameter and returns reset link string.
Workspace Password Reset UI Components
web/oss/src/components/pages/settings/WorkspaceManage/cellRenderers.tsx
WorkspaceManage extended with password-reset capability: adds Key icon import, imports resetPassword service and modal components, introduces state for modal visibility and reset link storage, adds async handleResetPassword handler with error messaging, wires "Reset password" dropdown action for workspace members, renders GenerateResetLinkModal and PasswordResetLinkModal with handlers.

SDK Provider Environment Variable Mapping

Layer / File(s) Summary
Provider Environment Variable Mapping
sdks/python/agenta/sdk/middlewares/running/vault.py
New _PROVIDER_ENV_VAR_MAP dictionary maps LLM provider kinds to API key environment variable names, handling special cases like together_aiTOGETHERAI_API_KEY. get_secrets updated to use mapping with fallback to {PROVIDER}_API_KEY pattern.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The changeset includes a large number of unrelated modifications (email/SMTP refactoring, SDK changes, table row click handlers, password reset functionality) that are completely out of scope for the breadcrumb label fix described in the PR objectives. Remove all changes unrelated to the breadcrumb label fix: revert email service changes, SDK modifications, table click handler refactoring, password reset features, and other non-breadcrumb files to keep the PR focused on issue #4549.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: show correct breadcrumb label for SDK evaluations' clearly describes the main change: fixing breadcrumb labels to correctly show 'SDK Evals' instead of 'Auto Evals' for SDK evaluations.
Description check ✅ Passed The description is directly related to the changeset, explaining the root cause (normalization of custom type, missing typeMap entry, hardcoded fallback) and listing the specific file changes made to fix the breadcrumb label issue.
Linked Issues check ✅ Passed The PR successfully addresses the coding requirements from issue #4549: removes the custom→auto normalization in test.tsx, adds the missing 'custom' entry to typeMap in Page.tsx, and updates the fallback label in buildBreadcrumbs.ts to ensure SDK evaluations display 'SDK Evals' instead of 'Auto Evals'.
Docstring Coverage ✅ Passed Docstring coverage is 86.67% which is sufficient. The required threshold is 60.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
web/oss/src/components/EvalRunDetails/components/Page.tsx (1)

26-30: ⚠️ Potential issue | 🔴 Critical

Fix evaluationType prop union to include "custom" in EvalRunPreviewPageProps.

Page.tsx types evaluationType as "auto" | "human" | "online" (excludes "custom"), but the caller/test defines EvalRunKind with "custom" and passes it via evaluationType={evaluationType}, so type="custom" will fail TS type-checking.

Suggested fix
 interface EvalRunPreviewPageProps {
     runId: string
-    evaluationType: "auto" | "human" | "online"
+    evaluationType: "auto" | "human" | "online" | "custom"
     projectId?: string | null
 }
🧹 Nitpick comments (3)
sdks/python/agenta/sdk/middlewares/running/vault.py (1)

40-60: ⚡ Quick win

Consider extracting the provider mapping to a shared constant.

This mapping is duplicated in at least two places within the SDK (vault.py and daytona.py per your comment on line 44 and the context snippets), which creates a maintenance burden. If a new provider is added or an existing mapping changes, all copies must be updated consistently.

Since both files are in the same SDK package, consider extracting _PROVIDER_ENV_VAR_MAP to a shared constants module (e.g., agenta/sdk/constants/providers.py) and importing it in both locations.

♻️ Example refactor to shared constant

Create agenta/sdk/constants/providers.py:

"""Shared provider-to-environment-variable mappings."""

PROVIDER_ENV_VAR_MAP = {
    "openai": "OPENAI_API_KEY",
    "cohere": "COHERE_API_KEY",
    # ... rest of mapping
    "together_ai": "TOGETHERAI_API_KEY",
    "gemini": "GEMINI_API_KEY",
}

Then in this file:

-_PROVIDER_ENV_VAR_MAP: Dict[str, str] = {
-    "openai": "OPENAI_API_KEY",
-    # ...
-}
+from agenta.sdk.constants.providers import PROVIDER_ENV_VAR_MAP as _PROVIDER_ENV_VAR_MAP

Apply the same import in daytona.py.

web/packages/agenta-playground/src/state/execution/executionRunner.ts (1)

714-742: 💤 Low value

Consider more defensive check for body-level error code.

Line 718 checks bodyStatus.code for truthiness before comparing to 200. While falsy error codes are rare in practice, the condition could miss edge cases like code: 0.

🛡️ Suggested defensive improvement
-        if (bodyStatus && typeof bodyStatus === "object" && bodyStatus.code && bodyStatus.code !== 200) {
+        if (bodyStatus && typeof bodyStatus === "object" && bodyStatus.code !== undefined && bodyStatus.code !== null && bodyStatus.code !== 200) {
api/oss/src/services/email_service.py (1)

47-47: ⚡ Quick win

Use keyword-only parameters for new helpers and keyword dispatch calls.

The new helper signatures and dispatch calls are positional; this is brittle for same-typed parameters and less readable.

Suggested refactor
-def _send_via_smtp(to_email: str, subject: str, html_content: str, from_email: str) -> None:
+def _send_via_smtp(
+    *,
+    # recipients
+    to_email: str,
+    from_email: str,
+    # content
+    subject: str,
+    html_content: str,
+) -> None:
...
-def _send_via_sendgrid(to_email: str, subject: str, html_content: str, from_email: str) -> None:
+def _send_via_sendgrid(
+    *,
+    # recipients
+    to_email: str,
+    from_email: str,
+    # content
+    subject: str,
+    html_content: str,
+) -> None:
...
-            _send_via_smtp(to_email, subject, html_content, from_email)
+            _send_via_smtp(
+                to_email=to_email,
+                from_email=from_email,
+                subject=subject,
+                html_content=html_content,
+            )
...
-            _send_via_sendgrid(to_email, subject, html_content, from_email)
+            _send_via_sendgrid(
+                to_email=to_email,
+                from_email=from_email,
+                subject=subject,
+                html_content=html_content,
+            )

As per coding guidelines, “Prefer keyword-only parameters using * in function signatures” and “Use grouped sections in function signatures/calls with # separators for readability.”

Also applies to: 74-74, 111-114

Source: Coding guidelines


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 0ba10a23-b942-4f39-a634-f94e8af871fe

📥 Commits

Reviewing files that changed from the base of the PR and between 98b8a9d and 6821d0f.

📒 Files selected for processing (37)
  • api/ee/src/services/db_manager_ee.py
  • api/ee/src/services/organization_service.py
  • api/oss/src/services/email_service.py
  • api/oss/src/services/organization_service.py
  • api/oss/src/services/user_service.py
  • api/oss/src/utils/env.py
  • api/pyproject.toml
  • docs/docs/self-host/02-configuration.mdx
  • hosting/docker-compose/ee/env.ee.dev.example
  • hosting/docker-compose/ee/env.ee.gh.example
  • hosting/docker-compose/oss/env.oss.dev.example
  • hosting/docker-compose/oss/env.oss.gh.example
  • sdks/python/agenta/sdk/middlewares/running/vault.py
  • web/oss/src/components/EvalRunDetails/atoms/runInvocationAction.ts
  • web/oss/src/components/EvalRunDetails/components/Page.tsx
  • web/oss/src/components/EvalRunDetails/test.tsx
  • web/oss/src/components/EvaluationRunsTablePOC/actions/navigationActions.ts
  • web/oss/src/components/EvaluationRunsTablePOC/components/EvaluationRunsTable/index.tsx
  • web/oss/src/components/InfiniteVirtualTable/hooks/useTableManager.tsx
  • web/oss/src/components/SharedDrawers/TraceDrawer/components/EvaluatorDetailsPopover.tsx
  • web/oss/src/components/SharedDrawers/TraceDrawer/hooks/useEvaluatorNavigation.ts
  • web/oss/src/components/TestcasesTableNew/components/TestcasesTableShell.tsx
  • web/oss/src/components/pages/observability/components/ObservabilityTable/index.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/index.tsx
  • web/oss/src/components/pages/prompts/PromptsPage.tsx
  • web/oss/src/components/pages/settings/WorkspaceManage/cellRenderers.tsx
  • web/oss/src/lib/helpers/buildBreadcrumbs.ts
  • web/oss/src/services/evaluations/invocations/api.ts
  • web/oss/src/services/profile/index.ts
  • web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx
  • web/packages/agenta-entities/src/runnable/types.ts
  • web/packages/agenta-entity-ui/src/shared/EntityTable.tsx
  • web/packages/agenta-playground/src/executeWorkflowRevision.ts
  • web/packages/agenta-playground/src/state/execution/executionRunner.ts
  • web/packages/agenta-playground/src/state/execution/types.ts
  • web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useEntityTableState.ts
  • web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useTableManager.tsx
💤 Files with no reviewable changes (3)
  • api/pyproject.toml
  • web/oss/src/components/EvaluationRunsTablePOC/actions/navigationActions.ts
  • api/ee/src/services/db_manager_ee.py

Comment on lines +22 to +25
import sendgrid

_sg = sendgrid.SendGridAPIClient(api_key=env.sendgrid.api_key)
log.info("✓ Email enabled via SendGrid (legacy)")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether sendgrid dependency is still declared in project manifests.
fd -a "pyproject.toml" | while read -r f; do
  echo "==> $f"
  rg -n "sendgrid" "$f" || true
done

Repository: Agenta-AI/agenta

Length of output: 321


🏁 Script executed:

#!/bin/bash
# 1) Find all occurrences of "sendgrid" and highlight likely dependency declarations.
rg -n --hidden --no-ignore-vcs --glob '!*node_modules/*' --glob '!*dist/*' \
  "sendgrid" . \
  || true

# 2) Specifically search common dependency files.
for f in $(fd -a "requirements*.txt" . || true); do echo "==> $f"; rg -n "sendgrid" "$f" || true; done
for f in $(fd -a "poetry.lock" . || true); do echo "==> $f"; rg -n "sendgrid" "$f" || true; done
for f in $(fd -a "Pipfile.lock" . || true); do echo "==> $f"; rg -n "sendgrid" "$f" || true; done
for f in $(fd -a "setup.cfg" . || true); do echo "==> $f"; rg -n "sendgrid" "$f" || true; done
for f in $(fd -a "setup.py" . || true); do echo "==> $f"; rg -n "sendgrid" "$f" || true; done

Repository: Agenta-AI/agenta

Length of output: 12693


🏁 Script executed:

#!/bin/bash
# Read-only: confirm SendGrid package presence in dependency lockfile(s).
rg -n "sendgrid" api/uv.lock || true

Repository: Agenta-AI/agenta

Length of output: 760


🏁 Script executed:

rg -n "sendgrid" api/uv.lock

Repository: Agenta-AI/agenta

Length of output: 760


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="api/oss/src/services/email_service.py"

echo "==> File head/tail context (lines 1-170)"
nl -ba "$FILE" | sed -n '1,170p'

echo
echo "==> Key searches"
rg -n "smtplib|SMTP\(|timeout=|_send_via_smtp|_send_via_sendgrid|def send_email|send_email\(|_send_via_sendgrid\(|_send_via_smtp\(" "$FILE"

Repository: Agenta-AI/agenta

Length of output: 162


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="api/oss/src/services/email_service.py"

echo "==> File lines 1-200"
awk 'NR>=1 && NR<=200 {printf "%6d:%s\n", NR, $0}' "$FILE"

echo
echo "==> Search hits"
rg -n "smtplib|SMTP\(|timeout=|_send_via_smtp|_send_via_sendgrid|def send_email|send_email\(|def _send_via_smtp|def _send_via_sendgrid" "$FILE"

Repository: Agenta-AI/agenta

Length of output: 4745


Fix email_service network robustness (SMTP timeout, async blocking, formatting) and make SendGrid import safer.

  • api/oss/src/services/email_service.py: _send_via_smtp creates smtplib.SMTP(smtp_host, smtp_port) without a timeout (lines 59 & 64); add timeout=... to avoid hangs.
  • send_email is async but calls _send_via_smtp / _send_via_sendgrid synchronously (lines 113–116), blocking the event loop; run the send in a thread (e.g., asyncio.to_thread) or use async clients.
  • Keyword-only + grouped-call guideline: _send_via_smtp, _send_via_sendgrid, and send_email use positional parameters and positional calls (lines 45–47, 74–76, 89–91, 114–116); switch to * keyword-only params and format calls with #-grouped blocks.
  • Legacy SendGrid client is created at module import time (lines 20–23). sendgrid is present in api/uv.lock (6.12.5), but defer import/client creation so enabling the config can’t break startup.

Comment on lines +59 to +64
server = smtplib.SMTP(smtp_host, smtp_port)
server.ehlo()
server.starttls()
server.ehlo()
else:
server = smtplib.SMTP(smtp_host, smtp_port)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add an SMTP connection timeout to prevent request hangs.

api/oss/src/services/email_service.py creates smtplib.SMTP(smtp_host, smtp_port) in both branches without a timeout, so a stalled network connection can block indefinitely.

Suggested fix
-        server = smtplib.SMTP(smtp_host, smtp_port)
+        server = smtplib.SMTP(smtp_host, smtp_port, timeout=10)
...
-        server = smtplib.SMTP(smtp_host, smtp_port)
+        server = smtplib.SMTP(smtp_host, smtp_port, timeout=10)

Also applies to: 67-69

Comment on lines +111 to +114
if _USE_SMTP:
_send_via_smtp(to_email, subject, html_content, from_email)
else:
_send_via_sendgrid(to_email, subject, html_content, from_email)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="api/oss/src/services/email_service.py"
echo "== File = $FILE =="
wc -l "$FILE"
echo

# show the relevant portion around lines 90-150 (covers 111-114)
nl -ba "$FILE" | sed -n '80,160p'

# show definitions/usages of the helper functions to see if they are sync and how they perform network I/O
echo
echo "== Look for _send_via_smtp definition =="
rg -n "def _send_via_smtp" "$FILE"
echo
nl -ba "$FILE" | sed -n '1,240p' | sed -n '1,240p' | head -n 40

echo
echo "== Look for _send_via_sendgrid definition =="
rg -n "def _send_via_sendgrid" "$FILE"

echo
# print near each helper definition block (best-effort: show 40 lines after match)
python3 - <<'PY'
import re, itertools
path="api/oss/src/services/email_service.py"
lines=open(path,'r',encoding='utf-8').read().splitlines()
# find line numbers for defs
for name in ["_send_via_smtp","_send_via_sendgrid","send_email"]:
    for i,l in enumerate(lines, start=1):
        if re.search(rf"def {re.escape(name)}\b", l):
            start=max(1,i-10); end=min(len(lines), i+120)
            print(f"\n== {name} around line {i} (show {start}-{end}) ==")
            for j in range(start,end+1):
                print(f"{j:4d}: {lines[j-1]}")
            break
PY

# run ruff (non-destructive) in api folder per guidelines to see if there are existing issues related to this code
echo
echo "== ruff format/check in api/ (targeting oss tree only if available) =="
ls api
ruff --version || true
ruff format api/oss/src/services/email_service.py
ruff check api/oss/src/services/email_service.py --fix

Repository: Agenta-AI/agenta

Length of output: 216


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="api/oss/src/services/email_service.py"
echo "== File = $FILE =="
wc -l "$FILE"
echo

# Line-numbered view of the relevant range (avoid `nl` which isn't available)
python3 - <<'PY'
path="api/oss/src/services/email_service.py"
start=80
end=160
with open(path,'r',encoding='utf-8') as f:
    lines=f.readlines()
for i in range(start, min(end, len(lines))):
    print(f"{i+1:4d}: {lines[i].rstrip()}")
PY

echo
echo "== Locate helper definitions =="
rg -n "def _send_via_smtp|def _send_via_sendgrid|async def send_email" "$FILE" || true

echo
echo "== Dump helper/send_email blocks (best-effort) =="
python3 - <<'PY'
import re
path="api/oss/src/services/email_service.py"
lines=open(path,'r',encoding='utf-8').read().splitlines()
def dump_at(name):
    for idx,l in enumerate(lines, start=1):
        if re.match(rf"\s*def {re.escape(name)}\b", l) or re.match(rf"\s*async def {re.escape(name)}\b", l):
            start=max(1, idx-10); end=min(len(lines), idx+200)
            print(f"\n== {name} starting at line {idx} (show {start}-{end}) ==")
            for j in range(start,end+1):
                print(f"{j:4d}: {lines[j-1]}")
            return
    print(f"\n== {name} not found ==")
for n in ["send_email","_send_via_smtp","_send_via_sendgrid"]:
    dump_at(n)
PY

echo
echo "== Try ruff on this file (if ruff exists) =="
command -v ruff >/dev/null 2>&1 && ruff --version || echo "ruff not found in sandbox"
if command -v ruff >/dev/null 2>&1; then
  ruff format "$FILE" || true
  ruff check "$FILE" --fix || true
fi

Repository: Agenta-AI/agenta

Length of output: 8421


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="api/oss/src/services/email_service.py"
python3 - <<'PY'
path="api/oss/src/services/email_service.py"
with open(path,'r',encoding='utf-8') as f:
    lines=f.readlines()
for i,l in enumerate(lines[:80], start=1):
    print(f"{i:4d}: {l.rstrip()}")
PY

Repository: Agenta-AI/agenta

Length of output: 2839


Offload SMTP/SendGrid blocking calls from async send_email

send_email is async, but it calls synchronous SMTP/SendGrid network operations directly at lines 111-114 (smtplib.SMTP(...).sendmail(...) and _sg.send(...)), which can block the event loop under load.

Suggested fix
+import asyncio
...
-        if _USE_SMTP:
-            _send_via_smtp(to_email, subject, html_content, from_email)
-        else:
-            _send_via_sendgrid(to_email, subject, html_content, from_email)
+        if _USE_SMTP:
+            await asyncio.to_thread(
+                _send_via_smtp,
+                to_email=to_email,
+                subject=subject,
+                html_content=html_content,
+                from_email=from_email,
+            )
+        else:
+            await asyncio.to_thread(
+                _send_via_sendgrid,
+                to_email=to_email,
+                subject=subject,
+                html_content=html_content,
+                from_email=from_email,
+            )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if _USE_SMTP:
_send_via_smtp(to_email, subject, html_content, from_email)
else:
_send_via_sendgrid(to_email, subject, html_content, from_email)
if _USE_SMTP:
await asyncio.to_thread(
_send_via_smtp,
to_email=to_email,
subject=subject,
html_content=html_content,
from_email=from_email,
)
else:
await asyncio.to_thread(
_send_via_sendgrid,
to_email=to_email,
subject=subject,
html_content=html_content,
from_email=from_email,
)

Comment thread api/oss/src/utils/env.py
or os.getenv("AGENTA_AUTHN_EMAIL_FROM")
or os.getenv("AGENTA_SEND_EMAIL_FROM_ADDRESS")
)
use_tls: bool = os.getenv("SMTP_USE_TLS", "true").lower() in ("true", "1", "yes")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use shared truthy parsing for SMTP_USE_TLS for consistent behavior.

SMTP_USE_TLS=on|enabled|y|t is currently treated as false here, while other env booleans in this file accept those values via _TRUTHY.

Suggested fix
-    use_tls: bool = os.getenv("SMTP_USE_TLS", "true").lower() in ("true", "1", "yes")
+    use_tls: bool = (os.getenv("SMTP_USE_TLS") or "true").lower() in _TRUTHY
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
use_tls: bool = os.getenv("SMTP_USE_TLS", "true").lower() in ("true", "1", "yes")
use_tls: bool = (os.getenv("SMTP_USE_TLS") or "true").lower() in _TRUTHY

Comment on lines +64 to +71
export const resetPassword = async (userId: string): Promise<string> => {
const base = getBaseUrl()
const url = new URL("api/profile/reset-password", base)
url.searchParams.set("user_id", userId)
const data = await fetchJson<string>(url, {
method: "POST",
})
return data
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate the generated request type and the corresponding client method.
rg -n -C3 'interface\s+ResetUserPasswordRequest|resetUserPassword|getAgentaSdkClient' web

# Show the generated request shape for this endpoint.
sed -n '1,120p' web/packages/agenta-api-client/src/generated/api/resources/users/client/requests/ResetUserPasswordRequest.ts

Repository: Agenta-AI/agenta

Length of output: 14163


Route resetPassword through the Fern SDK client (no manual URL + fetchJson)

web/oss/src/services/profile/index.ts (lines 64–71) hardcodes api/profile/reset-password and builds user_id via URL.searchParams, bypassing the Fern-generated API boundary and duplicating the generated request contract (ResetUserPasswordRequest { user_id: string }). Replace the manual fetchJson call with the Fern client call (e.g., getAgentaSdkClient({host: getAgentaApiUrl()}).users.resetUserPassword({ user_id: userId })) so transport/typing stay in sync.

Source: Coding guidelines

@GanJiaKouN16 GanJiaKouN16 force-pushed the fix/eval-breadcrumbs-sdk-type branch from 6821d0f to ad6e140 Compare June 6, 2026 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Frontend size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix SDK eval breadcrumbs showing Auto Evals

2 participants