Skip to content

refactor(eval): split progress reporter into strategy-based reporting package#1711

Open
Chibionos wants to merge 1 commit into
feat/extract-uipath-eval-packagefrom
refactor/eval-reporting-strategy
Open

refactor(eval): split progress reporter into strategy-based reporting package#1711
Chibionos wants to merge 1 commit into
feat/extract-uipath-eval-packagefrom
refactor/eval-reporting-strategy

Conversation

@Chibionos

Copy link
Copy Markdown
Contributor

Summary

Reimplements the strategy-pattern reporting refactor from #1040 (closed as stale — it predated the monorepo migration and was 540 commits behind) on current main. Stacked on #1710 (uipath-eval package extraction); will be retargeted to main once that merges.

The 1475-line _progress_reporter.py monolith threaded is_coded booleans through every method to switch between the legacy and coded StudioWeb evaluation APIs. Those differences now live in dedicated strategy classes.

Design

packages/uipath/src/uipath/_cli/_evals/_reporting/:

Module Responsibility
_strategy_protocol.py EvalReportingStrategy protocol: endpoint suffix, ID conversion, eval snapshot shape, result collection, update payload
_legacy_strategy.py Legacy API: GUID ids (deterministic uuid5 for strings), assertionRuns + evaluatorScores, no path segment
_coded_strategy.py Coded API: string ids unchanged, evaluatorRuns + scores, coded/ path segment
_strategies.py Strategy selection (strategy_for, is_coded_evaluators)
_reporter.py StudioWebProgressReporter: event handling, HTTP plumbing, per-execution state, resume flow
_models.py / _utils.py Shared models (status enum, progress item, agent snapshot) and helpers (usage extraction, GUID conversion, error decorator)

_progress_reporter.py remains as a compatibility shim re-exporting the public surface, so existing importers (cli_eval.py, tests, anything external reaching into _cli) are unaffected.

Mixed coded/legacy eval sets keep their existing behavior: results are collected by both strategies (each skips evaluators it doesn't own) and the active strategy shapes the update payload.

Behavior preservation

  • The existing 61-test test_progress_reporter.py suite passes without a single modification — payloads, endpoints, GUID conversion, resume flow, and env handling are byte-identical.
  • 24 new strategy unit tests cover ID conversion determinism, snapshot shapes, payload key differences, and protocol conformance.
  • Full package suite: 1213 tests pass; ruff, mypy (src + tests), and the custom httpx linter are clean.

🤖 Generated with Claude Code

… package

Reimplements the design from #1040 (closed as stale) on current main:
the 1475-line _progress_reporter.py monolith threaded is_coded booleans
through every method to switch between the legacy and coded StudioWeb
evaluation APIs. The differences (endpoint routing, GUID conversion,
eval snapshot shape, result collection format, update payload keys) now
live in strategy classes under _cli/_evals/_reporting/:

- _strategy_protocol.py: EvalReportingStrategy protocol
- _legacy_strategy.py: GUID ids, assertionRuns, no path segment
- _coded_strategy.py: string ids, evaluatorRuns, coded/ segment
- _reporter.py: event handling, HTTP plumbing, per-execution state
- _models.py / _utils.py / _strategies.py: shared pieces + selection

_progress_reporter.py remains as a compatibility shim. Behavior is
unchanged: the existing 61-test progress reporter suite passes without
modification; 24 new strategy unit tests added.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 11, 2026 06:23
@github-actions github-actions Bot added test:uipath-langchain Triggers tests in the uipath-langchain-python repository test:uipath-integrations labels Jun 11, 2026
@Chibionos Chibionos requested a review from mjnovice June 11, 2026 06:23

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors StudioWeb evaluation progress reporting by extracting the legacy-vs-coded API differences into a strategy-based _reporting/ package, while keeping _progress_reporter.py as a compatibility shim so existing CLI imports continue to work.

Changes:

  • Adds strategy protocol + legacy/coded strategy implementations and strategy selection helpers under uipath/_cli/_evals/_reporting/.
  • Moves the StudioWebProgressReporter implementation into _reporting/_reporter.py and re-exports the previous public surface from _progress_reporter.py.
  • Adds unit tests for strategy behavior and bumps the uipath package version to 2.10.83.

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
packages/uipath/uv.lock Bumps locked uipath version to 2.10.83.
packages/uipath/pyproject.toml Bumps package version to 2.10.83.
packages/uipath/tests/cli/eval/test_reporting_strategies.py Adds unit tests covering strategy selection, ID conversion, snapshot/payload shapes.
packages/uipath/src/uipath/_cli/_evals/_reporting/init.py Defines the new reporting package exports (strategies, models, reporter, helpers).
packages/uipath/src/uipath/_cli/_evals/_reporting/_strategy_protocol.py Introduces EvalReportingStrategy protocol describing API-shaping responsibilities.
packages/uipath/src/uipath/_cli/_evals/_reporting/_strategies.py Adds singleton strategies + strategy_for / is_coded_evaluators selection utilities.
packages/uipath/src/uipath/_cli/_evals/_reporting/_coded_strategy.py Implements coded API behavior (string IDs, coded/ routing, evaluatorRuns/scores).
packages/uipath/src/uipath/_cli/_evals/_reporting/_legacy_strategy.py Implements legacy API behavior (GUID conversion, assertionRuns/evaluatorScores).
packages/uipath/src/uipath/_cli/_evals/_reporting/_models.py Extracts shared models (status enum, progress item, agent snapshot).
packages/uipath/src/uipath/_cli/_evals/_reporting/_utils.py Extracts shared helpers (error decorator, deterministic GUID, usage extraction, env parsing).
packages/uipath/src/uipath/_cli/_evals/_reporting/_reporter.py New StudioWebProgressReporter implementation using the strategy package.
packages/uipath/src/uipath/_cli/_evals/_progress_reporter.py Compatibility shim re-exporting the prior public API from _reporting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +562 to +571
# Check if we already have an eval_run_id cached
existing_eval_run_id = self.eval_run_ids.get(payload.execution_id)

if existing_eval_run_id:
# Already have eval_run_id (from previous fetch or creation)
logger.info(
f"Using cached eval_run_id={existing_eval_run_id} for execution_id={payload.execution_id} "
f"(skipping backend fetch/create)"
)
return
Comment on lines +1121 to +1125
Args:
eval_set_id: The ID of the eval set
eval_set_run_id: The ID of the eval set run
evaluation_id: Optional evaluation ID to filter for a specific eval run
is_coded: Whether this is a coded evaluation (vs legacy)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:uipath-integrations test:uipath-langchain Triggers tests in the uipath-langchain-python repository

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants