UiPath
diff --git a/‎.claude/commands/add-integration-test.md‎
Lines changed: 206 additions & 0 deletions b/‎.claude/commands/add-integration-test.md‎
Lines changed: 206 additions & 0 deletions
diff --git a/‎pyproject.toml‎
Lines changed: 1 addition & 1 deletion b/‎pyproject.toml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎src/uipath/_cli/_evals/_eval_util.py‎
Lines changed: 80 additions & 0 deletions b/‎src/uipath/_cli/_evals/_eval_util.py‎
Lines changed: 80 additions & 0 deletions
diff --git a/‎src/uipath/_cli/_evals/_runtime.py‎
Lines changed: 14 additions & 2 deletions b/‎src/uipath/_cli/_evals/_runtime.py‎
Lines changed: 14 additions & 2 deletions
diff --git a/‎src/uipath/_cli/cli_eval.py‎
Lines changed: 10 additions & 0 deletions b/‎src/uipath/_cli/cli_eval.py‎
Lines changed: 10 additions & 0 deletions
@@ -0,0 +1,206 @@
+---
+allowed-tools: Read, Write, Bash, Glob, Grep, Edit
+description: Create a new integration test case following the testcases/ pattern
+argument-hint: <test-name> <description>
+---
+
+I'll help you create a new integration test case following the established pattern in the `testcases/` directory, similar to the `eval-input-overrides` example from PR #1101.
+
+## Understanding the Test Structure
+
+Based on the existing test pattern, each integration test should have:
+- `run.sh` - Main test execution script
+- `pyproject.toml` - Python dependencies
+- `entry-points.json` - Entry point configuration
+- `uipath.json` - UiPath configuration
+- `src/` directory containing:
+  - Evaluation set JSON files
+  - Input/configuration JSON files
+  - `assert.py` - Validation script
+
+## Step 1: Gather Information
+
+I need to understand what you're testing. Please provide:
+1. **Test Name**: A descriptive name for your test (e.g., "eval-multimodal-inputs")
+2. **Test Purpose**: What feature or scenario are you testing?
+3. **Evaluation Set**: What evaluations will run?
+4. **Expected Behavior**: What should the test verify?
+
+Let me check the existing testcases structure:
+
+!ls -1 testcases/
+
+## Step 2: Create Test Directory Structure
+
+Based on your test name `${test-name}`, I'll create:
+
+```bash
+testcases/${test-name}/
+├── run.sh
+├── pyproject.toml
+├── entry-points.json
+├── uipath.json
+├── src/
+│   ├── eval-set.json
+│   ├── config.json (if needed)
+│   └── assert.py
+```
+
+Let me read the reference implementation to understand the pattern:
+
+!cat testcases/eval-input-overrides/run.sh
+!cat testcases/eval-input-overrides/pyproject.toml
+!cat testcases/eval-input-overrides/src/assert.py
+
+## Step 3: Create the Test Files
+
+I'll create each file following the established pattern:
+
+### 1. run.sh - Test Execution Script
+```bash
+#!/bin/bash
+set -e
+
+echo "Syncing dependencies..."
+uv sync
+
+echo ""
+echo "Running ${test-name} integration test..."
+echo ""
+
+# Create output directory
+mkdir -p __uipath
+
+# Run evaluations
+uv run uipath eval main src/eval-set.json \
+  --no-report \
+  --output-file __uipath/output.json
+
+echo ""
+echo "Test completed! Verifying results..."
+echo ""
+
+# Run assertion script to verify results
+uv run python src/assert.py
+
+echo ""
+echo "✅ ${test-name} integration test completed successfully!"
+```
+
+### 2. pyproject.toml - Dependencies
+```toml
+[project]
+name = "${test-name}"
+version = "0.1.0"
+requires-python = ">=3.11"
+dependencies = [
+    "uipath>=2.4.0",
+]
+
+[build-system]
+requires = ["setuptools>=61.0"]
+build-backend = "setuptools.build_meta"
+```
+
+### 3. entry-points.json - Entry Points Configuration
+```json
+{
+  "main": "src/main.json"
+}
+```
+
+### 4. uipath.json - UiPath Configuration
+```json
+{
+  "name": "${test-name}",
+  "version": "1.0.0"
+}
+```
+
+### 5. src/eval-set.json - Evaluation Set
+(You'll need to provide the specific evaluation configuration)
+
+### 6. src/assert.py - Validation Script
+```python
+"""Assertions for ${test-name} testcase."""
+import json
+import os
+
+
+def main() -> None:
+    """Main assertion logic."""
+    output_file = "__uipath/output.json"
+
+    assert os.path.isfile(output_file), (
+        f"Evaluation output file '{output_file}' not found"
+    )
+    print(f"✓ Found evaluation output file: {output_file}")
+
+    with open(output_file, "r", encoding="utf-8") as f:
+        output_data = json.load(f)
+
+    print("✓ Loaded evaluation output")
+
+    # Add your specific assertions here
+    assert "evaluationSetResults" in output_data
+
+    evaluation_results = output_data["evaluationSetResults"]
+    assert len(evaluation_results) > 0, "No evaluation results found"
+
+    print(f"✓ Found {len(evaluation_results)} evaluation result(s)")
+
+    # Add test-specific validations
+
+    print("\\n✅ All assertions passed!")
+
+
+if __name__ == "__main__":
+    main()
+```
+
+## Step 4: Make run.sh Executable
+
+!chmod +x testcases/${test-name}/run.sh
+
+## Step 5: Test the Integration Test
+
+Let's validate the test runs correctly:
+
+!cd testcases/${test-name} && ./run.sh
+
+## Step 6: Add to Documentation
+
+Consider documenting your test in the project README or test documentation:
+- What scenario it tests
+- How to run it manually
+- What it validates
+
+---
+
+## Summary
+
+Your new integration test `${test-name}` has been created following the established pattern:
+
+✅ **Directory Structure**: Matches testcases/ pattern
+✅ **Dependencies**: Configured in pyproject.toml
+✅ **Test Script**: run.sh with proper error handling
+✅ **Assertions**: Validation logic in assert.py
+✅ **Configuration**: UiPath and entry points configured
+
+## Next Steps
+
+1. **Customize** the eval-set.json with your specific test data
+2. **Update** assert.py with test-specific validations
+3. **Run** the test: `cd testcases/${test-name} && ./run.sh`
+4. **Document** the test purpose and usage
+5. **Commit** the new test to version control
+
+## Tips
+
+- Keep tests focused on a single feature or scenario
+- Use descriptive evaluation names in eval-set.json
+- Add clear assertion messages for debugging
+- Follow the echo statement pattern (removed from initial header, kept for progress)
+- Ensure all JSON files are properly formatted
+
+Need help customizing any specific part of the test? Just ask!
@@ -1,6 +1,6 @@
 [project]
 name = "uipath"
-version = "2.4.21"
+version = "2.4.22"
 description = "Python SDK and CLI for UiPath Platform, enabling programmatic interaction with automation services, process management, and deployment tools."
 readme = { file = "README.md", content-type = "text/markdown" }
 requires-python = ">=3.11"
 
@@ -0,0 +1,80 @@
+"""Utility functions for applying input overrides to evaluation inputs."""
+
+import copy
+import logging
+from typing import Any
+
+logger = logging.getLogger(__name__)
+
+
+def deep_merge(base: dict[str, Any], override: dict[str, Any]) -> dict[str, Any]:
+    """Recursively merge override into base dictionary.
+
+    Args:
+        base: The base dictionary to merge into
+        override: The override dictionary to merge from
+
+    Returns:
+        A new dictionary with overrides recursively merged into base
+    """
+    result = copy.deepcopy(base)
+    for key, value in override.items():
+        if key in result and isinstance(result[key], dict) and isinstance(value, dict):
+            # Recursively merge nested dicts
+            result[key] = deep_merge(result[key], value)
+        else:
+            # Direct replacement for non-dict or new keys
+            result[key] = value
+    return result
+
+
+def apply_input_overrides(
+    inputs: dict[str, Any],
+    input_overrides: dict[str, Any],
+    eval_id: str | None = None,
+) -> dict[str, Any]:
+    """Apply input overrides to inputs using direct field override.
+
+    Format: Per-evaluation overrides (keys are evaluation IDs):
+       {"eval-1": {"operator": "*"}, "eval-2": {"a": 100}}
+
+    Deep merge is supported for nested objects:
+    - {"filePath": {"ID": "new-id"}} - deep merges inputs["filePath"] with {"ID": "new-id"}
+
+    Args:
+        inputs: The original inputs dictionary
+        input_overrides: Dictionary mapping evaluation IDs to their override values
+        eval_id: The evaluation ID (required)
+
+    Returns:
+        A new dictionary with overrides applied
+    """
+    if not input_overrides:
+        return inputs
+
+    if not eval_id:
+        logger.warning(
+            "eval_id not provided, cannot apply input overrides. Input overrides require eval_id."
+        )
+        return inputs
+
+    result = copy.deepcopy(inputs)
+
+    # Check if there are overrides for this specific eval_id
+    if eval_id not in input_overrides:
+        logger.debug(f"No overrides found for eval_id='{eval_id}'")
+        return result
+
+    overrides_to_apply = input_overrides[eval_id]
+    logger.debug(f"Applying overrides for eval_id='{eval_id}': {overrides_to_apply}")
+
+    # Apply direct field overrides with recursive deep merge
+    for key, value in overrides_to_apply.items():
+        if key in result and isinstance(result[key], dict) and isinstance(value, dict):
+            # Recursive deep merge for dict values
+            result[key] = deep_merge(result[key], value)
+        else:
+            # Direct replacement for non-dict or new keys
+            result[key] = value
+
+    return result
@@ -67,6 +67,7 @@
 from .._utils._eval_set import EvalHelpers
 from .._utils._parallelization import execute_parallel
 from ._configurable_factory import ConfigurableRuntimeFactory
+from ._eval_util import apply_input_overrides
 from ._evaluator_factory import EvaluatorFactory
 from ._models._evaluation_set import (
     EvaluationItem,
@@ -262,6 +263,7 @@ class UiPathEvalContext:
     verbose: bool = False
     enable_mocker_cache: bool = False
     report_coverage: bool = False
+    input_overrides: dict[str, Any] | None = None
     model_settings_id: str = "default"
 
 
@@ -524,7 +526,10 @@ async def _execute_eval(
                         ),
                     )
                     agent_execution_output = await self.execute_runtime(
-                        eval_item, execution_id, runtime
+                        eval_item,
+                        execution_id,
+                        runtime,
+                        input_overrides=self.context.input_overrides,
                     )
                 except Exception as e:
                     if self.context.verbose:
@@ -759,6 +764,7 @@ async def execute_runtime(
         eval_item: EvaluationItem,
         execution_id: str,
         runtime: UiPathRuntimeProtocol,
+        input_overrides: dict[str, Any] | None = None,
     ) -> UiPathEvalRunExecutionOutput:
         log_handler = self._setup_execution_logging(execution_id)
         attributes = {
@@ -785,8 +791,14 @@ async def execute_runtime(
 
             start_time = time()
             try:
+                # Apply input overrides to inputs if configured
+                inputs_with_overrides = apply_input_overrides(
+                    eval_item.inputs,
+                    input_overrides or {},
+                    eval_id=eval_item.id,
+                )
                 result = await execution_runtime.execute(
-                    input=eval_item.inputs,
+                    input=inputs_with_overrides,
                 )
             except Exception as e:
                 end_time = time()
 
@@ -1,6 +1,7 @@
 import ast
 import asyncio
 import os
+from typing import Any
 
 import click
 from uipath.core.tracing import UiPathTraceManager
@@ -113,6 +114,12 @@ def setup_reporting_prereq(no_report: bool) -> bool:
     default=20,
     help="Maximum concurrent LLM requests (default: 20)",
 )
+@click.option(
+    "--input-overrides",
+    cls=LiteralOption,
+    default="{}",
+    help='Input field overrides per evaluation ID: \'{"eval-1": {"operator": "*"}, "eval-2": {"a": 100}}\'. Supports deep merge for nested objects.',
+)
 def eval(
     entrypoint: str | None,
     eval_set: str | None,
@@ -126,6 +133,7 @@ def eval(
     model_settings_id: str,
     trace_file: str | None,
     max_llm_concurrency: int,
+    input_overrides: dict[str, Any],
 ) -> None:
     """Run an evaluation set against the agent.
 
@@ -141,6 +149,7 @@ def eval(
         model_settings_id: Model settings ID to override agent settings
         trace_file: File path where traces will be written in JSONL format
         max_llm_concurrency: Maximum concurrent LLM requests
+        input_overrides: Input field overrides mapping (direct field override with deep merge)
     """
     set_llm_concurrency(max_llm_concurrency)
 
@@ -178,6 +187,7 @@ def eval(
         eval_context.eval_ids = eval_ids
         eval_context.report_coverage = report_coverage
         eval_context.model_settings_id = model_settings_id
+        eval_context.input_overrides = input_overrides
 
         try: