You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The agent ships with a tool call order evaluator that verifies the ReAct node calls `get_current_time`**before**`web_search` when given a time-dependent query.
67
+
The agent ships with a tool call order evaluator that verifies the ReAct node calls `get_current_time`**before**`get_weather` when given a time-and-weather query, and an LLM judge that checks weather output for semantic similarity.
Copy file name to clipboardExpand all lines: template/evaluations/eval-sets/evaluation-set-default.json
+14-21Lines changed: 14 additions & 21 deletions
Original file line number
Diff line number
Diff line change
@@ -4,37 +4,30 @@
4
4
"name": "Default Evaluation Set",
5
5
"evaluatorRefs": [
6
6
"evaluator-llm-judge-output",
7
-
"evaluator-tool-call-order"
7
+
"evaluator-tool-call-order",
8
+
"evaluator-tool-call-arguments",
9
+
"evaluator-tool-call-count"
8
10
],
9
11
"evaluations": [
10
12
{
11
13
"id": "ada5a2c1-976c-470b-964f-eb70a5e61eb4",
12
-
"name": "Current time tool should be called before web search",
14
+
"name": "Weather in Paris",
13
15
"inputs": {
14
-
"query": "search the web using current time as input"
15
-
},
16
-
"evaluationCriterias": {
17
-
"evaluator-tool-call-order": null
18
-
},
19
-
"createdAt": "2026-03-23T17:10:17.394Z",
20
-
"updatedAt": "2026-03-23T17:10:40.659Z"
21
-
},
22
-
{
23
-
"id": "cf113553-5055-4cbf-a97f-e3801b05acd8",
24
-
"name": "UiPath details",
25
-
"inputs": {
26
-
"query": "tell me about UiPath"
16
+
"query": "Is it good weather for a walk in Paris?"
27
17
},
28
18
"evaluationCriterias": {
29
19
"evaluator-llm-judge-output": {
30
-
"expectedOutput": "{\n \"response\": \"# UiPath Overview: **UiPath** is a global software company and a leading platform for **Robotic Process Automation (RPA)** and **Agentic Automation**. Here are the key details:\\\\n\\\\n## Company Background\\\\n- **Founded:** 2005 by Daniel Dines and Marius Tirca\\\\n- **Headquarters:** New York City, United States\\\\n- **Status:** Public company\\\\n- **Funding:** Raised $1.96B from investors like Accel, Coatue, and Alkeon Capital Management\\\\n\\\\n## What It Does\\\\nUiPath specializes in automating repetitive, routine business tasks by using software robots that mimic human actions across digital systems. The platform helps organizations:\\\\n- Automate end-to-end business processes\\\\n- Reduce manual work and improve efficiency\\\\n- Enhance enterprise digital transformation\\\\n\\\\n## Core Offerings\\\\n- **RPA (Robotic Process Automation):** Build, manage, and deploy bots to automate repetitive tasks\\\\n- **AI Integration:** Advanced AI capabilities through collaborations with NVIDIA, OpenAI, Google, Microsoft, and Snowflake\\\\n- **UiPath Studio Web:** Browser-based tool for building cross-platform automations with templates and collaboration features\\\\n- **Agentic Automation:** AI agents that can orchestrate workflows with robots and human effort\\\\n\\\\n## Business Model Evolution\\\\n- **Shifting to consumption-based pricing:** Moving from traditional \\\"per-user seat\\\" licensing to \\\"pay-per-action\\\" models\\\\n- This reflects the changing value proposition as AI agents become more capable\\\\n\\\\n## Recent Developments\\\\n- Announced acquisition of **WorkFusion** (a pioneer in AI agents for financial crime compliance)\\\\n- Expanding AI capabilities and partnerships with major tech companies\\\\n- Offering globally recognized certification programs for automation expertise\\\\n\\\\nUiPath is positioned as one of the leading enterprise automation platforms for digital transformation\"\n}"
31
-
}
20
+
"expectedOutput": "{\n\"response\": \"Yes, it's quite good weather for a walk in Paris! The current conditions are:**Temperature**: 18°C (64°F) - mild and pleasant spring weather **Wind**: 12 km/h (7 mph) - light, manageable breeze **Conditions**: Partly cloudy - perfect for a walk without being too sunny or too gray\"\n}"
0 commit comments