You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* removed the last message check
* bumped the version
* simplified the condition check
* updated the message
---------
Co-authored-by: Ali Mahmoudzadeh <amah@microsoft.com>
Copy file name to clipboardExpand all lines: assets/evaluators/builtin/coherence/spec.yaml
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
type: "evaluator"
2
2
name: "builtin.coherence"
3
-
version: 7
3
+
version: 8
4
4
displayName: "Coherence-Evaluator"
5
5
description: "Evaluates how logically connected and consistent the response is. Ensures ideas flow naturally and make sense together. It’s best used for generative business writing such as summarizing meeting notes, creating marketing materials, and drafting emails."
Copy file name to clipboardExpand all lines: assets/evaluators/builtin/customer_satisfaction/spec.yaml
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
type: "evaluator"
2
2
name: "builtin.customer_satisfaction"
3
-
version: 10
3
+
version: 11
4
4
displayName: "Customer-Satisfaction-Evaluator"
5
5
description: "Evaluates the predicted customer satisfaction level of an AI agent interaction on a 1-5 Likert scale. This evaluator assesses whether the agent's response would likely result in a satisfied customer based on helpfulness, completeness, tone, and resolution of the user's needs. Useful for measuring customer support quality, chatbot effectiveness, and overall user experience."
Copy file name to clipboardExpand all lines: assets/evaluators/builtin/groundedness/spec.yaml
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
type: "evaluator"
2
2
name: "builtin.groundedness"
3
-
version: 12
3
+
version: 13
4
4
displayName: "Groundedness-Evaluator"
5
5
description: "Assesses whether the response stays true to the given context in a retrieval-augmented generation scenario. It’s best used for retrieval-augmented generation (RAG) scenarios, including question and answering and summarization. Use the groundedness metric when you need to verify that ai-generated responses align with and are validated by the provided context."
Copy file name to clipboardExpand all lines: assets/evaluators/builtin/task_adherence/spec.yaml
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
type: "evaluator"
2
2
name: "builtin.task_adherence"
3
-
version: 11
3
+
version: 12
4
4
displayName: "Task-Adherence-Evaluator-(Preview)"
5
5
description: "Evaluates whether the agent completed the task within the confines of the instructions given to the agentic system. Higher scores indicate better compliance with the instructions. This evaluator is useful when useful for end-to-end system-level task evaluation for agents. Example outputs include actions such as updating a database and textual responses such as writing a report."
description: "Evaluates whether an AI agent successfully completed the requested task end to end by analyzing the conversation history and agent response to determine if all task requirements were met, ignoring rule adherence or intent understanding. This evaluator is useful for assessing agent effectiveness in task-oriented scenarios, workflow automation, and goal-oriented AI interactions."
0 commit comments