You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: assets/evaluators/builtin/coherence/spec.yaml
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
type: "evaluator"
2
2
name: "builtin.coherence"
3
-
version: 9
3
+
version: 10
4
4
displayName: "Coherence-Evaluator"
5
5
description: "Evaluates how logically connected and consistent the response is. Ensures ideas flow naturally and make sense together. It’s best used for generative business writing such as summarizing meeting notes, creating marketing materials, and drafting emails."
Copy file name to clipboardExpand all lines: assets/evaluators/builtin/fluency/spec.yaml
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
type: "evaluator"
2
2
name: "builtin.fluency"
3
-
version: 8
3
+
version: 9
4
4
displayName: "Fluency-Evaluator"
5
5
description: "Evaluates how natural and grammatically correct the response sounds. Higher scores indicate smoother and clearer language. It’s best used for generative business writing such as summarizing meeting notes, creating marketing materials, and drafting email."
Copy file name to clipboardExpand all lines: assets/evaluators/builtin/groundedness/spec.yaml
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
type: "evaluator"
2
2
name: "builtin.groundedness"
3
-
version: 14
3
+
version: 15
4
4
displayName: "Groundedness-Evaluator"
5
5
description: "Assesses whether the response stays true to the given context in a retrieval-augmented generation scenario. It’s best used for retrieval-augmented generation (RAG) scenarios, including question and answering and summarization. Use the groundedness metric when you need to verify that ai-generated responses align with and are validated by the provided context."
Copy file name to clipboardExpand all lines: assets/evaluators/builtin/retrieval/spec.yaml
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
type: "evaluator"
2
2
name: "builtin.retrieval"
3
-
version: 10
3
+
version: 11
4
4
displayName: "Retrieval-Evaluator"
5
5
description: "Measures how effectively the system retrieves relevant data or content. Higher scores mean better recall of useful information. It’s best used for the quality of search in information retrieval and retrieval augmented generation, when you don't have ground truth for chunk retrieval rankings. Use the retrieval score when you want to assess to what extent the context chunks retrieved are highly relevant and ranked at the top for answering your users' queries."
Copy file name to clipboardExpand all lines: assets/evaluators/builtin/similarity/spec.yaml
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
type: "evaluator"
2
2
name: "builtin.similarity"
3
-
version: 5
3
+
version: 6
4
4
displayName: "Similarity-Evaluator"
5
5
description: "Measures how closely two pieces of text resemble each other in meaning. Higher scores indicate greater semantic similarity. It’s best used for NLP tasks with a user query. Use it when you want an objective evaluation of an AI model's performance, particularly in text generation tasks where you have access to ground truth responses."
0 commit comments