Skip to content

Commit 68314c5

Browse files
w-javedCopilot
andauthored
Backport version 1.16.8 (#47029)
* Bump azure-ai-evaluation version to 1.16.8 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add 1.16.8 and 1.16.9 (Unreleased) changelog sections for azure-ai-evaluation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Bump azure-ai-evaluation version to 1.16.9 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 218656c commit 68314c5

2 files changed

Lines changed: 18 additions & 8 deletions

File tree

sdk/evaluation/azure-ai-evaluation/CHANGELOG.md

Lines changed: 17 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,22 @@
11
# Release History
22

3-
## 1.16.7 (Unreleased)
3+
## 1.16.9 (Unreleased)
4+
5+
### Breaking Changes
6+
7+
- Updated `EVALUATOR_NAME_METRICS_MAPPINGS` so `document_retrieval` and `rouge_score` report single primary metrics (`document_retrieval`, `rouge`), with previous sub-metrics now represented in each evaluator's `*_properties` payload.
8+
9+
### Bugs Fixed
10+
11+
- Fixed `format_llm_response` raising `UnboundLocalError` when `inputs` was not provided by ensuring `sample_input` is always initialized.
12+
13+
## 1.16.8 (2026-05-19)
14+
15+
### Features Added
16+
17+
- App Insights logging now forwards arbitrary evaluator-specific keys from each event's `properties` payload as a single `gen_ai.evaluation.properties` JSON attribute (carried inside `internal_properties`). Previously only the four red-team keys (`attack_success`, `attack_technique`, `attack_complexity`, `attack_success_threshold`) were forwarded; structured outputs such as rubric `dimension_scores` were silently dropped. Payloads larger than 7500 characters are replaced with a valid JSON marker (`{"truncated": true, "original_size_bytes": <n>}`) so consumers can always `json.loads` the value. Non-dict `properties` payloads are now safely ignored instead of raising in the red-team forwarder.
18+
19+
## 1.16.7 (2026-05-07)
420

521
### Features Added
622

@@ -9,11 +25,6 @@
925
- Added `status` field (`"completed"`, `"error"`, `"skipped"`) on evaluation result items to indicate evaluator execution outcome.
1026
- Added `skipped` and `errored` counts to `result_counts` and `per_testing_criteria_results` in AOAI evaluation summaries.
1127
- Added `skipped` to `ResultCount` and `skipped`/`errored` to `PerTestingCriteriaResult` typed contracts.
12-
- App Insights logging now forwards arbitrary evaluator-specific keys from each event's `properties` payload as a single `gen_ai.evaluation.properties` JSON attribute (carried inside `internal_properties`). Previously only the four red-team keys (`attack_success`, `attack_technique`, `attack_complexity`, `attack_success_threshold`) were forwarded; structured outputs such as rubric `dimension_scores` were silently dropped. Payloads larger than 7500 characters are replaced with a valid JSON marker (`{"truncated": true, "original_size_bytes": <n>}`) so consumers can always `json.loads` the value. Non-dict `properties` payloads are now safely ignored instead of raising in the red-team forwarder.
13-
14-
### Breaking Changes
15-
16-
- Updated `EVALUATOR_NAME_METRICS_MAPPINGS` so `document_retrieval` and `rouge_score` report single primary metrics (`document_retrieval`, `rouge`), with previous sub-metrics now represented in each evaluator's `*_properties` payload.
1728

1829
### Bugs Fixed
1930

@@ -27,7 +38,6 @@
2738
- Fixed `_get_metric_result` prefix matching where shorter metric names (e.g., `xpia`) could match before longer, more-specific ones (e.g., `xpia_manipulated_content`). Now sorts by length descending for correct longest-prefix matching.
2839
- Fixed non-dict `_properties` values from evaluators causing downstream issues. Values that are not dicts are now logged and dropped gracefully.
2940
- Fixed filename length error in `_inline_image` by catching OSError/ValueError during local path resolution and fall back to returning a text chunk instead of throwing.
30-
- Fixed `format_llm_response` raising `UnboundLocalError` when `inputs` was not provided by ensuring `sample_input` is always initialized.
3141

3242
### Other Changes
3343

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@
33
# ---------------------------------------------------------
44
# represents upcoming version
55

6-
VERSION = "1.16.7"
6+
VERSION = "1.16.9"

0 commit comments

Comments
 (0)