Commit 439f471
[evaluation] Fix multi-turn red team attacks broken by PyRIT 0.11 (#46444)
* [evaluation] Fix multi-turn red team attacks broken by PyRIT 0.11
PyRIT 0.11 introduced two bugs in RedTeamingAttack that prevent multi-turn
red team attacks from running:
1. RedTeamingAttack._setup_async() adds prepended_conversation messages to
the adversarial chat memory BEFORE calling set_system_prompt(). The
default PromptChatTarget.set_system_prompt then raises
`RuntimeError: Conversation already exists, system prompt needs to be
set at the beginning`. CrescendoAttack avoids this by embedding context
in the system prompt template, so it does not regress.
2. RedTeamingAttack._generate_next_prompt_async returns
context.next_message directly without calling .duplicate_message().
PromptNormalizer.send_prompt_async then deepcopies the message but
preserves the MessagePiece id, so re-inserting the same id into memory
raises sqlite3.IntegrityError: UNIQUE constraint failed:
PromptMemoryEntries.id. Notably PromptSendingAttack._build_message
uses .duplicate_message() — the intended pattern.
Together these two bugs cause Foundry red team eval runs with the
MultiTurn (and other RedTeamingAttack-based) strategies to silently fail
and surface only baseline results in the UI.
Workaround applied at the SDK level to avoid bumping PyRIT:
- Bug #1: instance-scoped patch of set_system_prompt on the adversarial
chat target. The patched version inserts the system message into memory
via add_message_to_memory when prior messages exist, instead of raising.
Scope is limited to the AzureRAIServiceTarget instance created by the
scan, so no global PyRIT class is mutated.
- Bug #2: module-level monkey-patch of
RedTeamingAttack._generate_next_prompt_async that wraps the returned
message in .duplicate_message(). The patch is idempotent and applied
once at SDK module load.
Verified locally with both a callback target and an Azure OpenAI model
target (gpt-4o-mini): multi-turn attacks now execute the full
conversation loop and produce expected results in
violence_multi_turn_results.jsonl and final_results.json.
Work item: https://msdata.visualstudio.com/Vienna/_workitems/edit/5166253
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Update CHANGELOG for multi-turn PyRIT fix
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Move changelog entry to 1.16.6 and bump version
1.16.5 is already released (2026-04-08). Put this fix under 1.16.6
(Unreleased) to match in-flight version increment PR #46222.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address Copilot review: version-guard PyRIT 0.11 patches and add unit tests
- Add _is_affected_pyrit_version() check; both patches early-return for pyrit
versions other than 0.11.x so a future fix or signature change isn't masked.
- Add tests/unittests/test_redteam/test_pyrit_workarounds.py covering:
* patch is applied (marker on RedTeamingAttack._generate_next_prompt_async)
* patched method calls .duplicate_message() on the returned message
* None pass-through (no AttributeError)
* idempotent re-application (no double-wrapping)
The set_system_prompt instance patch remains covered by the local E2E reproductions
(callback target + gpt-4o-mini); unit testing it would require constructing an
AzureRAIServiceTarget with full memory plumbing and offers little value beyond E2E.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Strengthen multi-turn E2E test + re-record on PyRIT 0.11
Previous assertion (len(conversation) >= 2) was too weak to catch the PyRIT 0.11 set_system_prompt bug — any attack silently dropped from attack_details still passed. Now require multi_turn attack present + >= 4 messages per multi-turn conversation.
Re-recorded against patched SDK so playback validates the fix.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>1 parent 481e01f commit 439f471
6 files changed
Lines changed: 254 additions & 5 deletions
File tree
- sdk/evaluation/azure-ai-evaluation
- azure/ai/evaluation
- red_team
- tests
- e2etests
- unittests/test_redteam
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
4 | 16 | | |
5 | 17 | | |
6 | 18 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
| 6 | + | |
Lines changed: 117 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
28 | 139 | | |
29 | 140 | | |
30 | 141 | | |
| |||
1765 | 1876 | | |
1766 | 1877 | | |
1767 | 1878 | | |
| 1879 | + | |
| 1880 | + | |
| 1881 | + | |
| 1882 | + | |
| 1883 | + | |
| 1884 | + | |
1768 | 1885 | | |
1769 | 1886 | | |
1770 | 1887 | | |
| |||
Lines changed: 21 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
910 | 910 | | |
911 | 911 | | |
912 | 912 | | |
| 913 | + | |
| 914 | + | |
| 915 | + | |
| 916 | + | |
| 917 | + | |
| 918 | + | |
| 919 | + | |
| 920 | + | |
| 921 | + | |
| 922 | + | |
| 923 | + | |
913 | 924 | | |
914 | 925 | | |
915 | 926 | | |
916 | | - | |
917 | | - | |
| 927 | + | |
| 928 | + | |
| 929 | + | |
| 930 | + | |
| 931 | + | |
| 932 | + | |
| 933 | + | |
| 934 | + | |
| 935 | + | |
| 936 | + | |
918 | 937 | | |
919 | 938 | | |
920 | 939 | | |
| |||
Lines changed: 101 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
0 commit comments