Skip to content

Commit 214c831

Browse files
authored
Add SK AzureAIAgent notebook (#274)
* add notebook * fix unused arg * clean nb
1 parent d691e81 commit 214c831

1 file changed

Lines changed: 312 additions & 0 deletions

File tree

Lines changed: 312 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,312 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "bf5280e2",
6+
"metadata": {},
7+
"source": [
8+
"# Evaluate Semantic Kernel Azure AI Agents in Azure AI Foundry"
9+
]
10+
},
11+
{
12+
"cell_type": "markdown",
13+
"id": "0330c099",
14+
"metadata": {},
15+
"source": [
16+
"## Objective\n",
17+
"\n",
18+
"This sample demonstrates how to evaluate an AI agent (Azure AI Agent Service) on these important aspects of your agentic workflow:\n",
19+
"\n",
20+
"- Intent Resolution: Measures how well the agent identifies the user’s request, including how well it scopes the user’s intent, asks clarifying questions, and reminds end users of its scope of capabilities.\n",
21+
"- Tool Call Accuracy: Evaluates the agent's ability to select the appropriate tools, and process correct parameters from previous steps.\n",
22+
"- Task Adherence: Measures how well the agent’s response adheres to its assigned tasks, according to its system message and prior steps."
23+
]
24+
},
25+
{
26+
"cell_type": "markdown",
27+
"id": "b364c694",
28+
"metadata": {},
29+
"source": [
30+
"## Time\n",
31+
"You can expect to complete this sample in approximately 20 minutes."
32+
]
33+
},
34+
{
35+
"cell_type": "markdown",
36+
"id": "bbf5ecbb",
37+
"metadata": {},
38+
"source": [
39+
"## Prerequisites\n",
40+
"\n",
41+
"### Packages\n",
42+
"- `semantic-kernel` installed (`pip install semantic-kernel`)\n",
43+
"- `azure-ai-evaluation` SDK installed\n",
44+
"\n",
45+
"Before running the sample:\n",
46+
"```bash\n",
47+
"pip install semantic-kernel azure-ai-projects azure-identity azure-ai-evaluation\n",
48+
"```\n",
49+
"\n",
50+
"### Azure Resources\n",
51+
"- An Azure OpenAI resource with a deployment configured\n",
52+
"- An Azure AI Foundry project\n",
53+
"\n",
54+
"### Environment Variables\n",
55+
"\n",
56+
"- For **Foundry Agent service**:\n",
57+
" - **`AZURE_AI_AGENT_ENDPOINT`** – Endpoint of your Azure AI Foundry project.\n",
58+
" - **`AZURE_AI_AGENT_MODEL_DEPLOYMENT_NAME`** – Deployment name of the model used by the Foundry Agent.\n",
59+
"\n",
60+
"- For **evaluating agents**:\n",
61+
" - **`AZURE_OPENAI_ENDPOINT`** – Azure OpenAI endpoint used for evaluation.\n",
62+
" - **`AZURE_OPENAI_API_KEY`** – Azure OpenAI API key used for evaluation.\n",
63+
" - **`AZURE_OPENAI_CHAT_DEPLOYMENT_NAME`** – Deployment name of the chat model used for evaluation.\n",
64+
" - **`AZURE_OPENAI_API_VERSION`** – Azure OpenAI API version used for evaluation (e.g., `2024-05-01-preview`).\n",
65+
"\n",
66+
"- For **Azure AI Foundry** (Bonus):\n",
67+
" - **`AZURE_AI_AGENT_ENDPOINT`** – Endpoint of your Azure AI Foundry project."
68+
]
69+
},
70+
{
71+
"cell_type": "markdown",
72+
"id": "ba1d6576",
73+
"metadata": {},
74+
"source": [
75+
"### Create an Azure AI Agent with a plugin - [reference](https://learn.microsoft.com/en-us/semantic-kernel/frameworks/agent/agent-types/azure-ai-agent?pivots=programming-language-python)"
76+
]
77+
},
78+
{
79+
"cell_type": "code",
80+
"execution_count": null,
81+
"id": "7dc6ce40",
82+
"metadata": {},
83+
"outputs": [],
84+
"source": [
85+
"from typing import Annotated\n",
86+
"\n",
87+
"from azure.identity import DefaultAzureCredential\n",
88+
"\n",
89+
"from semantic_kernel.agents import AzureAIAgent, AzureAIAgentSettings\n",
90+
"from semantic_kernel.functions import kernel_function\n",
91+
"\n",
92+
"\n",
93+
"# Define a sample plugin for the sample\n",
94+
"class MenuPlugin:\n",
95+
" \"\"\"A sample Menu Plugin used for the concept sample.\"\"\"\n",
96+
"\n",
97+
" @kernel_function(description=\"Provides a list of specials from the menu.\")\n",
98+
" def get_specials(self) -> Annotated[str, \"Returns the specials from the menu.\"]:\n",
99+
" return \"\"\"\n",
100+
" Special Soup: Clam Chowder\n",
101+
" Special Salad: Cobb Salad\n",
102+
" Special Drink: Chai Tea\n",
103+
" \"\"\"\n",
104+
"\n",
105+
" @kernel_function(description=\"Provides the price of the requested menu item.\")\n",
106+
" def get_item_price(\n",
107+
" self, menu_item: Annotated[str, \"The name of the menu item.\"]\n",
108+
" ) -> Annotated[str, \"Returns the price of the menu item.\"]:\n",
109+
" _ = menu_item # This is just to simulate a function that uses the input.\n",
110+
" return \"$9.99\"\n",
111+
"\n",
112+
"\n",
113+
"# Create an agent\n",
114+
"creds = DefaultAzureCredential()\n",
115+
"project_client = AzureAIAgent.create_client(credential=creds)\n",
116+
"\n",
117+
"deployment_name = AzureAIAgentSettings().model_deployment_name\n",
118+
"agent_definition = await project_client.agents.create_agent(\n",
119+
" model=deployment_name,\n",
120+
" name=\"Host\",\n",
121+
" instructions=\"Answer questions about the menu.\",\n",
122+
")\n",
123+
"\n",
124+
"agent = AzureAIAgent(\n",
125+
" client=project_client,\n",
126+
" definition=agent_definition,\n",
127+
" plugins=[MenuPlugin()],\n",
128+
")"
129+
]
130+
},
131+
{
132+
"cell_type": "markdown",
133+
"id": "ca0a35a0",
134+
"metadata": {},
135+
"source": [
136+
"### Invoke the agent"
137+
]
138+
},
139+
{
140+
"cell_type": "code",
141+
"execution_count": null,
142+
"id": "3b7b9ba3",
143+
"metadata": {},
144+
"outputs": [],
145+
"source": [
146+
"USER_INPUTS = [\n",
147+
" \"Hello\",\n",
148+
" \"What is the special soup?\",\n",
149+
" \"What is the special drink?\",\n",
150+
" \"How much is it?\",\n",
151+
" \"Thank you\",\n",
152+
"]\n",
153+
"\n",
154+
"thread = None\n",
155+
"for user_input in USER_INPUTS:\n",
156+
" print(f\"## User: {user_input}\")\n",
157+
" response = await agent.get_response(messages=user_input, thread=thread)\n",
158+
" print(f\"## {response.name}: {response.content}\")\n",
159+
" thread = response.thread"
160+
]
161+
},
162+
{
163+
"cell_type": "markdown",
164+
"id": "2586d3e5",
165+
"metadata": {},
166+
"source": [
167+
"### Converter: Get data from agent"
168+
]
169+
},
170+
{
171+
"cell_type": "code",
172+
"execution_count": null,
173+
"id": "7813b5eb",
174+
"metadata": {},
175+
"outputs": [],
176+
"source": [
177+
"from azure.ai.evaluation import AIAgentConverter\n",
178+
"from azure.ai.projects import AIProjectClient\n",
179+
"\n",
180+
"# Print the thread ID for reference\n",
181+
"print(thread.id)\n",
182+
"\n",
183+
"# The AIAgentConverter requires a sync project client\n",
184+
"ai_agent_settings = AzureAIAgentSettings()\n",
185+
"sync_project_client = AIProjectClient(\n",
186+
" endpoint=ai_agent_settings.endpoint,\n",
187+
" credential=DefaultAzureCredential(),\n",
188+
")\n",
189+
"\n",
190+
"converter = AIAgentConverter(sync_project_client)\n",
191+
"\n",
192+
"file_name = \"evaluation_data.jsonl\"\n",
193+
"# Save the agent thread data to a JSONL file (all turns)\n",
194+
"evaluation_data = converter.prepare_evaluation_data([thread.id], filename=file_name)\n",
195+
"# print(json.dumps(evaluation_data, indent=4))\n",
196+
"len(evaluation_data) # number of turns in the thread"
197+
]
198+
},
199+
{
200+
"cell_type": "markdown",
201+
"id": "8bf87cab",
202+
"metadata": {},
203+
"source": [
204+
"### Setting up evaluator\n",
205+
"\n",
206+
"We will select the following evaluators to assess the different aspects relevant for agent quality: \n",
207+
"\n",
208+
"- [Intent resolution](https://aka.ms/intentresolution-sample): measures the extent of which an agent identifies the correct intent from a user query. Scale: integer 1-5. Higher is better.\n",
209+
"- [Tool call accuracy](https://aka.ms/toolcallaccuracy-sample): evaluates the agent’s ability to select the appropriate tools, and process correct parameters from previous steps. Scale: float 0-1. Higher is better.\n",
210+
"- [Task adherence](https://aka.ms/taskadherence-sample): measures the extent of which an agent’s final response adheres to the task based on its system message and a user query. Scale: integer 1-5. Higher is better.\n"
211+
]
212+
},
213+
{
214+
"cell_type": "code",
215+
"execution_count": null,
216+
"id": "e6ee09df",
217+
"metadata": {},
218+
"outputs": [],
219+
"source": [
220+
"from pprint import pprint\n",
221+
"\n",
222+
"from azure.ai.evaluation import (\n",
223+
" AzureOpenAIModelConfiguration,\n",
224+
" IntentResolutionEvaluator,\n",
225+
" TaskAdherenceEvaluator,\n",
226+
" ToolCallAccuracyEvaluator,\n",
227+
")\n",
228+
"\n",
229+
"from semantic_kernel.connectors.ai.open_ai import AzureOpenAISettings\n",
230+
"\n",
231+
"azure_openai_settings = AzureOpenAISettings()\n",
232+
"if not azure_openai_settings.endpoint:\n",
233+
" raise ValueError(\"Azure OpenAI endpoint is not set in the environment variables.\")\n",
234+
"if not azure_openai_settings.api_key:\n",
235+
" raise ValueError(\"Azure OpenAI API key is not set in the environment variables.\")\n",
236+
"if not azure_openai_settings.chat_deployment_name:\n",
237+
" raise ValueError(\"Azure OpenAI chat deployment name is not set in the environment variables.\")\n",
238+
"\n",
239+
"\n",
240+
"model_config = AzureOpenAIModelConfiguration(\n",
241+
" azure_endpoint=str(azure_openai_settings.endpoint),\n",
242+
" api_key=azure_openai_settings.api_key.get_secret_value(),\n",
243+
" api_version=azure_openai_settings.api_version,\n",
244+
" azure_deployment=azure_openai_settings.chat_deployment_name,\n",
245+
")\n",
246+
"\n",
247+
"intent_resolution = IntentResolutionEvaluator(model_config=model_config)\n",
248+
"tool_call_accuracy = ToolCallAccuracyEvaluator(model_config=model_config)\n",
249+
"task_adherence = TaskAdherenceEvaluator(model_config=model_config)"
250+
]
251+
},
252+
{
253+
"cell_type": "markdown",
254+
"id": "a7a3d235",
255+
"metadata": {},
256+
"source": [
257+
"### Run Evaluator"
258+
]
259+
},
260+
{
261+
"cell_type": "code",
262+
"execution_count": null,
263+
"id": "31eb7ecb",
264+
"metadata": {},
265+
"outputs": [],
266+
"source": [
267+
"from azure.ai.evaluation import evaluate\n",
268+
"\n",
269+
"response = evaluate(\n",
270+
" data=file_name,\n",
271+
" evaluators={\n",
272+
" \"tool_call_accuracy\": tool_call_accuracy,\n",
273+
" \"intent_resolution\": intent_resolution,\n",
274+
" \"task_adherence\": task_adherence,\n",
275+
" },\n",
276+
" azure_ai_project=ai_agent_settings.endpoint,\n",
277+
")\n",
278+
"pprint(f\"AI Foundary URL: {response.get('studio_url')}\")"
279+
]
280+
},
281+
{
282+
"cell_type": "markdown",
283+
"id": "ac38d924",
284+
"metadata": {},
285+
"source": [
286+
"## Inspect results on Azure AI Foundry\n",
287+
"\n",
288+
"Go to AI Foundry URL for rich Azure AI Foundry data visualization to inspect the evaluation scores and reasoning to quickly identify bugs and issues of your agent to fix and improve."
289+
]
290+
}
291+
],
292+
"metadata": {
293+
"kernelspec": {
294+
"display_name": "env",
295+
"language": "python",
296+
"name": "python3"
297+
},
298+
"language_info": {
299+
"codemirror_mode": {
300+
"name": "ipython",
301+
"version": 3
302+
},
303+
"file_extension": ".py",
304+
"mimetype": "text/x-python",
305+
"name": "python",
306+
"nbconvert_exporter": "python",
307+
"pygments_lexer": "ipython3"
308+
}
309+
},
310+
"nbformat": 4,
311+
"nbformat_minor": 5
312+
}

0 commit comments

Comments
 (0)