Skip to content

Commit 7cde607

Browse files
authored
Copilot Studio Response Analysis Tool (#423)
* New solution Copilot Tests Added * New solution Copilot Tests Added * New solution Copilot Tests Added * New solution Copilot Tests Added * New solution Copilot Tests Added * New solution Copilot Tests Added * New solution - Copilot Response Tests * New solution - Copilot Response Tests * New solution - Copilot Response Tests * New solution - Copilot Response Tests * New solution - Copilot Response Tests * New solution - Copilot Response Tests * New solution - Copilot Response Tests * New solution - Copilot Response Tests * New solution - Copilot Response Tests * New solution - Copilot Response Tests * New solution - Copilot Response Tests * New solution - Copilot Response Tests * New solution - Copilot Response Tests * New solution - Copilot Response Analysis Tool.
1 parent 5d47efc commit 7cde607

18 files changed

Lines changed: 935 additions & 0 deletions
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
COPILOTSTUDIOAGENT__ENVIRONMENTID=5e1afabf-33ce-eea5-9573-9ca969f96a0f
2+
COPILOTSTUDIOAGENT__SCHEMANAME=cr048_agent83vyYB
3+
COPILOTSTUDIOAGENT__TENANTID=8a235459-3d2c-415d-8c1e-e2fe133509ad
4+
COPILOTSTUDIOAGENT__AGENTAPPID=c2c612c5-7331-4382-9849-831869ca2af4
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{}
Lines changed: 193 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,193 @@
1+
# Copilot Studio Response Analysis Tool :computer:
2+
3+
## Purpose:
4+
Provide a **lightweight, developer-friendly tool** to -
5+
- Measure Copilot Agent **response-time performance** and correlate it with output size/tokens. :telescope:
6+
- Get **metrics** (Mean, Median, Max, Min, Standard Deviation) and **visual charts** to understand Copilot Agent response time trends and variability. :movie_camera:
7+
- Aggregated **real-time metrics, charts and tables** to spot spikes, drift, and outliers across single conversation. :bar_chart:
8+
- **Trace planner steps: tool invocations, and arguments** to view and validate dynamic plan composition. :computer:
9+
- Each planner step includes **Thought, Tool, and Arguments**, which together explain why the agent chose a path and how it executed it.
10+
- **Compare planner steps across queries** highlights tool calls and reasoning.
11+
- Automatically **generates a CSV file** containing all queries, their responses, and corresponding response times. :floppy_disk:
12+
13+
## Interpretion:
14+
15+
### 1. Statistics Tab :mag_right:
16+
Provides an overview of Copilot Agent performance metrics, including response time summaries (Mean, Median, Max, Min), variability (Standard Deviation), token correlation, and visual charts for trends and distribution.
17+
18+
<p align="center">
19+
<img src="img/Screen-Statistics.png" alt="StartTestRun">
20+
<br>
21+
</p>
22+
23+
#### <center>*Response Statistics:*</center> :chart_with_upwards_trend:
24+
| Metric | Description | Purpose |
25+
| :------- | :---------- | :---------- |
26+
| **Mean** | The average response time across all responses. | Gives an overall sense of typical performance but can be skewed by very high or low values.
27+
| **Median** | The middle value when all response times are arranged in ascending order. | Represents the central tendency and is less affected by outliers than the mean. Useful for understanding the “typical” response time.
28+
| **Max** | The longest response time recorded during the test run. | Highlights the worst-case scenario for latency, which is critical for identifying performance bottlenecks.
29+
| **Min** | The shortest response time recorded during the test run. | Shows the best-case performance and can indicate the system’s potential under optimal conditions.
30+
| **Standard Deviation** | Measures how much response times vary from the average. | Helps assess consistency—low SD means stable performance, high SD indicates fluctuating response times.
31+
| **Token Correlation** | Represents the correlation between Cresponse time and the number of tokens in response. | Indicate orchestrator efficiency—higher cost may indicate longer responses or slower performance.
32+
33+
#### <center>*Response Time Analysis:*</center> :chart_with_downwards_trend:
34+
| Chart | Description |
35+
| :------- | :---------- |
36+
| **Line Chart** | Shows how Copilot Agent response time changes across individual queries to identify spikes or trends. |
37+
| **Box Plot** | Summarizes overall response time distribution, highlighting consistency and outliers for performance benchmarking. |
38+
39+
### 2. Data Tab :calendar:
40+
Displays detailed per-query information, including the user’s prompt, Copilot Agent response, response time, output size, and step-by-step planner actions with tools and arguments—used for debugging and performance analysis.
41+
42+
#### Query Response / Time Data:
43+
A per‑query ledger linking the user prompt, the agent’s full reply, its latency, and output size to diagnose performance.
44+
45+
| Metric | Description |
46+
| :------- | :---------- |
47+
| **Serial** | Sequence number of the query in the run. |
48+
| **Query** | Exact user prompt/utterance. |
49+
| **Response** | Agent’s generated reply (full text). |
50+
| **Min** | Latency to produce the response (typically in seconds or ms). |
51+
| **Char** | Output length indicator (character count). |
52+
53+
<p align="center">
54+
<img src="img/Screen-Data-Query.png" alt="Screen-Data">
55+
<br>
56+
</p>
57+
58+
#### LLM Planner Steps Data:
59+
A step‑by‑step trace of the agent’s planning, tools, and arguments that explains how each response was produced.
60+
61+
| Metric | Description |
62+
| :------- | :---------- |
63+
| **Serial** | Sequence number matching the query above. |
64+
| **Query** | Exact user prompt/utterance. |
65+
| **PlannerStep** | Named step decided by orchestrator. |
66+
| **Thought** | Model’s internal reasoning summary for the step (high-level) |
67+
| **Tool** | Action or connector invoked. |
68+
| **Arguments** | Parameters passed / revieved. |
69+
70+
<p align="center">
71+
<img src="img/Screen-Data-Planner.png" alt="Screen-Data">
72+
<br>
73+
</p>
74+
75+
## Prerequisite:
76+
77+
To set up this sample, you will need the following:
78+
79+
1. [Python](https://www.python.org/) version 3.9 or higher
80+
2. An Agent Created in Microsoft Copilot Studio or access to an existing Agent.
81+
3. Ability to Create an Application Identity in Azure for a Public Client/Native App Registration Or access to an existing Public Client/Native App registration with the `CopilotStudio.Copilots.Invoke` API Permission assigned.
82+
83+
## Authentication:
84+
85+
The Copilot Studio Response Analysis Tool requires a User Token to operate. For this sample, we are using a user interactive flow to get the user token for the application ID created above. Other flows are allowed.
86+
87+
> [!Important]
88+
> The token is cached in the user machine in `.local_token_cache.json`
89+
90+
## Step 1. Create an Agent in Copilot Studio.
91+
92+
1. Create an Agent in [Copilot Studio](https://copilotstudio.microsoft.com)
93+
1. Publish your newly created Copilot
94+
2. Goto Settings => Advanced => Metadata and copy the following values, You will need them later:
95+
1. Schema name
96+
2. Environment Id
97+
98+
## Step 2. Create an Application Registration in Entra ID.
99+
100+
This step will require permissions to create application identities in your Azure tenant. For this sample, you will create a Native Client Application Identity, which does not have secrets.
101+
102+
1. Open https://portal.azure.com
103+
2. Navigate to Entra Id
104+
3. Create a new App Registration in Entra ID
105+
1. Provide a Name
106+
2. Choose "Accounts in this organization directory only"
107+
3. In the "Select a Platform" list, Choose "Public Client/native (mobile & desktop)
108+
4. In the Redirect URI url box, type in `http://localhost` (**note: use HTTP, not HTTPS**)
109+
5. Then click register.
110+
4. In your newly created application
111+
1. On the Overview page, Note down for use later when configuring the example application:
112+
1. The Application (client) ID
113+
2. The Directory (tenant) ID
114+
2. Go to API Permissions in `Manage` section
115+
3. Click Add Permission
116+
1. In the side panel that appears, Click the tab `API's my organization uses`
117+
2. Search for `Power Platform API`.
118+
1. *If you do not see `Power Platform API` see the note at the bottom of this section.*
119+
3. In the *Delegated permissions* list, choose `CopilotStudio` and Check `CopilotStudio.Copilots.Invoke`
120+
4. Click `Add Permissions`
121+
4. (Optional) Click `Grant Admin consent for copilotsdk`
122+
123+
> [!TIP]
124+
> If you do not see `Power Platform API` in the list of API's your organization uses, you need to add the Power Platform API to your tenant. To do that, goto [Power Platform API Authentication](https://learn.microsoft.com/power-platform/admin/programmability-authentication-v2#step-2-configure-api-permissions) and follow the instructions on Step 2 to add the Power Platform Admin API to your Tenant
125+
126+
## Step 3. Configure the Copilot Studio Response Analysis Tool.
127+
128+
With the above information, you can now run the `Copilot Studio Response Analysis Tool` sample.
129+
130+
1. Open the `env.TEMPLATE` file and rename it to `.env`.
131+
2. Configure the values based on what was recorded during the setup phase.
132+
133+
```bash
134+
COPILOTSTUDIOAGENT__ENVIRONMENTID="" # Environment ID of environment with the CopilotStudio App.
135+
COPILOTSTUDIOAGENT__SCHEMANAME="" # Schema Name of the Copilot to use
136+
COPILOTSTUDIOAGENT__TENANTID="" # Tenant ID of the App Registration used to login, this should be in the same tenant as the Copilot.
137+
COPILOTSTUDIOAGENT__AGENTAPPID="" # App ID of the App Registration used to login, this should be in the same tenant as the CopilotStudio environment.
138+
```
139+
140+
3. Run `pip install -r requirements.txt` to install all dependencies.
141+
142+
4. List test utterances sequentially in the `/data/input.txt` file. Marke the end of the file with "exit".
143+
144+
<p align="center">
145+
<img src="img/DataSet.png" alt="input data or utterances." width="600px">
146+
<br>
147+
</p>
148+
149+
## Step 4. Run the Copilot Studio Response Analysis Tool.
150+
151+
1. Run the Copilot Studio Response Analysis Tool using. This should challenge you to login and connect to the Copilot Studio Hosted agent
152+
153+
```sh
154+
python -m src.main
155+
```
156+
2. The command displays the local URL hosting the ap.
157+
158+
<p align="center">
159+
<img src="img/RunningURL.png" alt="RunningURL" width="600px">
160+
<br>
161+
</p>
162+
163+
3. Copy the URL in browser and load application.
164+
165+
<p align="center">
166+
<img src="img/URLLoad.png" alt="URLLoad">
167+
<br>
168+
</p>
169+
170+
4. Click `Start Test Run` button the console. This initiates a test run for a test utterances. The tool uses M365 Agent SDK to send utterance in `/data/input.txt` file and recieve response and logs for representation.
171+
172+
<p align="center">
173+
<img src="img/StartTestRun.png" alt="StartTestRun">
174+
<br>
175+
</p>
176+
177+
> [!Important]
178+
> Cross check test utterances are sequentially listed in the `/data/input.txt` file.
179+
180+
> [!TIP]
181+
> If the tool is properly setup, `Process Status` displays the current state of processing, including the number of utterances analyzed and conversation identifiers.
182+
183+
> [!TIP]
184+
> `Start Test Run` button woudl be disabled till completion of the session.
185+
186+
> [!TIP]
187+
> Utterances in the data file can be updated or altered after each session and the session re-executed.
188+
189+
> [!TIP]
190+
> After each test run, the tool automatically generates a CSV file containing all queries, their responses, and corresponding response times. The file is stored in the `/data/` directory for easy access.
191+
192+
> [!TIP]
193+
> If any utterances appear to be missing in the result, restart the tool and start a new session.
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
How do I reset my account password?
2+
What are the steps to update my billing information?
3+
How can I troubleshoot connectivity issues with the XYZ system?
4+
What is the process to request a refund?
5+
How do I configure email notifications in the ABC platform?
6+
What are the system requirements for installing the DEF software?
7+
How can I export my data from the GHI application?
8+
What security measures does the JKL service implement?
9+
How do I integrate the MNO API with my existing system?
10+
Where can I find user manuals and documentation for the PQR product?​1
11+
What were the key revenue drivers for the last quarter?
12+
How did our sales performance compare to the previous year?
13+
What are the current trends in our market segment?
14+
What operational challenges did we face last month?
15+
Are there any significant risks that could impact our business in the next quarter?
16+
How are we addressing supply chain disruptions?
17+
What feedback have we received from our top customers?
18+
How is our customer satisfaction rating trending?
19+
What are the key achievements of our team this month?
20+
How do I create a new account?
21+
How can I reset my password?
22+
How do I update my email address?
23+
How do I delete my account?
24+
How can I enable two-factor authentication?
25+
How do I change my username?
26+
What should I do if I suspect unauthorized access to my account?
27+
How do I manage my notification preferences?
28+
How can I link multiple accounts?
29+
What are our strategic priorities for the next quarter?
30+
What resources do we need to achieve our goals?
31+
How can we improve our support for remote teams?
32+
How do I recover a locked account?
33+
IT Troubleshooting & Technical Support 11. How do I troubleshoot login issues?
34+
What should I do if the website is not loading?
35+
How can I clear my browser cache?
36+
How do I report a bug or technical problem?
37+
How do I update the software to the latest version?
38+
What are the system requirements for using the platform?
39+
How do I enable cookies and JavaScript?
40+
How can I reset my app settings?
41+
How do I check for service outages?
42+
How do I contact technical support?
43+
1. How do I use the dashboard?
44+
How can I customize my profile?
45+
What integrations are available?
46+
How do I export my data?
47+
How do I set up notifications and alerts?
48+
How can I place an order?
49+
What payment methods do you accept?
50+
Can I change or cancel my order?
51+
What is your return policy?
52+
How do I track my shipment?
53+
What happens if my product arrives damaged?
54+
Do you ship internationally?
55+
How long does delivery take?
56+
Can I purchase gift cards?
57+
Is there a warranty on products?
58+
exit
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
What are our strategic priorities for the next quarter?
2+
What resources do we need to achieve our goals?
3+
What's the status of INC0007001?
4+
How can we improve our support for remote teams?
5+
What's the status of INC0007001?
6+
How do I recover a locked account?
7+
What's the status of INC0007001?
8+
What are the top 3 blockers?
9+
How do I create a new account?
10+
How can I reset my password?
11+
How do I update my email address?
12+
How do I delete my account?
13+
How can I enable two-factor authentication?
14+
How do I change my username?
15+
What should I do if I suspect unauthorized access to my account?
16+
How do I manage my notification preferences?
17+
How can I link multiple accounts?
18+
What are our strategic priorities for the next quarter?
19+
What resources do we need to achieve our goals?
20+
How can we improve our support for remote teams?
21+
exit

0 commit comments

Comments
 (0)