|
| 1 | +# Copilot Studio Response Analysis Tool :computer: |
| 2 | + |
| 3 | +## Purpose: |
| 4 | +Provide a **lightweight, developer-friendly tool** to - |
| 5 | +- Measure Copilot Agent **response-time performance** and correlate it with output size/tokens. :telescope: |
| 6 | +- Get **metrics** (Mean, Median, Max, Min, Standard Deviation) and **visual charts** to understand Copilot Agent response time trends and variability. :movie_camera: |
| 7 | +- Aggregated **real-time metrics, charts and tables** to spot spikes, drift, and outliers across single conversation. :bar_chart: |
| 8 | +- **Trace planner steps: tool invocations, and arguments** to view and validate dynamic plan composition. :computer: |
| 9 | + - Each planner step includes **Thought, Tool, and Arguments**, which together explain why the agent chose a path and how it executed it. |
| 10 | + - **Compare planner steps across queries** highlights tool calls and reasoning. |
| 11 | +- Automatically **generates a CSV file** containing all queries, their responses, and corresponding response times. :floppy_disk: |
| 12 | + |
| 13 | +## Interpretion: |
| 14 | + |
| 15 | +### 1. Statistics Tab :mag_right: |
| 16 | +Provides an overview of Copilot Agent performance metrics, including response time summaries (Mean, Median, Max, Min), variability (Standard Deviation), token correlation, and visual charts for trends and distribution. |
| 17 | + |
| 18 | + <p align="center"> |
| 19 | + <img src="img/Screen-Statistics.png" alt="StartTestRun"> |
| 20 | + <br> |
| 21 | + </p> |
| 22 | + |
| 23 | +#### <center>*Response Statistics:*</center> :chart_with_upwards_trend: |
| 24 | +| Metric | Description | Purpose | |
| 25 | +| :------- | :---------- | :---------- | |
| 26 | +| **Mean** | The average response time across all responses. | Gives an overall sense of typical performance but can be skewed by very high or low values. |
| 27 | +| **Median** | The middle value when all response times are arranged in ascending order. | Represents the central tendency and is less affected by outliers than the mean. Useful for understanding the “typical” response time. |
| 28 | +| **Max** | The longest response time recorded during the test run. | Highlights the worst-case scenario for latency, which is critical for identifying performance bottlenecks. |
| 29 | +| **Min** | The shortest response time recorded during the test run. | Shows the best-case performance and can indicate the system’s potential under optimal conditions. |
| 30 | +| **Standard Deviation** | Measures how much response times vary from the average. | Helps assess consistency—low SD means stable performance, high SD indicates fluctuating response times. |
| 31 | +| **Token Correlation** | Represents the correlation between Cresponse time and the number of tokens in response. | Indicate orchestrator efficiency—higher cost may indicate longer responses or slower performance. |
| 32 | + |
| 33 | +#### <center>*Response Time Analysis:*</center> :chart_with_downwards_trend: |
| 34 | +| Chart | Description | |
| 35 | +| :------- | :---------- | |
| 36 | +| **Line Chart** | Shows how Copilot Agent response time changes across individual queries to identify spikes or trends. | |
| 37 | +| **Box Plot** | Summarizes overall response time distribution, highlighting consistency and outliers for performance benchmarking. | |
| 38 | + |
| 39 | +### 2. Data Tab :calendar: |
| 40 | +Displays detailed per-query information, including the user’s prompt, Copilot Agent response, response time, output size, and step-by-step planner actions with tools and arguments—used for debugging and performance analysis. |
| 41 | + |
| 42 | +#### Query Response / Time Data: |
| 43 | +A per‑query ledger linking the user prompt, the agent’s full reply, its latency, and output size to diagnose performance. |
| 44 | + |
| 45 | +| Metric | Description | |
| 46 | +| :------- | :---------- | |
| 47 | +| **Serial** | Sequence number of the query in the run. | |
| 48 | +| **Query** | Exact user prompt/utterance. | |
| 49 | +| **Response** | Agent’s generated reply (full text). | |
| 50 | +| **Min** | Latency to produce the response (typically in seconds or ms). | |
| 51 | +| **Char** | Output length indicator (character count). | |
| 52 | + |
| 53 | +<p align="center"> |
| 54 | + <img src="img/Screen-Data-Query.png" alt="Screen-Data"> |
| 55 | + <br> |
| 56 | + </p> |
| 57 | + |
| 58 | +#### LLM Planner Steps Data: |
| 59 | +A step‑by‑step trace of the agent’s planning, tools, and arguments that explains how each response was produced. |
| 60 | + |
| 61 | +| Metric | Description | |
| 62 | +| :------- | :---------- | |
| 63 | +| **Serial** | Sequence number matching the query above. | |
| 64 | +| **Query** | Exact user prompt/utterance. | |
| 65 | +| **PlannerStep** | Named step decided by orchestrator. | |
| 66 | +| **Thought** | Model’s internal reasoning summary for the step (high-level) | |
| 67 | +| **Tool** | Action or connector invoked. | |
| 68 | +| **Arguments** | Parameters passed / revieved. | |
| 69 | + |
| 70 | +<p align="center"> |
| 71 | + <img src="img/Screen-Data-Planner.png" alt="Screen-Data"> |
| 72 | + <br> |
| 73 | + </p> |
| 74 | + |
| 75 | +## Prerequisite: |
| 76 | + |
| 77 | +To set up this sample, you will need the following: |
| 78 | + |
| 79 | +1. [Python](https://www.python.org/) version 3.9 or higher |
| 80 | +2. An Agent Created in Microsoft Copilot Studio or access to an existing Agent. |
| 81 | +3. Ability to Create an Application Identity in Azure for a Public Client/Native App Registration Or access to an existing Public Client/Native App registration with the `CopilotStudio.Copilots.Invoke` API Permission assigned. |
| 82 | + |
| 83 | +## Authentication: |
| 84 | + |
| 85 | +The Copilot Studio Response Analysis Tool requires a User Token to operate. For this sample, we are using a user interactive flow to get the user token for the application ID created above. Other flows are allowed. |
| 86 | + |
| 87 | +> [!Important] |
| 88 | +> The token is cached in the user machine in `.local_token_cache.json` |
| 89 | +
|
| 90 | +## Step 1. Create an Agent in Copilot Studio. |
| 91 | + |
| 92 | +1. Create an Agent in [Copilot Studio](https://copilotstudio.microsoft.com) |
| 93 | + 1. Publish your newly created Copilot |
| 94 | + 2. Goto Settings => Advanced => Metadata and copy the following values, You will need them later: |
| 95 | + 1. Schema name |
| 96 | + 2. Environment Id |
| 97 | + |
| 98 | +## Step 2. Create an Application Registration in Entra ID. |
| 99 | + |
| 100 | +This step will require permissions to create application identities in your Azure tenant. For this sample, you will create a Native Client Application Identity, which does not have secrets. |
| 101 | + |
| 102 | +1. Open https://portal.azure.com |
| 103 | +2. Navigate to Entra Id |
| 104 | +3. Create a new App Registration in Entra ID |
| 105 | + 1. Provide a Name |
| 106 | + 2. Choose "Accounts in this organization directory only" |
| 107 | + 3. In the "Select a Platform" list, Choose "Public Client/native (mobile & desktop) |
| 108 | + 4. In the Redirect URI url box, type in `http://localhost` (**note: use HTTP, not HTTPS**) |
| 109 | + 5. Then click register. |
| 110 | +4. In your newly created application |
| 111 | + 1. On the Overview page, Note down for use later when configuring the example application: |
| 112 | + 1. The Application (client) ID |
| 113 | + 2. The Directory (tenant) ID |
| 114 | + 2. Go to API Permissions in `Manage` section |
| 115 | + 3. Click Add Permission |
| 116 | + 1. In the side panel that appears, Click the tab `API's my organization uses` |
| 117 | + 2. Search for `Power Platform API`. |
| 118 | + 1. *If you do not see `Power Platform API` see the note at the bottom of this section.* |
| 119 | + 3. In the *Delegated permissions* list, choose `CopilotStudio` and Check `CopilotStudio.Copilots.Invoke` |
| 120 | + 4. Click `Add Permissions` |
| 121 | + 4. (Optional) Click `Grant Admin consent for copilotsdk` |
| 122 | + |
| 123 | +> [!TIP] |
| 124 | +> If you do not see `Power Platform API` in the list of API's your organization uses, you need to add the Power Platform API to your tenant. To do that, goto [Power Platform API Authentication](https://learn.microsoft.com/power-platform/admin/programmability-authentication-v2#step-2-configure-api-permissions) and follow the instructions on Step 2 to add the Power Platform Admin API to your Tenant |
| 125 | +
|
| 126 | +## Step 3. Configure the Copilot Studio Response Analysis Tool. |
| 127 | + |
| 128 | +With the above information, you can now run the `Copilot Studio Response Analysis Tool` sample. |
| 129 | + |
| 130 | +1. Open the `env.TEMPLATE` file and rename it to `.env`. |
| 131 | +2. Configure the values based on what was recorded during the setup phase. |
| 132 | + |
| 133 | +```bash |
| 134 | + COPILOTSTUDIOAGENT__ENVIRONMENTID="" # Environment ID of environment with the CopilotStudio App. |
| 135 | + COPILOTSTUDIOAGENT__SCHEMANAME="" # Schema Name of the Copilot to use |
| 136 | + COPILOTSTUDIOAGENT__TENANTID="" # Tenant ID of the App Registration used to login, this should be in the same tenant as the Copilot. |
| 137 | + COPILOTSTUDIOAGENT__AGENTAPPID="" # App ID of the App Registration used to login, this should be in the same tenant as the CopilotStudio environment. |
| 138 | +``` |
| 139 | + |
| 140 | +3. Run `pip install -r requirements.txt` to install all dependencies. |
| 141 | + |
| 142 | +4. List test utterances sequentially in the `/data/input.txt` file. Marke the end of the file with "exit". |
| 143 | + |
| 144 | + <p align="center"> |
| 145 | + <img src="img/DataSet.png" alt="input data or utterances." width="600px"> |
| 146 | + <br> |
| 147 | + </p> |
| 148 | + |
| 149 | +## Step 4. Run the Copilot Studio Response Analysis Tool. |
| 150 | + |
| 151 | +1. Run the Copilot Studio Response Analysis Tool using. This should challenge you to login and connect to the Copilot Studio Hosted agent |
| 152 | + |
| 153 | +```sh |
| 154 | +python -m src.main |
| 155 | +``` |
| 156 | +2. The command displays the local URL hosting the ap. |
| 157 | + |
| 158 | + <p align="center"> |
| 159 | + <img src="img/RunningURL.png" alt="RunningURL" width="600px"> |
| 160 | + <br> |
| 161 | + </p> |
| 162 | + |
| 163 | +3. Copy the URL in browser and load application. |
| 164 | + |
| 165 | + <p align="center"> |
| 166 | + <img src="img/URLLoad.png" alt="URLLoad"> |
| 167 | + <br> |
| 168 | + </p> |
| 169 | + |
| 170 | +4. Click `Start Test Run` button the console. This initiates a test run for a test utterances. The tool uses M365 Agent SDK to send utterance in `/data/input.txt` file and recieve response and logs for representation. |
| 171 | + |
| 172 | + <p align="center"> |
| 173 | + <img src="img/StartTestRun.png" alt="StartTestRun"> |
| 174 | + <br> |
| 175 | + </p> |
| 176 | + |
| 177 | +> [!Important] |
| 178 | +> Cross check test utterances are sequentially listed in the `/data/input.txt` file. |
| 179 | +
|
| 180 | +> [!TIP] |
| 181 | +> If the tool is properly setup, `Process Status` displays the current state of processing, including the number of utterances analyzed and conversation identifiers. |
| 182 | +
|
| 183 | +> [!TIP] |
| 184 | +> `Start Test Run` button woudl be disabled till completion of the session. |
| 185 | +
|
| 186 | +> [!TIP] |
| 187 | +> Utterances in the data file can be updated or altered after each session and the session re-executed. |
| 188 | +
|
| 189 | +> [!TIP] |
| 190 | +> After each test run, the tool automatically generates a CSV file containing all queries, their responses, and corresponding response times. The file is stored in the `/data/` directory for easy access. |
| 191 | +
|
| 192 | +> [!TIP] |
| 193 | +> If any utterances appear to be missing in the result, restart the tool and start a new session. |
0 commit comments