Merge pull request #929 from MicrosoftDocs/main

moonbox3 · web-flow · commit 8e20a5ef87be · 2026-03-20T18:06:31.000+09:00
Merge main to live
diff --git a/agent-framework/agents/agent-pipeline.md b/agent-framework/agents/agent-pipeline.md
@@ -5,7 +5,7 @@ zone_pivot_groups: programming-languages
 author: eavanvalkenburg
 ms.topic: conceptual
 ms.author: edvan
-ms.date: 03/11/2026
+ms.date: 03/20/2026
 ms.service: agent-framework
 ---
 
@@ -45,8 +45,8 @@ The `Agent` class builds a pipeline through class composition with two main comp
 
 **ChatClient** (separate and interchangeable component):
 
-1. **Chat Middleware + Telemetry** - Optional middleware chain and instrumentation layers
-2. **FunctionInvocation** - Handles tool calling loop, invoking Function Middleware + Telemetry per tool call
+1. **FunctionInvocation** - Handles tool calling loop, invoking Function Middleware + Telemetry per tool call
+2. **Chat Middleware + Telemetry** - Optional middleware chain and instrumentation layers, running per model call
 3. **RawChatClient** - Provider-specific implementation (Azure OpenAI, OpenAI, Anthropic, etc.) that communicates with the LLM
 
 When you call `run()`, your request flows through the Agent layers, then into the ChatClient pipeline for LLM communication.
@@ -231,9 +231,9 @@ When you invoke an agent, the request flows through the pipeline:
 
 **ChatClient pipeline:**
 
-4. **Chat Middleware + Telemetry** executes (if configured)
-5. **FunctionInvocation** sends request to the LLM and handles tool calling loop
+4. **FunctionInvocation** manages the tool calling loop
    - For each tool call, **Function Middleware + Telemetry** executes
+5. **Chat Middleware + Telemetry** executes per model call (if configured)
 6. **RawChatClient** handles provider-specific LLM communication
 7. Response flows back through the same layers
 8. **Context providers** are notified of new messages for storage
diff --git a/agent-framework/agents/conversations/compaction.md b/agent-framework/agents/conversations/compaction.md
@@ -5,10 +5,26 @@ zone_pivot_groups: programming-languages
 author: crickman
 ms.topic: conceptual
 ms.author: crickman
-ms.date: 03/05/2026
+ms.date: 03/18/2026
 ms.service: agent-framework
 ---
 
+<!--
+  Feature parity table – highlight impactful SDK differences between C# and Python.
+  Keep in sync when features are added or removed.
+
+  | Feature                                      | C# | Python | Notes                                                                                                       |
+  |----------------------------------------------|:--:|:------:|-------------------------------------------------------------------------------------------------------------|
+  | Truncation strategy                          | ✅ |   ✅   |                                                                                                             |
+  | Sliding window strategy                      | ✅ |   ✅   |                                                                                                             |
+  | Tool-result collapse strategy                | ✅ |   ✅   |                                                                                                             |
+  | Summarization strategy                       | ✅ |   ✅   |                                                                                                             |
+  | Selective tool-call exclusion strategy       | ❌ |   ✅   | Python-only: fully drops older tool-call groups; C# ToolResultCompactionStrategy collapses them instead     |
+  | Trigger / target predicate system            | ✅ |   ❌   | C#-only: CompactionTrigger delegates control when each strategy fires and stops; Python strategies use internal parameters |
+  | Composed pipeline strategy                   | ✅ |   ✅   | C#: PipelineCompactionStrategy (trigger-driven, runs all children); Python: TokenBudgetComposedStrategy (token-budget-driven, early-stop) |
+  | Post-run compaction of persisted history     | ❌ |   ✅   | Python-only: CompactionProvider.after_strategy compacts stored history after each run; C# compacts in-flight context only |
+-->
+
 # Compaction
 
 As conversations grow, the token count of the chat history can exceed model context windows or drive up costs. Compaction strategies reduce the size of conversation history while preserving important context, so agents can continue functioning over long-running interactions.
diff --git a/agent-framework/agents/middleware/index.md b/agent-framework/agents/middleware/index.md
@@ -5,7 +5,7 @@ zone_pivot_groups: programming-languages
 author: dmytrostruk
 ms.topic: reference
 ms.author: dmytrostruk
-ms.date: 03/16/2026
+ms.date: 03/20/2026
 ms.service: agent-framework
 ---
 
@@ -300,6 +300,9 @@ Chat middleware intercepts chat requests sent to AI models. It uses the `ChatCon
 
 The `call_next` callback continues to the next middleware or sends the request to the AI service.
 
+> [!NOTE]
+> Chat middleware runs inside the function invocation loop. This means it executes for **each model call**, including calls that send tool results back to the model during a multi-turn tool calling sequence.
+
 ### Function-based
 
 ```python
diff --git a/agent-framework/agents/providers/azure-openai.md b/agent-framework/agents/providers/azure-openai.md
@@ -148,6 +148,9 @@ pip install agent-framework --pre
 
 ## Configuration
 
+> [!IMPORTANT]
+> The `AzureOpenAIChatClient` and `AzureOpenAIAssistantsClient` require an **Azure OpenAI resource** endpoint (format: `https://<myresource>.openai.azure.com`). The `AzureOpenAIResponsesClient` can use either an Azure OpenAI resource endpoint **or** a [Microsoft Foundry project](/azure/ai-foundry/what-is-ai-foundry) endpoint (format: `https://<your-project>.services.ai.azure.com/api/projects/<project-id>`). If you need to use the Foundry Agent Service instead, see the [Foundry Agents provider page](./azure-ai-foundry.md).
+
 Each client type uses different environment variables:
 
 # [Chat Completion](#tab/aoai-chat-completion)
@@ -228,7 +231,7 @@ asyncio.run(main())
 
 ### Responses Client with Microsoft Foundry project endpoint
 
-`AzureOpenAIResponsesClient` can also be created from a Foundry project endpoint:
+Instead of an Azure OpenAI resource endpoint, `AzureOpenAIResponsesClient` can also be created from a [Microsoft Foundry project](/azure/ai-foundry/what-is-ai-foundry) endpoint. Use a Foundry project endpoint when you want to access models deployed through a Microsoft Foundry project rather than a standalone Azure OpenAI resource:
 
 ```python
 from agent_framework.azure import AzureOpenAIResponsesClient
diff --git a/agent-framework/agents/providers/microsoft-foundry.md b/agent-framework/agents/providers/microsoft-foundry.md
@@ -98,6 +98,9 @@ For more information on how to run and interact with agents, see the [Agent gett
 
 ### Environment Variables
 
+> [!IMPORTANT]
+> `AzureAIAgentClient` (Foundry Agent Service v1) and `AzureAIClient` (Foundry Agent Service v2) both require an **Azure AI Foundry project** endpoint (format: `https://<your-project>.services.ai.azure.com/api/projects/<project-id>`), **not** an Azure OpenAI resource endpoint. You must have an [Azure AI Foundry project](/azure/ai-foundry/what-is-ai-foundry) to use this provider. If you have a standalone Azure OpenAI resource instead, see the [Azure OpenAI provider page](./azure-openai.md).
+
 Before using Foundry Agents, you need to set up these environment variables:
 
 ```bash
diff --git a/agent-framework/media/agent-pipeline-python.svg b/agent-framework/media/agent-pipeline-python.svg
@@ -51,23 +51,23 @@
   <rect x="475" y="30" width="330" height="285" rx="8" ry="8" fill="#e0f2f1" stroke="#00695c" stroke-width="2"/>
   <text x="640" y="52" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="14" font-weight="600" fill="#00695c">ChatClient</text>
   
-  <!-- Chat Middleware + Telemetry box -->
-  <rect x="490" y="65" width="130" height="80" rx="5" ry="5" fill="white" stroke="#00695c" stroke-width="1.5"/>
-  <text x="555" y="90" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="10" font-weight="600" fill="#00695c">Chat Middleware</text>
-  <text x="555" y="105" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="10" font-weight="600" fill="#00695c">+ Telemetry</text>
-  <text x="555" y="125" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="9" fill="#666">middleware=[]</text>
-  <text x="555" y="138" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="9" fill="#666">(optional)</text>
-  
-  <!-- Function Invocation box -->
-  <rect x="640" y="65" width="150" height="120" rx="5" ry="5" fill="#b2dfdb" stroke="#00695c" stroke-width="1.5"/>
-  <text x="715" y="88" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="10" font-weight="600" fill="#00695c">FunctionInvocation</text>
-  <text x="715" y="103" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="9" fill="#555">Tool calling loop</text>
+  <!-- FunctionInvocation box (left) -->
+  <rect x="490" y="65" width="150" height="120" rx="5" ry="5" fill="#b2dfdb" stroke="#00695c" stroke-width="1.5"/>
+  <text x="565" y="88" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="10" font-weight="600" fill="#00695c">FunctionInvocation</text>
+  <text x="565" y="103" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="9" fill="#555">Tool calling loop</text>
   
   <!-- Function Middleware nested inside Function Invocation -->
-  <rect x="652" y="115" width="126" height="55" rx="3" ry="3" fill="white" stroke="#00695c" stroke-width="1"/>
-  <text x="715" y="135" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="9" fill="#00695c">Function Middleware</text>
-  <text x="715" y="150" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="9" fill="#00695c">+ Telemetry</text>
-  <text x="715" y="163" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="8" fill="#666">(per tool call)</text>
+  <rect x="502" y="115" width="126" height="55" rx="3" ry="3" fill="white" stroke="#00695c" stroke-width="1"/>
+  <text x="565" y="135" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="9" fill="#00695c">Function Middleware</text>
+  <text x="565" y="150" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="9" fill="#00695c">+ Telemetry</text>
+  <text x="565" y="163" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="8" fill="#666">(per tool call)</text>
+  
+  <!-- Chat Middleware + Telemetry box (right) -->
+  <rect x="660" y="65" width="130" height="80" rx="5" ry="5" fill="white" stroke="#00695c" stroke-width="1.5"/>
+  <text x="725" y="88" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="10" font-weight="600" fill="#00695c">Chat Middleware</text>
+  <text x="725" y="103" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="10" font-weight="600" fill="#00695c">+ Telemetry</text>
+  <text x="725" y="123" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="9" fill="#666">middleware=[]</text>
+  <text x="725" y="136" text-anchor="middle" font-family="Segoe UI, sans-serif" font-size="9" fill="#666">(per model call)</text>
   
   <!-- RawChatClient box -->
   <rect x="490" y="200" width="300" height="100" rx="5" ry="5" fill="#b2dfdb" stroke="#00695c" stroke-width="1.5"/>
@@ -93,15 +93,15 @@
   <!-- Agent to ChatClient -->
   <line x1="445" y1="167" x2="470" y2="167" stroke="#333" stroke-width="2" marker-end="url(#arrowhead)"/>
   
-  <!-- Chat Middleware to FunctionInvocation -->
-  <line x1="620" y1="105" x2="635" y2="105" stroke="#333" stroke-width="1.5" marker-end="url(#arrowhead)"/>
+  <!-- FunctionInvocation to Chat Middleware -->
+  <line x1="640" y1="105" x2="655" y2="105" stroke="#333" stroke-width="1.5" marker-end="url(#arrowhead)"/>
   
-  <!-- FunctionInvocation to RawChatClient (down) -->
-  <line x1="715" y1="185" x2="715" y2="195" stroke="#333" stroke-width="1.5" marker-end="url(#arrowhead)"/>
+  <!-- Chat Middleware to RawChatClient (down) -->
+  <line x1="725" y1="145" x2="725" y2="195" stroke="#333" stroke-width="1.5" marker-end="url(#arrowhead)"/>
   
   <!-- Loop arrow for tool calling -->
-  <path d="M 790 140 Q 800 120 790 100" stroke="#00695c" stroke-width="1.5" fill="none" marker-end="url(#looparrow)"/>
-  <text x="810" y="120" font-family="Segoe UI, sans-serif" font-size="8" fill="#00695c">loop</text>
+  <path d="M 485 140 Q 475 120 485 100" stroke="#00695c" stroke-width="1.5" fill="none" marker-end="url(#looparrow)"/>
+  <text x="462" y="123" font-family="Segoe UI, sans-serif" font-size="8" fill="#00695c">loop</text>
   
   <!-- ChatClient to LLM -->
   <line x1="805" y1="250" x2="855" y2="180" stroke="#333" stroke-width="2" marker-end="url(#arrowhead)"/>
diff --git a/agent-framework/support/upgrade/python-2026-significant-changes.md b/agent-framework/support/upgrade/python-2026-significant-changes.md
@@ -4,7 +4,7 @@ description: Guide to significant changes in Python releases for Microsoft Agent
 author: eavanvalkenburg
 ms.topic: upgrade-and-migration-article
 ms.author: edvan
-ms.date: 03/13/2026
+ms.date: 03/20/2026
 ms.service: agent-framework
 ---
 # Python 2026 Significant Changes Guide
@@ -18,9 +18,27 @@ This document will be removed once we reach the 1.0.0 stable release, so please
 
 ---
 
-## python-1.0.0rc5 / python-1.0.0b260318 (March 18, 2026)
+## python-1.0.0rc5 / python-1.0.0b260319 (March 19, 2026)
 
-**Release:** Scheduled for March 18, 2026. `agent-framework-core` and `agent-framework-azure-ai` move to `1.0.0rc5`; the remaining Python packages align on the March 2026 build line (`1.0.0b260318`).
+### 🔴 Chat client pipeline reordered: FunctionInvocation now wraps ChatMiddleware
+
+**PR:** [#4746](https://github.com/microsoft/agent-framework/pull/4746)
+
+The ChatClient pipeline ordering has changed. `FunctionInvocation` is now the outermost layer and wraps `ChatMiddleware`, which means chat middleware runs **per model call** (including each iteration of the tool calling loop) instead of once around the entire function invocation sequence.
+
+**Old pipeline order:**
+```
+ChatMiddleware → FunctionInvocation → RawChatClient
+```
+
+**New pipeline order:**
+```
+FunctionInvocation → ChatMiddleware → ChatTelemetry → RawChatClient
+```
+
+If you have custom chat middleware that assumed it ran only once per agent invocation (wrapping the entire tool calling loop), update it to be safe for repeated execution. Chat middleware is now invoked for each individual LLM request, including requests that send tool results back to the model.
+
+Additionally, `ChatTelemetry` is now a separate layer from `ChatMiddleware` in the pipeline, running closest to `RawChatClient`.
 
 ### 🔴 Public runtime kwargs split into explicit buckets