MicrosoftDocs
diff --git a/‎agent-framework/TOC.yml‎
Lines changed: 2 additions & 0 deletions b/‎agent-framework/TOC.yml‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎agent-framework/agents/agent-pipeline.md‎
Lines changed: 28 additions & 14 deletions b/‎agent-framework/agents/agent-pipeline.md‎
Lines changed: 28 additions & 14 deletions
diff --git a/‎agent-framework/agents/background-responses.md‎
Lines changed: 6 additions & 8 deletions b/‎agent-framework/agents/background-responses.md‎
Lines changed: 6 additions & 8 deletions
diff --git a/‎agent-framework/agents/conversations/context-providers.md‎
Lines changed: 13 additions & 4 deletions b/‎agent-framework/agents/conversations/context-providers.md‎
Lines changed: 13 additions & 4 deletions
diff --git a/‎agent-framework/agents/conversations/storage.md‎
Lines changed: 26 additions & 0 deletions b/‎agent-framework/agents/conversations/storage.md‎
Lines changed: 26 additions & 0 deletions
diff --git a/‎agent-framework/agents/declarative.md‎
Lines changed: 48 additions & 17 deletions b/‎agent-framework/agents/declarative.md‎
Lines changed: 48 additions & 17 deletions
@@ -180,6 +180,8 @@ items:
     href: integrations/m365.md
   - name: Neo4j GraphRAG Provider
     href: integrations/neo4j-graphrag.md
+  - name: Chat History Memory Provider
+    href: integrations/chat-history-memory-provider.md
   - name: Neo4j Memory Provider
     href: integrations/neo4j-memory.md
   - name: A2A Protocol
 
@@ -5,7 +5,7 @@ zone_pivot_groups: programming-languages
 author: eavanvalkenburg
 ms.topic: conceptual
 ms.author: edvan
-ms.date: 03/20/2026
+ms.date: 04/02/2026
 ms.service: agent-framework
 ---
 
@@ -40,13 +40,13 @@ The `Agent` class builds a pipeline through class composition with two main comp
 **Agent** (outer component):
 
 1. **Agent Middleware + Telemetry** - the `AgentMiddlewareLayer` and `AgentTelemetryLayer` classes handle middleware invocation and OpenTelemetry instrumentation
-2. **RawAgent** - Core agent logic that invokes context providers
-3. **Context Providers** - Unified `context_providers` list manages history and additional context
+2. **RawAgent** - Core agent logic that invokes context providers and collects provider-added middleware
+3. **Context Providers** - Unified `context_providers` list manages history, additional context, and per-run chat/function middleware
 
 **ChatClient** (separate and interchangeable component):
 
 1. **FunctionInvocation** - Handles tool calling loop, invoking Function Middleware + Telemetry per tool call
-2. **Chat Middleware + Telemetry** - Optional middleware chain and instrumentation layers, running per model call
+2. **Chat Middleware + Telemetry** - Optional middleware chain and instrumentation layers, including any chat middleware added by context providers, running per model call
 3. **RawChatClient** - Provider-specific implementation (Azure OpenAI, OpenAI, Anthropic, etc.) that communicates with the LLM
 
 When you call `run()`, your request flows through the Agent layers, then into the ChatClient pipeline for LLM communication.
@@ -144,6 +144,8 @@ agent = Agent(
 )
 ```
 
+Context providers can also attach chat or function middleware to a single invocation via `SessionContext.extend_middleware()`. The agent flattens those additions in provider order before entering the ChatClient pipeline.
+
 ::: zone-end
 
 For detailed context provider patterns, see [Context Providers](./conversations/context-providers.md).
@@ -157,9 +159,10 @@ The chat client layer handles the actual communication with the LLM service.
 `ChatClientAgent` uses an `IChatClient` instance, which can be decorated with additional middleware:
 
 ```csharp
-var chatClient = new AzureOpenAIClient(endpoint, credential)
-    .GetChatClient(deploymentName)
-    .AsIChatClient()
+var chatClient = new AIProjectClient(endpoint, credential)
+    .GetProjectOpenAIClient()
+    .GetProjectResponsesClient()
+    .AsIChatClient(deploymentName)
     .AsBuilder()
     .Use(CustomChatClientMiddleware)
     .Build();
@@ -170,9 +173,10 @@ var agent = new ChatClientAgent(chatClient, instructions: "You are helpful.");
 You can also use `AIContextProvider` as chat client middleware to enrich messages, tools, and instructions at the client level. This must be used within the context of a running `AIAgent`:
 
 ```csharp
-var chatClient = new AzureOpenAIClient(endpoint, credential)
-    .GetChatClient(deploymentName)
-    .AsIChatClient()
+var chatClient = new AIProjectClient(endpoint, credential)
+    .GetProjectOpenAIClient()
+    .GetProjectResponsesClient()
+    .AsIChatClient(deploymentName)
     .AsBuilder()
     .UseAIContextProviders(new MyContextProvider())
     .Build();
@@ -226,14 +230,14 @@ When you invoke an agent, the request flows through the pipeline:
 **Agent pipeline:**
 
 1. **Agent Middleware + Telemetry** executes middleware (if configured) and records spans
-2. **RawAgent** invokes context providers to load history and add context
+2. **RawAgent** invokes context providers to load history, add context, and collect provider-added chat/function middleware
 3. Request is passed to the ChatClient
 
 **ChatClient pipeline:**
 
 4. **FunctionInvocation** manages the tool calling loop
-   - For each tool call, **Function Middleware + Telemetry** executes
-5. **Chat Middleware + Telemetry** executes per model call (if configured)
+   - For each tool call, **Function Middleware + Telemetry** executes, including any function middleware added by context providers
+5. **Chat Middleware + Telemetry** executes per model call (if configured), including any chat middleware added by context providers
 6. **RawChatClient** handles provider-specific LLM communication
 7. Response flows back through the same layers
 8. **Context providers** are notified of new messages for storage
@@ -274,6 +278,16 @@ var copilotAgent = originalCopilotAgent
 ::: zone-end
 
 ::: zone pivot="programming-language-python"
+
+## Other agent types
+
+Not every Python agent uses the full `Agent` + `ChatClient` pipeline. `GitHubCopilotAgent`, for example, sends requests through the GitHub Copilot CLI instead of a local chat client.
+
+Even so, Python `GitHubCopilotAgent` still supports agent middleware and now runs `context_providers` around each invocation. Provider-added messages and instructions are included in the prompt sent to Copilot, and providers receive the matching `after_run` callback once a response is available.
+
+> [!NOTE]
+> Because `GitHubCopilotAgent` does not use a local chat client, chat client middleware still does not apply.
+
 ::: zone-end
 
 ## Next steps
@@ -285,4 +299,4 @@ var copilotAgent = originalCopilotAgent
 
 - [Middleware](./middleware/index.md) - Add cross-cutting behavior to your agents
 - [Context Providers](./conversations/context-providers.md) - Detailed patterns for history and context injection
-- [Running Agents](./running-agents.md) - How to invoke agents
+- [Running Agents](./running-agents.md) - How to invoke agents
@@ -55,11 +55,10 @@ Some agents may not allow explicit control over background responses. These agen
 For non-streaming scenarios, when you initially run an agent, it may or may not return a continuation token. If no continuation token is returned, it means the operation has completed. If a continuation token is returned, it indicates that the agent has initiated a background response that is still processing and will require polling to retrieve the final result:
 
 ```csharp
-AIAgent agent = new AzureOpenAIClient(
-    new Uri("https://<myresource>.openai.azure.com"),
+AIAgent agent = new AIProjectClient(
+    new Uri("<your-foundry-project-endpoint>"),
     new DefaultAzureCredential())
-    .GetResponsesClient("<deployment-name>")
-    .AsAIAgent();
+    .AsAIAgent(model: "<deployment-name>", instructions: "You are a helpful assistant.");
 
 AgentRunOptions options = new()
 {
@@ -100,11 +99,10 @@ Console.WriteLine(response.Text);
 In streaming scenarios, background responses work much like regular streaming responses - the agent streams all updates back to consumers in real-time. However, the key difference is that if the original stream gets interrupted, agents support stream resumption through continuation tokens. Each update includes a continuation token that captures the current state, allowing the stream to be resumed from exactly where it left off by passing this token to subsequent streaming API calls:
 
 ```csharp
-AIAgent agent = new AzureOpenAIClient(
-    new Uri("https://<myresource>.openai.azure.com"),
+AIAgent agent = new AIProjectClient(
+    new Uri("<your-foundry-project-endpoint>"),
     new DefaultAzureCredential())
-    .GetResponsesClient("<deployment-name>")
-    .AsAIAgent();
+    .AsAIAgent(model: "<deployment-name>", instructions: "You are a helpful assistant.");
 
 AgentRunOptions options = new()
 {
 
@@ -13,6 +13,9 @@ ms.service: agent-framework
 
 Context providers run around each invocation to add context before execution and process data after execution.
 
+> [!NOTE]
+> For a list of pre-built context providers you can use with your agent, see [Integrations](../../integrations/index.md)
+
 ## Built-in pattern
 
 :::zone pivot="programming-language-csharp"
@@ -282,10 +285,10 @@ internal sealed class AdvancedServiceMemoryProvider : AIContextProvider
 ```python
 from typing import Any
 
-from agent_framework import AgentSession, BaseContextProvider, SessionContext
+from agent_framework import AgentSession, ContextProvider, SessionContext
 
 
-class UserPreferenceProvider(BaseContextProvider):
+class UserPreferenceProvider(ContextProvider):
     def __init__(self) -> None:
         super().__init__("user-preferences")
 
@@ -314,6 +317,11 @@ class UserPreferenceProvider(BaseContextProvider):
                 state["favorite_food"] = text.split("favorite food is", 1)[1].strip().rstrip(".")
 ```
 
+> [!NOTE]
+> `ContextProvider` and `HistoryProvider` are the canonical Python base classes. `BaseContextProvider` and `BaseHistoryProvider` still exist as deprecated aliases for compatibility, but new providers should inherit from the new names.
+>
+> Context providers can also add chat or function middleware for the current invocation by calling `context.extend_middleware(self.source_id, middleware)`. The agent flattens those additions with `context.get_middleware()` and applies them in provider order before invoking the chat client.
+
 :::zone-end
 
 :::zone pivot="programming-language-python"
@@ -326,10 +334,10 @@ History providers are context providers specialized for loading/storing messages
 from collections.abc import Sequence
 from typing import Any
 
-from agent_framework import BaseHistoryProvider, Message
+from agent_framework import HistoryProvider, Message
 
 
-class DatabaseHistoryProvider(BaseHistoryProvider):
+class DatabaseHistoryProvider(HistoryProvider):
     def __init__(self, db: Any) -> None:
         super().__init__("db-history", load_messages=True)
         self._db = db
@@ -365,6 +373,7 @@ class DatabaseHistoryProvider(BaseHistoryProvider):
 > [!IMPORTANT]
 > In Python, you can configure multiple history providers, but **only one** should use `load_messages=True`.
 > Use additional providers for diagnostics/evals with `load_messages=False` and `store_context_messages=True` so they capture context from other providers alongside input/output.
+> If you need local history to persist around each model call in a tool loop, see [Storage](./storage.md#per-service-call-local-history-persistence).
 >
 > Example pattern:
 >
 
@@ -120,6 +120,32 @@ response = await agent.run("Continue this conversation.", session=session)
 
 :::zone-end
 
+## Per-service-call local history persistence
+
+Tool-calling runs can make multiple model calls before a single `agent.run()` completes. By default, local history providers persist once after the full run. If you want local history to mirror service-managed conversations more closely, set `require_per_service_call_history_persistence=True` so history providers run around each model call instead.
+
+:::zone pivot="programming-language-python"
+
+```python
+from agent_framework import Agent, InMemoryHistoryProvider
+from agent_framework.openai import OpenAIChatClient
+
+agent = Agent(
+    client=OpenAIChatClient(),
+    name="StorageAgent",
+    instructions="You are a helpful assistant.",
+    context_providers=[InMemoryHistoryProvider("memory", load_messages=True)],
+    require_per_service_call_history_persistence=True,
+)
+```
+
+> [!IMPORTANT]
+> Use this mode only for framework-managed local history. If the run is already bound to a service-managed conversation (for example via `session.service_session_id` or `options={"conversation_id": ...}`), Agent Framework raises an error instead of mixing the two persistence models.
+>
+> This mode is especially useful when middleware can terminate immediately after a tool call: persisting per model call keeps local history aligned with what a service-managed conversation would keep.
+
+:::zone-end
+
 ## Third-party/Custom storage pattern
 
 For database/Redis/blob-backed history, implement a custom history provider.
 
@@ -18,27 +18,58 @@ Declarative agents allow you to define agent configuration using YAML or JSON fi
 The following example shows how to create a declarative agent from a YAML configuration:
 
 ```csharp
-using System;
-using System.IO;
-using Azure.AI.OpenAI;
+using Azure.AI.Projects;
 using Azure.Identity;
 using Microsoft.Agents.AI;
-using Microsoft.Extensions.AI;
 
-// Load agent configuration from a YAML file
-var yamlContent = File.ReadAllText("agent-config.yaml");
-
-// Create the agent from the YAML definition
-AIAgent agent = AgentFactory.CreateFromYaml(
-    yamlContent,
-    new AzureOpenAIClient(
-        new Uri("https://<myresource>.openai.azure.com"),
-        new AzureCliCredential()));
-
-// Run the declarative agent
-Console.WriteLine(await agent.RunAsync("Why is the sky blue?"));
+// Create the chat client
+IChatClient chatClient = new AIProjectClient(
+    new Uri("<your-foundry-project-endpoint>"),
+    new DefaultAzureCredential())
+        .GetProjectOpenAIClient()
+        .GetProjectResponsesClient()
+        .AsIChatClient("gpt-4o-mini");
+
+// Define the agent using a YAML definition.
+var yamlDefinition =
+    """
+    kind: Prompt
+    name: Assistant
+    description: Helpful assistant
+    instructions: You are a helpful assistant. You answer questions in the language specified by the user. You return your answers in a JSON format.
+    model:
+        options:
+            temperature: 0.9
+            topP: 0.95
+    outputSchema:
+        properties:
+            language:
+                type: string
+                required: true
+                description: The language of the answer.
+            answer:
+                type: string
+                required: true
+                description: The answer text.
+    """;
+
+// Create the agent from the YAML definition.
+var agentFactory = new ChatClientPromptAgentFactory(chatClient);
+var agent = await agentFactory.CreateFromYamlAsync(yamlDefinition);
+
+// Invoke the agent and output the text result.
+Console.WriteLine(await agent!.RunAsync("Tell me a joke about a pirate in English."));
+
+// Invoke the agent with streaming support.
+await foreach (var update in agent!.RunStreamingAsync("Tell me a joke about a pirate in French."))
+{
+    Console.WriteLine(update);
+}
 ```
 
+> [!WARNING]
+> `DefaultAzureCredential` is convenient for development but requires careful consideration in production. In production, consider using a specific credential (e.g., `ManagedIdentityCredential`) to avoid latency issues, unintended credential probing, and potential security risks from fallback mechanisms.
+
 :::zone-end
 
 :::zone pivot="programming-language-python"
@@ -66,7 +97,7 @@ model:
   id: =Env.AZURE_OPENAI_MODEL
   connection:
     kind: remote
-    endpoint: =Env.AZURE_AI_PROJECT_ENDPOINT
+    endpoint: =Env.FOUNDRY_PROJECT_ENDPOINT
 """
     async with (
         AzureCliCredential() as credential,