Note: Agent capabilities in Foundry Local evolve—confirm support in the latest release notes before implementing advanced patterns.
Use Foundry Local to rapidly prototype agentic applications: system prompts, grounding, and orchestration patterns. When agent support is present, you can standardize on OpenAI-compatible function calling or use Azure AI Agents on the cloud side in hybrid designs.
🔄 Updated for Modern SDK: This module has been aligned with the latest Microsoft Foundry-Local repository patterns and matches the comprehensive implementation in
samples/05/. The examples now use the modernfoundry-local-sdkandOpenAIclient instead of manual requests.
🏗️ Architecture Highlights:
- Specialist Agents: Retrieval, Reasoning, and Execution agents with distinct capabilities
- Coordinator Pattern: Orchestrates multi-agent workflows with feedback loops
- Modern SDK Integration: Uses
FoundryLocalManagerand OpenAI client - Production Ready: Includes error handling, performance monitoring, and health checks
- Comprehensive Examples: Interactive Jupyter notebook with advanced features
📁 Local Implementation:
samples/05/multi_agent_orchestration.ipynb- Interactive examples and benchmarkssamples/05/agents/specialists.py- Agent implementationssamples/05/agents/coordinator.py- Orchestration logic
References:
- Foundry Local docs: https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-local/
- Azure AI Foundry Agents: https://learn.microsoft.com/en-us/azure/ai-services/agents/overview
- Function calling sample (Foundry Local samples): https://github.com/microsoft/Foundry-Local/tree/main/samples/python/functioncalling
- Design system prompts and grounding strategies for reliable behavior
- Implement function calling (tool use) patterns
- Orchestrate multi-agent workflows (local and hybrid)
- Plan for observability and safety
- Define strict roles, constraints, and output schemas
- Ground responses with local or enterprise data
- Enforce JSON outputs for downstream automation
# tools.py
import json
from typing import List, Dict, Any
def get_weather(city: str) -> str:
return f"Weather in {city}: Sunny, 25C"
# Modern tools format for OpenAI API
TOOLS = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"}
},
"required": ["city"]
}
}
}
]# agent.py
from foundry_local import FoundryLocalManager
from openai import OpenAI
import json
from tools import TOOLS, get_weather
# Initialize Foundry Local Manager
alias = "phi-4-mini"
manager = FoundryLocalManager(alias)
# Create OpenAI client using Foundry Local endpoint
client = OpenAI(
base_url=manager.endpoint,
api_key=manager.api_key
)
SYSTEM_PROMPT = "You are a helpful assistant. Use tools when needed."
def process_function_call(messages: List[Dict], tools: List[Dict]) -> str:
"""Process function calling with modern OpenAI API."""
try:
response = client.chat.completions.create(
model=manager.get_model_info(alias).id,
messages=messages,
tools=tools,
tool_choice="auto"
)
message = response.choices[0].message
if message.tool_calls:
# Handle function calls
messages.append(message)
for tool_call in message.tool_calls:
if tool_call.function.name == "get_weather":
args = json.loads(tool_call.function.arguments)
result = get_weather(args["city"])
# Add function result to messages
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": result
})
# Get final response
final_response = client.chat.completions.create(
model=manager.get_model_info(alias).id,
messages=messages
)
return final_response.choices[0].message.content
else:
return message.content
except Exception as e:
return f"Error: {str(e)}"
# Example usage
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": "What's the weather in Paris?"}
]
result = process_function_call(messages, TOOLS)
print(result)Run:
# Ensure Foundry Local is running with a model
foundry model run phi-4-mini
python agent.pyDesign a coordinator that routes tasks to specialist agents (retrieval, reasoning, execution) using Foundry Local’s OpenAI-compatible endpoint.
Step 1) Define specialist agents with modern SDK (see samples/05/agents/specialists.py)
# agents/specialists.py
from foundry_local import FoundryLocalManager
from openai import OpenAI
from typing import List, Dict, Any
class FoundryClient:
"""Shared client for all specialist agents."""
def __init__(self, model_alias: str = "phi-4-mini"):
self.client = None
self.model_name = None
self.model_alias = model_alias
self._initialize_client()
def _initialize_client(self):
"""Initialize OpenAI client with Foundry Local."""
try:
manager = FoundryLocalManager(self.model_alias)
model_info = manager.get_model_info(self.model_alias)
self.client = OpenAI(
base_url=manager.endpoint,
api_key=manager.api_key
)
self.model_name = model_info.id
print(f"✅ Foundry Local initialized with model: {self.model_name}")
except Exception as e:
print(f"❌ Error initializing Foundry Local: {e}")
raise
def chat(self, messages: List[Dict[str, str]], max_tokens: int = 300, temperature: float = 0.4) -> str:
"""Send chat completion request to the model."""
try:
response = self.client.chat.completions.create(
model=self.model_name,
messages=messages,
max_tokens=max_tokens,
temperature=temperature
)
return response.choices[0].message.content
except Exception as e:
return f"Error generating response: {str(e)}"
# Global client instance
_client = FoundryClient()
class RetrievalAgent:
"""Agent specialized in retrieving relevant information from knowledge sources."""
SYSTEM = """You are a specialized retrieval agent. Your job is to extract and retrieve
the most relevant information from knowledge sources based on a given query. Focus on key facts,
data points, and contextual information that would be useful for decision-making."""
def run(self, query: str) -> str:
"""Retrieve relevant information based on the query."""
messages = [
{"role": "system", "content": self.SYSTEM},
{"role": "user", "content": f"Query: {query}\n\nRetrieve the most relevant key facts, data points, and contextual information that would help answer this query or support decision-making around it."}
]
return _client.chat(messages)
class ReasoningAgent:
"""Agent specialized in step-by-step analysis and reasoning."""
SYSTEM = """You are a specialized reasoning agent. Your job is to analyze inputs
step-by-step and produce structured, logical conclusions. Break down complex problems
into manageable parts and provide clear reasoning for your conclusions."""
def run(self, context: str, question: str) -> str:
"""Analyze context and question to produce structured conclusions."""
messages = [
{"role": "system", "content": self.SYSTEM},
{"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}\n\nAnalyze this step-by-step and provide a structured, logical conclusion with clear reasoning."}
]
return _client.chat(messages, max_tokens=400)
class ExecutionAgent:
"""Agent specialized in creating actionable execution plans."""
SYSTEM = """You are a specialized execution agent. Your job is to transform decisions
and conclusions into concrete, actionable steps. Always format your response as valid JSON
with an array of action items. Each action should be specific, measurable, and achievable."""
def run(self, decision: str) -> str:
"""Transform decision into actionable steps in JSON format."""
messages = [
{"role": "system", "content": self.SYSTEM},
{"role": "user", "content": f"Decision/Conclusion:\n{decision}\n\nCreate 3-5 specific, actionable steps to implement this decision. Format as JSON with this structure:\n{{\"actions\": [{{\"step\": 1, \"description\": \"...\", \"priority\": \"high/medium/low\", \"timeline\": \"...\"}}]}}"}
]
return _client.chat(messages, max_tokens=400, temperature=0.3)Step 2) Build the coordinator with advanced features
# agents/coordinator.py
from .specialists import RetrievalAgent, ReasoningAgent, ExecutionAgent
from typing import Dict, Any
import time
import json
class Coordinator:
"""Multi-agent coordinator that orchestrates specialist agents to handle complex tasks."""
def __init__(self):
"""Initialize the coordinator with specialist agents."""
self.retrieval = RetrievalAgent()
self.reasoning = ReasoningAgent()
self.execution = ExecutionAgent()
def handle(self, user_goal: str) -> Dict[str, Any]:
"""
Orchestrate multiple agents to handle a complex user goal.
Args:
user_goal: The user's high-level goal or request
Returns:
Dictionary containing the goal, context, decision, and actions
"""
print(f"🎯 **Coordinator:** Processing goal: {user_goal}")
print("=" * 60)
start_time = time.time()
# Step 1: Retrieve relevant context
print("📚 **Step 1:** Retrieving context...")
context = self.retrieval.run(user_goal)
print(f" ✅ Context retrieved ({len(context)} chars)")
# Step 2: Analyze and reason about the context
print("🧠 **Step 2:** Analyzing and reasoning...")
decision = self.reasoning.run(context, user_goal)
print(f" ✅ Analysis completed ({len(decision)} chars)")
# Step 3: Create actionable execution plan
print("⚡ **Step 3:** Creating execution plan...")
actions = self.execution.run(decision)
print(f" ✅ Execution plan created ({len(actions)} chars)")
end_time = time.time()
processing_time = end_time - start_time
result = {
"goal": user_goal,
"context": context,
"decision": decision,
"actions": actions,
"agent_flow": ["retrieval", "reasoning", "execution"],
"processing_time": processing_time,
"timestamp": time.strftime("%Y-%m-%d %H:%M:%S")
}
print(f"✅ **Coordination Complete** (⏱️ {processing_time:.2f}s)")
return result
def handle_with_feedback(self, user_goal: str, feedback_rounds: int = 1) -> Dict[str, Any]:
"""
Handle a goal with multiple feedback rounds for refinement.
Args:
user_goal: The user's high-level goal or request
feedback_rounds: Number of feedback rounds to perform
Returns:
Dictionary containing the refined result
"""
result = self.handle(user_goal)
for round_num in range(feedback_rounds):
print(f"\n🔄 **Feedback Round {round_num + 1}:**")
print("-" * 40)
# Use reasoning agent to refine the execution plan
refinement_prompt = f"""
Original Goal: {user_goal}
Current Decision: {result['decision']}
Current Actions: {result['actions']}
Review the above and suggest improvements or refinements to make the execution plan more effective.
"""
refined_decision = self.reasoning.run(result['context'], refinement_prompt)
refined_actions = self.execution.run(refined_decision)
result['decision'] = refined_decision
result['actions'] = refined_actions
result['refinement_rounds'] = round_num + 1
print(f" ✅ Round {round_num + 1} refinement completed")
return result
def main():
"""Main function demonstrating the multi-agent coordinator."""
print("🤖 **Multi-Agent Coordinator Demo**")
print("=" * 50)
# Create coordinator
coord = Coordinator()
# Example goals
example_goals = [
"Create a plan to onboard 5 new customers this month",
"Develop a strategy to improve team productivity by 20%",
"Design a customer feedback collection system"
]
# Process example with feedback
goal = example_goals[0]
print(f"🎯 **Processing Goal:** {goal}")
print("-" * 50)
try:
# Basic processing
result = coord.handle(goal)
# With feedback refinement
refined_result = coord.handle_with_feedback(goal, feedback_rounds=1)
print("\n📊 **Final Result:**")
print("=" * 50)
print(f"**Goal:** {refined_result['goal']}")
print(f"**Processing Time:** {refined_result['processing_time']:.2f}s")
# Try to parse actions as JSON
try:
actions_json = json.loads(refined_result['actions'])
print(f"\n**Formatted Actions:**")
print(json.dumps(actions_json, indent=2))
except (json.JSONDecodeError, TypeError):
print(f"\n**Actions:** {refined_result['actions']}")
except Exception as e:
print(f"❌ **Error:** {e}")
print("\nPlease ensure Foundry Local is running with a model loaded.")
if __name__ == "__main__":
main()Step 3) Validate against Foundry Local and run samples
REM Confirm the local endpoint and model are available
foundry model list
foundry model run phi-4-mini
curl http://localhost:8000/v1/models
REM Run the coordinator from Module08 directory
cd Module08
python -m samples.05.agents.coordinator
REM Or explore the comprehensive Jupyter notebook
jupyter notebook samples/05/multi_agent_orchestration.ipynb📚 Local Sample References:
- Main Implementation:
samples/05/agents/specialists.pyandsamples/05/agents/coordinator.py- Comprehensive Examples:
samples/05/multi_agent_orchestration.ipynb- Setup Instructions:
samples/05/README.md🔗 Related Foundry Local Samples:
Guidelines:
- Implement retries and timeouts between agents
- Add a small in-memory store (dict) for conversation/thread state
- Introduce rate-limiting when chaining multiple calls
Track prompts, responses, and errors locally, while enforcing data hygiene in your agent stack.
Step 1) Lightweight request logging (optional)
Note: The following helper is not included by default. Create infra/obs.py if you want local JSON logging for experiments.
# infra/obs.py
import time, json, os
from datetime import datetime
LOG_DIR = os.getenv("FOUNDRY_AGENT_LOG_DIR", "./agent_logs")
os.makedirs(LOG_DIR, exist_ok=True)
def log_event(kind: str, payload: dict):
ts = datetime.utcnow().strftime("%Y%m%dT%H%M%SZ")
path = os.path.join(LOG_DIR, f"{ts}_{kind}.json")
with open(path, "w", encoding="utf-8") as f:
json.dump(payload, f, ensure_ascii=False, indent=2)Integrate logging into agents (optional):
# in agents/specialists.py after receiving content
from infra.obs import log_event
# ... inside chat(...)
resp = r.json()
log_event("chat_request", {"endpoint": f"{BASE_URL}/v1/chat/completions"})
log_event("chat_response", resp)
return resp["choices"][0]["message"]["content"]Step 2) Validate availability and basic health via CLI
REM Ensure Foundry Local is running a model
foundry model list
foundry model run phi-4-mini
REM Validate the OpenAI-compatible endpoint
curl http://localhost:8000/v1/modelsStep 3) Redaction and PII hygiene
- Before sending messages to the model, strip or hash sensitive fields (emails, phone numbers, IDs)
- Keep raw source data on-device, only pass necessary context strings
Example redaction helper:
# infra/redact.py
import re
EMAIL_RE = re.compile(r"[\w\.-]+@[\w\.-]+")
PHONE_RE = re.compile(r"\+?\d[\d\s\-]{7,}\d")
def sanitize(text: str) -> str:
text = EMAIL_RE.sub("[REDACTED_EMAIL]", text)
text = PHONE_RE.sub("[REDACTED_PHONE]", text)
return textUse in agents:
from infra.redact import sanitize
# user_goal = sanitize(user_goal)
# context = sanitize(context)Step 4) Circuit breakers and error handling
- Wrap each agent call with try/except and exponential backoff
- Short-circuit the pipeline on repeated failures
import time
def with_retry(func, retries=3, base_delay=0.5):
for i in range(retries):
try:
return func()
except Exception as e:
if i == retries - 1:
raise
time.sleep(base_delay * (2 ** i))Step 5) Local audit trail and export
- Store JSON logs under
./agent_logs - Periodically compress and rotate logs
- Export summaries for reviews (counts, avg latency, error rates)
Step 6) Cross-check with Microsoft Learn docs
- Foundry Local serves an OpenAI-compatible API (validated with
curl /v1/models) - Use
foundry model run <name>to confirm model availability - Follow official guidance for client integration and sample apps (Open WebUI/how-tos)
References
- Foundry Local Documentation: https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-local/
- Azure AI Agents: https://learn.microsoft.com/en-us/azure/ai-services/agents/overview
- Local Samples:
- Multi-Agent Orchestration:
Module08/samples/05/multi_agent_orchestration.ipynb - Agent Implementation:
Module08/samples/05/agents/ - Sample README:
Module08/samples/05/README.md
- Multi-Agent Orchestration:
- Official Microsoft Samples:
- Integration Examples: https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-local/how-to/how-to-chat-application-with-open-web-ui
- Explore Azure AI Agents for cloud-hosted orchestration
- Add enterprise connectors (Microsoft Graph, Search, databases)