Section02 : Function Calling in Small Language Models (SLMs)

What is Function Calling?
How Function Calling Works
Application Scenarios
Setting Up Function Calling with Phi-4-mini and Ollama
Working with Qwen3 Function Calling
Foundry Local Integration
Best Practices and Troubleshooting
Advanced Examples

What is Function Calling?

Function calling is a powerful capability that allows Small Language Models (SLMs) to interact with external tools, APIs, and services. Instead of being limited to their training data, SLMs can now:

Connect to external APIs (weather services, databases, search engines)
Execute specific functions based on user requests
Retrieve real-time information from various sources
Perform computational tasks through specialized tools
Chain multiple operations together for complex workflows

This capability transforms SLMs from static text generators into dynamic AI agents that can perform real-world tasks.

How Function Calling Works

The function calling process follows a systematic workflow:

1. Tool Integration

External Tools: SLMs can connect to weather APIs, databases, web services, and other external systems
Function Definitions: Each tool is defined with specific parameters, input/output formats, and descriptions
API Compatibility: Tools are integrated through standardized interfaces (REST APIs, SDKs, etc.)

2. Function Definition

Functions are defined with three key components:

{
  "name": "function_name",
  "description": "Clear description of what the function does",
  "parameters": {
    "parameter_name": {
      "description": "What this parameter represents",
      "type": "data_type",
      "default": "default_value"
    }
  }
}

3. Intent Detection

Natural Language Processing: The SLM analyzes user input to understand intent
Function Matching: Determines which function(s) are needed to fulfill the request
Parameter Extraction: Identifies and extracts required parameters from the user's message

4. JSON Output Generation

The SLM generates structured JSON containing:

Function name to call
Required parameters with appropriate values
Execution context and metadata

5. External Execution

Parameter Validation: Ensures all required parameters are present and correctly formatted
Function Execution: The application executes the specified function with provided parameters
Error Handling: Manages failures, timeouts, and invalid responses

6. Response Integration

Result Processing: The function output is returned to the SLM
Context Integration: The SLM incorporates the results into its response
User Communication: Presents the information in a natural, conversational format

Application Scenarios

Data Retrieval

Convert natural language queries into structured API calls:

"Show my recent orders" → Database query with user ID and date filters
"What's the weather in Tokyo?" → Weather API call with location parameter
"Find emails from John last week" → Email service query with sender and date filters

Operation Execution

Transform user requests into specific function calls:

"Schedule a meeting for tomorrow at 2 PM" → Calendar API integration
"Send a message to the team" → Communication platform API
"Create a backup of my files" → File system operation

Computational Tasks

Handle complex mathematical or logical operations:

"Calculate compound interest on $10,000 at 5% for 10 years" → Financial calculation function
"Analyze this dataset for trends" → Statistical analysis tools
"Optimize this route for delivery" → Route optimization algorithms

Data Processing Workflows

Chain multiple function calls for complex operations:

Retrieve data from multiple sources
Parse and validate the information
Transform data into required format
Store results in appropriate systems
Generate reports or visualizations

UI/UX Integration

Enable dynamic interface updates:

"Show sales data on the dashboard" → Chart generation and display
"Update the map with new locations" → Geospatial data integration
"Refresh the inventory display" → Real-time data synchronization

Setting Up Function Calling with Phi-4-mini and Ollama

Microsoft's Phi-4-mini supports both single and parallel function calling through Ollama. Here's how to set it up:

Prerequisites

Ollama version 0.5.13 or higher
Phi-4-mini model (recommended: phi4-mini:3.8b-fp16)

Installation Steps

1. Install and Run Phi-4-mini

# Download the model (if not already present)
ollama run phi4-mini:3.8b-fp16

# Verify the model is available
ollama list

2. Create Custom ModelFile Template

Due to current limitations in Ollama's default templates, you need to create a custom ModelFile with the following template:

TEMPLATE """
{{- if .Messages }}
{{- if or .System .Tools }}<|system|>
{{ if .System }}{{ .System }}
{{- end }}
In addition to plain text responses, you can chose to call one or more of the provided functions.
Use the following rule to decide when to call a function:
* if the response can be generated from your internal knowledge (e.g., as in the case of queries like "What is the capital of Poland?"), do so
* if you need external information that can be obtained by calling one or more of the provided functions, generate a function calls
If you decide to call functions:
* prefix function calls with functools marker (no closing marker required)
* all function calls should be generated in a single JSON list formatted as functools[{"name": [function name], "arguments": [function arguments as JSON]}, ...]
* follow the provided JSON schema. Do not hallucinate arguments or values. Do to blindly copy values from the provided samples
* respect the argument type formatting. E.g., if the type if number and format is float, write value 7 as 7.0
* make sure you pick the right functions that match the user intent
Available functions as JSON spec:
{{- if .Tools }}
{{ .Tools }}
{{- end }}<|end|>
{{- end }}
{{- range .Messages }}
{{- if ne .Role "system" }}<|{{ .Role }}|>
{{- if and .Content (eq .Role "tools") }}
{"result": {{ .Content }}}
{{- else if .Content }}
{{ .Content }}
{{- else if .ToolCalls }}
functools[
{{- range .ToolCalls }}{{ "{" }}"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}{{ "}" }}
{{- end }}]
{{- end }}<|end|>
{{- end }}
{{- end }}<|assistant|>
{{ else }}
{{- if .System }}<|system|>
{{ .System }}<|end|>{{ end }}{{ if .Prompt }}<|user|>
{{ .Prompt }}<|end|>{{ end }}<|assistant|>
{{ end }}{{ .Response }}{{ if .Response }}<|user|>{{ end }}
"""

3. Create the Custom Model

# Save the template above as 'Modelfile' and run:
ollama create phi4-mini-fc:3.8b-fp16 -f ./Modelfile

Single Function Calling Example

import json
import requests

# Define the tool/function
tools = [
    {
        "name": "get_weather",
        "description": "Get current weather information for a location",
        "parameters": {
            "location": {
                "description": "The city or location name",
                "type": "str",
                "default": "New York"
            },
            "units": {
                "description": "Temperature units (celsius or fahrenheit)",
                "type": "str",
                "default": "celsius"
            }
        }
    }
]

# Create the message with system prompt including tools
messages = [
    {
        "role": "system",
        "content": "You are a helpful weather assistant",
        "tools": json.dumps(tools)
    },
    {
        "role": "user",
        "content": "What's the weather like in London today?"
    }
]

# Make request to Ollama API
response = requests.post(
    "http://localhost:11434/api/chat",
    json={
        "model": "phi4-mini-fc:3.8b-fp16",
        "messages": messages,
        "stream": False
    }
)

print(response.json())

Parallel Function Calling Example

import json
import requests

# Define multiple tools for parallel execution
AGENT_TOOLS = {
    "booking_flight": {
        "name": "booking_flight",
        "description": "Book a flight ticket",
        "parameters": {
            "departure": {
                "description": "Departure airport code",
                "type": "str"
            },
            "destination": {
                "description": "Destination airport code", 
                "type": "str"
            },
            "outbound_date": {
                "description": "Departure date (YYYY-MM-DD)",
                "type": "str"
            },
            "return_date": {
                "description": "Return date (YYYY-MM-DD)",
                "type": "str"
            }
        }
    },
    "booking_hotel": {
        "name": "booking_hotel",
        "description": "Book a hotel room",
        "parameters": {
            "city": {
                "description": "City name for hotel booking",
                "type": "str"
            },
            "check_in_date": {
                "description": "Check-in date (YYYY-MM-DD)",
                "type": "str"
            },
            "check_out_date": {
                "description": "Check-out date (YYYY-MM-DD)",
                "type": "str"
            }
        }
    }
}

SYSTEM_PROMPT = """
You are my travel agent with some tools available.
"""

messages = [
    {
        "role": "system",
        "content": SYSTEM_PROMPT,
        "tools": json.dumps(AGENT_TOOLS)
    },
    {
        "role": "user", 
        "content": "I need to travel from London to New York from March 21 2025 to March 27 2025. Please book both flight and hotel."
    }
]

# The model will generate parallel function calls
response = requests.post(
    "http://localhost:11434/api/chat",
    json={
        "model": "phi4-mini-fc:3.8b-fp16",
        "messages": messages,
        "stream": False
    }
)

print(response.json())

Working with Qwen3 Function Calling

Qwen3 offers advanced function calling capabilities with excellent performance and flexibility. Here's how to implement it:

Using Qwen-Agent Framework

Qwen-Agent provides a high-level framework that simplifies function calling implementation:

Installation

pip install -U "qwen-agent[gui,rag,code_interpreter,mcp]"

Basic Setup

import os
from qwen_agent.agents import Assistant

# Configure the LLM
llm_cfg = {
    'model': 'Qwen3-8B',
    # Option 1: Use Alibaba Model Studio
    'model_type': 'qwen_dashscope',
    'api_key': os.getenv('DASHSCOPE_API_KEY'),
    
    # Option 2: Use local deployment
    # 'model_server': 'http://localhost:8000/v1',
    # 'api_key': 'EMPTY',
    
    # Optional configuration for thinking mode
    'generate_cfg': {
        'thought_in_content': True,  # Include reasoning in response
    }
}

# Define tools using MCP (Model Context Protocol)
tools = [
    {
        'mcpServers': {
            'time': {
                'command': 'uvx',
                'args': ['mcp-server-time', '--local-timezone=Asia/Shanghai']
            },
            'fetch': {
                'command': 'uvx', 
                'args': ['mcp-server-fetch']
            }
        }
    },
    'code_interpreter',  # Built-in code execution tool
]

# Create the assistant
bot = Assistant(llm=llm_cfg, function_list=tools)

# Example usage
messages = [
    {
        'role': 'user', 
        'content': 'What time is it now? Also, fetch the latest news from https://example.com/news'
    }
]

# Generate response with function calling
for response in bot.run(messages=messages):
    print(response)

Custom Function Implementation

You can also define custom functions for Qwen3:

import json
from qwen_agent.tools.base import BaseTool

class WeatherTool(BaseTool):
    description = 'Get weather information for a specific location'
    parameters = [
        {
            'name': 'location',
            'type': 'string', 
            'description': 'City or location name',
            'required': True
        },
        {
            'name': 'units',
            'type': 'string',
            'description': 'Temperature units (celsius or fahrenheit)',
            'required': False,
            'default': 'celsius'
        }
    ]
    
    def call(self, params: str, **kwargs) -> str:
        """Execute the weather lookup"""
        params_dict = json.loads(params)
        location = params_dict.get('location')
        units = params_dict.get('units', 'celsius')
        
        # Simulate weather API call
        weather_data = {
            'location': location,
            'temperature': '22°C' if units == 'celsius' else '72°F',
            'condition': 'Partly cloudy',
            'humidity': '65%'
        }
        
        return json.dumps(weather_data)

# Use the custom tool
tools = [WeatherTool()]
bot = Assistant(llm=llm_cfg, function_list=tools)

messages = [{'role': 'user', 'content': 'What\'s the weather in Tokyo?'}]
response = bot.run(messages=messages)
print(list(response)[-1])

Advanced Qwen3 Features

Thinking Mode Control

Qwen3 supports dynamic switching between thinking and non-thinking modes:

# Enable thinking mode for complex reasoning
messages = [
    {
        'role': 'user',
        'content': '/think Solve this complex math problem: If a train travels 120 km in 1.5 hours, and another train travels 200 km in 2.5 hours, which train is faster and by how much?'
    }
]

# Disable thinking mode for simple queries
messages = [
    {
        'role': 'user', 
        'content': '/no_think What is the capital of France?'
    }
]

Multi-step Function Calling

Qwen3 excels at chaining multiple function calls:

# Complex workflow example
messages = [
    {
        'role': 'user',
        'content': '''
        I need to prepare for a business meeting:
        1. Check my calendar for conflicts tomorrow
        2. Get weather forecast for the meeting location (San Francisco)
        3. Find recent news about the client company (TechCorp)
        4. Calculate travel time from my office to their headquarters
        '''
    }
]

# Qwen3 will automatically determine the sequence of function calls needed

Foundry Local Integration

Microsoft's Foundry Local provides an OpenAI-compatible API for running models locally with enhanced privacy and performance.

Setup and Installation

Windows

Download the installer from the Foundry Local releases page and follow the installation instructions.

macOS

brew tap microsoft/foundrylocal
brew install foundrylocal

Basic Usage

import openai
from foundry_local import FoundryLocalManager

# Initialize with model alias
alias = "phi-3.5-mini"  # Or any supported model
manager = FoundryLocalManager(alias)

# Create OpenAI client pointing to local endpoint
client = openai.OpenAI(
    base_url=manager.endpoint,
    api_key=manager.api_key
)

# Define functions for the model
functions = [
    {
        "name": "calculate_tax",
        "description": "Calculate tax amount based on income and rate",
        "parameters": {
            "type": "object",
            "properties": {
                "income": {
                    "type": "number",
                    "description": "Annual income amount"
                },
                "tax_rate": {
                    "type": "number", 
                    "description": "Tax rate as decimal (e.g., 0.25 for 25%)"
                }
            },
            "required": ["income", "tax_rate"]
        }
    }
]

# Make function calling request
response = client.chat.completions.create(
    model=manager.model_info.id,
    messages=[
        {
            "role": "user",
            "content": "Calculate the tax for someone earning $75,000 with a 22% tax rate"
        }
    ],
    functions=functions,
    function_call="auto"
)

print(response.choices[0].message.content)

Advanced Foundry Local Features

Model Management

# List available models
foundry model list

# Download specific model
foundry model download phi-3.5-mini

# Run model interactively
foundry model run phi-3.5-mini

# Remove model from cache
foundry model remove phi-3.5-mini

# Delete all cached models
foundry model remove "*"

Performance Optimization

Foundry Local automatically selects the best model variant for your hardware:

CUDA GPU: Downloads GPU-optimized models
Qualcomm NPU: Uses NPU-accelerated variants
CPU-only: Selects CPU-optimized models

Best Practices and Troubleshooting

Function Definition Best Practices

1. Clear and Descriptive Naming

# Good
{
    "name": "get_stock_price",
    "description": "Retrieve current stock price for a given symbol"
}

# Avoid
{
    "name": "get_data", 
    "description": "Gets data"
}

2. Comprehensive Parameter Definitions

{
    "name": "send_email",
    "description": "Send an email message to specified recipients",
    "parameters": {
        "to": {
            "type": "array",
            "items": {"type": "string"},
            "description": "List of recipient email addresses",
            "required": True
        },
        "subject": {
            "type": "string",
            "description": "Email subject line",
            "required": True
        },
        "body": {
            "type": "string", 
            "description": "Email message content",
            "required": True
        },
        "priority": {
            "type": "string",
            "enum": ["low", "normal", "high"],
            "description": "Email priority level",
            "default": "normal",
            "required": False
        }
    }
}

3. Input Validation and Error Handling

def execute_function(function_name, parameters):
    try:
        # Validate required parameters
        if function_name == "send_email":
            if not parameters.get("to") or not parameters.get("subject"):
                return {"error": "Missing required parameters: to, subject"}
            
            # Validate email format
            email_pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
            for email in parameters["to"]:
                if not re.match(email_pattern, email):
                    return {"error": f"Invalid email format: {email}"}
        
        # Execute function logic
        result = perform_actual_function(function_name, parameters)
        return {"success": True, "data": result}
        
    except Exception as e:
        return {"error": str(e)}

Common Issues and Solutions

Issue 1: Function Not Being Called

Symptoms: Model responds with text instead of calling the function

Solutions:

Check function description: Ensure it clearly matches the user's intent
Verify parameter definitions: Make sure all required parameters are properly defined
Review system prompt: Include clear instructions about when to use functions
Test with explicit requests: Try "Please use the weather function to get data for London"

Issue 2: Incorrect Parameters

Symptoms: Function called with wrong or missing parameters

Solutions:

Add parameter examples: Include sample values in parameter descriptions
Use enum constraints: Limit parameter values to specific options when possible
Implement fallback values: Provide sensible defaults for optional parameters

{
    "name": "book_restaurant",
    "parameters": {
        "cuisine": {
            "type": "string",
            "enum": ["italian", "chinese", "mexican", "american", "french"],
            "description": "Type of cuisine (example: 'italian' for Italian food)"
        },
        "party_size": {
            "type": "integer",
            "minimum": 1,
            "maximum": 20,
            "description": "Number of people (example: 4 for a family of four)"
        }
    }
}

Issue 3: Parallel Function Calling Failures

Symptoms: Only one function executes when multiple should run

Solutions:

Check model support: Ensure your model supports parallel function calling
Update system prompt: Include "some tools" or "multiple tools" in the system message
Use appropriate model versions: Phi-4-mini:3.8b-fp16 recommended for Ollama

Issue 4: Template Issues with Ollama

Symptoms: Function calling doesn't work with default Ollama setup

Solutions:

Use custom ModelFile: Apply the corrected template provided in this tutorial
Update Ollama: Ensure you're using version 0.5.13 or higher
Check model quantization: Higher quantization levels (Q8_0, fp16) work better than heavily quantized versions

Performance Optimization

1. Efficient Function Design

Keep functions focused: Each function should have a single, clear purpose
Minimize external dependencies: Reduce API calls and network requests where possible
Cache results: Store frequently requested data to improve response times

2. Batching and Async Operations

import asyncio
import aiohttp

async def batch_function_calls(function_calls):
    """Execute multiple function calls concurrently"""
    async with aiohttp.ClientSession() as session:
        tasks = []
        for call in function_calls:
            if call["name"] == "fetch_url":
                task = fetch_url_async(session, call["parameters"]["url"])
                tasks.append(task)
        
        results = await asyncio.gather(*tasks)
        return results

async def fetch_url_async(session, url):
    async with session.get(url) as response:
        return await response.text()

3. Resource Management

Connection pooling: Reuse database and API connections
Rate limiting: Implement proper rate limiting for external APIs
Timeout handling: Set reasonable timeouts for all external calls

Advanced Examples

Multi-Agent Collaboration System

import json
from typing import List, Dict
from qwen_agent.agents import Assistant

class MultiAgentSystem:
    def __init__(self):
        # Research Agent
        self.research_agent = Assistant(
            llm={'model': 'Qwen3-8B', 'model_server': 'http://localhost:8000/v1'},
            function_list=[
                {'mcpServers': {'search': {'command': 'uvx', 'args': ['mcp-server-search']}}},
                {'mcpServers': {'fetch': {'command': 'uvx', 'args': ['mcp-server-fetch']}}}
            ]
        )
        
        # Analysis Agent
        self.analysis_agent = Assistant(
            llm={'model': 'Qwen3-8B', 'model_server': 'http://localhost:8000/v1'},
            function_list=['code_interpreter']
        )
        
        # Communication Agent
        self.comm_agent = Assistant(
            llm={'model': 'Qwen3-8B', 'model_server': 'http://localhost:8000/v1'},
            function_list=[self.create_email_tool(), self.create_slack_tool()]
        )
    
    def create_email_tool(self):
        """Custom email sending tool"""
        class EmailTool:
            name = "send_email"
            description = "Send email to specified recipients"
            parameters = {
                "to": {"type": "string", "description": "Recipient email"},
                "subject": {"type": "string", "description": "Email subject"},
                "body": {"type": "string", "description": "Email content"}
            }
            
            def call(self, params):
                # Implement actual email sending logic
                return f"Email sent successfully to {params['to']}"
        
        return EmailTool()
    
    def create_slack_tool(self):
        """Custom Slack messaging tool"""  
        class SlackTool:
            name = "send_slack"
            description = "Send message to Slack channel"
            parameters = {
                "channel": {"type": "string", "description": "Slack channel"},
                "message": {"type": "string", "description": "Message content"}
            }
            
            def call(self, params):
                # Implement actual Slack API call
                return f"Message sent to {params['channel']}"
        
        return SlackTool()
    
    async def process_complex_request(self, user_request: str):
        """Process complex multi-step requests using multiple agents"""
        
        # Step 1: Research phase
        research_prompt = f"Research the following topic and gather relevant information: {user_request}"
        research_results = []
        for response in self.research_agent.run([{'role': 'user', 'content': research_prompt}]):
            research_results.append(response)
        
        # Step 2: Analysis phase
        analysis_prompt = f"Analyze the following research data and provide insights: {research_results[-1]}"
        analysis_results = []
        for response in self.analysis_agent.run([{'role': 'user', 'content': analysis_prompt}]):
            analysis_results.append(response)
        
        # Step 3: Communication phase
        comm_prompt = f"Create a summary report and send it via email: {analysis_results[-1]}"
        comm_results = []
        for response in self.comm_agent.run([{'role': 'user', 'content': comm_prompt}]):
            comm_results.append(response)
        
        return {
            'research': research_results[-1],
            'analysis': analysis_results[-1], 
            'communication': comm_results[-1]
        }

# Usage example
async def main():
    system = MultiAgentSystem()
    
    request = """
    Analyze the impact of remote work on productivity in tech companies. 
    Research recent studies, analyze the data, and send a summary to our team.
    """
    
    results = await system.process_complex_request(request)
    print("Multi-agent processing complete:", results)

# Run the example
# asyncio.run(main())

Dynamic Tool Selection System

class DynamicToolSelector:
    def __init__(self):
        self.available_tools = {
            'weather': {
                'description': 'Get weather information',
                'domains': ['weather', 'temperature', 'forecast', 'climate'],
                'function': self.get_weather
            },
            'calculator': {
                'description': 'Perform mathematical calculations',
                'domains': ['math', 'calculate', 'compute', 'arithmetic'],
                'function': self.calculate
            },
            'web_search': {
                'description': 'Search the internet for information',
                'domains': ['search', 'find', 'lookup', 'research'],
                'function': self.web_search
            },
            'file_manager': {
                'description': 'Manage files and directories',
                'domains': ['file', 'directory', 'save', 'load', 'delete'],
                'function': self.manage_files
            }
        }
    
    def analyze_intent(self, user_input: str) -> List[str]:
        """Analyze user input to determine which tools might be needed"""
        user_words = user_input.lower().split()
        relevant_tools = []
        
        for tool_name, tool_info in self.available_tools.items():
            for domain in tool_info['domains']:
                if domain in user_words:
                    relevant_tools.append(tool_name)
                    break
        
        return relevant_tools
    
    def get_tool_definitions(self, tool_names: List[str]) -> List[Dict]:
        """Generate function definitions for selected tools"""
        definitions = []
        
        for tool_name in tool_names:
            if tool_name == 'weather':
                definitions.append({
                    'name': 'get_weather',
                    'description': 'Get current weather information',
                    'parameters': {
                        'location': {'type': 'string', 'description': 'City or location name'},
                        'units': {'type': 'string', 'enum': ['celsius', 'fahrenheit'], 'default': 'celsius'}
                    }
                })
            elif tool_name == 'calculator':
                definitions.append({
                    'name': 'calculate',
                    'description': 'Perform mathematical calculations',
                    'parameters': {
                        'expression': {'type': 'string', 'description': 'Mathematical expression to evaluate'},
                        'precision': {'type': 'integer', 'default': 2, 'description': 'Decimal places for result'}
                    }
                })
            # Add more tool definitions as needed
        
        return definitions
    
    def get_weather(self, location: str, units: str = 'celsius') -> Dict:
        """Mock weather function"""
        return {
            'location': location,
            'temperature': '22°C' if units == 'celsius' else '72°F',
            'condition': 'Sunny',
            'humidity': '60%'
        }
    
    def calculate(self, expression: str, precision: int = 2) -> Dict:
        """Safe mathematical calculation"""
        try:
            # Simple evaluation for demo - in production, use a proper math parser
            import math
            allowed_names = {
                k: v for k, v in math.__dict__.items() if not k.startswith("__")
            }
            allowed_names.update({"abs": abs, "round": round})
            
            result = eval(expression, {"__builtins__": {}}, allowed_names)
            return {
                'expression': expression,
                'result': round(float(result), precision),
                'success': True
            }
        except Exception as e:
            return {
                'expression': expression,
                'error': str(e),
                'success': False
            }
    
    def web_search(self, query: str, max_results: int = 5) -> Dict:
        """Mock web search function"""
        return {
            'query': query,
            'results': [
                {'title': f'Result {i+1} for {query}', 'url': f'https://example{i+1}.com'}
                for i in range(max_results)
            ]
        }
    
    def manage_files(self, action: str, file_path: str, content: str = None) -> Dict:
        """Mock file management function"""
        return {
            'action': action,
            'file_path': file_path,
            'success': True,
            'message': f'Successfully {action}ed file: {file_path}'
        }

# Usage example
def smart_assistant_with_dynamic_tools():
    selector = DynamicToolSelector()
    
    user_requests = [
        "What's the weather like in New York and calculate 15% tip on $50?",
        "Search for recent AI developments and save the results to a file",
        "Calculate the area of a circle with radius 10 and check weather in Tokyo"
    ]
    
    for request in user_requests:
        print(f"\nUser Request: {request}")
        
        # Analyze which tools might be needed
        relevant_tools = selector.analyze_intent(request)
        print(f"Relevant Tools: {relevant_tools}")
        
        # Get function definitions for the LLM
        tool_definitions = selector.get_tool_definitions(relevant_tools)
        print(f"Tool Definitions: {len(tool_definitions)} functions available")
        
        # In a real implementation, you would pass these to your LLM
        # The LLM would then decide which functions to call and with what parameters

### Enterprise Integration Example

```python
import asyncio
import json
from typing import Dict, List, Any
from dataclasses import dataclass
from datetime import datetime

@dataclass
class FunctionResult:
    """Standard result format for all function calls"""
    success: bool
    data: Any = None
    error: str = None
    execution_time: float = 0.0
    timestamp: datetime = None

class EnterpriseAIAgent:
    """Production-ready AI agent with comprehensive function calling capabilities"""
    
    def __init__(self, config: Dict):
        self.config = config
        self.functions = {}
        self.audit_log = []
        self.rate_limiters = {}
        
        # Initialize core business functions
        self._register_core_functions()
    
    def _register_core_functions(self):
        """Register all available business functions"""
        
        # CRM Functions
        self.register_function(
            name="get_customer_info",
            description="Retrieve customer information from CRM",
            parameters={
                "customer_id": {"type": "string", "required": True},
                "include_history": {"type": "boolean", "default": False}
            },
            handler=self._get_customer_info,
            rate_limit=100  # calls per minute
        )
        
        # Sales Functions
        self.register_function(
            name="create_sales_opportunity",
            description="Create a new sales opportunity",
            parameters={
                "customer_id": {"type": "string", "required": True},
                "product_id": {"type": "string", "required": True},
                "estimated_value": {"type": "number", "required": True},
                "expected_close_date": {"type": "string", "required": True}
            },
            handler=self._create_sales_opportunity,
            rate_limit=50
        )
        
        # Analytics Functions
        self.register_function(
            name="generate_sales_report",
            description="Generate sales performance report",
            parameters={
                "period": {"type": "string", "enum": ["daily", "weekly", "monthly", "quarterly"]},
                "region": {"type": "string", "required": False},
                "product_category": {"type": "string", "required": False}
            },
            handler=self._generate_sales_report,
            rate_limit=10
        )
        
        # Notification Functions
        self.register_function(
            name="send_notification",
            description="Send notification to team members",
            parameters={
                "recipients": {"type": "array", "items": {"type": "string"}},
                "message": {"type": "string", "required": True},
                "priority": {"type": "string", "enum": ["low", "medium", "high"], "default": "medium"},
                "channel": {"type": "string", "enum": ["email", "slack", "teams"], "default": "email"}
            },
            handler=self._send_notification,
            rate_limit=200
        )
    
    def register_function(self, name: str, description: str, parameters: Dict, 
                         handler: callable, rate_limit: int = 60):
        """Register a new function with the agent"""
        self.functions[name] = {
            'description': description,
            'parameters': parameters,
            'handler': handler,
            'rate_limit': rate_limit,
            'call_count': 0,
            'last_reset': datetime.now()
        }
    
    async def execute_function(self, function_name: str, parameters: Dict) -> FunctionResult:
        """Execute a function with comprehensive error handling and logging"""
        start_time = datetime.now()
        
        try:
            # Validate function exists
            if function_name not in self.functions:
                return FunctionResult(
                    success=False,
                    error=f"Function '{function_name}' not found",
                    timestamp=start_time
                )
            
            # Check rate limits
            if not self._check_rate_limit(function_name):
                return FunctionResult(
                    success=False,
                    error=f"Rate limit exceeded for function '{function_name}'",
                    timestamp=start_time
                )
            
            # Validate parameters
            validation_result = self._validate_parameters(function_name, parameters)
            if not validation_result.success:
                return validation_result
            
            # Execute function
            func_info = self.functions[function_name]
            handler = func_info['handler']
            
            if asyncio.iscoroutinefunction(handler):
                result_data = await handler(**parameters)
            else:
                result_data = handler(**parameters)
            
            execution_time = (datetime.now() - start_time).total_seconds()
            
            result = FunctionResult(
                success=True,
                data=result_data,
                execution_time=execution_time,
                timestamp=start_time
            )
            
            # Log successful execution
            self._log_function_call(function_name, parameters, result)
            
            return result
            
        except Exception as e:
            execution_time = (datetime.now() - start_time).total_seconds()
            result = FunctionResult(
                success=False,
                error=str(e),
                execution_time=execution_time,
                timestamp=start_time
            )
            
            # Log failed execution
            self._log_function_call(function_name, parameters, result)
            
            return result
    
    def _check_rate_limit(self, function_name: str) -> bool:
        """Check if function call is within rate limits"""
        func_info = self.functions[function_name]
        now = datetime.now()
        
        # Reset counter if a minute has passed
        if (now - func_info['last_reset']).seconds >= 60:
            func_info['call_count'] = 0
            func_info['last_reset'] = now
        
        # Check if under limit
        if func_info['call_count'] >= func_info['rate_limit']:
            return False
        
        func_info['call_count'] += 1
        return True
    
    def _validate_parameters(self, function_name: str, parameters: Dict) -> FunctionResult:
        """Validate function parameters"""
        func_params = self.functions[function_name]['parameters']
        
        # Check required parameters
        for param_name, param_info in func_params.items():
            if param_info.get('required', False) and param_name not in parameters:
                return FunctionResult(
                    success=False,
                    error=f"Missing required parameter: {param_name}"
                )
        
        # Validate parameter types and constraints
        for param_name, value in parameters.items():
            if param_name in func_params:
                param_info = func_params[param_name]
                
                # Type validation
                expected_type = param_info.get('type')
                if expected_type == 'string' and not isinstance(value, str):
                    return FunctionResult(
                        success=False,
                        error=f"Parameter '{param_name}' must be a string"
                    )
                elif expected_type == 'number' and not isinstance(value, (int, float)):
                    return FunctionResult(
                        success=False,
                        error=f"Parameter '{param_name}' must be a number"
                    )
                elif expected_type == 'boolean' and not isinstance(value, bool):
                    return FunctionResult(
                        success=False,
                        error=f"Parameter '{param_name}' must be a boolean"
                    )
                
                # Enum validation
                if 'enum' in param_info and value not in param_info['enum']:
                    return FunctionResult(
                        success=False,
                        error=f"Parameter '{param_name}' must be one of: {param_info['enum']}"
                    )
        
        return FunctionResult(success=True)
    
    def _log_function_call(self, function_name: str, parameters: Dict, result: FunctionResult):
        """Log function call for audit purposes"""
        log_entry = {
            'timestamp': result.timestamp.isoformat(),
            'function_name': function_name,
            'parameters': parameters,
            'success': result.success,
            'execution_time': result.execution_time,
            'error': result.error if not result.success else None
        }
        
        self.audit_log.append(log_entry)
        
        # Optionally write to external logging system
        if self.config.get('enable_external_logging', False):
            self._write_to_external_log(log_entry)
    
    def _write_to_external_log(self, log_entry: Dict):
        """Write log entry to external logging system"""
        # Implementation would depend on your logging infrastructure
        # e.g., send to ELK stack, CloudWatch, etc.
        pass
    
    # Business Function Implementations
    async def _get_customer_info(self, customer_id: str, include_history: bool = False) -> Dict:
        """Retrieve customer information from CRM system"""
        # Simulate database/API call
        await asyncio.sleep(0.1)  # Simulate network delay
        
        customer_data = {
            'customer_id': customer_id,
            'name': 'John Doe',
            'email': 'john.doe@example.com',
            'phone': '+1-555-0123',
            'status': 'active',
            'tier': 'premium'
        }
        
        if include_history:
            customer_data['purchase_history'] = [
                {'date': '2024-01-15', 'product': 'Product A', 'amount': 1500},
                {'date': '2024-03-22', 'product': 'Product B', 'amount': 2300}
            ]
        
        return customer_data
    
    async def _create_sales_opportunity(self, customer_id: str, product_id: str, 
                                      estimated_value: float, expected_close_date: str) -> Dict:
        """Create a new sales opportunity"""
        # Simulate CRM API call
        await asyncio.sleep(0.2)
        
        opportunity_id = f"OPP-{datetime.now().strftime('%Y%m%d%H%M%S')}"
        
        return {
            'opportunity_id': opportunity_id,
            'customer_id': customer_id,
            'product_id': product_id,
            'estimated_value': estimated_value,
            'expected_close_date': expected_close_date,
            'status': 'open',
            'created_date': datetime.now().isoformat()
        }
    
    async def _generate_sales_report(self, period: str, region: str = None, 
                                   product_category: str = None) -> Dict:
        """Generate comprehensive sales report"""
        # Simulate data aggregation
        await asyncio.sleep(0.5)
        
        return {
            'report_id': f"RPT-{datetime.now().strftime('%Y%m%d%H%M%S')}",
            'period': period,
            'region': region,
            'product_category': product_category,
            'total_sales': 125000.00,
            'total_opportunities': 45,
            'conversion_rate': 0.67,
            'top_products': [
                {'product_id': 'PROD-001', 'sales': 45000},
                {'product_id': 'PROD-002', 'sales': 32000}
            ],
            'generated_at': datetime.now().isoformat()
        }
    
    async def _send_notification(self, recipients: List[str], message: str, 
                               priority: str = 'medium', channel: str = 'email') -> Dict:
        """Send notification through specified channel"""
        # Simulate notification service call
        await asyncio.sleep(0.1)
        
        notification_id = f"NOTIF-{datetime.now().strftime('%Y%m%d%H%M%S')}"
        
        return {
            'notification_id': notification_id,
            'recipients': recipients,
            'channel': channel,
            'priority': priority,
            'status': 'sent',
            'sent_at': datetime.now().isoformat()
        }
    
    def get_function_definitions(self) -> List[Dict]:
        """Get OpenAI-compatible function definitions for all registered functions"""
        definitions = []
        
        for func_name, func_info in self.functions.items():
            definition = {
                'name': func_name,
                'description': func_info['description'],
                'parameters': {
                    'type': 'object',
                    'properties': {},
                    'required': []
                }
            }
            
            for param_name, param_info in func_info['parameters'].items():
                definition['parameters']['properties'][param_name] = {
                    'type': param_info['type'],
                    'description': param_info.get('description', '')
                }
                
                if 'enum' in param_info:
                    definition['parameters']['properties'][param_name]['enum'] = param_info['enum']
                
                if 'default' in param_info:
                    definition['parameters']['properties'][param_name]['default'] = param_info['default']
                
                if param_info.get('required', False):
                    definition['parameters']['required'].append(param_name)
            
            definitions.append(definition)
        
        return definitions

# Usage Example for Enterprise Integration
async def enterprise_demo():
    """Demonstrate enterprise AI agent capabilities"""
    
    config = {
        'enable_external_logging': True,
        'max_concurrent_functions': 10,
        'default_timeout': 30
    }
    
    agent = EnterpriseAIAgent(config)
    
    # Example 1: Customer inquiry processing
    print("=== Customer Inquiry Processing ===")
    
    # Get customer information
    result = await agent.execute_function(
        'get_customer_info',
        {'customer_id': 'CUST-12345', 'include_history': True}
    )
    
    if result.success:
        print(f"Customer Info Retrieved: {result.data['name']}")
        print(f"Execution Time: {result.execution_time:.3f}s")
    
    # Example 2: Sales opportunity creation
    print("\n=== Sales Opportunity Creation ===")
    
    result = await agent.execute_function(
        'create_sales_opportunity',
        {
            'customer_id': 'CUST-12345',
            'product_id': 'PROD-001',
            'estimated_value': 15000.0,
            'expected_close_date': '2025-09-30'
        }
    )
    
    if result.success:
        print(f"Opportunity Created: {result.data['opportunity_id']}")
    
    # Example 3: Batch operations
    print("\n=== Batch Operations ===")
    
    tasks = [
        agent.execute_function('generate_sales_report', {'period': 'monthly'}),
        agent.execute_function('send_notification', {
            'recipients': ['manager@company.com'],
            'message': 'New opportunity created',
            'priority': 'high',
            'channel': 'email'
        })
    ]
    
    results = await asyncio.gather(*tasks)
    
    for i, result in enumerate(results):
        if result.success:
            print(f"Task {i+1} completed successfully")
        else:
            print(f"Task {i+1} failed: {result.error}")
    
    # Display audit log
    print(f"\n=== Audit Log ({len(agent.audit_log)} entries) ===")
    for entry in agent.audit_log[-3:]:  # Show last 3 entries
        print(f"{entry['timestamp']}: {entry['function_name']} - {'SUCCESS' if entry['success'] else 'FAILED'}")

# Run the enterprise demo
# asyncio.run(enterprise_demo())

Conclusion

Function calling in Small Language Models represents a paradigm shift from static AI assistants to dynamic, capable agents that can interact with the real world. This tutorial has covered:

Key Takeaways

Foundation Understanding: Function calling enables SLMs to extend beyond their training data by connecting to external tools and services.
Implementation Flexibility: Multiple approaches exist, from low-level implementations with custom templates to high-level frameworks like Qwen-Agent and Foundry Local.
Production Considerations: Enterprise deployments require attention to error handling, rate limiting, security, and audit logging.
Performance Optimization: Proper function design, efficient execution, and smart caching can significantly improve response times.

Future Directions

As SLM technology continues to evolve, we can expect:

Improved Function Calling Accuracy: Better intent detection and parameter extraction
Enhanced Parallel Processing: More sophisticated multi-function orchestration
Better Integration Standards: Standardized protocols for tool integration
Advanced Security Features: Enhanced authentication and authorization mechanisms
Expanded Ecosystem: Growing library of pre-built functions and integrations

Getting Started

To begin implementing function calling in your projects:

Start Simple: Begin with basic single-function scenarios
Choose Your Framework: Select between direct implementation (Ollama/Phi-4) or framework-assisted (Qwen-Agent)
Design Functions Carefully: Focus on clear, well-documented function definitions
Implement Error Handling: Build robust error handling from the beginning
Scale Gradually: Move from simple to complex scenarios as you gain experience

Function calling transforms SLMs from impressive text generators into practical AI agents capable of solving real-world problems. By following the patterns and practices outlined in this tutorial, you can build powerful, reliable AI systems that extend far beyond traditional chat interfaces.

Resources and References

Phi-4 Models: Hugging Face Collection
Qwen3 Documentation: Official Qwen Documentation
Ollama: Official Website
Foundry Local: GitHub Repository
Function Calling Best Practices: Hugging Face Guide

Remember that function calling is an evolving field, and staying updated with the latest developments in your chosen frameworks and models will help you build more effective AI agents.

➡️ What's next

03: Model Context Protocol (MCP) Integration

FilesExpand file tree

02.FunctionCalling.md

Latest commit

History

02.FunctionCalling.md

File metadata and controls

Section02 : Function Calling in Small Language Models (SLMs)

Table of Contents

What is Function Calling?

How Function Calling Works

1. Tool Integration

2. Function Definition

3. Intent Detection

4. JSON Output Generation

5. External Execution

6. Response Integration

Application Scenarios

Data Retrieval

Operation Execution

Computational Tasks

Data Processing Workflows

UI/UX Integration

Setting Up Function Calling with Phi-4-mini and Ollama

Prerequisites

Installation Steps

1. Install and Run Phi-4-mini

2. Create Custom ModelFile Template

3. Create the Custom Model

Single Function Calling Example

Parallel Function Calling Example

Working with Qwen3 Function Calling

Using Qwen-Agent Framework

Installation

Basic Setup

Custom Function Implementation

Advanced Qwen3 Features

Thinking Mode Control

Multi-step Function Calling

Foundry Local Integration

Setup and Installation

Windows

macOS

Basic Usage

Advanced Foundry Local Features

Model Management

Performance Optimization

Best Practices and Troubleshooting

Function Definition Best Practices

1. Clear and Descriptive Naming

2. Comprehensive Parameter Definitions

3. Input Validation and Error Handling

Common Issues and Solutions

Issue 1: Function Not Being Called

Issue 2: Incorrect Parameters

Issue 3: Parallel Function Calling Failures

Issue 4: Template Issues with Ollama

Performance Optimization

1. Efficient Function Design

2. Batching and Async Operations

3. Resource Management

Advanced Examples

Multi-Agent Collaboration System

Dynamic Tool Selection System

Conclusion

Key Takeaways

Future Directions

Getting Started

Resources and References

➡️ What's next