MCP Tool Learning Example

This example demonstrates how Atlas SDK enables agents to learn efficient tool usage patterns through reinforcement learning. It uses the Model Context Protocol (MCP) to provide file operation tools to a LangGraph agent, showing measurable improvement in tool selection and task completion efficiency over 25 progressive learning runs.

Overview

What this demonstrates:

MCP server implementation with 5 file system tools
LangGraph agent integration via langchain-mcp-adapters
Progressive learning across 25 tasks (simple → complex)
Tool usage optimization through reward signals
Learning playbook generation and visualization

Architecture

┌─────────────────────┐
│  learning_harness   │  25 progressive tasks
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│   Atlas SDK Core    │  Orchestration + reward system
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│    mcp_agent.py     │  LangGraph ReAct agent
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│ MultiServerMCPClient│  langchain-mcp-adapters
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│   mcp_server.py     │  5 file operation tools
└─────────────────────┘

Prerequisites

1. Install Dependencies

pip install arc-atlas langchain-mcp-adapters langchain-openai langgraph mcp anyio

2. Set API Keys

export OPENAI_API_KEY=sk-...
export GEMINI_API_KEY=...

Or create a .env file in the project root:

OPENAI_API_KEY=sk-...
GEMINI_API_KEY=...

3. Start Atlas Infrastructure

atlas init  # Starts Docker + Postgres for telemetry persistence

Files

File	Purpose
`mcp_server.py`	MCP server with 5 file tools (read, write, list, search, run_command)
`mcp_agent.py`	LangGraph agent that connects to MCP server and exposes tools
`config.yaml`	Atlas configuration (Python adapter, LLM settings, reward system)
`learning_harness.py`	Runs 25 progressive learning tasks and tracks metrics
`sample_workspace/`	Test files for agent operations

Quick Start

Option 1: Run the Full Learning Harness (Recommended)

cd examples/mcp_tool_learning
python learning_harness.py

This executes 25 learning runs with progressive complexity:

Phase 1 (tasks 1-5): Basic file operations (list, read, write)
Phase 2 (tasks 6-10): Multi-step operations (copy, search, combine)
Phase 3 (tasks 11-15): Complex workflows (batch operations, manifests)
Phase 4 (tasks 16-20): Advanced scenarios (backups, reporting)
Phase 5 (tasks 21-25): Edge cases and error handling

Option 2: Run a Single Task

atlas run --config examples/mcp_tool_learning/config.yaml \
          --task "List all files in sample_workspace and read notes.txt"

MCP Tools Available

The agent has access to 5 MCP tools:

Tool	Description	Example Use
`read_file`	Read file contents	Reading configuration or data files
`write_file`	Write/create files	Saving reports or backups
`list_files`	List directory contents	Discovering available files
`search_content`	Regex search in files	Finding specific patterns or keywords
`run_command`	Safe shell commands	Executing ls, grep, wc, etc.

Learning Objectives

The agent learns to:

Tool Selection: Choose the right tool for each operation (e.g., list_files before read_file)
Efficiency: Minimize redundant operations (e.g., caching file lists instead of listing repeatedly)
Error Handling: Gracefully handle missing files and invalid operations
Multi-Step Planning: Break complex tasks into efficient sequences
Context Awareness: Understand when to search vs read, list vs execute

Viewing Learning Progress

1. Check Learning Playbook

After running the harness, view the synthesized learning playbook:

python -m atlas.cli.learning --project mcp-tool-learning

This shows:

Tool usage patterns over time
Reward progression across sessions
Common failure modes and recoveries
Synthesized best practices

2. Export Session Traces

arc-atlas --database-url postgresql://atlas:atlas@localhost:5433/atlas \
          --output mcp_traces.jsonl \
          --limit 25

3. Query Database Directly

psql postgresql://atlas:atlas@localhost:5433/atlas

SELECT session_id, task, reward_score, created_at
FROM atlas_sessions
WHERE project_name = 'mcp-tool-learning'
ORDER BY session_id DESC
LIMIT 25;

Expected Results

Early runs (tasks 1-5):

More tool calls per task (trial and error)
Lower reward scores (~0.6-0.7)
Occasional incorrect tool selection

Later runs (tasks 15-25):

Fewer tool calls per task (optimized)
Higher reward scores (~0.8-0.9)
Consistent correct tool selection
Better error handling

Key Metrics:

Tool call reduction: 30-40% fewer calls in later tasks
Completion rate: 95%+ by task 25
Reward progression: +0.2-0.3 average increase
Cost per run: ~$0.05-0.10 with GPT-4.1-mini

Understanding the Configuration

The config.yaml uses a Python adapter to integrate the MCP agent:

agent:
  type: python
  import_path: examples.mcp_tool_learning.mcp_agent
  attribute: create_agent

The reward system provides learning signals:

rim:
  judge_prompt: |
    Reward effective tool usage:
    - Correct tool for each task
    - Minimal redundant operations
    - Proper error handling

Troubleshooting

MCP Server Connection Issues

If you see connection errors, verify the MCP server path is correct:

# In mcp_agent.py, check:
server_path = Path(__file__).parent / "mcp_server.py"

Async Event Loop Errors

The agent uses async/await patterns. If you see event loop errors:

# Ensure you're running with proper async support
python learning_harness.py  # Not: python -i learning_harness.py

API Rate Limits

If you hit rate limits, add delays between tasks:

# In learning_harness.py, increase the sleep duration:
await asyncio.sleep(2)  # Change from 1 to 2 seconds

Next Steps

Customize Tools: Modify mcp_server.py to add domain-specific tools
Adjust Tasks: Edit LEARNING_TASKS in learning_harness.py for your use case
Tune Rewards: Update the judge_prompt in config.yaml to reward different behaviors
Export for Training: Use the exported traces to fine-tune your own models

Cost Breakdown

Approximate costs for the full 25-run learning session:

Component	Model	Cost per run	Total (25 runs)
Student (Agent)	GPT-4.1-mini	~$0.03	~$0.75
Teacher (Validator)	GPT-4.1-mini	~$0.02	~$0.50
Reward system	Gemini-2.5-Flash	~$0.01	~$0.25
Total	-	~$0.06	~$1.50

Note: Actual costs vary based on:

Task complexity and agent token usage
Number of tool calls per task
Retry attempts and error handling
Reward model evaluation depth

Related Examples

atlas quickstart - Basic Atlas SDK learning demonstration (CLI command)

License

Apache 2.0 - See main repository LICENSE file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MCP Tool Learning Example

Overview

Architecture

Prerequisites

1. Install Dependencies

2. Set API Keys

3. Start Atlas Infrastructure

Files

Quick Start

Option 1: Run the Full Learning Harness (Recommended)

Option 2: Run a Single Task

MCP Tools Available

Learning Objectives

Viewing Learning Progress

1. Check Learning Playbook

2. Export Session Traces

3. Query Database Directly

Expected Results

Understanding the Configuration

Troubleshooting

MCP Server Connection Issues

Async Event Loop Errors

API Rate Limits

Next Steps

Cost Breakdown

Related Examples

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

MCP Tool Learning Example

Overview

Architecture

Prerequisites

1. Install Dependencies

2. Set API Keys

3. Start Atlas Infrastructure

Files

Quick Start

Option 1: Run the Full Learning Harness (Recommended)

Option 2: Run a Single Task

MCP Tools Available

Learning Objectives

Viewing Learning Progress

1. Check Learning Playbook

2. Export Session Traces

3. Query Database Directly

Expected Results

Understanding the Configuration

Troubleshooting

MCP Server Connection Issues

Async Event Loop Errors

API Rate Limits

Next Steps

Cost Breakdown

Related Examples

License