This example demonstrates how Atlas SDK enables agents to learn efficient tool usage patterns through reinforcement learning. It uses the Model Context Protocol (MCP) to provide file operation tools to a LangGraph agent, showing measurable improvement in tool selection and task completion efficiency over 25 progressive learning runs.
What this demonstrates:
- MCP server implementation with 5 file system tools
- LangGraph agent integration via
langchain-mcp-adapters - Progressive learning across 25 tasks (simple → complex)
- Tool usage optimization through reward signals
- Learning playbook generation and visualization
┌─────────────────────┐
│ learning_harness │ 25 progressive tasks
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ Atlas SDK Core │ Orchestration + reward system
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ mcp_agent.py │ LangGraph ReAct agent
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ MultiServerMCPClient│ langchain-mcp-adapters
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ mcp_server.py │ 5 file operation tools
└─────────────────────┘pip install arc-atlas langchain-mcp-adapters langchain-openai langgraph mcp anyioexport OPENAI_API_KEY=sk-...
export GEMINI_API_KEY=...Or create a .env file in the project root:
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=...atlas init # Starts Docker + Postgres for telemetry persistence| File | Purpose |
|---|---|
mcp_server.py |
MCP server with 5 file tools (read, write, list, search, run_command) |
mcp_agent.py |
LangGraph agent that connects to MCP server and exposes tools |
config.yaml |
Atlas configuration (Python adapter, LLM settings, reward system) |
learning_harness.py |
Runs 25 progressive learning tasks and tracks metrics |
sample_workspace/ |
Test files for agent operations |
cd examples/mcp_tool_learning
python learning_harness.pyThis executes 25 learning runs with progressive complexity:
- Phase 1 (tasks 1-5): Basic file operations (list, read, write)
- Phase 2 (tasks 6-10): Multi-step operations (copy, search, combine)
- Phase 3 (tasks 11-15): Complex workflows (batch operations, manifests)
- Phase 4 (tasks 16-20): Advanced scenarios (backups, reporting)
- Phase 5 (tasks 21-25): Edge cases and error handling
atlas run --config examples/mcp_tool_learning/config.yaml \
--task "List all files in sample_workspace and read notes.txt"The agent has access to 5 MCP tools:
| Tool | Description | Example Use |
|---|---|---|
read_file |
Read file contents | Reading configuration or data files |
write_file |
Write/create files | Saving reports or backups |
list_files |
List directory contents | Discovering available files |
search_content |
Regex search in files | Finding specific patterns or keywords |
run_command |
Safe shell commands | Executing ls, grep, wc, etc. |
The agent learns to:
- Tool Selection: Choose the right tool for each operation (e.g.,
list_filesbeforeread_file) - Efficiency: Minimize redundant operations (e.g., caching file lists instead of listing repeatedly)
- Error Handling: Gracefully handle missing files and invalid operations
- Multi-Step Planning: Break complex tasks into efficient sequences
- Context Awareness: Understand when to search vs read, list vs execute
After running the harness, view the synthesized learning playbook:
python -m atlas.cli.learning --project mcp-tool-learningThis shows:
- Tool usage patterns over time
- Reward progression across sessions
- Common failure modes and recoveries
- Synthesized best practices
arc-atlas --database-url postgresql://atlas:atlas@localhost:5433/atlas \
--output mcp_traces.jsonl \
--limit 25psql postgresql://atlas:atlas@localhost:5433/atlas
SELECT session_id, task, reward_score, created_at
FROM atlas_sessions
WHERE project_name = 'mcp-tool-learning'
ORDER BY session_id DESC
LIMIT 25;Early runs (tasks 1-5):
- More tool calls per task (trial and error)
- Lower reward scores (~0.6-0.7)
- Occasional incorrect tool selection
Later runs (tasks 15-25):
- Fewer tool calls per task (optimized)
- Higher reward scores (~0.8-0.9)
- Consistent correct tool selection
- Better error handling
Key Metrics:
- Tool call reduction: 30-40% fewer calls in later tasks
- Completion rate: 95%+ by task 25
- Reward progression: +0.2-0.3 average increase
- Cost per run: ~$0.05-0.10 with GPT-4.1-mini
The config.yaml uses a Python adapter to integrate the MCP agent:
agent:
type: python
import_path: examples.mcp_tool_learning.mcp_agent
attribute: create_agentThe reward system provides learning signals:
rim:
judge_prompt: |
Reward effective tool usage:
- Correct tool for each task
- Minimal redundant operations
- Proper error handlingIf you see connection errors, verify the MCP server path is correct:
# In mcp_agent.py, check:
server_path = Path(__file__).parent / "mcp_server.py"The agent uses async/await patterns. If you see event loop errors:
# Ensure you're running with proper async support
python learning_harness.py # Not: python -i learning_harness.pyIf you hit rate limits, add delays between tasks:
# In learning_harness.py, increase the sleep duration:
await asyncio.sleep(2) # Change from 1 to 2 seconds- Customize Tools: Modify
mcp_server.pyto add domain-specific tools - Adjust Tasks: Edit
LEARNING_TASKSinlearning_harness.pyfor your use case - Tune Rewards: Update the
judge_promptinconfig.yamlto reward different behaviors - Export for Training: Use the exported traces to fine-tune your own models
Approximate costs for the full 25-run learning session:
| Component | Model | Cost per run | Total (25 runs) |
|---|---|---|---|
| Student (Agent) | GPT-4.1-mini | ~$0.03 | ~$0.75 |
| Teacher (Validator) | GPT-4.1-mini | ~$0.02 | ~$0.50 |
| Reward system | Gemini-2.5-Flash | ~$0.01 | ~$0.25 |
| Total | - | ~$0.06 | ~$1.50 |
Note: Actual costs vary based on:
- Task complexity and agent token usage
- Number of tool calls per task
- Retry attempts and error handling
- Reward model evaluation depth
atlas quickstart- Basic Atlas SDK learning demonstration (CLI command)
Apache 2.0 - See main repository LICENSE file