feat(utils): add token budget enforcement for all LangGraph agents by adickinson72 · Pull Request #332 · cnoe-io/ai-platform-engineering

adickinson72 · 2025-10-02T15:47:07Z

Summary

Add opt-in token budget enforcement to all LangGraph agents via BaseLangGraphAgent._wrap_mcp_tools.

token_budget.py — TokenBudgetManager tracks estimated token consumption across tool calls, raises graceful exceptions when limits are exceeded
budget_aware_tool.py — BudgetAwareTool standalone wrapper for custom tool pipelines
base_langgraph_agent.py — Budget checks wired into all three tool wrapper paths (safe_coroutine, safe_run, safe_arun); reset per query in stream()

How it works

When ENABLE_TOKEN_BUDGET=true, each MCP tool call is checked against token and call-count limits before execution. If a limit is exceeded, the tool returns a partial-results message instead of executing — no exceptions propagate to LangGraph/A2A.

All budget operations are defensively wrapped so bugs in budget tracking never crash tool execution.

Configuration (env vars)

Variable	Default	Description
`ENABLE_TOKEN_BUDGET`	`false`	Opt-in toggle
`AGENT_MAX_TOKENS`	`20000`	Max estimated tokens per query
`AGENT_MAX_TOOL_CALLS`	`8`	Max tool calls per query

What was removed

The original mas-agent-base agent directory was deleted — its base agent/executor duplicated BaseLangGraphAgent/BaseLangGraphAgentExecutor. Only the novel token budget utilities were kept and integrated.

Test plan

Verify all existing agents work unchanged (default: budget disabled)
Set ENABLE_TOKEN_BUDGET=true and AGENT_MAX_TOOL_CALLS=3, confirm agent stops after 3 tool calls with partial results message
Verify no exceptions propagate to A2A streams when budget is exceeded
Run existing test suite

github-actions · 2025-10-02T15:49:00Z

📊 Test Coverage Report

Main Tests Coverage

Metric	Coverage	Details
Lines	12.6%	245/1940 lines
Branches	0.0%	0/0 branches

RAG Tests Coverage

Metric	Coverage	Details
Lines	59.7%	462/774 lines
Branches	35.7%	70/196 branches

📁 Coverage Artifacts

Main tests: coverage-reports-main artifact
RAG tests: coverage-reports-rag artifact
Download artifacts to view detailed HTML coverage reports

github-actions · 2026-01-10T03:01:03Z

Thank you for your contribution! This PR has been automatically marked as stale because it has no recent activity in the last 90 days. It will be closed in 7 days, if no further activity occurs. If this pull request is still relevant, please leave a comment to let us know, and the stale label will be automatically removed.

sriaradhyula · 2026-04-02T06:35:09Z

Integration Analysis: `mas_agent_base` ↔ Template Agent

Thanks for sharing this reference implementation @adickinson72. This comment proposes concrete integration options for aligning mas_agent_base with the existing template agent and BaseLangGraphAgent infrastructure.

Current State

The template agent (ai_platform_engineering/agents/template/) and all production agents (argocd, github, jira, etc.) currently rely on:

Component	Location	Size
`BaseLangGraphAgent`	`utils/a2a_common/base_langgraph_agent.py`	~112 KB
`BaseLangGraphAgentExecutor`	`utils/a2a_common/base_langgraph_agent_executor.py`	~20 KB

The PetStoreAgent in the template extends BaseLangGraphAgent using an abstract-method override pattern:

class PetStoreAgent(BaseLangGraphAgent):
    def get_agent_name(self) -> str: ...
    def get_system_instruction(self) -> str: ...
    def get_mcp_config(self, server_path: str) -> dict: ...
    def get_mcp_http_config(self) -> dict | None: ...
    def get_tool_working_message(self) -> str: ...
    def get_tool_processing_message(self) -> str: ...

mas_agent_base proposes a constructor-injection pattern instead:

class PetstoreAgent(BaseAgent):
    def __init__(self):
        super().__init__(
            agent_name="petstore",
            system_instruction=SYSTEM_PROMPT,
            mcp_config=MCPConfig(server_name="petstore", server_path="...", required_env_vars=["PETSTORE_API_KEY"]),
        )

Feature Gap Analysis

Features present in BaseLangGraphAgent that are absent from mas_agent_base.BaseAgent:

Feature	`BaseLangGraphAgent`	`mas_agent_base.BaseAgent`
Persistent checkpointing (Redis / MongoDB / PostgreSQL)	✅ via `get_checkpointer()`	❌ `InMemorySaver` only
LangMem message summarization	✅ auto-compression	❌ not present
Context token management	✅ provider-aware limits	❌ not present
Custom CA bundle / TLS (`CUSTOM_CA_BUNDLE`, `SSL_VERIFY`)	✅ `_build_httpx_client_factory()`	❌ not present
Tool output chunking (>50 KB → temp file)	✅ `_chunk_large_output()`	❌ not present
Tool output truncation with refine-query guidance	✅ `_truncate_tool_output()`	❌ not present
ExceptionGroup recovery for Go MCP servers	✅ handled	❌ not present
Multi-server MCP config (pass full dict)	✅ supported	❌ single-server `MCPConfig` only
`get_additional_tools()` hook for non-MCP tools	✅ supported	❌ not present
Date injection in system prompt	✅ auto-prepended	❌ not present
Streaming artifact accumulation (partial chunks)	✅ in executor	❌ executor emits single-pass

Features in mas_agent_base that are absent or weaker in BaseLangGraphAgent:

Feature	`BaseLangGraphAgent`	`mas_agent_base.BaseAgent`
Per-request tool-call budget (hard cap)	❌ no per-call limit	✅ `TokenBudgetManager` + `BudgetAwareTool`
Bedrock prompt caching (`create_cache_point`)	❌ not present	✅ built-in
Lazy initialization / context-manager lifecycle	❌ eagerly initializes	✅ `initialize()` / `async with`
Clean dataclass MCP config (`MCPConfig`)	❌ abstract methods	✅ structured, validateable

Integration Options

Option A — Full Migration: Template Adopts `mas_agent_base`

Replace BaseLangGraphAgent / BaseLangGraphAgentExecutor with BaseAgent / BaseAgentExecutor from this PR as the template's base.

What changes in the template:

# Before (agent.py)
from ai_platform_engineering.utils.a2a_common.base_langgraph_agent import BaseLangGraphAgent

class PetStoreAgent(BaseLangGraphAgent):
    def get_mcp_config(self, server_path: str) -> dict:
        return {"command": "uv", "args": [...], "env": {...}, "transport": "stdio"}
    ...

# After (agent.py)
from ai_platform_engineering.agents.mas_agent_base import BaseAgent, MCPConfig

class PetStoreAgent(BaseAgent):
    def __init__(self):
        super().__init__(
            agent_name="petstore",
            system_instruction=SYSTEM_PROMPT,
            mcp_config=MCPConfig(
                server_name="petstore",
                server_path="path/to/mcp_petstore/__main__.py",
                required_env_vars=["PETSTORE_API_KEY"],
                transport="stdio",
            ),
        )

# Before (agent_executor.py)
from ai_platform_engineering.utils.a2a_common.base_langgraph_agent_executor import BaseLangGraphAgentExecutor

class PetStoreAgentExecutor(BaseLangGraphAgentExecutor):
    def __init__(self):
        super().__init__(PetStoreAgent())

# After (agent_executor.py) — nearly identical
from ai_platform_engineering.agents.mas_agent_base import BaseAgentExecutor

class PetStoreAgentExecutor(BaseAgentExecutor):
    def __init__(self):
        super().__init__(PetStoreAgent())

Trade-offs:


✅ Cleaner, more readable agent definitions
✅ `MCPConfig` is self-documenting and validateable
✅ Bedrock prompt caching for free
✅ Hard per-request tool-call budget
❌ Loses persistent checkpointing — sessions reset on pod restart
❌ Loses LangMem summarization — long conversations will hit context limits
❌ Loses TLS / CA bundle support — breaks enterprise proxies
❌ Loses tool output chunking — large API responses risk context overflow
❌ `MCPConfig` only handles single-server stdio or single HTTP endpoint

Verdict: Not recommended as a direct drop-in today. The feature regressions are too significant for production agents. Better suited as the base for new, simple agents that do not need persistent memory or complex MCP setups.

Option B — Selective Backport: Cherry-Pick Features Into Existing Base

Keep BaseLangGraphAgent as the production base, but pull specific innovations from mas_agent_base into it.

B1 — Adopt MCPConfig dataclass (replaces abstract get_mcp_config / get_mcp_http_config):

# utils/mcp_config.py — extend with MCPConfig dataclass
@dataclass
class MCPConfig:
    server_name: str
    required_env_vars: list[str]
    transport: str = "stdio"
    server_path: str | None = None
    http_url: str | None = None
    http_headers: dict[str, str] | None = None

    def validate_env_vars(self) -> None: ...
    def get_client_config(self) -> dict: ...

# BaseLangGraphAgent — add optional hook
def get_mcp_config_object(self) -> MCPConfig | None:
    return None  # subclasses opt in; existing abstract methods still work

This is backward-compatible — existing agents keep abstract methods, new agents use MCPConfig.

B2 — Adopt TokenBudgetManager + BudgetAwareTool (adds hard tool-call cap):

# In BaseLangGraphAgent._setup_mcp_and_graph():
if self.enable_token_budget:
    tools = self._wrap_tools_with_budget(tools)

This is additive — controlled by AGENT_MAX_TOOL_CALLS env var, disabled by default.

B3 — Adopt prompt caching (Bedrock create_cache_point):

# In BaseLangGraphAgent.__init__():
if enable_prompt_caching and hasattr(self.model, "create_cache_point"):
    self.cache_point = self.model.create_cache_point()

B4 — Adopt lazy initialization pattern (decouple MCP startup from object construction):

Currently BaseLangGraphAgent initializes synchronously at import time for the graph, but the actual MCP tool loading is deferred. Aligning with BaseAgent's explicit initialize() + async with pattern would improve testability and allow agents to start without live MCP servers available.

Trade-offs:


✅ No regressions — all existing features preserved
✅ Each change is independently mergeable
✅ Aligns interfaces without rewriting ~150 KB of base class
❌ Does not achieve full consolidation — two base classes still exist
❌ More incremental, slower to clean up the overall architecture

Verdict: Recommended near-term path. B1 + B2 are the highest value changes and can be done in 2–3 focused PRs.

Option C — Parallel Base Classes: `mas_agent_base` as Lightweight Tier

Ship mas_agent_base alongside BaseLangGraphAgent as an explicitly lighter tier for agents that do not need persistent memory or advanced context management.

The template agent ships two variants:

agents/template/
  agent_petstore/               ← current (uses BaseLangGraphAgent, full-featured)
  agent_petstore_simple/        ← new (uses BaseAgent from mas_agent_base, lighter)

Document the trade-offs clearly so agent authors choose the right tier. mas_agent_base becomes the entry point for new agents; BaseLangGraphAgent is the upgrade path when you need persistence or context management.

Trade-offs:


✅ Both tiers coexist — no migrations required
✅ Simpler onboarding for new agent authors using the lightweight path
✅ `mas_agent_base` is immediately useful without full parity
❌ Two diverging base classes increases maintenance surface
❌ Feature gaps between tiers create confusion about when to upgrade
❌ Requires clear docs + governance to prevent fragmentation

Verdict: Viable if paired with a clear migration guide. Requires agreement on which features will eventually be unified and which will remain tier-specific.

Recommended Path

Short term (this PR + 2 follow-ups):

Merge mas_agent_base as-is, scoped to agents/mas_agent_base/ (Option C foundation).
Add MCPConfig dataclass to utils/mcp_config.py and make BaseLangGraphAgent accept it as an alternative to abstract methods (Option B1).
Port TokenBudgetManager + BudgetAwareTool into BaseLangGraphAgent as opt-in (Option B2).

Medium term:

Port Bedrock prompt caching into BaseLangGraphAgent (Option B3).
Update the template agent to demonstrate mas_agent_base as the lightweight path (Option C).
Add persistent checkpointing and TLS support to mas_agent_base.BaseAgent to close the gap with BaseLangGraphAgent.

Long term:

When mas_agent_base reaches feature parity, deprecate BaseLangGraphAgent in favor of the cleaner constructor-injection pattern.

Open Questions

Persistent memory: Is InMemorySaver-only acceptable for agents using mas_agent_base? Or should BaseAgent.__init__ accept an optional checkpointer parameter?
Multi-server MCP: MCPConfig currently models a single server. Several production agents (e.g. GitHub) use multi-server configs. Should MCPConfig support a list of servers, or should BaseAgent accept list[MCPConfig]?
Streaming executor: BaseAgentExecutor._handle_agent_event emits one artifact per completion event, while BaseLangGraphAgentExecutor accumulates chunks and streams progressively. Which behavior do downstream consumers (UI, supervisor) require?
Module location: Should mas_agent_base live under agents/ (current PR) or under utils/a2a_common/ where the other base classes live?

Add MAS Agent Core shared base classes and utilities for building A2A agents, including BaseAgent, BaseAgentExecutor, TokenBudgetManager, and MCP config. Signed-off-by: Adam Dickinson <adickinson@demandbase.com>

…conventions Rename mas_agent_base to mas-agent-base folder, add pyproject.toml, Makefile, uv.lock, and Docker compose entries to match existing agent conventions. Register in .github/agents.json for CI builds. Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Assisted-by: Claude:claude-opus-4-6 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-04-14T23:02:58Z