Skip to content

[FEATURE] Add optional security metadata to tool definitions #2154

@srbhsrkr

Description

@srbhsrkr

Problem Statement

The SDK's tool system has zero security classification. Every registered tool — whether it reads a file or deletes a database — is treated identically by the event loop. There is no way to declare a tool's safety profile, and no built-in permission gate before tool execution.

What exists today:

  • @tool decorator supports only: func, description, inputSchema, name, context. No security parameters.
  • ToolSpec TypedDict has only: description, inputSchema, name, outputSchema. No permission fields.
  • AgentTool base class has: tool_name, tool_spec, tool_type, supports_hot_reload, is_dynamic. No security properties.
  • BeforeToolCallEvent with cancel_tool field enables permission gating via hooks, but requires users to implement all classification logic externally.
  • MCPClient.ToolFilters filters tools at load time by name pattern, but no runtime permission checks.

What's missing:

  • No way for tools to self-declare: "I'm read-only", "I'm destructive", "I need confirmation".
  • No mechanism for hooks to reason about tool safety without hardcoding tool-name-to-permission mappings.
  • MCP tools have no runtime authorization beyond load-time name filtering.

Why this matters:

  • Tool misuse is OWASP ci: update sphinx-autodoc-typehints requirement from <2.0.0,>=1.12.0 to >=1.12.0,<4.0.0 #5 for agentic applications. The SDK already has the hook infrastructure (BeforeToolCallEvent + cancel_tool). What's missing is the metadata on tools for hooks to reason about, and a reference implementation that users can adopt.
  • Without metadata, every permission hook must hardcode tool-name lists. Metadata makes permission policies declarative and portable.

Related Issues

Proposed Solution

Updated based on feedback in the comments — generalized to a metadata dict so this interface also covers #1261 (tags) and #1598 (tool-type detection), and added MCP annotation mapping. Original proposal used security-specific fields only.

Phase 1: Generic tool metadata + typed accessors for safety (non-breaking)

Rather than adding security-specific fields, introduce a generic metadata dict on AgentTool with thin typed accessors for well-known keys. This handles this issue, #1261 (tags), and #1598 (tool-type detection) with one interface.

AgentTool base class:

class AgentTool(ABC):
    @property
    def metadata(self) -> dict[str, Any]:
        """Arbitrary tool metadata. Well-known keys have convenience accessors."""
        return {}

    # Typed accessors for well-known safety keys (safe defaults)
    @property
    def is_read_only(self) -> bool:
        return bool(self.metadata.get("read_only", False))

    @property
    def is_destructive(self) -> bool:
        return bool(self.metadata.get("destructive", False))

    @property
    def requires_confirmation(self) -> bool:
        return bool(self.metadata.get("requires_confirmation", False))

@tool decorator — accepts either named safety params or a raw metadata dict (they populate the same underlying map):

@tool(read_only=True)
def list_files(directory: str) -> list[str]: ...

@tool(destructive=True, requires_confirmation=True)
def delete_file(path: str) -> str: ...

@tool(metadata={"read_only": True, "tags": ["filesystem"], "cost_tier": "low"})
def stat_file(path: str) -> dict: ...

MCP tools: MCPAgentTool maps the MCP spec's tool annotations (readOnlyHint, destructiveHint, idempotentHint, openWorldHint) into the same metadata dict, so hooks reason about MCP and native tools uniformly. Servers that don't provide annotations yield empty metadata — "unclassified," not silently "safe."

Classification model: Instead of two independent booleans (which leave (False, False) ambiguous), the initial well-known keys express the three-tier model cleanly: no flags = unclassified, read_only=True = no side effects, destructive=True = irreversible. requires_confirmation is an orthogonal gating hint.

Phase 2: Reference PermissionPolicy hook (optional, opt-in)

Provide a reference PermissionPolicy hook that uses the typed accessors:

class PermissionPolicy(HookProvider):
    """Reference implementation: gates destructive tools via cancel_tool."""

    def before_tool_call(self, event: BeforeToolCallEvent):
        tool = event.selected_tool
        if tool.is_destructive or tool.requires_confirmation:
            event.cancel_tool = "This tool requires confirmation."

Users opt in by adding hooks=[PermissionPolicy()] to their Agent.

Explicitly out of scope for this issue

  • Capability intersection across agent delegation — trust propagation in multi-agent graphs is a separate design.
  • Budget / cost enforcement — orthogonal to safety classification; belongs in a cost-control proposal.
  • Runtime policy hot-reload — deployment concern, not SDK surface.

Trade-offs

Concern Analysis
MCP tools can't use @tool decorator MCPAgentTool maps MCP spec tool annotations (readOnlyHint, destructiveHint, idempotentHint, openWorldHint) into the shared metadata dict. No name heuristics needed.
Defaults matter Typed accessors return False when the key is absent, but absence means "unclassified," not "safe." Policies should gate on explicit read_only=True, not on the absence of destructive=True.
Doesn't this duplicate hooks? Hooks enable custom logic, but without metadata on the tool, every hook must hardcode tool-name-to-permission mappings. Metadata makes policies declarative and portable across projects.
Why a generic metadata dict instead of typed fields? Keeps the interface extensible — #1261 (tags) and #1598 (tool-type detection) want different keys on the same surface. Typed accessors give IDE autocomplete and type-safety for the well-known keys without locking the schema.
Breaking change risk Zero. All new fields are optional with backward-compatible defaults. Existing @tool functions and AgentTool subclasses work unchanged.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions