Skip to content

Latest commit

 

History

History
407 lines (305 loc) · 24.9 KB

File metadata and controls

407 lines (305 loc) · 24.9 KB

Tool System

Overview

Every action Claude Code can take is modeled as a Tool. The tool system provides a uniform interface for schema definition, permission checking, execution, and progress reporting. Tools are the bridge between the LLM's intent (expressed as tool_use blocks in the API response) and actual side effects on the user's system.

Tool Registry

src/tools.ts is the central registry. The primary entry point is getAllBaseTools(), which returns the exhaustive list of all tools that could be available in the current environment. Tools are conditionally included based on:

  • Feature flags (compile-time via bun:bundle): feature('PROACTIVE'), feature('KAIROS'), feature('AGENT_TRIGGERS'), feature('COORDINATOR_MODE'), feature('CONTEXT_COLLAPSE'), feature('TERMINAL_PANEL'), feature('WEB_BROWSER_TOOL'), feature('HISTORY_SNIP'), feature('UDS_INBOX'), feature('WORKFLOW_SCRIPTS'), etc. Dead code elimination removes entire tool imports when flags are off.
  • User type (process.env.USER_TYPE === 'ant'): Internal-only tools like REPLTool, ConfigTool, TungstenTool, SuggestBackgroundPRTool are gated by this check.
  • Environment variables: ENABLE_LSP_TOOL, CLAUDE_CODE_VERIFY_PLAN, CLAUDE_CODE_SIMPLE, NODE_ENV=test.
  • Runtime feature checks: isTodoV2Enabled(), isWorktreeModeEnabled(), isAgentSwarmsEnabled(), isPowerShellToolEnabled(), isReplModeEnabled(), hasEmbeddedSearchTools().

The getTools() function applies additional runtime filtering:

  1. Simple mode (CLAUDE_CODE_SIMPLE): Restricts to only BashTool, FileReadTool, FileEditTool (or REPLTool in REPL mode).
  2. Deny rules: filterToolsByDenyRules() removes tools that have blanket deny rules in the permission context.
  3. REPL mode: When REPL is active and available, primitive tools (REPL_ONLY_TOOLS) are hidden from direct use -- they are accessible inside the REPL VM context instead.
  4. isEnabled() check: Each tool's isEnabled() method is called; disabled tools are filtered out.

Tool Pool Assembly

assembleToolPool() is the single source of truth for combining built-in tools with MCP tools. It:

  1. Gets built-in tools via getTools().
  2. Filters MCP tools by deny rules.
  3. Sorts each partition (built-in and MCP) alphabetically by name for prompt-cache stability, keeping built-ins as a contiguous prefix.
  4. Deduplicates via uniqBy('name') with built-in tools taking precedence.

This ordering is critical: the API's cache policy places a breakpoint after built-in tools, so interleaving MCP tools into that prefix would invalidate downstream cache keys.

Built-in Tools

Core File Operations

Tool Purpose Concurrency Safe
BashTool Shell command execution with permission classifier Only when read-only
FileReadTool Read files (text, images, PDFs, Jupyter notebooks) Yes
FileEditTool Partial file modification (string replacement) No
FileWriteTool Create or overwrite files No
GlobTool File pattern matching (fast, any codebase size) Yes
GrepTool ripgrep-based content search with regex support Yes
NotebookEditTool Jupyter notebook cell editing No

Note: When hasEmbeddedSearchTools() is true (Anthropic-native builds with bfs/ugrep embedded in the Bun binary), GlobTool and GrepTool are omitted since find/grep in the shell are aliased to fast embedded equivalents.

Web and Search

Tool Purpose
WebFetchTool Fetch and process URL content
WebSearchTool Web search
LSPTool Language Server Protocol integration (gated by ENABLE_LSP_TOOL)

Agent and Task Management

Tool Purpose
AgentTool Spawn sub-agents (local, worktree, remote, background)
SendMessageTool Inter-agent messaging (lazy-loaded to break circular dependency)
TeamCreateTool Create team of parallel agents (lazy-loaded, gated by agent swarms)
TeamDeleteTool Remove agent teams (lazy-loaded, gated by agent swarms)
TaskCreateTool Create tracked tasks (gated by isTodoV2Enabled())
TaskUpdateTool Update task status
TaskGetTool Read task details
TaskListTool List all tasks
TaskStopTool Stop running tasks
TaskOutputTool Read task output

Planning and Workflow

Tool Purpose
EnterPlanModeTool Switch to read-only planning mode
ExitPlanModeV2Tool Exit plan mode with approval
EnterWorktreeTool Create isolated git worktree
ExitWorktreeTool Exit worktree, merge changes
SkillTool Execute registered skills
ToolSearchTool Discover deferred tools by keyword
TodoWriteTool Write todo items

Feature-Gated Tools

Tool Feature Flag Purpose
SleepTool PROACTIVE or KAIROS Wait in proactive mode
CronCreateTool AGENT_TRIGGERS Schedule recurring triggers
CronDeleteTool AGENT_TRIGGERS Remove scheduled triggers
CronListTool AGENT_TRIGGERS List scheduled triggers
RemoteTriggerTool AGENT_TRIGGERS_REMOTE Trigger remote agents
MonitorTool MONITOR_TOOL System monitoring
BriefTool Always included Brief notifications
PushNotificationTool KAIROS or KAIROS_PUSH_NOTIFICATION Send push notifications
SendUserFileTool KAIROS Send files to user
SubscribePRTool KAIROS_GITHUB_WEBHOOKS Subscribe to PR events
WebBrowserTool WEB_BROWSER_TOOL Browser automation
OverflowTestTool OVERFLOW_TEST_TOOL Test overflow handling
CtxInspectTool CONTEXT_COLLAPSE Inspect context state
TerminalCaptureTool TERMINAL_PANEL Capture terminal output
SnipTool HISTORY_SNIP Snip conversation history
ListPeersTool UDS_INBOX List peer agents
WorkflowTool WORKFLOW_SCRIPTS Execute workflow scripts
PowerShellTool isPowerShellToolEnabled() PowerShell execution (Windows)

Internal-Only Tools (USER_TYPE=ant)

Tool Purpose
REPLTool REPL-based execution that wraps primitive tools in a VM
ConfigTool Configuration management
TungstenTool Virtual terminal abstraction
SuggestBackgroundPRTool Suggest background PR creation

Agent-Disallowed Tools

The constants file (src/constants/tools.ts) defines which tools are restricted for sub-agents:

  • ALL_AGENT_DISALLOWED_TOOLS: Tools forbidden for all agents -- TaskOutputTool, ExitPlanModeV2Tool, EnterPlanModeTool, AskUserQuestionTool, TaskStopTool, WorkflowTool. AgentTool is also blocked for non-ant users to prevent recursive agent spawning.
  • ASYNC_AGENT_ALLOWED_TOOLS: Explicit allowlist for async agents -- file operations, web tools, shell tools, skill tool, worktree tools.
  • IN_PROCESS_TEAMMATE_ALLOWED_TOOLS: Additional tools for in-process teammates -- task CRUD, messaging, and cron tools.
  • COORDINATOR_MODE_ALLOWED_TOOLS: The coordinator sees only AgentTool, TaskStopTool, SendMessageTool, and SyntheticOutputTool.

Tool Interface

Every tool implements the Tool type defined in src/Tool.ts. The interface is comprehensive, covering execution, permissions, validation, rendering, and metadata. Tools are constructed via the buildTool() helper, which fills in safe defaults for commonly-stubbed methods.

Core Methods

Method Purpose Default
name Unique identifier string (required)
aliases Backwards-compatible names for renamed tools none
searchHint 3-10 word capability phrase for ToolSearch keyword matching none
description() LLM-facing description (used for the API tool description field) (required)
prompt() Full prompt text for system prompt generation (required)
inputSchema Zod v4 schema defining parameters (required)
inputJSONSchema Optional direct JSON Schema (MCP tools bypass Zod conversion) none
outputSchema Zod schema for output validation none
call() Async execution function receiving validated input + ToolUseContext (required)
maxResultSizeChars Threshold for persisting large results to disk (required)

State Query Methods

Method Purpose Default
isEnabled() Whether the tool should be included in the tool pool true
isReadOnly() Whether the tool only reads state false (assume writes)
isConcurrencySafe() Whether the tool can run concurrently with others false (assume not safe)
isDestructive() Whether the tool performs irreversible operations false
isOpenWorld() Whether the tool accesses external resources none
shouldDefer Whether the tool should be deferred for ToolSearch none
alwaysLoad Whether to never defer, even when ToolSearch is active none
interruptBehavior() What happens on user interrupt: 'cancel' or 'block' 'block'

Permission Methods

Method Purpose Default
validateInput() Check if tool is allowed with given input/context none (skip)
checkPermissions() Tool-specific permission logic, called after validateInput() {behavior: 'allow'}
preparePermissionMatcher() Compile permission rule patterns for hook if conditions none

Rendering Methods

Method Purpose
renderToolUseMessage() Render the tool invocation in the UI
renderToolResultMessage() Render the tool result
renderToolUseProgressMessage() Progress UI while tool runs
renderToolUseQueuedMessage() UI when tool is queued
renderToolUseRejectedMessage() UI when tool use is rejected
renderToolUseErrorMessage() UI for tool errors
renderGroupedToolUse() Render multiple parallel instances as a group
renderToolUseTag() Optional metadata tag after tool use message
userFacingName() Human-readable tool name for UI
getToolUseSummary() Short string for compact views
getActivityDescription() Present-tense description for spinner display
extractSearchText() Flattened text for transcript search indexing
isResultTruncated() Whether non-verbose rendering is truncated

Classifier and Observability Methods

Method Purpose Default
toAutoClassifierInput() Compact representation for security classifier '' (skip)
isSearchOrReadCommand() Classify as search/read/list for collapsible display none
mapToolResultToToolResultBlockParam() Convert output to API tool_result format (required)
backfillObservableInput() Add legacy/derived fields to copies for observers none
inputsEquivalent() Check if two inputs are semantically identical none

Input Validation

Tool inputs go through a multi-stage validation pipeline:

  1. LLM output parsing: Raw JSON from Claude API response is extracted from tool_use blocks.
  2. Zod schema validation: Input validated against the tool's Zod v4 inputSchema via safeParse(). Tools use lazySchema() to defer schema construction until first use.
  3. Input validation: If the tool defines validateInput(), it is called to check context-specific constraints (e.g., path restrictions, mode compatibility).
  4. Error feedback: If any validation fails, a descriptive error message is sent back to the LLM as a tool_result with is_error: true, wrapped in <tool_use_error> tags, enabling self-correction.

The formatZodValidationError() utility formats Zod errors into human-readable messages for the LLM.

Tool Execution Pipeline

Tool execution is handled by two parallel systems, both in src/services/tools/:

Batch Orchestration (toolOrchestration.ts)

The runTools() generator function partitions a batch of tool calls and executes them:

  1. partitionToolCalls() groups consecutive tool calls into batches:

    • A run of consecutive concurrency-safe tools becomes one batch.
    • Each non-concurrency-safe tool becomes its own batch.
    • If isConcurrencySafe() throws (e.g., shell-quote parse failure), the tool is treated as non-concurrent (fail-closed).
  2. Concurrency-safe batches are dispatched to runToolsConcurrently(), which uses an all() utility that runs async generators in parallel with a configurable concurrency limit (CLAUDE_CODE_MAX_TOOL_USE_CONCURRENCY, default 10). Context modifiers from concurrent tools are queued and applied after all tools in the batch complete.

  3. Non-concurrent batches are dispatched to runToolsSerially(), which runs tools one at a time. Context modifiers are applied immediately after each tool completes.

Streaming Executor (StreamingToolExecutor.ts)

The StreamingToolExecutor class handles tool execution as tool_use blocks stream in from the API (before the full response is available):

  • Tools are added via addTool() as their blocks stream in.
  • The executor tracks each tool's status: queued -> executing -> completed -> yielded.
  • canExecuteTool() enforces concurrency: a new tool can start only if no tools are executing, or if both the new tool and all executing tools are concurrency-safe.
  • Non-concurrent tools create a barrier: if a queued non-concurrent tool is encountered, the queue stops processing until all prior tools complete.
  • The executor supports discard() for streaming fallback scenarios where results should be abandoned.
  • A sibling abort controller fires when a Bash tool errors, killing sibling subprocesses without aborting the parent query.

Single Tool Execution (toolExecution.ts)

The runToolUse() generator function handles a single tool call:

  1. Tool lookup: First checks available tools, then falls back to getAllBaseTools() for deprecated alias resolution.
  2. Abort check: If the abort controller is already signaled, yields a cancel message immediately.
  3. Pre-tool hooks: runPreToolUseHooks() executes user-defined hooks before execution.
  4. Permission check: canUseTool() is called, which runs through the permission system.
  5. Execution: tool.call() is invoked with the validated input, context, and a progress callback.
  6. Post-tool hooks: runPostToolUseHooks() executes after successful completion; runPostToolUseFailureHooks() on failure.
  7. Result processing: Output is mapped via mapToolResultToToolResultBlockParam(), and large results exceeding maxResultSizeChars are persisted to disk with a preview sent to the LLM.
  8. Telemetry: Extensive event logging for tool use, errors, duration, and permission decisions.

Tool Result Storage

When a tool result exceeds its maxResultSizeChars threshold, the result is persisted to a file on disk and the LLM receives a preview plus the file path. This prevents context window bloat from large outputs.

  • FileReadTool sets maxResultSizeChars: Infinity because persisting would create a circular read-file-read loop (it already self-bounds via its own limits).
  • GrepTool uses a lower threshold (20,000 chars) since search results can be very large.
  • BashTool and PowerShellTool use 30,000 chars.
  • Most other tools default to 100,000 chars.

Tool Execution Concurrency

The concurrency model is central to performance. The isConcurrencySafe() method on each tool determines whether it can run in parallel:

Always concurrency-safe (return true):

  • FileReadTool, GlobTool, GrepTool -- pure reads, no side effects.
  • WebFetchTool, WebSearchTool -- external reads.
  • TaskGetTool, TaskListTool, TaskOutputTool, TaskUpdateTool -- task state queries.
  • ListMcpResourcesTool, ReadMcpResourceTool -- MCP resource reads.
  • ConfigTool, LSPTool, BriefTool, RemoteTriggerTool, AskUserQuestionTool -- stateless or external.
  • SyntheticOutputTool, ExitPlanModeV2Tool, AgentTool, TaskStopTool.

Conditionally concurrency-safe:

  • BashTool: Only when isReadOnly() returns true (delegates to checkReadOnlyConstraints()). Commands like ls, cat, grep are read-only; git push, rm, echo > file are not.
  • PowerShellTool: Same pattern as BashTool.

Never concurrency-safe (default false via buildTool):

  • FileEditTool, FileWriteTool, NotebookEditTool -- file writes must be serialized to prevent race conditions.
  • EnterPlanModeTool -- mode change.

Deferred Tool Loading (ToolSearch)

Not all tools are loaded into the system prompt upfront. The deferred loading system saves tokens by only including tool names initially, with full schemas loaded on demand.

Architecture

  1. isDeferredTool() (in src/tools/ToolSearchTool/prompt.ts) determines which tools are deferred: MCP tools and tools with shouldDefer: true.
  2. alwaysLoad overrides deferral -- a tool with alwaysLoad: true is never deferred, even when ToolSearch is active. MCP tools can set this via _meta['anthropic/alwaysLoad'].
  3. Deferred tools are sent to the API with defer_loading: true. Only their names appear in the system prompt (via formatDeferredToolLine()).
  4. When the LLM needs a deferred tool, it calls ToolSearchTool.

ToolSearchTool

The tool supports two query modes:

  • Direct selection: select:ToolName fetches a specific tool by exact name. Also supports select:Tool1,Tool2 for multiple tools.
  • Keyword search: Free-text queries are matched against tool names and descriptions using a scoring algorithm that considers:
    • Exact name match (fast path).
    • CamelCase and underscore splitting of tool names.
    • MCP tool prefix parsing (mcp__server__action).
    • searchHint fields on tools.
    • Word-boundary regex matching with pre-compiled patterns.

Results are returned as tool_reference content blocks in the tool result, which the API expands into full tool definitions in the model's context.

Tool Search Modes

Controlled by ENABLE_TOOL_SEARCH environment variable:

Value Mode Behavior
(unset) tst Always defer MCP and shouldDefer tools
true tst Always defer
auto tst-auto Defer only when tool descriptions exceed threshold
auto:N tst-auto Same, with N% of context window as threshold
auto:0 tst Always defer (0% threshold = always exceeded)
auto:100 standard Never defer (100% threshold = never exceeded)
false standard All tools exposed inline

The auto threshold defaults to 10% of the context window. Both exact token counts (via API) and character-based heuristics (2.5 chars/token fallback) are used to measure deferred tool size.

Model Compatibility

Tool search requires model support for tool_reference blocks. Models matching patterns in the unsupported list (default: haiku) have tool search disabled. The list is configurable via GrowthBook feature flag tengu_tool_search_unsupported_models.

Discovered Tools Across Compaction

When conversation compaction occurs, tool_reference blocks in older messages may be lost. The system handles this via:

  • extractDiscoveredToolNames() scans message history for all previously discovered tool names.
  • Compact boundary markers carry preCompactDiscoveredTools metadata.
  • Deferred tools delta attachments (getDeferredToolsDelta()) track which tools have been announced versus newly available.

BashTool Deep Dive

BashTool (src/tools/BashTool/BashTool.tsx) is the most complex tool due to security implications and the breadth of commands it can execute.

Permission System

BashTool has a multi-layered permission system defined in bashPermissions.ts:

  1. Read-only validation (readOnlyValidation.ts): Analyzes commands to determine if they are purely read-only. Used for plan mode restrictions and concurrency safety.

  2. Permission rule matching: The preparePermissionMatcher() method parses commands via the security AST (parseForSecurity) and produces a closure that matches against permission rule patterns. Compound commands (ls && git push) fire hooks if ANY subcommand matches. Leading VAR=val prefixes are stripped so FOO=bar git push matches Bash(git *).

  3. Bash classifier (bashClassifier.ts): An ML-based classifier that categorizes commands as safe/unsafe/ask. Controlled by isClassifierPermissionsEnabled().

Security Analysis (bashSecurity.ts)

Deep security analysis checks for dangerous patterns:

  • Command substitution: $(), ${}, process substitution <() / >(), Zsh-specific expansions (=cmd, ~[, (e:, (+).
  • Zsh dangerous commands: zmodload (gateway to dangerous modules like zsh/mapfile, zsh/system, zsh/zpty, zsh/net/tcp), emulate -c (eval equivalent), module builtins (sysopen, sysread, syswrite).
  • Heredoc analysis: Extracts and validates heredocs, including detection of heredocs inside command substitutions.
  • Shell quote parsing: Validates against malformed tokens and known shell-quote library bugs.

Command Semantics (commandSemantics.ts)

Classifies bash commands for UI display:

  • Search commands: find, grep, rg, ag, ack, locate, which, whereis.
  • Read commands: cat, head, tail, less, more, wc, stat, file, strings, jq, awk, cut, sort, uniq, tr.
  • List commands: ls, tree, du.
  • Semantic-neutral commands: echo, printf, true, false, : -- these do not change the read/search nature of a compound pipeline.

Execution Model

  • Runs in a persistent shell session (working directory preserved between calls via shell state).
  • Configurable timeout: default 120 seconds (getDefaultTimeoutMs()), max 600 seconds (getMaxTimeoutMs()).
  • Background execution via run_in_background parameter -- spawns a shell task and returns immediately.
  • Output captured with size limits; large outputs truncated with end-truncating accumulator.
  • Proactive/assistant mode has a blocking budget (ASSISTANT_BLOCKING_BUDGET_MS = 15_000ms) after which commands auto-background.
  • Progress display kicks in after PROGRESS_THRESHOLD_MS = 2000ms.

Sandbox Support

shouldUseSandbox() determines whether to run commands in a sandboxed environment. The SandboxManager provides isolation when enabled.

Special Behaviors

  • Sed edit detection: parseSedEditCommand() parses sed commands that modify files, enabling file history tracking for undo.
  • Git operation tracking: trackGitOperations() detects and tracks git commits, attributing them to Claude Code.
  • Image output handling: Detects and resizes image outputs from shell commands.
  • CWD reset: resetCwdIfOutsideProject() resets the working directory if a command navigates outside the project.
  • File change notifications: notifyVscodeFileUpdated() notifies VS Code when files change.

The buildTool() Pattern

All tools are constructed via buildTool(), which spreads safe defaults over the provided definition:

const TOOL_DEFAULTS = {
  isEnabled: () => true,
  isConcurrencySafe: (_input) => false,   // fail-closed
  isReadOnly: (_input) => false,           // assume writes
  isDestructive: (_input) => false,
  checkPermissions: (input) => Promise.resolve({ behavior: 'allow', updatedInput: input }),
  toAutoClassifierInput: (_input) => '',   // skip classifier
  userFacingName: (_input) => '',
}

This pattern ensures all tools have consistent defaults. The BuiltTool<D> type uses conditional type mapping to merge the definition's explicit types with defaults, preserving full type safety across all 60+ tools.

Circular Dependency Management

Several tools require lazy loading to break circular dependencies:

  • TeamCreateTool and TeamDeleteTool: Loaded via getter functions (getTeamCreateTool()) because the tool registry imports from tool files which import back into the registry.
  • SendMessageTool: Same pattern with getSendMessageTool().
  • coordinatorModeModule: Conditionally required behind feature('COORDINATOR_MODE').

Design Patterns

  • Strategy Pattern: Different checkPermissions() implementations define different permission logic per tool. The permissionMode on tools (via isReadOnly, isConcurrencySafe, isDestructive) forms a multi-dimensional strategy selection.
  • Factory Pattern: buildTool() constructs tool instances with consistent defaults. getAllBaseTools() acts as a registry factory filtered by environment.
  • Observer Pattern: ToolCallProgress<P> callbacks enable reactive progress reporting. The progressAvailableResolve signal in StreamingToolExecutor wakes up result consumers when progress is available.
  • Chain of Responsibility: Input validation -> permission check -> pre-tool hooks -> execution -> post-tool hooks -> result processing.
  • Partition/Batch Pattern: partitionToolCalls() groups tool calls into concurrent and serial batches, maximizing throughput while preventing race conditions.
  • Lazy Initialization: lazySchema() defers Zod schema construction. Lazy require() calls break circular dependencies. memoize() caches tool descriptions for ToolSearch.
  • Fail-Closed Defaults: buildTool() defaults to non-concurrent, non-read-only, non-destructive -- the most restrictive safe behavior.