Skip to content

Fix toolTokens over-counting when Anthropic Tool Search is enabled#4834

Merged
bhavyaus merged 1 commit intomainfrom
dev/bhavyau/tst-tool-token-fix
Mar 31, 2026
Merged

Fix toolTokens over-counting when Anthropic Tool Search is enabled#4834
bhavyaus merged 1 commit intomainfrom
dev/bhavyau/tst-tool-token-fix

Conversation

@bhavyaus
Copy link
Copy Markdown
Contributor

@bhavyaus bhavyaus commented Mar 30, 2026

Problem

When TST (Tool Search Tool) is enabled, most tools are sent to the Anthropic API with defer_loading: true and don't count against the context window until the model loads them via tool_search. However, the toolTokens calculation in agentIntent.ts counted ALL available tools (~40+ tools, ~40K tokens), over-estimating by ~25-30K tokens. This artificially reduced the message budget and inflated the context ratio, causing premature conversation compaction.

Fix

Filter availableTools to only non-deferred tools (via IToolDeferralService.isNonDeferredTool()) before calling countToolTokens() when TST is enabled. Non-deferred tools (~18 tools, ~10-15K tokens) are the ones that are always immediately available and actually charged against the context window by the API.

Changes

  • agentIntent.ts: Import IToolDeferralService and isAnthropicToolSearchEnabled, inject the service, filter tools before token counting, enhance debug log to show total vs non-deferred tool counts
  • askAgentIntent.ts, editCodeIntent2.ts, notebookEditorIntent.ts: Thread IToolDeferralService through subclass constructors and super() calls

When TST (Tool Search Tool) is enabled, most tools are sent with
defer_loading: true and don't count against the context window until the
model loads them via tool_search. However, the toolTokens calculation in
agentIntent counted ALL available tools, over-estimating by ~25-30K
tokens and causing premature compaction.

Fix: filter availableTools to only non-deferred tools (via
IToolDeferralService) before calling countToolTokens() when TST is
enabled. This gives an accurate budget that reflects what the API
actually charges against the context window.
@bhavyaus bhavyaus force-pushed the dev/bhavyau/tst-tool-token-fix branch from 3f487b2 to bd00295 Compare March 30, 2026 23:02
@bhavyaus bhavyaus marked this pull request as ready for review March 30, 2026 23:31
Copilot AI review requested due to automatic review settings March 30, 2026 23:31
@bhavyaus bhavyaus enabled auto-merge March 30, 2026 23:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts agent prompt budgeting to avoid over-counting tool schema tokens when Anthropic Tool Search defers most tools, preventing premature compaction and overly constrained message budgets.

Changes:

  • Inject IToolDeferralService into AgentIntentInvocation and filter out deferred tools before calling countToolTokens() when Anthropic Tool Search is enabled.
  • Thread IToolDeferralService through intent invocation subclass constructors (AskAgent, EditCode2, NotebookEditor) to satisfy the new base-class dependency.
  • Expand debug logging to include total vs non-deferred tool counts when Tool Search is enabled.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
src/extension/intents/node/agentIntent.ts Filters tools used for token counting under Anthropic Tool Search and logs total vs non-deferred tool counts.
src/extension/intents/node/askAgentIntent.ts Threads IToolDeferralService through to the base invocation.
src/extension/intents/node/editCodeIntent2.ts Threads IToolDeferralService through to the base invocation.
src/extension/intents/node/notebookEditorIntent.ts Threads IToolDeferralService through to the base invocation.

Comment thread src/extension/intents/node/agentIntent.ts
@bhavyaus bhavyaus added this pull request to the merge queue Mar 31, 2026
Merged via the queue into main with commit 1f763d3 Mar 31, 2026
24 checks passed
@bhavyaus bhavyaus deleted the dev/bhavyau/tst-tool-token-fix branch March 31, 2026 02:43
bhavyaus added a commit that referenced this pull request Apr 6, 2026
…ools

Deferred tools (defer_loading: true) still count against the API context
window. The 3/30 change (#4834) excluded them from toolTokens, causing
the message budget to be ~31K tokens too generous and leading to
context_length_exceeded errors followed by summarization failures
("No messages provided").

- Count all tools in agentIntent budget calculation
- Reserve tool token budget in summarization prompt rendering
- Add modelMaxPromptTokens to summarization telemetry
- Add priority to summarization UserMessage

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bhavyaus added a commit that referenced this pull request Apr 6, 2026
…ools

Deferred tools (defer_loading: true) still count against the API context
window. The 3/30 change (#4834) excluded them from toolTokens, causing
the message budget to be ~31K tokens too generous and leading to
context_length_exceeded errors followed by summarization failures
("No messages provided").
bhavyaus added a commit that referenced this pull request Apr 6, 2026
…ools

Deferred tools (defer_loading: true) still count against the API context
window. The 3/30 change (#4834) excluded them from toolTokens, causing
the message budget to be ~31K tokens too generous and leading to
context_length_exceeded errors followed by summarization failures
("No messages provided").
bhavyaus added a commit that referenced this pull request Apr 6, 2026
…ools

Deferred tools (defer_loading: true) still count against the API context
window. The 3/30 change (#4834) excluded them from toolTokens, causing
the message budget to be ~31K tokens too generous and leading to
context_length_exceeded errors followed by summarization failures
("No messages provided").
bhavyaus added a commit that referenced this pull request Apr 6, 2026
…ools

Deferred tools (defer_loading: true) still count against the API context
window. The 3/30 change (#4834) excluded them from toolTokens, causing
the message budget to be ~31K tokens too generous and leading to
context_length_exceeded errors followed by summarization failures
("No messages provided").
github-merge-queue Bot pushed a commit that referenced this pull request Apr 6, 2026
…ools (#4992)

Deferred tools (defer_loading: true) still count against the API context
window. The 3/30 change (#4834) excluded them from toolTokens, causing
the message budget to be ~31K tokens too generous and leading to
context_length_exceeded errors followed by summarization failures
("No messages provided").
bhavyaus added a commit that referenced this pull request Apr 6, 2026
…ools

Deferred tools (defer_loading: true) still count against the API context
window. The 3/30 change (#4834) excluded them from toolTokens, causing
the message budget to be ~31K tokens too generous and leading to
context_length_exceeded errors followed by summarization failures
("No messages provided").
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants