Fix: count all tool tokens in budget including deferred tools by bhavyaus · Pull Request #4990 · microsoft/vscode-copilot-chat

bhavyaus · 2026-04-06T04:17:44Z

No description provided.

…ools Deferred tools (defer_loading: true) still count against the API context window. The 3/30 change (#4834) excluded them from toolTokens, causing the message budget to be ~31K tokens too generous and leading to context_length_exceeded errors followed by summarization failures ("No messages provided"). - Count all tools in agentIntent budget calculation - Reserve tool token budget in summarization prompt rendering - Add modelMaxPromptTokens to summarization telemetry - Add priority to summarization UserMessage Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR updates agent prompt budgeting to count all tool schema tokens (including deferred tools) against the model context window, and adjusts conversation summarization so its prompt rendering reserves token budget for tools. It also extends summarization telemetry and adds a unit test capturing a zero-messages rendering edge case.

Changes:

Revert tool token counting in AgentIntentInvocation to include deferred tools when computing the message budget.
Reserve tool token budget when rendering the summarization prompt in Full mode, and add modelMaxPromptTokens to summarization telemetry.
Add a unit test reproducing the “No messages provided” failure mode via an empty rendered prompt.

Show a summary per file

File	Description
src/extension/prompts/node/agent/test/summarization.spec.tsx	Adds a repro test where summarization prompt rendering produces zero messages under an extremely small token budget.
src/extension/prompts/node/agent/summarizedConversationHistory.tsx	Reserves message budget for tools in Full summarization mode; tweaks message priority; adds `modelMaxPromptTokens` to telemetry.
src/extension/intents/node/agentIntent.ts	Counts tool tokens across all available tools (no deferral filtering) and removes tool-deferral plumbing from the invocation.

Copilot's findings

Comments suppressed due to low confidence (1)

src/extension/prompts/node/agent/summarizedConversationHistory.tsx:689

After rendering the summarization prompt, summarizationPrompt can legitimately be empty (0 messages) when the token budget is too small (see the new repro test). The current code proceeds to makeChatRequest2 with messages=[], which will fail validation (“No messages provided”) and may prevent a clean fallback path while also producing noisy telemetry. Add an explicit guard after render (e.g. if summarizationPrompt.length===0, throw a BudgetExceededError or a dedicated error) to skip the request and force the intended fallback/handling.

		let summarizationPrompt: ChatMessage[];
		const associatedRequestId = this.props.promptContext.conversation?.getLatestTurn().id;
		try {
			summarizationPrompt = (await renderPromptElement(this.instantiationService, endpoint, ConversationHistorySummarizationPrompt, { ...propsInfo.props, simpleMode: mode === SummaryMode.Simple }, undefined, this.token)).messages;
			this.logInfo(`summarization prompt rendered in ${stopwatch.elapsed()}ms.`, mode);
		} catch (e) {
			const budgetExceeded = e instanceof BudgetExceededError;
			const outcome = budgetExceeded ? 'budget_exceeded' : 'renderError';
			this.logInfo(`Error rendering summarization prompt in mode: ${mode}. ${e.stack}`, mode);
			this.sendSummarizationTelemetry(outcome, '', this.props.endpoint.model, mode, stopwatch.elapsed(), undefined);

Files reviewed: 3/3 changed files
Comments generated: 2

Copilot · 2026-04-06T04:27:51Z

@@ -379,7 +378,6 @@ export class AgentIntentInvocation extends EditCodeIntentInvocation implements I
 		@IExperimentationService private readonly expService: IExperimentationService,
 		@IAutomodeService private readonly automodeService: IAutomodeService,
 		@IOTelService override readonly otelService: IOTelService,


AgentIntentInvocation’s constructor no longer accepts IToolDeferralService, but several subclasses (e.g. AskAgentIntentInvocation/EditCode2IntentInvocation/NotebookEditorIntentInvocation) still inject toolDeferralService and pass it to super(...). This will fail TypeScript compilation due to an argument count mismatch. Update those subclasses to remove the extra DI parameter + super argument (and remove now-unused imports) to keep constructors consistent.

Suggested change

@IOTelService override readonly otelService: IOTelService,

@IOTelService override readonly otelService: IOTelService,

_toolDeferralService?: unknown,

Copilot · 2026-04-06T04:27:51Z

+		// In Full mode, tools are sent alongside the summarization prompt with
+		// tool_choice: 'none'. Reserve budget for them so the rendered messages
+		// plus tools don't exceed the model's context window.
+		const tools = this.props.tools;
+		const toolTokens = mode === SummaryMode.Full && tools?.length
+			? await this.props.endpoint.acquireTokenizer().countToolTokens(tools)
+			: 0;
+		const endpoint = toolTokens > 0
+			? this.props.endpoint.cloneWithTokenOverride(
+				Math.max(1, Math.floor((this.props.endpoint.modelMaxPromptTokens - toolTokens) * 0.9)))
+			: this.props.endpoint;


modelMaxPromptTokens telemetry is documented as “the … budget used for the summarization prompt rendering”, but the value sent is this.props.endpoint.modelMaxPromptTokens (the pre-reservation budget). Since getSummary may clone the endpoint with a reduced token budget after reserving tool tokens, telemetry will be misleading. Consider reporting the effective budget actually used for rendering (e.g. the cloned endpoint’s modelMaxPromptTokens / computed message budget), or report both original and effective budgets.

Copilot AI review requested due to automatic review settings April 6, 2026 04:17

Copilot started reviewing on behalf of bhavyaus April 6, 2026 04:19 View session

vs-code-engineering Bot assigned bhavyaus Apr 6, 2026

bhavyaus closed this Apr 6, 2026

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: count all tool tokens in budget including deferred tools#4990

Fix: count all tool tokens in budget including deferred tools#4990
bhavyaus wants to merge 1 commit intomainfrom
dev/bhavyau/fix-summarization-empty-prompt

bhavyaus commented Apr 6, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	@IOTelService override readonly otelService: IOTelService,
	@IOTelService override readonly otelService: IOTelService,
	_toolDeferralService?: unknown,

Conversation

bhavyaus commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bhavyaus commented Apr 6, 2026 •

edited

Loading