Update to Responses API compaction and telemetry#4931
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refines how the OpenAI Responses API “context-management compaction” feature is enabled and observed, ensuring compaction-specific behavior only runs when compaction is actually enabled and adding outcome telemetry plus regression coverage.
Changes:
- Centralizes computation of the compaction threshold and uses it to gate request
context_managementand WebSocket stateful-marker slicing. - Propagates the compaction threshold into Responses API stream processors (HTTP, WebSocket, and pass-through) and emits
responsesApi.compactionOutcometelemetry only when compaction is enabled. - Adds unit tests covering enabled/disabled compaction flows and telemetry emission behavior.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/platform/networking/node/chatStream.ts | Adds responsesApi.compactionOutcome telemetry emitter helper with GDPR annotation. |
| src/platform/endpoint/node/responsesApi.ts | Computes compaction threshold once, adjusts marker/compaction slicing, plumbs threshold into processor, and emits compaction outcome telemetry. |
| src/platform/endpoint/node/chatEndpoint.ts | Passes computed compaction threshold through to Responses API response processing. |
| src/extension/prompt/node/chatMLFetcher.ts | Ensures WebSocket Responses API processing receives compaction threshold parsed from request body. |
| src/extension/externalAgents/node/oaiLanguageModelServer.ts | Ensures pass-through Responses API processing receives compaction threshold parsed from request body. |
| src/platform/endpoint/node/test/responsesApi.spec.ts | Adds tests for request-body threshold extraction, WebSocket slicing behavior, and compaction telemetry emission gating. |
| .vscode/settings.json | Enables terminal tool sandbox in workspace settings. |
| .vscode/launch.json | Adds preLaunchTask: "compile" to the primary extension launch configuration. |
b86ad3b to
c918766
Compare
| messages = messages.slice(statefulMarkerAndIndex.index + 1); | ||
| if (latestCompactionMessageIndex !== undefined) { | ||
| if (latestCompactionMessageIndex > statefulMarkerAndIndex.index) { | ||
| messages = messages.slice(latestCompactionMessageIndex - (statefulMarkerAndIndex.index + 1)); |
There was a problem hiding this comment.
We still set previousResponseId in this case?
There was a problem hiding this comment.
This PR is no longer valid. I had to leave this, thought I closed it and made some changes in this commit: microsoft/vscode@5207a1a
There was a problem hiding this comment.
validated the changes and for websockets compaction is being sent.
| messages = messages.slice(latestCompactionMessageIndex); | ||
| } | ||
|
|
||
| const latestCompactionMessage = latestCompactionMessageIndex !== undefined ? createCompactionRoundTripMessage(messages[latestCompactionMessageIndex]) : undefined; |
There was a problem hiding this comment.
nit: looks like this could be moved into the if block below
Summary
Details
createResponsesRequestBodynow computes the compaction threshold once and uses it to gate both requestcontext_managementand websocket stateful-marker slicing behaviorOpenAIResponsesProcessorcall sites now receive the request compaction threshold from the request bodyresponsesApi.compactionOutcomeemits only when compaction is enabled and reports the outcome, request IDs, model, threshold, and token countscompletionIdorcompactionMessageIdTesting
runTestsonsrc/platform/endpoint/node/test/responsesApi.spec.ts(16 passed)start-watch-tasksoutput showed no compile errors