All notable changes to the "github-copilot-api-vscode" extension will be documented in this file.
Check Keep a Changelog for recommendations on how to structure this file.
- Comprehensive Telemetry: Expanded from 7 sparse events to 15 rich telemetry helpers, giving deep observability via Azure Application Insights.
- Activation context: VS Code version, OS platform/arch, Node version, Copilot Chat presence, extension version.
- Deactivation summary: Session duration bucket, total commands and requests served this session.
- Server start (extended): WebSocket enabled, MCP enabled, model family, max concurrency, API key presence, rate-limit presence.
- Server stop (extended): Uptime bucket, total requests, error rate bucket.
- Request details (extended): Endpoint bucket, token-in/out buckets, message-count bucket, streaming flag, tool use flag, system-prompt flag, finish reason.
- New
request.errorevent: Per-request error category (auth / rate_limit / timeout / server_error / client_error). - New
rateLimit.hitevent: Endpoint and limit type when a 429 is returned. - New
model.switchedevent: Old/new model family and trigger source (quickpick / api / config). - New
config.changedevent: Setting key changed (never the value) and whether a server restart followed. - New
ws.*events: WebSocketconnected,message(type only),disconnected,errorlifecycle. - New
uriHandler.invokedevent: Deep-link path used (/dashboard,/start,/stop). - Performance heartbeat (
perf.heartbeat): Fires every 5 minutes while the server is running — heap MB bucket, uptime, RPM, error rate, server state. - All 3 chat commands now tracked:
openCopilotChat,askCopilot,askSelectionWithCopilot— previously had no telemetry. - Shared bucketing utilities:
durationBucket,tokenBucket,messageCountBucket,uptimeBucket,heapMbBucket,modelFamilyexported from TelemetryService to eliminate duplicated logic. bucketEndpoint()helper: Maps raw URL paths to stable, low-cardinality App Insights dimension labels.
telemetryServerStartedprops extended (hostBucket,enableWebSocket,enableMcp,modelFamily,maxConcurrency,hasApiKey,hasRateLimit).telemetryServerStoppednow includes uptime and request-count context.telemetryServerErrorandtelemetryTunnelStartedupdated to use structured props objects.
lint-staged17.0.2 → 17.0.4typescript-eslint8.59.2 → 8.59.3@types/node25.6.0 → 25.7.0ws8.20.0 → 8.20.1 (security fix: uninitialized memory disclosure inwebsocket.close())
- Release workflow: explicitly package the VSIX before uploading to GitHub Release, ensuring the release page actually contains the artifact.
- Release packaging: removed stale
.vsixfiles from repository root so published release contains only the current version.
- Claude Code Full Compatibility: End-to-end support for Claude Code connecting via the Anthropic Messages API (
/v1/messages). x-api-keyAuthentication: Now accepts Anthropic SDK-stylex-api-keyheader in addition toAuthorization: Bearer— required for Claude Code with API key auth enabled.- Tool Use in Anthropic Path: The streaming Anthropic handler now passes tools to the VS Code LM API and emits proper
tool_useSSE content blocks (content_block_start→input_json_delta→content_block_stop) withstop_reason: "tool_use". systemPrompt as Array: Both streaming and non-streaming Anthropic handlers now acceptsystemas an array of text blocks ([{type:"text",text:"..."}]) per the latest Anthropic SDK spec.AnthropicContentBlockType: Added explicit TypeScript union type fortext,tool_use, andtool_resultcontent blocks.- Extended
AnthropicMessageRequest: Interface now includestools,tool_choice, and arraysystemfields.
- Claude 500 Error — "Unexpected chat message content type llm 2": Multi-part array message content (
[{type:"text",text:"..."}]) was incorrectlyJSON.stringify'd before being passed to the VS Code LM API. Now usesflattenMessageContent()across all three conversion paths (processStreamingChatCompletion,invokeCopilotWithTools,processStreamingAnthropicMessages). tool_resultContent Blocks:flattenMessageContent()now correctly extracts text fromtool_resultblocks in message history, allowing multi-turn tool conversations.tool_useContent Blocks: Conversation history messages containingtool_useblocks are now formatted as human-readable summaries instead of silently dropped.- CORS Headers: Added
x-api-key,anthropic-version, andanthropic-betatoAccess-Control-Allow-Headersfor full Anthropic SDK CORS compatibility.
- Modern Dashboard UI: Complete visual overhaul with glassmorphism-inspired cards, refined color palettes adapting to VS Code themes, and smooth interactive hover effects.
- Multi-Model Support Badge: Highlighted capability in the dashboard hero section noting the Gateway can fetch ANY language model detected in VS Code.
- Wiki Refinements: Redesigned documentation tabs to a modern pill-shaped style and improved code block typography.
- Live Feed & Activities: Resolved JavaScript reference errors that broke the real-time Live Log and Recent Activity features.
- Cloudflare Tunnel Status: Fixed a bug where starting a tunnel displayed the literal string "null" instead of a proper "Starting" state in the dashboard.
- Status Bar: Shows active model name, uptime counter, tunnel indicator (🌐), and enriched tooltip with full metrics table
- Quick Pick Menu: Copy API URL, Quick Test (sends live "Hello" request), Switch Model (lists all Copilot models), Edit System Prompt, Start/Manage Tunnel
- Sidebar: Animated pulsing status dot, clickable model name for switching, live uptime ticker (ticks every second), 4-stat grid (RPM, latency, total reqs, error rate), live request feed with flash animation, config status indicators (Auth/HTTPS/Tunnel)
- README: "Run as a Background Service" guide for macOS, Windows, and Linux
- Sidebar layout reorganized: stats and live feed now above action buttons for better at-a-glance monitoring
- Removed duplicate Swagger button event listener from sidebar
- OpenAI Chat Completions:
max_completion_tokensparameter support (auto-normalized tomax_tokensfor GPT-5.x compatibility) - OpenAI Chat Completions:
developerrole support (auto-mapped tosystemfor 2025+ spec) - OpenAI Chat Completions:
stream_options.include_usage— emit usage chunk in streaming responses - OpenAI Chat Completions:
reasoning_effortparameter support for o-series models (auto-mapped toreasoning.effort) - OpenAI Responses API:
text.formatstructured output pass-through (was hardcoded totext) - OpenAI Responses API:
truncationparameter pass-through - OpenAI Responses API: Expanded
reasoning.effortvalues — addedminimal,none,xhigh(Aug–Dec 2025 spec) - OpenAI Responses API: Streaming events now pass through
text.formatandtruncationfrom request - Anthropic Messages API:
thinking,metadatainterface fields;tool_usestop reason; cache token usage - Google Generative AI:
frequencyPenalty,presencePenalty,responseMimeType,responseSchema,safetySettingsinterface - OpenAPI Spec: Added
max_completion_tokens,reasoning_effortto Chat Completions schema - OpenAPI Spec: Added
developerto Message role enum - OpenAPI Spec: Added
reasoning,truncation,store,previous_response_id,tool_choiceto Responses API schema
- Updated branding to emphasize free, open-source, and trustworthy nature
- Upgraded TypeScript to 5.9.3, typescript-eslint to 8.55.0, @types/node to 25.2.3
- Cloudflare Tunnel binary: Now downloads the cloudflared binary automatically at runtime to extension storage. Works properly for marketplace-installed extensions.
- 🌐 Internet Access via Cloudflare Tunnels: Expose your API to the internet with a single click. Get a free public
*.trycloudflare.comURL instantly — no account required. Perfect for accessing from your phone, sharing with friends, or remote development. - Network Access Guide: New "What's New" banner explaining the difference between localhost (127.0.0.1), LAN (0.0.0.0), and Cloudflare Tunnel access.
- Security Enforcement: Tunnel requires API key authentication to be enabled before going live.
- Dashboard UI improvements with better feature discovery.
- Real-time metrics: Sidebar and dashboard now update metrics (Req/Min, Latency, Tokens, Connections) in real-time
- Dashboard race condition: Fixed message listener timing issue that caused stale data on initial load
- Host/Port layout: Fixed overlap between host, port inputs and Apply button
- "Things you should read" button in sidebar linking to notes.suhaib.in
- Model Selection: Fixed critical issue where API endpoints were ignoring the requested model and defaulting to the first available model. Now all endpoints strictly validate and use the exact model specified in the request.
- Added model validation across all API endpoints (OpenAI, Anthropic, Google, Llama). Invalid models now return a 404 error with a list of available models.
- Updated README to be model-agnostic and expanded tool categories.
- Removed specific model name references to future-proof documentation.
- Dashboard readability issues in Light and High Contrast themes by using VS Code theme variables.
- Initial release