Describe the bug
When multiple concurrent copilot processes use gpt-5.3-codex, they fail with unsupported_api_for_model because the CLI sends requests to /chat/completions instead of /responses.
The issue occurs when the listModels API call is rate-limited (HTTP 429). A single CLI instance works fine because the model list loads successfully.
Root cause (from inspecting the 0.0.415 bundle):
The responses API routing in KXe.getCompletionWithTools requires both conditions to be true:
U7e(settings, clientOptions) && model?.supported_endpoints?.includes("/responses")
U7e() checks clientOptions.thinkingMode || featureFlag("copilot_swe_agent_enable_responses_api")
model comes from chatClient.modelPromise — a model list fetch from the API
When the model list fetch fails (HTTP 429), model is null, so supported_endpoints?.includes("/responses") is falsy. The CLI falls back to /chat/completions, which rejects codex models. It then retries the same failing endpoint 5 times without re-attempting the model list fetch.
Affected version
0.0.415
Steps to reproduce the behavior
- Run multiple concurrent
copilot -p processes with --model gpt-5.3-codex
- The
listModels endpoint gets rate-limited (429)
- All instances fall back to
/chat/completions and fail with:
{"error":{"message":"model \"gpt-5.3-codex\" is not accessible via the /chat/completions endpoint","code":"unsupported_api_for_model"}}
Expected behavior
When the model list is unavailable, the CLI should either:
- Handle the
unsupported_api_for_model error by retrying with the /responses endpoint
- Retry the model list fetch with backoff before giving up
- Treat
thinkingMode alone (without model metadata) as sufficient to route to /responses
Additional context
From the process logs:
[ERROR] Error loading models: Error: Failed to list models: 429
Followed by 6 consecutive failures on /chat/completions with unsupported_api_for_model, then exit code 1.
Describe the bug
When multiple concurrent
copilotprocesses usegpt-5.3-codex, they fail withunsupported_api_for_modelbecause the CLI sends requests to/chat/completionsinstead of/responses.The issue occurs when the
listModelsAPI call is rate-limited (HTTP 429). A single CLI instance works fine because the model list loads successfully.Root cause (from inspecting the 0.0.415 bundle):
The responses API routing in
KXe.getCompletionWithToolsrequires both conditions to be true:U7e()checksclientOptions.thinkingMode || featureFlag("copilot_swe_agent_enable_responses_api")modelcomes fromchatClient.modelPromise— a model list fetch from the APIWhen the model list fetch fails (HTTP 429),
modelis null, sosupported_endpoints?.includes("/responses")is falsy. The CLI falls back to/chat/completions, which rejects codex models. It then retries the same failing endpoint 5 times without re-attempting the model list fetch.Affected version
0.0.415
Steps to reproduce the behavior
copilot -pprocesses with--model gpt-5.3-codexlistModelsendpoint gets rate-limited (429)/chat/completionsand fail with:{"error":{"message":"model \"gpt-5.3-codex\" is not accessible via the /chat/completions endpoint","code":"unsupported_api_for_model"}}Expected behavior
When the model list is unavailable, the CLI should either:
unsupported_api_for_modelerror by retrying with the/responsesendpointthinkingModealone (without model metadata) as sufficient to route to/responsesAdditional context
From the process logs:
Followed by 6 consecutive failures on
/chat/completionswithunsupported_api_for_model, then exit code 1.