Skip to content

Commit 6380626

Browse files
fix(capacity-retry): expand markers to match real upstream transient errors
Production dialog_logs 10-min sample (886 requests / 149 response.failed events = 17% transient failure rate) reveals the existing marker list missed 100% of real failures: - "Our servers are currently overloaded. Please try again later." (90%) - "An error occurred while processing your request. You can retry..." (10%) The codex CLI cosmetic message "Selected model is at capacity. Please try a different model." that users see is a CLIENT-SIDE fallback rendering, not the real upstream payload. So the previous marker list ("at capacity" / "try a different mode|model") matched the CLI display but never the wire content. Net effect: zero transparent retries in 24h despite ~17% upstream failure rate; every failure leaked to clients. Fix: add markers covering both observed phrases plus the generic "try again later" tail. Negative test cases pinned for auth / context-length errors which must NOT retry. Function kept named isCapacityError for minimal blast radius - all three call sites in handler.go (/v1/responses, /v1/chat/completions, /v1/compact) and handler_anthropic.go (/v1/messages) get the broader coverage automatically.
1 parent e71f949 commit 6380626

2 files changed

Lines changed: 27 additions & 2 deletions

File tree

proxy/capacity_retry.go

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,22 +6,38 @@ import (
66
"github.com/tidwall/gjson"
77
)
88

9-
// capacityErrorMarkers 上游 Codex 返回的"容量告急"错误特征关键词(小写比较)。
9+
// capacityErrorMarkers 上游 Codex 返回的"瞬时可重试错误"特征关键词(小写比较)。
1010
//
1111
// 典型错误消息(由上游 Responses SSE 的 response.failed 事件携带):
1212
//
1313
// "Selected model is at capacity. Please try a different mode"
1414
// "The model you requested is at capacity..."
15+
// "Our servers are currently overloaded. Please try again later."
16+
// "An error occurred while processing your request. You can retry your request..."
1517
//
16-
// 命中任一 marker 即判定为容量错误,允许 codex2api 对该请求做透明重试
18+
// 命中任一 marker 即判定为可瞬时重试错误,允许 codex2api 对该请求做透明重试
1719
// (换一个账号再试),前提是响应流还未向下游客户端写入任何字节。
1820
//
21+
// 历史:v1.7.51 之前只匹配 "at capacity" 家族,但生产 dialog_logs 实测显示
22+
// codex CLI 渲染的 "Selected model is at capacity. Please try a different
23+
// model." 其实是客户端兜底文案——上游真实文案 90% 是
24+
// "Our servers are currently overloaded",10% 是 "An error occurred while
25+
// processing your request",原 marker 一个也命中不了,导致透明重试机制完全
26+
// 失效,错误全部漏给客户端。
27+
//
1928
// 参考:GitHub Issue openai/codex#17014 — 2026 年 4 月 gpt-5.4 区域性容量
2029
// 紧张时期,上游经常在成功建立流后、实际生成内容前抛出 response.failed。
2130
var capacityErrorMarkers = []string{
31+
// 容量类(OpenAI 早期文案)
2232
"at capacity",
2333
"try a different mode",
2434
"try a different model",
35+
// 服务器过载(v1.7.51 起新增,生产主要错误)
36+
"currently overloaded",
37+
"servers are currently",
38+
"try again later",
39+
// 通用瞬时错误(上游显式说 "you can retry")
40+
"an error occurred while processing your request",
2541
}
2642

2743
// isCapacityError 判断错误消息是否匹配"上游容量告急"特征。

proxy/capacity_retry_test.go

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,15 @@ func TestIsCapacityError(t *testing.T) {
1717
{"try a different model", "Please try a different model.", true},
1818
{"rate limit is NOT capacity", "Rate limit exceeded", false},
1919
{"quota is NOT capacity", "You exceeded your current quota", false},
20+
// v1.7.51 起新增 markers(生产 dialog_logs 实测样本)
21+
{"servers overloaded (90% 样本)",
22+
"Our servers are currently overloaded. Please try again later.", true},
23+
{"generic upstream error (10% 样本)",
24+
"An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists.", true},
25+
{"only 'try again later' tail", "Service unavailable. Try again later.", true},
26+
// 反向防误伤
27+
{"auth error must NOT retry", "Invalid authentication credentials", false},
28+
{"context length must NOT retry", "This model's maximum context length is 128000 tokens", false},
2029
}
2130
for _, c := range cases {
2231
t.Run(c.name, func(t *testing.T) {

0 commit comments

Comments
 (0)