Skip to content

fix(model-fallback): add HTTP statusCode check for GLM rate limit fallback#3773

Open
cailgarrisk-collab wants to merge 1 commit intocode-yeongyu:devfrom
cailgarrisk-collab:fix/glm-rate-limit-fallback-statuscode
Open

fix(model-fallback): add HTTP statusCode check for GLM rate limit fallback#3773
cailgarrisk-collab wants to merge 1 commit intocode-yeongyu:devfrom
cailgarrisk-collab:fix/glm-rate-limit-fallback-statuscode

Conversation

@cailgarrisk-collab
Copy link
Copy Markdown

@cailgarrisk-collab cailgarrisk-collab commented May 3, 2026

Summary

isRetryableModelError() only checked error message strings for rate-limit patterns, missing cases where the provider returns HTTP 429 without matching text (e.g., Chinese GLM/Z.ai messages like "请求频率过高"). This fix adds HTTP status code checking as a parallel detection path.

Changes

src/shared/model-error-classifier.ts

  • ErrorInfo interface: added statusCode?: number
  • isRetryableModelError(): checks statusCode (429/503/529) after STOP patterns and before message pattern fallback — stops prevent quota/billing 429s from retrying
  • STOP_MESSAGE_PATTERNS: added GLM/Z.ai-specific quota patterns: "daily call limit", "in arrears", "fair use policy", "recharge and try", "usage limit reached for"
  • 400 intentionally excluded from statusCode check (permanent client error)

src/features/background-agent/error-classifier.ts

  • Added extractErrorStatusCode() — extracts HTTP status from error objects supporting: statusCode, status, code, response.status (number and string formats)

src/features/background-agent/fallback-retry-handler.ts

  • tryFallbackRetry() errorInfo accepts statusCode?: number

src/features/background-agent/manager.ts

  • All 3 errorInfo construction sites now extract statusCode via extractErrorStatusCode()
  • tryFallbackRetry + handleSessionErrorEvent signatures updated

src/shared/model-error-classifier.test.ts

  • 12 new tests covering: GLM 429 with Chinese message, 429 with no message, 503/529, 400/401 exclusion, GLM quota/arrears/fair-use STOP patterns, STOP > statusCode precedence, backward compat

Verification

bun test src/shared/model-error-classifier.test.ts → 34/34 PASS
bun test src/features/background-agent/error-classifier.test.ts → 65/65 PASS
npx tsc --noEmit → clean (0 errors)

Precedence Chain

NON_RETRYABLE_ERROR_NAMES → STOP_ERROR_NAMES → RETRYABLE_ERROR_NAMES → STOP_MESSAGE_PATTERNS → AUTO_RETRY_GATE → statusCode(429/503/529) → RETRYABLE_MESSAGE_PATTERNS

STOP patterns (quota, arrears, fair-use) always win over statusCode 429.


View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

Summary by cubic

Fixes model fallback detection by checking HTTP status codes so GLM/Z.ai rate limits are caught even with non-English or missing messages. Prevents retries on quota/billing errors via new STOP patterns.

  • Bug Fixes
    • isRetryableModelError() checks statusCode 429/503/529 before message matching; STOP patterns still win.
    • Added GLM/Z.ai STOP phrases (quota/arrears/fair-use) to avoid retrying permanent limits.
    • Introduced extractErrorStatusCode() and passed statusCode through fallback and manager handlers.
    • Expanded tests for Chinese 429, no-message 429, 503/529, STOP precedence; 400 excluded.

Written for commit 61d2f11. Summary will update on new commits.

…lback

isRetryableModelError() now checks the HTTP status code (429/503/529)
in addition to existing message pattern matching. This ensures rate
limit errors trigger model fallback regardless of error message format
or language (e.g., Chinese GLM errors).

Changes:
- ErrorInfo interface extended with statusCode?: number
- isRetryableModelError() checks statusCode after STOP patterns, before
  message pattern fallback
- extractErrorStatusCode() added to error-classifier.ts (supports
  statusCode, status, code, response.status fields)
- GLM-specific STOP patterns added: daily call limit, in arrears,
  fair use policy, recharge and try — these prevent quota/billing 429s
  from being treated as transient rate limits
- statusCode propagated through tryFallbackRetry and manager.ts

400 intentionally excluded from statusCode check (permanent client error).
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 3, 2026

Thank you for your contribution! Before we can merge this PR, we need you to sign our Contributor License Agreement (CLA).

To sign the CLA, please comment on this PR with:

I have read the CLA Document and I hereby sign the CLA

This is a one-time requirement. Once signed, all your future contributions will be automatically accepted.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 5 files

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.

Auto-approved: Correctly implements status code checks with proper precedence to avoid regressions on quota errors; comprehensive tests provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant