Skip to content

[codex] reconcile OpenAI admin test rate-limit state#1772

Open
KnowSky404 wants to merge 2 commits intoWei-Shaw:mainfrom
KnowSky404:fix/openai-test-state-reconciliation
Open

[codex] reconcile OpenAI admin test rate-limit state#1772
KnowSky404 wants to merge 2 commits intoWei-Shaw:mainfrom
KnowSky404:fix/openai-test-state-reconciliation

Conversation

@KnowSky404
Copy link
Copy Markdown
Contributor

@KnowSky404 KnowSky404 commented Apr 21, 2026

What changed

  • reconcile OpenAI admin test 429 usage_limit_reached results into persisted account runtime state
  • persist rate-limit state during the admin test path instead of only returning the error to the UI
  • clear stale status=error / old 403 error message when a retest shows the account is actually rate-limited instead of forbidden
  • add unit coverage for the OpenAI admin test reconciliation path

Why

We reproduced a production state drift where some OpenAI OAuth accounts stayed persisted as:

  • status=error
  • error_message = Access forbidden (403): account may be suspended or lack permissions

but live admin retesting later returned 429 usage_limit_reached instead.

That left stale 403 state in the database even though the real upstream condition had already changed.

Root cause

testOpenAIAccountConnection only persisted permanent error state for 401 responses.

For 429, the admin test path returned the error to the caller and persisted codex snapshot headers, but it did not reuse the existing rate-limit state handling path. As a result:

  • rate_limited_at / rate_limit_reset_at were not updated
  • stale status=error and old 403 error messages could remain in place

User impact

This makes the admin UI and stored account state align better with the actual upstream result during manual retests:

  • stale 403 error state no longer survives when the current result is 429
  • operators see the account as rate-limited rather than permanently broken
  • successful retests continue to use the existing recovery flow unchanged

Validation

  • GOCACHE=/tmp/go-build GOMODCACHE=/tmp/go-mod go test -tags=unit ./internal/service -run 'Test(AccountTestService_OpenAI|CalculateOpenAI429ResetTime|Handle429_OpenAIPersistsCodexSnapshotImmediately|RateLimitService_RecoverAccount(AfterSuccessfulTest|State))'

Related issues

Fixes #1770
Fixes #1771

@KnowSky404 KnowSky404 marked this pull request as ready for review April 21, 2026 00:59
@KnowSky404
Copy link
Copy Markdown
Contributor Author

Added another regression-test pass for the admin OpenAI account test path so the state transitions are covered more explicitly.\n\nNew test coverage now includes:\n- with codex headers -> persists rate-limit state and clears stale error state\n- with only body -> still persists rate-limit state and clears stale error state\n- on an already account -> persists rate-limit state without unnecessary error clearing\n- without any reset signal -> does not mutate runtime state\n- -> still keeps the permanent-auth-error behavior and does not mark rate-limited\n\nThis was added mainly because the production symptom is difficult to verify manually and we wanted the state reconciliation behavior locked down with focused unit tests.

@KnowSky404
Copy link
Copy Markdown
Contributor Author

Added another regression-test pass for the admin OpenAI account test path so the state transitions are covered more explicitly.

New test coverage now includes:

  • 429 with codex headers -> persists rate-limit state and clears stale 403 error state
  • 429 with only body resets_at -> still persists rate-limit state and clears stale error state
  • 429 on an already active account -> persists rate-limit state without unnecessary error clearing
  • 429 without any reset signal -> does not mutate runtime state
  • 401 -> still keeps the permanent-auth-error behavior and does not mark rate-limited

This was added mainly because the production symptom is difficult to verify manually and we wanted the state reconciliation behavior locked down with focused unit tests.

@Wei-Shaw Wei-Shaw force-pushed the main branch 4 times, most recently from 1e0d466 to 4d0483f Compare April 22, 2026 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant