Skip to content

feat: add grok subscription support#3310

Open
heathermhuang wants to merge 12 commits into
Wei-Shaw:mainfrom
heathermhuang:codex/grok-subscription-support
Open

feat: add grok subscription support#3310
heathermhuang wants to merge 12 commits into
Wei-Shaw:mainfrom
heathermhuang:codex/grok-subscription-support

Conversation

@heathermhuang

@heathermhuang heathermhuang commented Jun 16, 2026

Copy link
Copy Markdown

Summary

  • Adds first-class grok / xAI subscription-backed account support using OAuth credentials and the existing Sub2API account, scheduler, refresh, usage, quota, and billing paths.
  • Adds xAI OAuth helpers, admin OAuth endpoints, token refresh/provider services, Grok account creation, and Grok usage display.
  • Routes OpenAI-compatible Grok Responses traffic to the configured xAI-compatible base URL with OAuth bearer tokens:
    • public /v1/responses, /responses, and /backend-api/codex/responses
    • non-streaming and streaming Responses behavior covered locally, with prior non-streaming live smoke
  • Keeps public Grok Chat Completions routes out of this PR's production scope:
    • /v1/chat/completions and /chat/completions intentionally return route-level unsupported responses for Grok groups
    • the lower-level raw Grok Chat Completions forwarder exists and is locally tested, but it is not exposed as a public Grok gateway contract until the route fence, route-level tests, and live QA are explicitly expanded
  • Adds Grok quota parity where xAI exposes usable signals:
    • active admin quota probe using a minimal safe upstream /responses request
    • durable observation metadata for headers_observed, last_probe_at, last_headers_seen_at, and last upstream status
    • explicit no_headers state when a probe succeeds but xAI returns no quota headers, without fabricating quota values
    • normalization/persistence of xAI x-ratelimit-*, retry-after, subscription, and entitlement headers
    • scheduler auto-pause for exhausted request/token windows, Retry-After, 401 reauth, and 403 entitlement/subscription failures
    • frontend quota probe/status display in the account usage area, including fresh accounts before passive snapshots exist
  • Adds admin routes:
    • GET /api/v1/admin/grok/accounts/:id/quota
    • POST /api/v1/admin/grok/accounts/:id/reset-quota
    • GET /api/v1/admin/grok/runtime-sanity
  • Quota reset intentionally returns 501 GROK_QUOTA_RESET_UNSUPPORTED; credits are not faked unless xAI exposes a real reset capability.
  • Hardens Grok OAuth with PKCE token-exchange code_challenge, HTTPS/host validation for OAuth endpoints and base URLs, explicit unsafe dev overrides, sanitized runtime sanity reporting, state-required callback exchange by default, account-scoped cache/error-redaction/refresh behavior, and a concurrency cap of 1 unless explicitly overridden.
  • Documents the safe base URL choices: default https://api.x.ai/v1 and explicit opt-in https://cli-chat-proxy.grok.com/v1 for CLI-proxy-style behavior.

Scope Notes

  • Scope is text/reasoning Grok models through OpenAI-compatible Responses traffic.
  • Public Grok Chat Completions, image, video, TTS, transcription, browser automation, cookie scraping, and Grok web scraping are out of scope.
  • Quota values are not invented. The implementation records and acts on whitelisted xAI headers when xAI returns them; otherwise usage remains locally tracked with unknown upstream quota and a timestamped no_headers observation if an active probe saw no quota headers.
  • OAuth behavior was aligned against Hermes and OpenClaw reference behavior where applicable, while keeping bare-code fallback explicit instead of silent.

Testing

Latest validation on current head 82da8a3816a78a817405ecd16327973cfd52015f:

  • cd backend && GOCACHE=/private/tmp/sub2api-go-cache /Users/heatherm/.cache/codex/toolchains/go1.26.4-darwin-arm64/go/bin/go test -tags=unit ./... -count=1
    • Passed with localhost listener permission for httptest/miniredis.
  • pnpm --dir frontend typecheck
  • pnpm --dir frontend lint:check
  • pnpm --dir frontend build
  • git diff --check

Additional focused Grok validation on current head 82da8a3816a78a817405ecd16327973cfd52015f:

  • cd backend && GOCACHE=/private/tmp/sub2api-go-cache /Users/heatherm/.cache/codex/toolchains/go1.26.4-darwin-arm64/go/bin/go test -tags=unit ./internal/pkg/xai ./internal/handler/admin ./internal/service ./internal/server/routes -run 'Test(ObserveQuota|ParseQuota|RuntimeSanity|GrokQuota|GrokOAuthHandler|GrokTokenProviderRefresh|HandleGrokAccountUpstreamError|ShouldAutoPauseGrok|ForwardGrokResponsesStreaming|ForwardAsChatCompletionsForGrok|GatewayRoutesGrok|NormalizeAccountConcurrencyCapsGrok|ValidateXAI)' -count=1

Additional validation on prior head 7f3638adb89199110f0beefed9bdb652b5436379:

  • cd backend && GOCACHE=/private/tmp/sub2api-go-cache /Users/heatherm/.cache/codex/toolchains/go1.26.4-darwin-arm64/go/bin/go test -tags=unit ./internal/server/routes -run TestGatewayRoutesGrokOnlyAllowsResponsesHTTP -count=1
  • git diff --check

Grok readiness validation on prior head 94866822ae08ffe92584780b03410f1c749666b4:

  • cd backend && GOCACHE=/private/tmp/sub2api-go-cache /Users/heatherm/.cache/codex/toolchains/go1.26.4-darwin-arm64/go/bin/go test -tags=unit ./internal/service -run 'Test(ForwardGrokResponsesStreamingUsesXAIResponsesAndSnapshots|ForwardAsChatCompletionsForGrokStreamingUsesRawXAIChatCompletions|GrokTokenProviderRefreshesExpiredTokenOnRequestPath|ForwardAsChatCompletionsForGrokUsesXAIChatCompletionsAndSnapshots|NormalizeAccountConcurrencyCapsGrokOAuthUnlessUnsafe)' -count=1
  • cd backend && GOCACHE=/private/tmp/sub2api-go-cache /Users/heatherm/.cache/codex/toolchains/go1.26.4-darwin-arm64/go/bin/go test -tags=unit ./internal/pkg/xai ./internal/handler/admin ./internal/service ./internal/server/routes -run 'Test(BuildGrok|PatchGrok|ParseQuota|GrokQuota|GrokOAuthHandler|ShouldAutoPauseGrok|ForwardAsChatCompletionsForGrok|ForwardGrokResponsesStreaming|GrokTokenProviderRefreshesExpiredToken|ValidateXAI|GatewayRoutesGrok|NormalizeAccountConcurrencyCapsGrok)' -count=1
  • git diff --check

Parent-session validation retained for the broader feature branch:

  • cd backend && PATH="$HOME/.cache/codex/toolchains/go1.26.4-darwin-arm64/go/bin:$PATH" make generate
  • cd backend && PATH="$HOME/.cache/codex/toolchains/go1.26.4-darwin-arm64/go/bin:$PATH" go test ./internal/service -count=1
  • cd backend && PATH="$HOME/.cache/codex/toolchains/go1.26.4-darwin-arm64/go/bin:$PATH" make test-unit
  • cd backend && PATH="$HOME/.cache/codex/toolchains/go1.26.4-darwin-arm64/go/bin:$HOME/.cache/codex/toolchains/bin:$PATH" golangci-lint run ./... --timeout=30m
  • make test-frontend
  • pnpm --dir frontend audit --prod --audit-level=high --json > <tmp> && python3.12 tools/check_pnpm_audit_exceptions.py --audit <tmp> --exceptions .github/audit-exceptions.yml

Validation caveats:

  • The repo declares go 1.26.4; local validation used the pinned toolchain under ~/.cache/codex/toolchains/go1.26.4-darwin-arm64/.
  • The first full backend unit attempt without listener permission failed only on sandbox-blocked httptest/miniredis binds. The same command passed after rerunning with localhost listener permission.
  • pnpm --dir frontend build passes, with existing Vite warnings about mixed dynamic/static imports, outdated Browserslist data, and chunks over 500 kB.
  • The repo secret_scan target could not be run because tools/secret_scan.py is missing. A touched-file scan found only expected field names and test placeholders.
  • Frontend Vitest had a local Node 22 runner hang in a prior session; latest-head typecheck, lint, and production build now pass locally.

Live QA

  • Real xAI OAuth login and callback exchange were exercised in a prior session after explicit approval using disposable local Postgres/Redis/server infra.
  • OAuth auth URL included state, code_challenge, and code_challenge_method=S256.
  • xAI OAuth callback returned both code and state; token exchange succeeded and created a Grok OAuth account.
  • Disposable Sub2API API key routed /v1/responses through the Grok OAuth account to xAI.
  • Minimal non-streaming smoke returned HTTP 200 for grok-4.3 with output qa-ok.
  • The first gateway smoke exposed a scheduler capability bug (no available accounts despite an available Grok account); fixed and verified with regression tests and live smoke.
  • Latest heads 94866822, 7f3638ad, and 82da8a38 added local tests/docs/readiness hardening only; real xAI OAuth, live xAI provider QA, and production deploy QA were not rerun on those heads.
  • All disposable OAuth QA containers, callback listener, local server, API keys, tokens, callback files, and temp data were cleaned up after validation.

Current GitHub State

  • Current head: 82da8a3816a78a817405ecd16327973cfd52015f.
  • PR is open, not draft, and mergeable from GitHub's perspective.
  • GitHub reports mergeStateStatus: UNSTABLE.
  • Visible checks on the latest head are still limited to CLA Assistant:
    • cla-check: success
    • cla-lock: skipped
  • Base-repo CI and Security Scan are still not visible on the latest head. If they remain absent, gated, or action-required for the fork PR, maintainer approval/rerun is still needed before merge confidence.

Production Readiness

  • Core OAuth flow: go for prior live proof plus latest-head broad local regression coverage; not rerun against xAI on the latest heads.
  • Gateway routing: go for the public Grok /v1/responses contract, with local streaming/non-streaming coverage and prior live non-streaming smoke. Public Grok Chat Completions is intentionally out of scope and should remain no-go until explicitly enabled and tested.
  • Quota/subscription parity: partial go. Active probe, no-header observation, header normalization, admin query UI, scheduler pause behavior, 401/403/429 state handling, and runtime sanity reporting are implemented and covered locally. Live quota/header/subscription parity is still not fully proven because xAI may omit quota headers and live parity probes were not rerun on the latest head.
  • CI/merge readiness: not final until current-head CI/Security status is visible and green or explicitly accepted by maintainers.
  • Production readiness: not final until CI/Security runs and the remaining controlled QA is completed, including streaming Responses live QA, refresh-after-expiry or forced refresh, small-concurrency behavior, quota/subscription signal checks, and production base-url/env sanity.

References

@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

All contributors have signed the CLA. ✅
Posted by the CLA Assistant Lite bot.

@heathermhuang heathermhuang marked this pull request as draft June 16, 2026 11:01
@heathermhuang

Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

github-actions Bot added a commit that referenced this pull request Jun 16, 2026
@heathermhuang

Copy link
Copy Markdown
Author

Maintainer action requested: GitHub Actions for the latest head 24b6fdf9d988b441c0e6654b425ac79fed38d776 are still gated with action_required; both CI and Security Scan created runs with no jobs. Please approve/run the workflows when convenient so the PR can get base-repo CI evidence.

@heathermhuang heathermhuang marked this pull request as ready for review June 17, 2026 07:45
@heathermhuang

heathermhuang commented Jun 18, 2026

Copy link
Copy Markdown
Author

@Wei-Shaw Latest head is 82da8a3816a78a817405ecd16327973cfd52015f.

Latest-head local validation is green:

  • backend: go test -tags=unit ./... -count=1
  • frontend: pnpm --dir frontend typecheck
  • frontend: pnpm --dir frontend lint:check
  • frontend: pnpm --dir frontend build
  • git diff --check

Public Grok scope remains /v1/responses only; Grok Chat Completions is intentionally fenced out of this PR's production scope.

Could you please review and confirm whether base-repo CI/Security checks need to be manually enabled for this fork PR? Currently the only visible checks on this head are CLA (cla-check passed, cla-lock skipped).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant