feat(web-ui): proactive auth guard + SSE/WS token-expiry re-auth path (#651)#690
Conversation
…#651) Add a client-side route guard in AppLayout (the JWT lives in localStorage, so server/edge middleware can't gate it): unauthenticated visitors to a protected route see a neutral loader and are redirected to /login — the sidebar shell never renders, so there is no shell flicker. The /login page redirects an already-authenticated visitor to /. For SSE/WebSocket streams (which auth via ?token= and whose failures the axios 401 interceptor never sees), add a shared verifyAuthAfterStreamFailure() probe that hits /users/me via the axios client: a genuine expiry returns 401 and the existing interceptor clears the token and redirects to /login, while transient failures (200/network) don't falsely bounce the user. Wired into useEventSource (terminal CLOSED — covers task + stress-test SSE), useAgentChat (WS close 1008), and useTerminalSocket (WS close 4001). Proxy /users/* in next.config.js so the probe reaches the backend. Closes #651
…h probe (#651) Cross-family review (codex) found the guard and stream re-auth probe broke the supported CODEFRAME_AUTH_REQUIRED=false local-dev mode: - AppLayout redirected every protected route to /login whenever no JWT was stored, trapping token-less auth-disabled users on the login page. - the stream probe hit fastapi-users /users/me, which 401s without a JWT even when auth is disabled, redirecting closed streams to /login. Both now probe /api/v2/settings/keys, which is gated by the backend require_auth dependency and therefore RESPECTS CODEFRAME_AUTH_REQUIRED (returns 200 without a token when auth is disabled, 401 only when auth is required). AppLayout uses a new checkAuthAccess() (bare axios, no global-interceptor redirect) to decide allow/deny/fail-open; the stream probe keeps the intercepted client. The now-unneeded /users/* proxy is reverted. Also harden the WS re-auth: the backend denies WS auth before accepting the handshake, so browsers report expiry as close code 1006 (not 1008/4001) — fire the probe on any non-normal (!= 1000) close instead of specific codes; the probe self-filters so transient closes still reconnect. Addresses codex P1/P2 and internal-review Major.
) Cross-family review (codex round 2) found a regression: resetting retriesRef in useEventSource.onopen meant an endpoint that accepts then immediately closes without sending a message refunded its retry budget every cycle, so maxRetries was never reached — an infinite reconnect loop. Restore the original contract (reset the retry counter only on a received message, not on open); the auth-probe guard re-arms on the same message signal. Add a regression test proving the budget is exhausted even when each connection opens before closing.
|
Warning Review limit reached
More reviews will be available in 1 hour, 5 minutes, and 25 seconds. Learn how PR review limits work. Your organization has reached its usage spending cap. Adjust your spending cap in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (14)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Code Review for PR 690: proactive auth guard + SSE/WS token-expiry re-auth path This is a solid, well-scoped implementation. The two-probe design (shared intercepted client for stream failures, bare axios for route guard) is the right architecture for the problem. What is working well:
Issues and observations: Issue 1 - WS hooks can probe on every retry cycle (minor): useAgentChat and useTerminalSocket fire the probe on every non-normal close. With MAX_RETRIES=5 a repeated 1006 loop emits up to 5 probes before giving up. Functionally harmless (probe self-filters) but inconsistent with the SSE hook explicit once-per-sequence guard. Consider adding an authProbeFiredRef analog to the WS hooks, resetting on successful open. Low priority if intentional. Issue 2 - WS_NORMAL_CLOSURE_CODE = 1000 duplicated: defined independently in useAgentChat.ts and useTerminalSocket.ts. A shared constant would be one source of truth. Low-priority nit given the no-premature-abstractions rule in CLAUDE.md. Issue 3 - access state has no explicit denied/redirecting value: when checkAuthAccess returns denied, the component calls router.replace('/login') but access stays pending forever. UX is correct (loader shown during redirect), but the implicit terminal state could surprise a future reader. Consider a redirecting variant to make the state machine explicit. Not blocking. Issue 4 - Verify getToken is in scope in checkAuthAccess (confirm only): checkAuthAccess calls getToken() but the diff starts at line 165, so the earlier definition is not visible. This should already be a local helper in api.ts used by createApiClient. Just confirm it is not accidentally calling something undefined; if getToken only lives in auth.ts an explicit import is needed. Issue 5 - API_ORIGIN evaluated at module load: module-level, inlined at Next.js build time. Fine for this use case, consistent with the existing pattern. Summary: The core cases are handled correctly -- no shell flicker for unauthenticated visitors, auth-disabled mode preserved, and a re-auth path wired into all three stream types. Test coverage is thorough. Issue 1 (WS probe frequency) and Issue 3 (explicit denied state) are most worth considering before merge; the rest are confirmation or nit. Approve conditional on confirming Issue 4. |
Review triage (claude[bot])Thanks — triaged, no blocking findings. Summary:
CodeRabbit was rate-limited and posted no detailed findings. Demo (Phase 11): all three acceptance criteria verified with outcome evidence against a live stack (FastAPI :8000 + Next :3001 + headless browser):
|
Follow-up review (after triage)All five issues from the first pass were resolved or explicitly accepted — no prior findings remain open. Prior items — confirmed closed:
New observation (trivial, non-blocking): In Implementation quality: Nothing blocking. Ready to merge. |
|
Follow-up review (continuing from prior [claude] comment) Issue 4 (getToken in checkAuthAccess) — CONFIRMED CLEAR. Open items from prior review — still unresolved (no new commits observed):
Overall: The implementation is clean and the core auth guard architecture is sound. Issues 1–3 are all minor or stylistic. No correctness bugs found. This looks ready to merge when the author is comfortable with the open items. |
Summary
Implements #651: the web UI had no proactive auth guard — the only gate was a reactive axios 401 interceptor, so an unauthenticated visitor rendered the full shell before flickering to
/login, and SSE/WebSocket token expiry (which auth via?token=) died silently with no path back to re-auth.AppLayout(client-side, because the JWT lives inlocalStorageand a server/edgemiddleware.tscan't read it). A visitor with no token only ever sees a neutral loader — the sidebar/shell never renders for them, so there's no shell flicker./loginalready-authenticated redirect →/./api/v2/settings/keys(arequire_auth-gated endpoint that respectsCODEFRAME_AUTH_REQUIRED). On genuine expiry the existing 401 interceptor clears the token and redirects to/login; transient failures don't falsely bounce the user. Wired intouseEventSource(SSE; covers task + stress-test streams),useAgentChat(WS), anduseTerminalSocket(WS).CODEFRAME_AUTH_REQUIRED=falsethe probe returns 200 without a token, so token-less local-dev access still works (the guard usescheckAuthAccess(), a bare-axios probe that returns allow/deny/fail-open without triggering the global redirect).Acceptance Criteria
/loginwithout shell flicker./loginis redirected to/.Test Plan
npm run buildsucceeds (TypeScript clean)Known Limitations / Intentionally Deferred
accept(), so browsers report expiry as close code1006(abnormal), not the server's1008/4001. The hooks therefore fire the re-auth probe on any non-normal close (!= 1000); the probe self-filters (only a genuine 401 redirects), so transient closes still reconnect. The server-side close codes are left unchanged (out of scope).Implementation Notes
/api/v2/settings/keysrather than fastapi-users/users/mespecifically because the latter 401s even when auth is disabled; the former honorsrequire_auth/CODEFRAME_AUTH_REQUIRED.Closes #651