fix(gotrue): prevent stale token refresh from overwriting concurrent session changes#1351
fix(gotrue): prevent stale token refresh from overwriting concurrent session changes#1351brunovsiqueira wants to merge 2 commits intosupabase:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses race conditions in packages/gotrue token refresh flow where an in-flight refresh could overwrite a newer session established concurrently (e.g., via signIn, signOut, or setSession), and improves refresh de-duplication behavior.
Changes:
- Add a monotonic session version counter to detect and discard stale refresh results.
- Replace the previous global refresh completer with a same-token de-dup + different-token serialization mechanism.
- Add a new test suite covering common token refresh race scenarios.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| packages/gotrue/lib/src/gotrue_client.dart | Introduces session version guarding and a serialized refresh queue with same-token de-duplication. |
| packages/gotrue/test/src/token_refresh_race_test.dart | Adds unit tests to validate refresh concurrency behavior and stale-result handling. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // They should have executed sequentially: A before B. | ||
| expect(completionOrder, ['A', 'B']); | ||
| }); | ||
|
|
There was a problem hiding this comment.
The new tests don’t cover the case where a different-token refresh is queued while token A is in-flight, and then token A is requested again before the first completes. With the current _activeRefresh implementation this can enqueue a second refresh for token A (defeating de-dup and potentially reintroducing the “Already Used” failure). Adding a test for: start refresh(A) with gate → call refresh(B) (queued) → call refresh(A) again → assert only one /token request for A and that the third call shares the first future would catch this regression.
| test( | |
| 'repeated token request reuses in-flight refresh ' | |
| 'while another token is queued', () async { | |
| mockClient.tokenGate = Completer<void>(); | |
| final completionOrder = <String>[]; | |
| final future1 = client.setSession('token-A').then((r) { | |
| completionOrder.add('A1'); | |
| return r; | |
| }); | |
| await Future<void>.delayed(Duration.zero); | |
| final future2 = client.setSession('token-B').then((r) { | |
| completionOrder.add('B'); | |
| return r; | |
| }); | |
| final future3 = client.setSession('token-A').then((r) { | |
| completionOrder.add('A2'); | |
| return r; | |
| }); | |
| // The second request for token-A must share the first in-flight refresh. | |
| expect(identical(future1, future3), isTrue); | |
| mockClient.tokenGate!.complete(); | |
| await Future.wait([future1, future2, future3]); | |
| // Only token-A and token-B should have triggered refresh requests. | |
| expect(mockClient.tokenRequestCount, 2); | |
| // The shared token-A future completes before the queued token-B refresh. | |
| expect(completionOrder, ['A1', 'A2', 'B']); | |
| }); |
| // Register immediately so that a synchronous second call with the same | ||
| // token hits the de-dup check above rather than creating a second entry | ||
| // in the serial queue. | ||
| _activeRefresh = (refreshToken: refreshToken, completer: completer); | ||
|
|
||
| // Chain onto the serial queue so different-token refreshes never overlap. | ||
| _pendingRefreshOperation = _pendingRefreshOperation | ||
| .then((_) => _executeRefresh(refreshToken, completer)) | ||
| .catchError((_) {}); |
There was a problem hiding this comment.
_activeRefresh is overwritten immediately for every new refresh request, even when a different-token refresh is only queued behind an in-flight one. This breaks same-token de-duplication in scenarios like: refresh(tokenA) in-flight → refresh(tokenB) queued (sets _activeRefresh=B) → refresh(tokenA) again (no longer de-dups; a second tokenA refresh gets queued/executed). It also means dispose() may only completeError the most recently queued completer, not the truly in-flight one. Consider tracking in-flight vs queued separately (e.g., only set _activeRefresh when _executeRefresh starts, or use a per-token map of pending refresh futures/completers so repeated calls for the same token always share the same future even if another token is queued).
| void _saveSession(Session session) { | ||
| _sessionVersion++; | ||
| _log.finest('Saving session: $session'); | ||
| _log.fine('Saving session'); | ||
| _currentSession = session; | ||
| } | ||
|
|
||
| void _removeSession() { | ||
| _sessionVersion++; | ||
| _log.fine('Removing session'); | ||
| _currentSession = null; |
There was a problem hiding this comment.
The session-version guard relies on all session mutations going through _saveSession/_removeSession, but this file still has direct _currentSession = ... writes (e.g. updateUser, setInitialSession). Those paths won’t bump _sessionVersion, so a stale refresh could still overwrite a concurrently updated session/user. To make the guard reliable, route all _currentSession mutations through _saveSession/_removeSession (or otherwise ensure they increment _sessionVersion).
| void _saveSession(Session session) { | |
| _sessionVersion++; | |
| _log.finest('Saving session: $session'); | |
| _log.fine('Saving session'); | |
| _currentSession = session; | |
| } | |
| void _removeSession() { | |
| _sessionVersion++; | |
| _log.fine('Removing session'); | |
| _currentSession = null; | |
| void _setCurrentSession(Session? session) { | |
| _sessionVersion++; | |
| if (session == null) { | |
| _log.fine('Removing session'); | |
| } else { | |
| _log.finest('Saving session: $session'); | |
| _log.fine('Saving session'); | |
| } | |
| _currentSession = session; | |
| } | |
| /// set currentSession and currentUser | |
| void _saveSession(Session session) { | |
| _setCurrentSession(session); | |
| } | |
| void _removeSession() { | |
| _setCurrentSession(null); |
| if (!completer.isCompleted) completer.completeError(error); | ||
| } catch (error, stack) { | ||
| _refreshTokenCompleter?.completeError(error); | ||
| if (!completer.isCompleted) completer.completeError(error); | ||
| notifyException(error, stack); |
There was a problem hiding this comment.
completeError is called without passing the captured stack/stackTrace. That loses the original stack for callers and makes refresh failures harder to debug. Use the completeError(error, stack) overload in both the on AuthException and generic catch paths.
| void dispose() { | ||
| _onAuthStateChangeController.close(); | ||
| _onAuthStateChangeControllerSync.close(); | ||
| _broadcastChannel?.close(); | ||
| _broadcastChannelSubscription?.cancel(); | ||
| final completer = _refreshTokenCompleter; | ||
| if (completer != null && !completer.isCompleted) { | ||
| completer.completeError(AuthException('Disposed')); | ||
| final active = _activeRefresh; | ||
| if (active != null && !active.completer.isCompleted) { | ||
| active.completer.completeError(AuthException('Disposed')); | ||
| } | ||
| _activeRefresh = null; | ||
| _autoRefreshTicker?.cancel(); | ||
| } |
There was a problem hiding this comment.
dispose() completes the active refresh completer with Disposed, but it does not prevent the in-flight/queued _executeRefresh from later calling _saveSession() / notifyAllSubscribers(). Since dispose() closes the stream controllers, any later notifyAllSubscribers will throw (adding to a closed StreamController) and may also mutate state after disposal. Consider adding a _isDisposed guard that prevents applying refresh results and emitting events once disposed, and ensure any queued refresh completers are completed/cancelled as well.
|
I'm starting to think if this is the correct solution to fix this. Couldn't we more adopt how auth-js does this? I haven't looked too much into their solution, but they are locking their session, which could be a more generally solution. |
Summary
Fixes race conditions in
GoTrueClient._callRefreshToken()where an in-flight token refresh could overwrite a newer session established by a concurrentsignIn,signOut, orsetSessioncall._saveSession/_removeSessionnow bump a monotonic_sessionVersioncounter. After the refresh network round-trip completes,_executeRefreshchecks whether the version changed — if so, the stale result is discarded instead of overwriting the newer session.Completerpattern ignored therefreshTokenparameter —_callRefreshToken(tokenB)would silently return tokenA's in-flight future. Now, same-token calls de-duplicate correctly while different-token calls are serialized via a Future chain (same pattern as the lifecycle reconnection in fix(supabase_flutter): simplify lifecycle reconnection with serial Future chain #1340)._autoRefreshTokenTick: reads_currentSessioninto a local variable once so thatrefreshTokenandexpiresAtcome from the same snapshot._removeSession()+ emitssignedOutif the session was replaced by a concurrent operation while the refresh was in-flight. Allcompleter.complete/completeErrorcalls are guarded with!completer.isCompletedfor dispose safety.Race conditions fixed
signOut()during in-flight refreshsignInWithPassword()during auto-refreshsignedOutevent emitted, new session removedRelated issues
Test plan
packages/gotrue/test/src/token_refresh_race_test.dartcovering all race conditions abovedart analyzeclean (only pre-existing deprecation infos)