Skip to content

fix(gotrue): prevent stale token refresh from overwriting concurrent session changes#1351

Open
brunovsiqueira wants to merge 2 commits intosupabase:mainfrom
brunovsiqueira:fix/gotrue-token-refresh-races
Open

fix(gotrue): prevent stale token refresh from overwriting concurrent session changes#1351
brunovsiqueira wants to merge 2 commits intosupabase:mainfrom
brunovsiqueira:fix/gotrue-token-refresh-races

Conversation

@brunovsiqueira
Copy link
Copy Markdown
Contributor

@brunovsiqueira brunovsiqueira commented Apr 13, 2026

Summary

Fixes race conditions in GoTrueClient._callRefreshToken() where an in-flight token refresh could overwrite a newer session established by a concurrent signIn, signOut, or setSession call.

  • Session version guard: _saveSession / _removeSession now bump a monotonic _sessionVersion counter. After the refresh network round-trip completes, _executeRefresh checks whether the version changed — if so, the stale result is discarded instead of overwriting the newer session.
  • Token-aware dedup: the old Completer pattern ignored the refreshToken parameter — _callRefreshToken(tokenB) would silently return tokenA's in-flight future. Now, same-token calls de-duplicate correctly while different-token calls are serialized via a Future chain (same pattern as the lifecycle reconnection in fix(supabase_flutter): simplify lifecycle reconnection with serial Future chain #1340).
  • TOCTOU fix in _autoRefreshTokenTick: reads _currentSession into a local variable once so that refreshToken and expiresAt come from the same snapshot.
  • Defensive error handling: a failed refresh no longer calls _removeSession() + emits signedOut if the session was replaced by a concurrent operation while the refresh was in-flight. All completer.complete/completeError calls are guarded with !completer.isCompleted for dispose safety.

Race conditions fixed

Scenario Before After
signOut() during in-flight refresh Refresh completes → restores session, undoing signOut Stale result discarded; session stays null
signInWithPassword() during auto-refresh Old refresh overwrites new signIn session Version mismatch detected; new session preserved
Two concurrent refreshes with different tokens Second caller gets first token's future (wrong result) Second refresh queued and executes after first completes
Refresh fails while new session was set signedOut event emitted, new session removed Version guard prevents removal; new session preserved

Related issues

Test plan

  • 7 new tests in packages/gotrue/test/src/token_refresh_race_test.dart covering all race conditions above
  • All existing gotrue unit tests pass
  • All supabase_flutter tests pass
  • dart analyze clean (only pre-existing deprecation infos)

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses race conditions in packages/gotrue token refresh flow where an in-flight refresh could overwrite a newer session established concurrently (e.g., via signIn, signOut, or setSession), and improves refresh de-duplication behavior.

Changes:

  • Add a monotonic session version counter to detect and discard stale refresh results.
  • Replace the previous global refresh completer with a same-token de-dup + different-token serialization mechanism.
  • Add a new test suite covering common token refresh race scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
packages/gotrue/lib/src/gotrue_client.dart Introduces session version guarding and a serialized refresh queue with same-token de-duplication.
packages/gotrue/test/src/token_refresh_race_test.dart Adds unit tests to validate refresh concurrency behavior and stale-result handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// They should have executed sequentially: A before B.
expect(completionOrder, ['A', 'B']);
});

Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new tests don’t cover the case where a different-token refresh is queued while token A is in-flight, and then token A is requested again before the first completes. With the current _activeRefresh implementation this can enqueue a second refresh for token A (defeating de-dup and potentially reintroducing the “Already Used” failure). Adding a test for: start refresh(A) with gate → call refresh(B) (queued) → call refresh(A) again → assert only one /token request for A and that the third call shares the first future would catch this regression.

Suggested change
test(
'repeated token request reuses in-flight refresh '
'while another token is queued', () async {
mockClient.tokenGate = Completer<void>();
final completionOrder = <String>[];
final future1 = client.setSession('token-A').then((r) {
completionOrder.add('A1');
return r;
});
await Future<void>.delayed(Duration.zero);
final future2 = client.setSession('token-B').then((r) {
completionOrder.add('B');
return r;
});
final future3 = client.setSession('token-A').then((r) {
completionOrder.add('A2');
return r;
});
// The second request for token-A must share the first in-flight refresh.
expect(identical(future1, future3), isTrue);
mockClient.tokenGate!.complete();
await Future.wait([future1, future2, future3]);
// Only token-A and token-B should have triggered refresh requests.
expect(mockClient.tokenRequestCount, 2);
// The shared token-A future completes before the queued token-B refresh.
expect(completionOrder, ['A1', 'A2', 'B']);
});

Copilot uses AI. Check for mistakes.
Comment on lines +1403 to +1411
// Register immediately so that a synchronous second call with the same
// token hits the de-dup check above rather than creating a second entry
// in the serial queue.
_activeRefresh = (refreshToken: refreshToken, completer: completer);

// Chain onto the serial queue so different-token refreshes never overlap.
_pendingRefreshOperation = _pendingRefreshOperation
.then((_) => _executeRefresh(refreshToken, completer))
.catchError((_) {});
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_activeRefresh is overwritten immediately for every new refresh request, even when a different-token refresh is only queued behind an in-flight one. This breaks same-token de-duplication in scenarios like: refresh(tokenA) in-flight → refresh(tokenB) queued (sets _activeRefresh=B) → refresh(tokenA) again (no longer de-dups; a second tokenA refresh gets queued/executed). It also means dispose() may only completeError the most recently queued completer, not the truly in-flight one. Consider tracking in-flight vs queued separately (e.g., only set _activeRefresh when _executeRefresh starts, or use a per-token map of pending refresh futures/completers so repeated calls for the same token always share the same future even if another token is queued).

Copilot uses AI. Check for mistakes.
Comment on lines 1296 to 1306
void _saveSession(Session session) {
_sessionVersion++;
_log.finest('Saving session: $session');
_log.fine('Saving session');
_currentSession = session;
}

void _removeSession() {
_sessionVersion++;
_log.fine('Removing session');
_currentSession = null;
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The session-version guard relies on all session mutations going through _saveSession/_removeSession, but this file still has direct _currentSession = ... writes (e.g. updateUser, setInitialSession). Those paths won’t bump _sessionVersion, so a stale refresh could still overwrite a concurrently updated session/user. To make the guard reliable, route all _currentSession mutations through _saveSession/_removeSession (or otherwise ensure they increment _sessionVersion).

Suggested change
void _saveSession(Session session) {
_sessionVersion++;
_log.finest('Saving session: $session');
_log.fine('Saving session');
_currentSession = session;
}
void _removeSession() {
_sessionVersion++;
_log.fine('Removing session');
_currentSession = null;
void _setCurrentSession(Session? session) {
_sessionVersion++;
if (session == null) {
_log.fine('Removing session');
} else {
_log.finest('Saving session: $session');
_log.fine('Saving session');
}
_currentSession = session;
}
/// set currentSession and currentUser
void _saveSession(Session session) {
_setCurrentSession(session);
}
void _removeSession() {
_setCurrentSession(null);

Copilot uses AI. Check for mistakes.
Comment on lines +1461 to 1464
if (!completer.isCompleted) completer.completeError(error);
} catch (error, stack) {
_refreshTokenCompleter?.completeError(error);
if (!completer.isCompleted) completer.completeError(error);
notifyException(error, stack);
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

completeError is called without passing the captured stack/stackTrace. That loses the original stack for callers and makes refresh failures harder to debug. Use the completeError(error, stack) overload in both the on AuthException and generic catch paths.

Copilot uses AI. Check for mistakes.
Comment on lines 1364 to 1375
void dispose() {
_onAuthStateChangeController.close();
_onAuthStateChangeControllerSync.close();
_broadcastChannel?.close();
_broadcastChannelSubscription?.cancel();
final completer = _refreshTokenCompleter;
if (completer != null && !completer.isCompleted) {
completer.completeError(AuthException('Disposed'));
final active = _activeRefresh;
if (active != null && !active.completer.isCompleted) {
active.completer.completeError(AuthException('Disposed'));
}
_activeRefresh = null;
_autoRefreshTicker?.cancel();
}
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dispose() completes the active refresh completer with Disposed, but it does not prevent the in-flight/queued _executeRefresh from later calling _saveSession() / notifyAllSubscribers(). Since dispose() closes the stream controllers, any later notifyAllSubscribers will throw (adding to a closed StreamController) and may also mutate state after disposal. Consider adding a _isDisposed guard that prevents applying refresh results and emitting events once disposed, and ensure any queued refresh completers are completed/cancelled as well.

Copilot uses AI. Check for mistakes.
@Vinzent03
Copy link
Copy Markdown
Collaborator

I'm starting to think if this is the correct solution to fix this. Couldn't we more adopt how auth-js does this? I haven't looked too much into their solution, but they are locking their session, which could be a more generally solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auto Refresh is not working

3 participants