Serialize OAuth token refresh per-account to prevent RTR logout. by brandonpage · Pull Request #2931 · forcedotcom/SalesforceMobileSDK-Android

brandonpage · 2026-06-17T00:18:55Z

No description provided.

wmathurin · 2026-06-17T02:28:11Z

+            // election so that a no-match path (e.g. account removed during refresh) never
+            // marks a RefreshState as refreshing — preserving the no deadlock fix and
+            // logout-during-refresh semantics.
+            final UserAccountManager userAccountManager = SalesforceSDKManager.getInstance().getUserAccountManager();


Should we add a helper function on UserAccountManager to return the matching user given a refresh token?

I considered that but it won't be needed when we make the API change to ClientManager's constructor to associate it with a UserAccount from the start.

wmathurin

The core concurrency model is solid — the winner/loser election, spurious-wakeup loop, single finally publish path, and live-token read in refreshStaleToken are all correct. A few issues below worth addressing before merging.

wmathurin · 2026-06-17T02:32:25Z

+            }
+            if (parked >= count) return
+            Thread.yield()
+        }


Thread.yield() in a tight deadline loop burns CPU and can be flaky on loaded CI machines where yield starvation delays state transitions. Thread.sleep(1) per iteration is cheaper and less sensitive to scheduler pressure.

I updated to Thread.sleep(50) as I don't think 1ms is even feasible w/ schedule overhead.

brandonpage · 2026-06-17T22:28:50Z

+    public synchronized void refreshAccessToken() throws IOException {
+        oAuthRefreshInterceptor.refreshAccessToken();
+    }


@wmathurin @JohnsonEricAtSalesforce The is simple but important new API.

wmathurin

Great fix for the RTR logout bug overall.
I kept digging with Claude to see if having the two separate synchronized(state.lock) could lead to unwanted edge cases.
It came up with one. It's very unlikely to happen but the fix makes sense, so we should probably adopt it. Below are the details.

Consecutive-cycle race: a loser can miss a result that was already published

The gap

Between Block 1 (election) and Block 2 (publish), the lock is intentionally released so the network call doesn't hold it. refreshing=true is the "slot taken" flag during that gap. This is correct.

The issue arises at the transition between two consecutive refresh cycles. After Winner A publishes in Block 2:

A: refreshing=false, newAuthToken=published, notifyAll(), releases lock
   ← RACE: Loser X (just woken) and Winner B (arriving fresh) both compete for state.lock →
B: wins lock in Block 1, sees refreshing=false, sets refreshing=true, newAuthToken=null
X: acquires lock, sees refreshing=true (Winner B!), parks again with whatever deadline time remains

Winner B clears state.newAuthToken=null as part of becoming winner — correctly, to avoid re-publishing a stale result next cycle. But this means Loser X, even if it was nanoseconds behind, finds nothing to adopt and is now on the hook for Winner B's full network round-trip, counted against a deadline that started ticking during Winner A's cycle.

If Winner A consumed most of the 30s budget and Winner B's call takes longer than what remains, Loser X times out and returns null — even though Winner A's refresh succeeded.

Proposed fix

Add a freshness check inside Block 1, before committing to a new winner election:

synchronized (state.lock) {
    if (state.refreshing) {
        // existing loser path — unchanged
    }
    // If a refresh completed very recently, adopt its result instead of starting a new cycle.
    // This closes the consecutive-cycle race where a loser woken by notifyAll() loses the
    // lock race to a new winner, which clears the just-published result before the loser
    // can read it.
    if (state.newAuthToken != null &&
            System.currentTimeMillis() - state.lastRefreshTime < RECENT_REFRESH_THRESHOLD_MILLIS) {
        adoptWinnerResult(state);
        return state.newAuthToken;
    }
    // Become winner — unchanged
    state.refreshing = true;
    state.newAuthToken = null;
    ...
}

RECENT_REFRESH_THRESHOLD_MILLIS can be small — a few seconds is enough, since the race window is the time between notifyAll() and a loser re-acquiring the lock, which is microseconds in practice. The threshold just needs to be larger than that window.

This makes state.newAuthToken serve dual purpose: election result for waiting losers, and short-lived cache for threads that arrive just after a cycle completes. Winner B becomes a no-op adopter rather than starting a redundant network call, and Loser X's deadline is no longer at risk from a cycle it had no connection to.

…essary refresh to be requested.

brandonpage · 2026-06-22T23:20:46Z

Great fix for the RTR logout bug overall. I kept digging with Claude to see if having the two separate synchronized(state.lock) could lead to unwanted edge cases. It came up with one. It's very unlikely to happen but the fix makes sense, so we should probably adopt it. Below are the details.

@wmathurin Great find. I originally had a different approach to fix this (adding a generational counter) that I think better guarantees correctness in the worst case, but was definitely less elegant in some situations. I was able to combine the best of both into this fix.

Serialize OAuth token refresh per-account to prevent RTR logout.

482f32c

wmathurin reviewed Jun 17, 2026

View reviewed changes

Add public API that allows apps to proactively refresh the access token.

f10e639

brandonpage marked this pull request as ready for review June 17, 2026 22:26

brandonpage commented Jun 17, 2026

View reviewed changes

brandonpage requested a review from wmathurin June 19, 2026 16:19

wmathurin reviewed Jun 22, 2026

View reviewed changes

Fix a rare timing issue with RTR that could cause an additional unnec…

34f0be0

…essary refresh to be requested.

wmathurin approved these changes Jun 22, 2026

View reviewed changes

brandonpage merged commit 25b42de into forcedotcom:dev Jun 23, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Serialize OAuth token refresh per-account to prevent RTR logout.#2931

Serialize OAuth token refresh per-account to prevent RTR logout.#2931
brandonpage merged 3 commits into
forcedotcom:devfrom
brandonpage:fix-concurrent-rtr-logout

brandonpage commented Jun 17, 2026

Uh oh!

wmathurin Jun 17, 2026

Uh oh!

brandonpage Jun 17, 2026 •

edited

Loading

Uh oh!

wmathurin left a comment

Uh oh!

Uh oh!

Uh oh!

wmathurin Jun 17, 2026

Uh oh!

brandonpage Jun 17, 2026

Uh oh!

brandonpage Jun 17, 2026

Uh oh!

wmathurin left a comment

Uh oh!

brandonpage commented Jun 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

brandonpage commented Jun 17, 2026

Uh oh!

wmathurin Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

brandonpage Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wmathurin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

wmathurin Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

brandonpage Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

brandonpage Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

wmathurin left a comment

Choose a reason for hiding this comment

Uh oh!

brandonpage commented Jun 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

brandonpage Jun 17, 2026 •

edited

Loading