CDRIVER-6092 CDRIVER-6262 CDRIVER-6268 implement exponential backoff and jitter in retry loops#2240
Merged
connorsmacd merged 60 commits intomongodb:masterfrom Mar 10, 2026
Conversation
b65a467 to
dae4720
Compare
dae4720 to
2407ba1
Compare
kevinAlbs
reviewed
Mar 6, 2026
Collaborator
kevinAlbs
left a comment
There was a problem hiding this comment.
Sorry for the goofs in the suggested test. Posting drafted feedback since I expect the test failures might be a source of confusion. Will continue reviewing.
kevinAlbs
reviewed
Mar 6, 2026
Collaborator
kevinAlbs
left a comment
There was a problem hiding this comment.
Only substantial remaining comment is suggestion to move token_bucket to the mongoc_topology_t.
kevinAlbs
reviewed
Mar 9, 2026
And test aggregate with write stage
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reference
CDRIVER-6092
CDRIVER-6262
CDRIVER-6268
CDRIVER-6092 is the primary task. CDRIVER-6262 and CDRIVER-6268 fix/augment some of the unified tests introduced in this PR.
Summary
This PR implements the exponential retry backoff algorithm defined by DRIVERS-3239.
Refactor existing retry "gotos" into loops
The existing implementation for retryable reads and writes use a simple
gotolabel for iteration. To better align with the client backpressure implementation, I decided traditional loops would be more suitable. Therefore, I chose to first refactor the existing implementation of retryable reads and writes to use loops without making any backpressure-related changes. These changes are found in the following commits:Token bucket (
mongoc_token_bucket_t)Client backpressure uses a token bucket as a simple rate limiting mechanism. The token bucket is only used when
adaptiveRetries=Trueis specified as a URI option. Note that the C implementation of the token bucket closely matches the pseudocode found in the spec.Common exponential retry backoff implementation for both retryable commands and
withTransaction#2198 introduced a very similar exponential retry backoff algorithm for
withTransaction. Both algorithms use the same math to compute durations. However, retryable commands use a different growth factor, initial backoff duration, and maximum backoff duration.For the purposes of code reuse, I introduced
mongoc_retry_backoff_generator_t(renaming suggestions are welcome). This component encapsulates retry iteration to model a generator-like interface. Instead of using hardcoded values for the maximum retry attempt,mongoc_retry_backoff_generator_tcomputes these values programatically. See: Refactor with_transaction to use backoff generator.Note
I was initially considering putting these changes into a separate PR that only affected
withTransaction, but I felt it was hard to contextualize the changes without seeing the retryable command algorithm as well.First pass implementation
Before attempting to introduce more generic retryable command interface, I first implemented exponential retry backoff in each existing retry loop. The first passes can be found in the following commits:
Important
The above implementations were based of a previous version of the spec where the token bucket was used unconditionally. With the most recent version of the spec, the token bucket is only used if the
adaptiveRetries=TrueURI option is passed.Generic retryable command interface
To increase code reuse and better align with the spec, I refactored the implementation described above to use a generic retry loop component called
mongoc_retryable_cmd_t. The functionmongoc_retryable_cmd_runmodels, as closely as possible, the pseudocode found in the spec. The retry loops differ from each other in how the command is executed and how the retry server is selected. To account for this,mongoc_retryable_cmd_thas a v-table-like construct that allows each implementation to customize these steps of the algorithm.Tip
For reviewers, it may be helpful to compare the commits from the first pass implementation to understand how the various parts of the loops were broken up to use
mongoc_retryable_cmd_t.Other changes
ASSERT_CMPDURATION, which was introduced in CDRIVER-6084 CDRIVER-6189 CDRIVER-6206 implement withTransaction exponential retry backoff #2198: