Skip to content

RUBY-3770 Implement makeTimeoutError semantics in withTransaction#3006

Merged
comandeo-mongo merged 8 commits intomongodb:masterfrom
comandeo-mongo:3770-backoff-csot-with-transaction
Mar 27, 2026
Merged

RUBY-3770 Implement makeTimeoutError semantics in withTransaction#3006
comandeo-mongo merged 8 commits intomongodb:masterfrom
comandeo-mongo:3770-backoff-csot-with-transaction

Conversation

@comandeo-mongo
Copy link
Copy Markdown
Contributor

@comandeo-mongo comandeo-mongo commented Mar 27, 2026

No description provided.

comandeo-mongo and others added 8 commits March 26, 2026 15:38
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Mongo::Error::PoolTimeout does not exist; replace with StandardError
which covers all connection-related errors (SocketError, pool checkout
timeouts, NoServerAvailable, etc.).
Per DRIVERS-3391 / transactions-convenient-api spec, withTransaction must
propagate a TimeoutError (wrapping the last transient error as .cause) when
the CSOT deadline is exhausted, instead of re-raising the raw error.

Changes to lib/mongo/session.rb:
- Callback raises TransientTransactionError + deadline expired: raise
  TimeoutError in CSOT mode, re-raise original in non-CSOT mode.
- Commit raises UnknownTransactionCommitResult + deadline expired: same.
- Commit raises TransientTransactionError + deadline expired: same.
- Backoff (regular and overload) would exceed deadline: use new
  make_timeout_error_from helper — CSOT wraps last_error with cause,
  non-CSOT raises last_error directly (fixes incorrect TimeoutError
  that was raised in non-CSOT mode before this change).
- Commit overload backoff would exceed deadline: raise TimeoutError in
  CSOT mode, re-raise in non-CSOT mode.
- Add make_timeout_error_from private helper implementing the
  makeTimeoutError(lastError) pseudocode from the spec. Uses an inner
  begin/rescue to wire .cause when called outside a rescue block.

New spec/mongo/session/with_transaction_timeout_spec.rb covers all eight
code paths as unit tests with stubbed time control (no real server needed).
The test expected TimeoutError from a non-CSOT with_transaction call.
Per the spec, non-CSOT mode re-raises the original error directly.
Add timeout_ms: 5000 to enable CSOT so the test matches its intent.
@comandeo-mongo comandeo-mongo marked this pull request as ready for review March 27, 2026 12:35
@comandeo-mongo comandeo-mongo requested a review from a team as a code owner March 27, 2026 12:35
@comandeo-mongo comandeo-mongo requested review from Copilot and jamis March 27, 2026 12:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements transactions-convenient-api makeTimeoutError semantics for Mongo::Session#with_transaction, and updates/extends specs to cover CSOT vs non-CSOT timeout behavior and several SDAM/CMAP-related scenarios.

Changes:

  • Update with_transaction to wrap deadline-expiration cases into a Mongo::Error::TimeoutError only when CSOT is enabled; otherwise re-raise the last error.
  • Add new prose-style specs for retry-timeout/backoff-deadline enforcement and add/adjust unified SDAM minPoolSize tests.
  • Add an integration prose test for connection pool backpressure, and adjust a couple of existing specs for stability.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
lib/mongo/session.rb Adds make_timeout_error_from and uses it at key timeout decision points in with_transaction.
spec/mongo/session/with_transaction_timeout_spec.rb New prose tests verifying timeout wrapping vs re-raise behavior (CSOT vs non-CSOT).
spec/mongo/session_transaction_spec.rb Adjusts an existing timeout-related transaction spec to run with CSOT enabled.
spec/spec_tests/data/sdam_unified/pool-clear-min-pool-size-error.yml Adds new unified test coverage around pool-clearing behavior during minPoolSize population errors.
spec/spec_tests/data/sdam_unified/minPoolSize-error.yml Adjusts an existing unified test to use a server error code and polling monitoring mode.
spec/mongo/retryable/token_bucket_spec.rb Makes the thread-safety assertion deterministic by avoiding floor/ceiling effects.
spec/integration/secondary_reads_spec.rb Loosens read-counter tolerances in secondary read routing assertions.
spec/integration/sdam_prose_spec.rb Adds an integration test exercising checkout failures under ingress connection rate limiting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

# In non-CSOT mode re-raises last_error directly.
def make_timeout_error_from(last_error, timeout_message)
if @with_transaction_timeout_ms
raise Mongo::Error::TimeoutError, "#{timeout_message}: #{last_error}"
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helper creates a new TimeoutError but doesn’t explicitly attach the underlying error as the exception cause. In some call sites (e.g., the retry backoff deadline checks) this method is invoked outside of a rescue, so Ruby won’t auto-populate TimeoutError#cause, and we lose the spec-required ability to surface the underlying error. Consider raising the TimeoutError with cause: last_error (or constructing the exception and passing cause:) so callers can inspect the original failure via Exception#cause.

Suggested change
raise Mongo::Error::TimeoutError, "#{timeout_message}: #{last_error}"
raise Mongo::Error::TimeoutError.new("#{timeout_message}: #{last_error}"), cause: last_error

Copilot uses AI. Check for mistakes.
Comment on lines +31 to 32
end_stats[:secondary].should be_within(50).of(start_stats[:secondary])
end_stats[:primary].should >= start_stats[:primary] + 30
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Widening the tolerance to 50 makes this test non-discriminating given it only performs 30 reads: if reads are accidentally routed to the wrong server, the “wrong” counter can still increase by ~30 and remain within 50 of the starting value, causing a false pass. To address flakiness without losing signal, consider increasing the number of reads (so expected delta dwarfs noise) and/or tightening the bound or asserting on deltas between start/end counters.

Copilot uses AI. Check for mistakes.
Comment on lines +53 to 54
end_stats[:primary].should be_within(50).of(start_stats[:primary])
end_stats[:secondary].should >= start_stats[:secondary] + 30
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With only 30 operations, allowing the primary counter to vary by up to 50 can also let a complete misrouting (primary increasing by ~30 during secondary reads) pass as “within 50”, weakening the regression check. Consider increasing the number of reads and/or using a tighter/noise-aware assertion (e.g., compare deltas rather than absolute within(50)).

Copilot uses AI. Check for mistakes.
end

describe 'Connection Pool Backpressure' do
min_server_fcv '8.2'
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

min_server_fcv '8.2' may be stricter than necessary for this test: MongoDB server docs indicate the ingressConnectionEstablishmentRateLimiter* parameters are also present in 8.1.1 and 8.0.12. If the intent is to exercise the feature wherever it exists, consider gating on a lower FCV/version (or probing for the parameter via getParameter and skipping if missing) so the test runs on more CI configurations.

Suggested change
min_server_fcv '8.2'
min_server_fcv '8.0'

Copilot uses AI. Check for mistakes.
@comandeo-mongo comandeo-mongo merged commit 40ce1b6 into mongodb:master Mar 27, 2026
197 checks passed
@comandeo-mongo comandeo-mongo deleted the 3770-backoff-csot-with-transaction branch March 27, 2026 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants