RUBY-3770 Implement makeTimeoutError semantics in withTransaction#3006
Conversation
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Mongo::Error::PoolTimeout does not exist; replace with StandardError which covers all connection-related errors (SocketError, pool checkout timeouts, NoServerAvailable, etc.).
Per DRIVERS-3391 / transactions-convenient-api spec, withTransaction must propagate a TimeoutError (wrapping the last transient error as .cause) when the CSOT deadline is exhausted, instead of re-raising the raw error. Changes to lib/mongo/session.rb: - Callback raises TransientTransactionError + deadline expired: raise TimeoutError in CSOT mode, re-raise original in non-CSOT mode. - Commit raises UnknownTransactionCommitResult + deadline expired: same. - Commit raises TransientTransactionError + deadline expired: same. - Backoff (regular and overload) would exceed deadline: use new make_timeout_error_from helper — CSOT wraps last_error with cause, non-CSOT raises last_error directly (fixes incorrect TimeoutError that was raised in non-CSOT mode before this change). - Commit overload backoff would exceed deadline: raise TimeoutError in CSOT mode, re-raise in non-CSOT mode. - Add make_timeout_error_from private helper implementing the makeTimeoutError(lastError) pseudocode from the spec. Uses an inner begin/rescue to wire .cause when called outside a rescue block. New spec/mongo/session/with_transaction_timeout_spec.rb covers all eight code paths as unit tests with stubbed time control (no real server needed).
The test expected TimeoutError from a non-CSOT with_transaction call. Per the spec, non-CSOT mode re-raises the original error directly. Add timeout_ms: 5000 to enable CSOT so the test matches its intent.
There was a problem hiding this comment.
Pull request overview
Implements transactions-convenient-api makeTimeoutError semantics for Mongo::Session#with_transaction, and updates/extends specs to cover CSOT vs non-CSOT timeout behavior and several SDAM/CMAP-related scenarios.
Changes:
- Update
with_transactionto wrap deadline-expiration cases into aMongo::Error::TimeoutErroronly when CSOT is enabled; otherwise re-raise the last error. - Add new prose-style specs for retry-timeout/backoff-deadline enforcement and add/adjust unified SDAM minPoolSize tests.
- Add an integration prose test for connection pool backpressure, and adjust a couple of existing specs for stability.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| lib/mongo/session.rb | Adds make_timeout_error_from and uses it at key timeout decision points in with_transaction. |
| spec/mongo/session/with_transaction_timeout_spec.rb | New prose tests verifying timeout wrapping vs re-raise behavior (CSOT vs non-CSOT). |
| spec/mongo/session_transaction_spec.rb | Adjusts an existing timeout-related transaction spec to run with CSOT enabled. |
| spec/spec_tests/data/sdam_unified/pool-clear-min-pool-size-error.yml | Adds new unified test coverage around pool-clearing behavior during minPoolSize population errors. |
| spec/spec_tests/data/sdam_unified/minPoolSize-error.yml | Adjusts an existing unified test to use a server error code and polling monitoring mode. |
| spec/mongo/retryable/token_bucket_spec.rb | Makes the thread-safety assertion deterministic by avoiding floor/ceiling effects. |
| spec/integration/secondary_reads_spec.rb | Loosens read-counter tolerances in secondary read routing assertions. |
| spec/integration/sdam_prose_spec.rb | Adds an integration test exercising checkout failures under ingress connection rate limiting. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # In non-CSOT mode re-raises last_error directly. | ||
| def make_timeout_error_from(last_error, timeout_message) | ||
| if @with_transaction_timeout_ms | ||
| raise Mongo::Error::TimeoutError, "#{timeout_message}: #{last_error}" |
There was a problem hiding this comment.
The helper creates a new TimeoutError but doesn’t explicitly attach the underlying error as the exception cause. In some call sites (e.g., the retry backoff deadline checks) this method is invoked outside of a rescue, so Ruby won’t auto-populate TimeoutError#cause, and we lose the spec-required ability to surface the underlying error. Consider raising the TimeoutError with cause: last_error (or constructing the exception and passing cause:) so callers can inspect the original failure via Exception#cause.
| raise Mongo::Error::TimeoutError, "#{timeout_message}: #{last_error}" | |
| raise Mongo::Error::TimeoutError.new("#{timeout_message}: #{last_error}"), cause: last_error |
| end_stats[:secondary].should be_within(50).of(start_stats[:secondary]) | ||
| end_stats[:primary].should >= start_stats[:primary] + 30 |
There was a problem hiding this comment.
Widening the tolerance to 50 makes this test non-discriminating given it only performs 30 reads: if reads are accidentally routed to the wrong server, the “wrong” counter can still increase by ~30 and remain within 50 of the starting value, causing a false pass. To address flakiness without losing signal, consider increasing the number of reads (so expected delta dwarfs noise) and/or tightening the bound or asserting on deltas between start/end counters.
| end_stats[:primary].should be_within(50).of(start_stats[:primary]) | ||
| end_stats[:secondary].should >= start_stats[:secondary] + 30 |
There was a problem hiding this comment.
With only 30 operations, allowing the primary counter to vary by up to 50 can also let a complete misrouting (primary increasing by ~30 during secondary reads) pass as “within 50”, weakening the regression check. Consider increasing the number of reads and/or using a tighter/noise-aware assertion (e.g., compare deltas rather than absolute within(50)).
| end | ||
|
|
||
| describe 'Connection Pool Backpressure' do | ||
| min_server_fcv '8.2' |
There was a problem hiding this comment.
min_server_fcv '8.2' may be stricter than necessary for this test: MongoDB server docs indicate the ingressConnectionEstablishmentRateLimiter* parameters are also present in 8.1.1 and 8.0.12. If the intent is to exercise the feature wherever it exists, consider gating on a lower FCV/version (or probing for the parameter via getParameter and skipping if missing) so the test runs on more CI configurations.
| min_server_fcv '8.2' | |
| min_server_fcv '8.0' |
No description provided.