Skip to content

Add BatchSize parameter to MongoDB source to prevent cursor timeouts on large collections#234

Open
Copilot wants to merge 6 commits intomainfrom
copilot/fix-mongo-cursor-error
Open

Add BatchSize parameter to MongoDB source to prevent cursor timeouts on large collections#234
Copilot wants to merge 6 commits intomainfrom
copilot/fix-mongo-cursor-error

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 8, 2026

Migrating MongoDB collections with 250k+ documents fails with MongoCursorNotFoundException because cursors timeout before all documents are fetched.

Changes

  • MongoSourceSettings: Added optional BatchSize property to control documents per batch
  • MongoRepository: Updated FindAsync to apply BatchSize via FindOptions when specified
  • MongoDataSourceExtension: Modified both filtered and unfiltered collection reads to use FindAsync with batch size support (previously unfiltered reads used AsQueryable() which doesn't support batching). This change from AsQueryable() to FindAsync(empty filter) provides async-correctness improvements (truly async, no thread-pool hop, supports server-side options) that apply even when BatchSize is not configured.
  • Validation: Added warning log when invalid BatchSize <= 0 is specified to prevent silent misconfiguration
  • Documentation: Added BatchSize parameter details and examples to MongoDB extension README, including clarification that BatchSize mitigates cursor timeouts but does not disable them (as NoCursorTimeout would, though it's not honored on Azure Cosmos DB)
  • Tests: Added comprehensive unit tests (MongoRepositoryTests.cs) verifying that FindOptions.BatchSize is correctly passed to the MongoDB driver

Usage

{
    "ConnectionString": "mongodb://localhost:27017",
    "DatabaseName": "mydb",
    "Collection": "large_collection",
    "BatchSize": 1000
}

If not specified, MongoDB's default batch size applies. Invalid values (<= 0) trigger a warning log and are ignored to prevent migration failures.

Validation

Invalid BatchSize values are logged with a warning:

Ignoring invalid BatchSize {Value} for collection '{Collection}'; must be > 0. Cursor timeout issues may occur on large collections.
Original prompt

This section details on the original issue you should resolve

<issue_title>MongoCursorNotFoundException: Cursor not found on server using connection.</issue_title>
<issue_description>Hello!

I am performing a data migration from an un-sharded Azure Cosmos MongoDB (source) account to a sharded Azure Cosmos MongoDB account (sink). One of my collections in the source has ~250k documents. When running the dmt tool, I am running into a cursor not found exception when migrating this particular collection:

Data transfer failed
MongoDB.Driver.MongoCursorNotFoundException: Cursor <id> not found on server <fqdn:10255> using connection <connection_id>

The community forum has a few recommendation as mentioned here. I have tried to reduce BatchSize but in vain. I would like to have the ability to disable cursor timeouts. Is this possible to implement with the current implementation of the C# driver?

In addition, I've always noticed that this exception gets thrown at the 18000th record exact (I have reproduced this 3 times). That being said, I am not sure if it's really due to 10 minute cursor timeout...

Please advise. Your support is appreciated :)</issue_description>

<agent_instructions>I started working on implementing what I believe is the fix of adding BatchSize to the Source but I never finished. Take a look at this branch: https://github.com/philnach/data-migration-desktop-tool/tree/MongoDBBatchSize for the work that I started. Please add BatchSize to the MongoDB source and make sure documentation and tests are updated accordingly. Also review your code for accuracy.</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@philnach I believe the issue is that the sink doesn't consider the ```BatchSize``` only the source. I'm working on plumbing ```BatchSize``` in (https://github.com/philnach/data-migration-desktop-tool/tree/MongoDBBatchSize).

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 3 commits January 8, 2026 20:13
…rrors

Co-authored-by: philnach <19275540+philnach@users.noreply.github.com>
Co-authored-by: philnach <19275540+philnach@users.noreply.github.com>
Co-authored-by: philnach <19275540+philnach@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix MongoCursorNotFoundException during data migration Add BatchSize parameter to MongoDB source to prevent cursor timeouts on large collections Jan 8, 2026
Copilot AI requested a review from philnach January 8, 2026 20:21
@philnach
Copy link
Copy Markdown
Collaborator

philnach commented Jan 8, 2026

@copilot Is this a good solution for largo MongoDB collecitons? Does the MongoDB drive/sdk support this scenario already?

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jan 8, 2026

@copilot Is this a good solution for largo MongoDB collecitons? Does the MongoDB drive/sdk support this scenario already?

Yes, this is a good solution for large MongoDB collections. The MongoDB C# driver (v2.30.0) natively supports the BatchSize parameter through FindOptions, which is exactly what we're using in the implementation.

How it works:

  • The FindOptions.BatchSize property controls how many documents the server returns in each batch when iterating through a cursor
  • This is a standard MongoDB feature that prevents cursor timeouts on large result sets
  • The driver automatically manages cursor lifecycle and fetches new batches as needed

Why this solves the problem:

  • Without BatchSize, cursors can timeout (default: 10 minutes) if the client doesn't fetch documents quickly enough
  • For 250k+ document collections, setting BatchSize: 1000 ensures documents are fetched in manageable chunks, keeping the cursor active
  • This is the recommended MongoDB approach for handling large datasets

Alternative approaches considered:

  1. Using AsQueryable() - doesn't support batch size control, can timeout
  2. Increasing cursor timeout - not configurable from client side in Azure Cosmos DB
  3. Pagination with skip/limit - inefficient for large offsets

The implementation follows MongoDB best practices and leverages native driver functionality.

@philnach philnach marked this pull request as ready for review January 10, 2026 19:47
@philnach
Copy link
Copy Markdown
Collaborator

@markjbrown / @bowencode , this is an older issue, but it enables a scenario where there's large data coming from MongoDB. It allows the user to specify the BatchSize against the source. dmt already supported against the sink.

Copy link
Copy Markdown
Collaborator

@philnach philnach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up — the overall approach (using FindOptions.BatchSize to mitigate cursor timeouts on large MongoDB collections) is the right MongoDB primitive, and CI is green. Before merging I'd like to see three things addressed; details inline.

Blocking

  1. Silent-ignoring BatchSize <= 0 hides misconfiguration. A user typo like "BatchSize": -1 will look configured but the cursor-timeout failure that motivated this PR will reappear unchanged. Please either log a warning when an invalid value is supplied, or validate at the settings layer with [Range(1, int.MaxValue)] so the user gets a clear error.
  2. The new tests don't actually verify the plumbing. They exercise the auto-property setter on MongoSourceSettings, but not the real contract — that when BatchSize is configured, FindOptions.BatchSize is what gets passed to the driver. A small unit test on MongoRepository.FindAsync using a Mock<IMongoCollection<BsonDocument>> capturing the FindOptions argument would close that gap.
  3. The basic README example now suggests "BatchSize": 1000 as a default. Users will read it as recommended/required. Please drop BatchSize from the basic example and rely on the dedicated subsection (which already explains it well) to introduce it.

Non-blocking suggestions

  • Call out the AsQueryable()FindAsync(empty filter) switch in the PR description. It's a real (and probably good) async-correctness improvement that applies even when BatchSize isn't set — worth flagging for reviewers and future archaeology.
  • LogBatchSizeIfSpecified is invoked from both branches of EnumerateCollectionAsync; could be hoisted up to a single call there.
  • The comment in MongoRepository.FindAsync ("prevent exceptions during data migration") overstates the driver's behavior on BatchSize <= 0 — please reword or remove.
  • One sentence in the README would help set expectations: BatchSize mitigates cursor timeouts (smaller batches reset the idle timer between fetches) but does not disable them; NoCursorTimeout is the actual disable, and isn't always honored on Azure Cosmos for MongoDB.

Nice-to-haves aside, I think items 1–3 are worth landing before merge. Happy to re-review quickly once those are addressed.

Comment thread Extensions/Mongo/README.md
Comment thread Extensions/Mongo/README.md
Copilot AI requested a review from philnach May 1, 2026 22:32
@philnach
Copy link
Copy Markdown
Collaborator

philnach commented May 1, 2026

Thanks for the quick turnaround on b861bf4 — verified all five threads and resolved them. Quick confirmation of what landed:

  • MongoDataSourceExtension.LogBatchSizeIfSpecified now logs a warning when BatchSize <= 0 ("Ignoring invalid BatchSize ... must be > 0. Cursor timeout issues may occur on large collections."). Misleading comment in MongoRepository.cs removed.
  • MongoRepositoryTests.cs adds 4 mock-based tests that capture FindOptions<BsonDocument, BsonDocument> and assert BatchSize is set for positive values and not set for null/0/-1. Exactly the contract test we wanted.
  • ✅ Basic Source JSON example in the README is back to the minimal form; BatchSize is introduced only in the dedicated subsection.
  • ✅ PR description now calls out the AsQueryable()FindAsync(empty filter) async-correctness improvement.
  • ✅ README "How it works" section explains BatchSize mitigates (does not disable) cursor timeouts and notes NoCursorTimeout is often unsupported on Cosmos DB for MongoDB.

From my side this is good to go pending CI. Both workflows on b861bf4 are sitting in action_required — looks like the build/test runs need a maintainer (@markjbrown / @bowencode) to approve the workflows, or a re-trigger. I'll hold off on dismissing my changes-requested vote until CI is green.

Copy link
Copy Markdown
Collaborator

@philnach philnach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MongoCursorNotFoundException: Cursor not found on server using connection.

2 participants