Skip to content

feat(cosmos): enforce maximum fan-out limit in query pipeline#4615

Open
Copilot wants to merge 7 commits into
mainfrom
copilot/enforce-maximum-fan-out
Open

feat(cosmos): enforce maximum fan-out limit in query pipeline#4615
Copilot wants to merge 7 commits into
mainfrom
copilot/enforce-maximum-fan-out

Conversation

Copilot AI commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Cross-partition queries targeting a large number of physical partitions are a common source of performance problems. This adds a configurable cap: if a query would fan out to more physical partitions than the limit, plan_operation returns an error (HTTP 400 / sub-status 20307) rather than silently executing an expensive scatter-gather. The default limit is 100.

New error code

  • SubStatusCode::CLIENT_QUERY_FAN_OUT_LIMIT_EXCEEDED (20307) and CosmosStatus::CLIENT_QUERY_FAN_OUT_LIMIT_EXCEEDED (HTTP 400 / 20307)

Driver API (azure_data_cosmos_driver)

  • New PlanOptions struct groups the continuation token and fan-out cap into a single parameter; CosmosDriver::plan_operation now accepts Option<PlanOptions> as its final argument (None applies all defaults). This follows the same extensible pattern used elsewhere in the SDK, avoiding call-site churn when new planning options are added.
  • Fan-out check runs after topology resolution, before execution; the error message explicitly names QueryOptions::with_max_fan_out() so users know how to raise the cap

Public API (azure_data_cosmos)

  • FeedOptions::max_fan_out: Option<usize> + FeedOptions::with_max_fan_out(n) builder method
  • QueryOptions::with_max_fan_out(n) convenience shortcut
let pages = container
    .query_items("SELECT * FROM c", partition_key::all(), None)?
    .with_max_fan_out(250)   // raise the limit for this query
    .into_stream();

The limit only applies to fresh query plans; resumed continuation tokens are not re-checked.

Copilot AI and others added 2 commits June 17, 2026 17:23
Adds a configurable maximum fan-out limit for cross-partition queries.
When a query would fan out to more physical partitions than the limit,
plan_operation returns a CLIENT_QUERY_FAN_OUT_LIMIT_EXCEEDED error
(HTTP 400 / sub-status 20307). The default cap is 100 and can be
raised via FeedOptions::with_max_fan_out / QueryOptions::with_max_fan_out.

Fixes #4453

Co-authored-by: analogrelay <7574+analogrelay@users.noreply.github.com>
- Rename effective_max_fan_out to fan_out_limit in planner.rs
- Fix incorrect doc comment reference in feed.rs

Co-authored-by: analogrelay <7574+analogrelay@users.noreply.github.com>
Copilot AI changed the title [WIP] Enforce maximum fan-out limit in query pipeline feat(cosmos): enforce maximum fan-out limit in query pipeline Jun 17, 2026
Copilot AI requested a review from analogrelay June 17, 2026 17:26
Comment thread sdk/cosmos/azure_data_cosmos/CHANGELOG.md Outdated
Comment thread sdk/cosmos/azure_data_cosmos_driver/src/driver/cosmos_driver.rs Outdated
Comment thread sdk/cosmos/azure_data_cosmos_driver/CHANGELOG.md Outdated
Group continuation token and max_fan_out into a PlanOptions struct so
plan_operation can accept Option<PlanOptions> as its final parameter.
This avoids signature churn when adding new planning options in future.

- Add PlanOptions struct (driver/plan_options.rs), exported from the
  driver crate
- Change CosmosDriver::plan_operation to accept Option<PlanOptions>
  with unwrap_or_default() at the start
- Update all SDK callers (container_client, cosmos_client,
  database_client) to construct PlanOptions from FeedOptions
- Fix private intra-doc link on plan_operation (was referencing the
  private planner::DEFAULT_MAX_FAN_OUT constant, breaking cargo doc)
- Update CHANGELOG entries to reference PR #4615 (not issue #4453)

Co-authored-by: analogrelay <7574+analogrelay@users.noreply.github.com>
Copilot AI requested a review from analogrelay June 17, 2026 20:53
@analogrelay analogrelay marked this pull request as ready for review June 17, 2026 22:21
@analogrelay analogrelay requested a review from a team as a code owner June 17, 2026 22:21
Copilot AI review requested due to automatic review settings June 17, 2026 22:21
Comment thread sdk/cosmos/azure_data_cosmos_driver/src/driver/plan_options.rs Outdated
Co-authored-by: Ashley Stanton-Nurse <github@analogrelay.net>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a configurable maximum fan-out cap for cross-partition queries so that queries targeting “too many” physical partitions fail fast (HTTP 400 / sub-status 20307) instead of silently triggering expensive scatter-gather execution. It wires the cap from the public azure_data_cosmos surface down into the azure_data_cosmos_driver planner, and introduces a new sub-status/status code to represent this client-side policy violation.

Changes:

  • Added FeedOptions::max_fan_out + with_max_fan_out and a QueryOptions::with_max_fan_out shortcut to configure the cap from the public SDK.
  • Introduced azure_data_cosmos_driver::PlanOptions and updated CosmosDriver::plan_operation to accept Option<PlanOptions>, using PlanOptions::max_fan_out to enforce the cap during planning.
  • Added new error codes (SubStatusCode/CosmosStatus 20307) plus unit/integration test updates for the new planner parameter.
Show a summary per file
File Description
sdk/cosmos/azure_data_cosmos/src/options/feed.rs Adds max_fan_out to feed/query options and builder conveniences.
sdk/cosmos/azure_data_cosmos/src/clients/container_client.rs Propagates public max_fan_out + continuation into driver planning via PlanOptions.
sdk/cosmos/azure_data_cosmos/CHANGELOG.md Documents the new public query fan-out cap option.
sdk/cosmos/azure_data_cosmos_driver/src/lib.rs Re-exports PlanOptions at the driver crate root.
sdk/cosmos/azure_data_cosmos_driver/src/error/cosmos_status.rs Adds new 20307 sub-status + 400/20307 CosmosStatus constant and name mapping.
sdk/cosmos/azure_data_cosmos_driver/src/driver/plan_options.rs Introduces the new PlanOptions struct for plan_operation.
sdk/cosmos/azure_data_cosmos_driver/src/driver/mod.rs Wires plan_options module and re-exports PlanOptions.
sdk/cosmos/azure_data_cosmos_driver/src/driver/dataflow/planner.rs Enforces default/custom fan-out limit and adds planner tests.
sdk/cosmos/azure_data_cosmos_driver/src/driver/dataflow/integration_tests/query_resume.rs Updates integration tests for the new build_sequential_drain signature.
sdk/cosmos/azure_data_cosmos_driver/src/driver/cosmos_driver.rs Updates plan_operation to use PlanOptions and pass max_fan_out into the planner.
sdk/cosmos/azure_data_cosmos_driver/CHANGELOG.md Documents new planning options + new status codes (needs categorization tweak).

Copilot's findings

  • Files reviewed: 11/11 changed files
  • Comments generated: 7

Comment on lines +192 to +205
let fan_out_limit = max_fan_out.unwrap_or(DEFAULT_MAX_FAN_OUT);
if request_nodes.len() > fan_out_limit {
return Err(crate::error::CosmosError::builder()
.with_status(crate::error::CosmosStatus::CLIENT_QUERY_FAN_OUT_LIMIT_EXCEEDED)
.with_message(format!(
"cross-partition query would fan out to {} physical partitions, \
which exceeds the maximum of {}; use \
QueryOptions::with_max_fan_out() to raise the limit if this \
level of fan-out is intentional",
request_nodes.len(),
fan_out_limit,
))
.build());
}

@analogrelay analogrelay Jun 18, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but continuation tokens aren't authenticated or secured, so my concern would be this could open someone up to crafting a token with huge fan-out. I think we should leave this as-is. If there is a continuation token and resuming it with the current topology would put it over the max fan-out limit, I think we should fail.

Comment on lines +31 to +33
/// Users who truly need more can increase the limit via
/// [`FeedOptions::max_fan_out`](crate::options::FeedOptions::max_fan_out) /
/// `QueryOptions::with_max_fan_out`.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Just remove the reference to the max_fan_out option from this doc comment. This is the constant, it's crate-internal.

Comment thread sdk/cosmos/azure_data_cosmos_driver/src/error/cosmos_status.rs
Comment on lines +2231 to +2234
/// The query would require contacting more physical partitions than
/// the configured maximum. Raise the limit via
/// `FeedOptions::max_fan_out` / `QueryOptions::with_max_fan_out` if
/// this level of fan-out is intentional.
Comment on lines +10 to +14
/// Group both continuation-token resumption and the fan-out cap into a single
/// struct so the signature of `plan_operation` can grow without forcing a
/// change at every call site. Callers that need no special behavior can pass
/// `None`; `plan_operation` treats `None` as `PlanOptions::default()`.
#[derive(Default)]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Derive Clone and Debug as recommended, but don't make the suggested doc comment updates, they're too verbose.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 9c4e9f8 — derived Clone and Debug on PlanOptions.

Comment thread sdk/cosmos/azure_data_cosmos/src/clients/container_client.rs

### Features Added

- Added `PlanOptions` struct to `CosmosDriver::plan_operation`, grouping continuation-token and fan-out cap into a single parameter, making the API extensible without future call-site churn. ([#4615](https://github.com/Azure/azure-sdk-for-rust/pull/4615))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Agreed here, put the change to the plan_operation signature in the driver CHANGELOG's breaking changes section. It's NOT breaking to SDK users though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 9c4e9f8 — moved the plan_operation signature change to the Breaking Changes section.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
- Derive Clone and Debug on PlanOptions
- Move plan_operation signature change to breaking changes section
- Remove FeedOptions reference from internal constant doc

Co-authored-by: analogrelay <7574+analogrelay@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants