-
Notifications
You must be signed in to change notification settings - Fork 355
Hedge cold container-metadata reads across regions (#4253) #4608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
NaluTripician
wants to merge
1
commit into
Azure:main
Choose a base branch
from
NaluTripician:nalutripician/cosmos-metadata-structural
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -30,16 +30,24 @@ use crate::{ | |
| /// this constant is the upper bound. | ||
| const DEFAULT_THRESHOLD_CAP: Duration = Duration::from_millis(1000); | ||
|
|
||
| /// Resource types eligible for cross-region hedging in the current phase. | ||
| /// Resource types eligible for cross-region hedging. | ||
| /// | ||
| /// Subsequent phases widen this single constant — no other change to | ||
| /// [`should_hedge`] is required. | ||
| const HEDGEABLE_RESOURCE_TYPES: &[ResourceType] = &[ResourceType::Document]; | ||
| /// `Document` covers data-plane reads. `DocumentCollection` covers the | ||
| /// control-plane container/collection metadata read that warms a cold | ||
| /// container cache — hedging it across regions keeps a slow or unhealthy | ||
| /// preferred region from stalling the read (and any operation blocked on it) | ||
| /// past the caller's timeout, which is the cross-region-failover-preemption | ||
| /// scenario in issue #4253. Both are idempotent reads. | ||
|
Comment on lines
+39
to
+40
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We probably don't need to mention the relevant issue here |
||
| /// | ||
| /// Subsequent phases may widen this single constant further — no other change | ||
| /// to [`should_hedge`] is required. | ||
| const HEDGEABLE_RESOURCE_TYPES: &[ResourceType] = | ||
| &[ResourceType::Document, ResourceType::DocumentCollection]; | ||
|
|
||
| /// Operation types eligible for cross-region hedging in the current phase. | ||
| /// | ||
| /// Future phases will append feed-style operations | ||
| /// (`Query` / `ReadFeed` / `QueryPlan`) and metadata reads. | ||
| /// (`Query` / `ReadFeed` / `QueryPlan`). | ||
| const HEDGEABLE_OPERATION_TYPES: &[OperationType] = &[OperationType::Read]; | ||
|
|
||
| /// Returns `true` when the operation is eligible for cross-region hedging. | ||
|
|
@@ -324,6 +332,26 @@ mod tests { | |
| CosmosOperation::read_database(db) | ||
| } | ||
|
|
||
| fn read_container_operation() -> CosmosOperation { | ||
| CosmosOperation::read_container(fake_container_reference()) | ||
| } | ||
|
|
||
| fn create_container_operation() -> CosmosOperation { | ||
| let account = AccountReference::with_master_key( | ||
| Url::parse("https://acct.documents.azure.com/").unwrap(), | ||
| "k", | ||
| ); | ||
| CosmosOperation::create_container(DatabaseReference::from_name(account, "db")) | ||
| } | ||
|
|
||
| fn read_all_containers_operation() -> CosmosOperation { | ||
| let account = AccountReference::with_master_key( | ||
| Url::parse("https://acct.documents.azure.com/").unwrap(), | ||
| "k", | ||
| ); | ||
| CosmosOperation::read_all_containers(DatabaseReference::from_name(account, "db")) | ||
| } | ||
|
|
||
| fn enabled_strategy() -> HedgingStrategy { | ||
| HedgingStrategy::new(HedgeThreshold::new(Duration::from_millis(500)).unwrap()) | ||
| } | ||
|
|
@@ -381,12 +409,45 @@ mod tests { | |
|
|
||
| #[test] | ||
| fn should_hedge_non_document() { | ||
| // Reads against non-Document resource types are excluded in Phase 1. | ||
| // Reads against non-hedgeable resource types (e.g. Database) are excluded. | ||
| let state = account_state_with_regions(&[Region::EAST_US, Region::WEST_US_2]); | ||
| let op = read_database_operation(); | ||
| assert!(!should_hedge(Some(&enabled_strategy()), &op, &state, &[],)); | ||
| } | ||
|
|
||
| /// Regression for issue #4253: a cold container/collection metadata point-read | ||
| /// is hedged across regions so a slow or unhealthy preferred region cannot stall | ||
| /// the read (and any operation blocked on warming the container cache) past the | ||
| /// caller's timeout. This replaces the earlier detached-task approach with the | ||
| /// crate's structural cross-region mechanism (no detached tasks). | ||
| #[test] | ||
| fn should_hedge_container_read() { | ||
| let state = account_state_with_regions(&[Region::EAST_US, Region::WEST_US_2]); | ||
| let op = read_container_operation(); | ||
| assert_eq!(op.resource_type(), ResourceType::DocumentCollection); | ||
| assert!(should_hedge(Some(&enabled_strategy()), &op, &state, &[])); | ||
| } | ||
|
|
||
| /// Container writes (create/replace/delete) must never hedge — only idempotent | ||
| /// reads are eligible. | ||
| #[test] | ||
| fn should_not_hedge_container_write() { | ||
| let state = account_state_with_regions(&[Region::EAST_US, Region::WEST_US_2]); | ||
| let op = create_container_operation(); | ||
| assert_eq!(op.resource_type(), ResourceType::DocumentCollection); | ||
| assert!(!should_hedge(Some(&enabled_strategy()), &op, &state, &[])); | ||
| } | ||
|
|
||
| /// Feed-style container reads (`ReadFeed`, e.g. list containers) are not hedged | ||
| /// in this phase — only point reads of a container's metadata. | ||
| #[test] | ||
| fn should_not_hedge_container_feed_read() { | ||
| let state = account_state_with_regions(&[Region::EAST_US, Region::WEST_US_2]); | ||
| let op = read_all_containers_operation(); | ||
| assert_eq!(op.resource_type(), ResourceType::DocumentCollection); | ||
| assert!(!should_hedge(Some(&enabled_strategy()), &op, &state, &[])); | ||
| } | ||
|
|
||
| #[test] | ||
| fn should_hedge_disabled_override() { | ||
| // `None` represents Disabled at any layer — short-circuits before | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with copilot - also the issue is being referenced, not the PR link