Skip to content

Commit 09d85ce

Browse files
Emit x-ms-cosmos-hub-region-processing-only header (#4389)
Implements [`HUB_REGION_PROCESSING_HEADER_SPEC.md`](https://github.com/Azure/azure-sdk-for-rust/blob/release/azure_data_cosmos-previews/sdk/cosmos/azure_data_cosmos/docs/HUB_REGION_PROCESSING_HEADER_SPEC.md) (merged in #4320) for single-master data-plane Cosmos operations. On the retry triggered by the first `404 / 1002 (READ_SESSION_NOT_AVAILABLE)` response, the driver now latches a per-operation flag and emits the `x-ms-cosmos-hub-region-processing-only: True` request header on every subsequent transport attempt for that operation. This asks the backend to route only to a region that has caught up to the requested LSN, reducing the chance of a follow-up retry hitting another region whose session is also behind. The latch is sticky for the remainder of the operation and gated on three conditions, which together scope the change exactly to the case the spec calls out: - data-plane scope (metadata-pipeline ops are out of scope per spec §1.5) - single-master accounts (multi-master accounts already have a different recovery path) - first 1002 within the operation (subsequent 1002s preserve the latch but don't re-arm it) Mirrors .NET parity ([Azure/azure-cosmos-dotnet-v3#5447](Azure/azure-cosmos-dotnet-v3#5447)). Forward-compatible by design: backends that ignore the header preserve the existing single-master 1002 retry behavior, so this lands safely ahead of the server-side rollout. ## Changes by file - **`cosmos_headers.rs`** — adds `HUB_REGION_PROCESSING_ONLY` const inside `request_header_names`. - **`components.rs`** — adds `is_dataplane: bool` and `hub_region_processing_only: bool` fields to `OperationRetryState` with LOAD-BEARING doc comments; `initial()` defaults both to `false`; invariant note added to `session_token_retry_count` because the latch trigger reads it pre-increment. `advance_*` propagates the flags automatically via `..self` — no new positional parameters. - **`operation_pipeline.rs`** — sets `retry_state.is_dataplane = pipeline_type.is_data_plane()` once at the production call site (uses the accessor because `PipelineType` is `#[non_exhaustive]` — a future variant must not silently bypass an equality gate); extracts `apply_hub_region_header(..)` as a small free function so the emission rule can be tested without driving the full pipeline; updates 5 existing exhaustive `OperationRetryState` struct-literal tests for the two new fields. - **`retry_evaluation.rs`** — extracts `build_session_retry_state(..)` that latches `hub_region_processing_only` when all four trigger conditions hold; gate placed after the existing `>= 2` abort check so it only runs on the SessionRetry path; updates the one exhaustive struct-literal test for the two new fields. - **`azure_data_cosmos_driver/CHANGELOG.md`** and **`azure_data_cosmos/CHANGELOG.md`** — entries under the unreleased version with a back-link to #4320 and the spec doc. ## Tests added (10) Trigger-side coverage in `retry_evaluation::tests` (8): single-master/data-plane first 1002 latches (AC-1); multi-master never latches (AC-4); latch stays set across subsequent 1002s (AC-2); 200/410/503 never latch (AC-5); count boundary at exhaustion (AC-3); metadata-pipeline never latches (AC-8); independent operations don't share state (AC-6); latch survives failover via `..self` propagation. Emission-side coverage in `operation_pipeline::tests` (2): header set to `True` when latched; header omitted when not latched. All 876 lib tests + 57 in-memory-emulator tests + 40 emulator tests pass; `cargo fmt` and `cargo clippy --all-features --tests` are clean. ## Acceptance criteria AC-1 through AC-6 and AC-8 from the spec are covered by the tests above. AC-7 (long-tail recovery / per-region telemetry) is deferred per the spec's ALT-5 — no client-side scope. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 0627369 commit 09d85ce

6 files changed

Lines changed: 486 additions & 5 deletions

File tree

sdk/cosmos/azure_data_cosmos/CHANGELOG.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,10 @@
55
### Features Added
66

77
### Breaking Changes
8-
- Removed the `request_url()` accessor (gated on the `fault_injection` feature) from `ItemResponse`/`ResourceResponse`/`BatchResponse`. Driver-routed operations never populated it, so it always returned `None` in current usage.
98

9+
- Removed the `request_url()` accessor (gated on the `fault_injection` feature) from `ItemResponse`/`ResourceResponse`/`BatchResponse`. Driver-routed operations never populated it, so it always returned `None` in current usage.
1010
- `CosmosClientBuilder::with_user_agent_suffix` (and `CosmosClientOptions::with_user_agent_suffix`) now take `UserAgentSuffix` instead of `impl Into<String>`. Callers passing a `&str` or `String` must construct the value explicitly via `UserAgentSuffix::new` (panics on invalid input) or `UserAgentSuffix::try_new` (returns `Option`). Validation rules (max 25 characters, HTTP-header-safe) are now enforced at the construction site instead of being applied silently inside the builder. ([#4368](https://github.com/Azure/azure-sdk-for-rust/pull/4368))
11-
1211
- Replaced `CosmosDiagnostics` with `CosmosDiagnosticsContext` (a re-export of `azure_data_cosmos_driver::diagnostics::DiagnosticsContext`). All response types now return `Arc<CosmosDiagnosticsContext>` from `diagnostics()` (the returned `Arc` derefs transparently to `CosmosDiagnosticsContext` for read-only inspection, and can be retained alongside a consumed response body). The previous `activity_id() -> Option<&str>` and `server_duration_ms() -> Option<f64>` accessors on `CosmosDiagnostics` are replaced by `CosmosDiagnosticsContext::activity_id() -> &ActivityId` and per-request server timing via `CosmosDiagnosticsContext::requests()[i].server_duration_ms()`.
13-
1412
- Removed `azure_data_cosmos::constants::SubStatusCode` and its `new`/`value`/`from_header_value`/`From`/`Display`/`Debug` API. The SDK no longer maintains a parallel sub-status-code type — fault-injection (the only remaining consumer) now uses `azure_data_cosmos_driver::models::SubStatusCode` directly. Callers that referenced the SDK type should switch to the driver re-export.
1513

1614
### Bugs Fixed

sdk/cosmos/azure_data_cosmos_driver/CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44

55
### Features Added
66

7+
- Added support for the `x-ms-cosmos-hub-region-processing-only` request header on retries after a `404 / 1002 (READ_SESSION_NOT_AVAILABLE)` response on single-master data-plane Cosmos operations. The header asks the backend to route only to a region that has caught up to the requested LSN, reducing the chance of a follow-up retry hitting a region whose session is also behind. The header is scoped to single-master accounts (multi-master accounts already have a different recovery path) and to data-plane operations (metadata-pipeline operations are out of scope per the design spec). Once latched on the first 1002 within an operation, the header is emitted on every subsequent retry for that operation. ([#4389](https://github.com/Azure/azure-sdk-for-rust/pull/4389))
78
- Added local query-plan generator scaffolding under `crate::query` (lexer, parser, AST, planner, and in-memory evaluator). The scaffolding is **not wired into the production query path** yet — production callers still issue Gateway query-plan requests via `CosmosOperation::query_plan`. The `__internal_testing` cargo feature exposes `query::__test_only_generate_query_plan_for_pk_paths`, `query::__TEST_ONLY_SUPPORTED_QUERY_FEATURES`, and `CosmosOperation::query_plan` for cross-crate gateway-comparison tests; this feature is intentionally unstable and **not covered by SemVer**.
89
- Added per-partition automatic failover (PPAF) for writes on single-master accounts. On 403/3 WriteForbidden, 503 ServiceUnavailable, 429/3092 SystemResourceUnavailable, 410/1022 Gone, or 408 RequestTimeout from a region, the affected partition is failed over to the next preferred region; subsequent writes for that partition skip the failed region. ([#4156](https://github.com/Azure/azure-sdk-for-rust/pull/4156))
910
- Added per-partition circuit breaker (PPCB) for reads (any account) and writes (multi-master accounts). Tracks failure counts per `(partition_key_range_id, region)` and routes to an alternate region once the threshold (default 10 reads, 5 writes) is exceeded. A background failback loop probes the original region for recovery. ([#4156](https://github.com/Azure/azure-sdk-for-rust/pull/4156))

sdk/cosmos/azure_data_cosmos_driver/src/driver/pipeline/components.rs

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,13 +75,67 @@ pub(crate) struct OperationRetryState {
7575
/// Number of operation-level failover retries attempted.
7676
pub failover_retry_count: u32,
7777
/// Number of session-token retries attempted.
78+
///
79+
/// **Invariant:** This counter is incremented only by
80+
/// [`Self::advance_session_retry`] on the 404/1002 path. The
81+
/// hub-region-processing-only latch trigger in
82+
/// `retry_evaluation::try_handle_read_session_not_available` reads
83+
/// `== 0` to detect the first 1002 within an operation; if a future
84+
/// refactor adds another increment site, that trigger gate must be
85+
/// re-validated.
7886
pub session_token_retry_count: u32,
7987
/// Maximum failover retries.
8088
pub max_failover_retries: u32,
8189
/// Maximum session retries.
8290
pub max_session_retries: u32,
8391
/// Whether multiple write locations can be used.
8492
pub can_use_multiple_write_locations: bool,
93+
/// Whether this operation is on the data-plane pipeline (vs metadata).
94+
///
95+
/// Set once at the production call site in `execute_operation_pipeline`
96+
/// from `pipeline_type.is_data_plane()`. Used to gate the
97+
/// hub-region-processing-only latch so metadata-pipeline operations
98+
/// (which ride the same `execute_operation_pipeline` but are scoped out
99+
/// of the spec per HUB_REGION_PROCESSING_HEADER_SPEC.md §1.5) never
100+
/// emit the header.
101+
///
102+
/// LOAD-BEARING for the metadata-pipeline scope gate (AC-8).
103+
/// The production call site MUST use the
104+
/// `PipelineType::is_data_plane()` accessor — NOT `==` matching —
105+
/// because `PipelineType` is `#[non_exhaustive]` and a future variant
106+
/// would silently bypass an equality gate.
107+
pub is_dataplane: bool,
108+
/// Hub-region-processing-only latch.
109+
///
110+
/// Sticky within a single operation. Set on the retry triggered by the
111+
/// FIRST `404 / 1002 (READ_SESSION_NOT_AVAILABLE)` on a single-master
112+
/// data-plane account; once set, every subsequent transport attempt
113+
/// for this operation emits the
114+
/// `x-ms-cosmos-hub-region-processing-only: True` header.
115+
///
116+
/// Bounded by operation lifetime — `OperationRetryState` is
117+
/// constructed fresh per call to `execute_operation_pipeline` (single
118+
/// production call site), so the latch never leaks across operations.
119+
///
120+
/// LOAD-BEARING for SE-003 mitigation — see
121+
/// HUB_REGION_PROCESSING_HEADER_SPEC.md AG-1..AG-4. If a future
122+
/// refactor adds a second production construction site for
123+
/// `OperationRetryState`, the SE-003 mitigation argument needs to be
124+
/// re-validated.
125+
///
126+
/// **Cross-region hedging coordination.** The orchestrator added in
127+
/// `azure_data_cosmos_driver/docs/HEDGING_SPEC.md` §9.5 constructs an
128+
/// `OperationRetryState` *per hedge*, so this per-state latch is
129+
/// per-hedge by default. The hedging spec requires augmenting this
130+
/// state with a `shared_hub_region_latch: Option<Arc<AtomicBool>>`
131+
/// that is `Some` only when running inside `execute_with_hedging()`,
132+
/// is CAS-set alongside this field in `build_session_retry_state`,
133+
/// and is OR'd into the emission decision in `apply_hub_region_header`.
134+
/// This mirrors .NET v3's `CrossRegionAvailabilityContext` shared
135+
/// object introduced by azure-cosmos-dotnet-v3#5815. Any change to
136+
/// the latch trigger or emission rule here MUST update both call
137+
/// sites and §9.5 of the hedging spec.
138+
pub hub_region_processing_only: bool,
85139
/// Regions excluded for this operation.
86140
pub excluded_regions: Vec<Region>,
87141
/// Session-retry routing override for read operations.
@@ -164,6 +218,8 @@ impl OperationRetryState {
164218
max_failover_retries,
165219
max_session_retries,
166220
can_use_multiple_write_locations,
221+
is_dataplane: false,
222+
hub_region_processing_only: false,
167223
excluded_regions,
168224
session_retry_routing: SessionRetryRouting::PreferredEndpoints,
169225
partition_key_range_id: None,

sdk/cosmos/azure_data_cosmos_driver/src/driver/pipeline/operation_pipeline.rs

Lines changed: 124 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,18 @@ pub(crate) async fn execute_operation_pipeline(
140140
.per_partition_circuit_breaker_enabled
141141
&& location_snapshot.account.preferred_read_endpoints.len() > 1;
142142

143+
// HUB_REGION_PROCESSING_HEADER_SPEC.md §1.5: gate the
144+
// `x-ms-cosmos-hub-region-processing-only` latch on data-plane scope
145+
// so metadata-pipeline operations (which ride the same
146+
// `execute_operation_pipeline`) never emit the header.
147+
//
148+
// Use the `PipelineType::is_data_plane()` accessor — NOT `==` matching
149+
// — because `PipelineType` is `#[non_exhaustive]` and a future variant
150+
// would silently bypass an equality gate. Equivalently
151+
// `!pipeline_type.is_metadata()` (the metadata pipeline is the only
152+
// current variant that is out of spec scope).
153+
retry_state.is_dataplane = pipeline_type.is_data_plane();
154+
143155
let deadline = options
144156
.end_to_end_latency_policy()
145157
.map(|p| Instant::now() + p.timeout());
@@ -153,7 +165,7 @@ pub(crate) async fn execute_operation_pipeline(
153165
operation,
154166
&retry_state,
155167
&location,
156-
pipeline_type == PipelineType::DataPlane,
168+
pipeline_type.is_data_plane(),
157169
location_state_store.endpoint_unavailability_ttl(),
158170
);
159171

@@ -185,6 +197,14 @@ pub(crate) async fn execute_operation_pipeline(
185197
};
186198
let mut transport_request = build_transport_request(operation, custom_headers, &ctx)?;
187199

200+
// HUB_REGION_PROCESSING_HEADER_SPEC.md §3 / public-spec §3.4:
201+
// Emit the `x-ms-cosmos-hub-region-processing-only: True` header
202+
// when the latch is set. The latch is flipped in
203+
// `try_handle_read_session_not_available` on the first 1002 of a
204+
// single-master data-plane operation, and is sticky for the
205+
// remainder of the operation's transport attempts (AC-1, AC-2).
206+
apply_hub_region_header(&mut transport_request, &retry_state);
207+
188208
apply_optional_request_headers(&mut transport_request, operation, options);
189209

190210
tracing::trace!(
@@ -923,6 +943,40 @@ fn compute_execution_context(retry_state: &OperationRetryState) -> ExecutionCont
923943
}
924944
}
925945

946+
/// Conditionally emits the `x-ms-cosmos-hub-region-processing-only: True`
947+
/// header on the outbound transport request when the latch on
948+
/// `retry_state` is set.
949+
///
950+
/// Extracted as a free function (rather than left inline at the call site
951+
/// in `execute_operation_pipeline`) so that the emission rule can be
952+
/// exercised by unit tests without spinning up the full pipeline. The
953+
/// production call site is the loop iteration after `build_transport_request`
954+
/// and before `apply_optional_request_headers`.
955+
///
956+
/// HUB_REGION_PROCESSING_HEADER_SPEC.md §3 / public-spec §3.4. See
957+
/// `try_handle_read_session_not_available` for the latch trigger.
958+
///
959+
/// **Hedging coordination (future).** Per HEDGING_SPEC.md §9.5, the
960+
/// emission decision MUST OR in a `shared_hub_region_latch:
961+
/// Option<Arc<AtomicBool>>` field added to `OperationRetryState`, read
962+
/// with `Acquire` ordering. That field is set from `build_session_retry_state`
963+
/// the first time any hedge in the same `execute_with_hedging()`
964+
/// fan-out observes 1002 and is what makes the other (still-latch-clean)
965+
/// hedges immediately emit the header — the Rust counterpart of .NET v3's
966+
/// `CrossRegionAvailabilityContext.ShouldAddHubRegionProcessingOnlyHeader`
967+
/// from azure-cosmos-dotnet-v3#5815.
968+
fn apply_hub_region_header(
969+
transport_request: &mut TransportRequest,
970+
retry_state: &OperationRetryState,
971+
) {
972+
if retry_state.hub_region_processing_only {
973+
transport_request.headers.insert(
974+
HeaderName::from_static(request_header_names::HUB_REGION_PROCESSING_ONLY),
975+
HeaderValue::from_static("True"),
976+
);
977+
}
978+
}
979+
926980
/// Applies operation-options-driven request headers that are only known
927981
/// after the request has been built by `build_transport_request`.
928982
///
@@ -1333,6 +1387,8 @@ mod tests {
13331387
max_failover_retries: 3,
13341388
max_session_retries: 2,
13351389
can_use_multiple_write_locations: false,
1390+
is_dataplane: false,
1391+
hub_region_processing_only: false,
13361392
excluded_regions: Vec::new(),
13371393
session_retry_routing:
13381394
crate::driver::pipeline::components::SessionRetryRouting::PreferredWriteEndpoints,
@@ -1387,6 +1443,8 @@ mod tests {
13871443
max_failover_retries: 3,
13881444
max_session_retries: 2,
13891445
can_use_multiple_write_locations: false,
1446+
is_dataplane: false,
1447+
hub_region_processing_only: false,
13901448
excluded_regions: Vec::new(),
13911449
session_retry_routing:
13921450
crate::driver::pipeline::components::SessionRetryRouting::PreferredEndpoints,
@@ -1441,6 +1499,8 @@ mod tests {
14411499
max_failover_retries: 3,
14421500
max_session_retries: 2,
14431501
can_use_multiple_write_locations: false,
1502+
is_dataplane: false,
1503+
hub_region_processing_only: false,
14441504
excluded_regions: Vec::new(),
14451505
session_retry_routing:
14461506
crate::driver::pipeline::components::SessionRetryRouting::PreferredEndpoints,
@@ -1502,6 +1562,8 @@ mod tests {
15021562
max_failover_retries: 3,
15031563
max_session_retries: 3,
15041564
can_use_multiple_write_locations: true,
1565+
is_dataplane: false,
1566+
hub_region_processing_only: false,
15051567
excluded_regions: Vec::new(),
15061568
session_retry_routing:
15071569
crate::driver::pipeline::components::SessionRetryRouting::PreferredEndpoints,
@@ -1761,6 +1823,8 @@ mod tests {
17611823
max_failover_retries: 3,
17621824
max_session_retries: 2,
17631825
can_use_multiple_write_locations: false,
1826+
is_dataplane: false,
1827+
hub_region_processing_only: false,
17641828
excluded_regions: vec!["westus2".into()],
17651829
session_retry_routing:
17661830
crate::driver::pipeline::components::SessionRetryRouting::PreferredEndpoints,
@@ -2736,6 +2800,65 @@ mod tests {
27362800
));
27372801
}
27382802

2803+
// ── apply_hub_region_header ──────────────────────────────────────
2804+
//
2805+
// See HUB_REGION_PROCESSING_HEADER_SPEC.md §3.4 / public-spec §4.2.
2806+
// The emission logic itself is a 4-line conditional; these tests
2807+
// exercise both branches so AC-1/AC-5 don't drift on a refactor.
2808+
2809+
fn build_minimal_transport_request() -> super::TransportRequest {
2810+
let operation = CosmosOperation::read_all_databases(test_account());
2811+
let routing = test_routing();
2812+
let activity_id = ActivityId::from_string("hub-region-test".to_string());
2813+
let ctx = TransportRequestContext {
2814+
routing: &routing,
2815+
activity_id: &activity_id,
2816+
execution_context: ExecutionContext::Initial,
2817+
deadline: None,
2818+
resolved_session_token: None,
2819+
throughput_control: None,
2820+
};
2821+
build_transport_request(&operation, None, &ctx).expect("request should build")
2822+
}
2823+
2824+
/// T-6 — When the latch is set on `retry_state`, the helper emits
2825+
/// `x-ms-cosmos-hub-region-processing-only: True` on the transport
2826+
/// request (AC-1).
2827+
#[test]
2828+
fn transport_request_emits_hub_region_header_when_latched() {
2829+
let mut request = build_minimal_transport_request();
2830+
let mut state = super::OperationRetryState::initial(0, false, Vec::new(), 3, 1);
2831+
state.is_dataplane = true;
2832+
state.hub_region_processing_only = true;
2833+
2834+
super::apply_hub_region_header(&mut request, &state);
2835+
2836+
let value = request.headers.get_optional_str(&HeaderName::from_static(
2837+
request_header_names::HUB_REGION_PROCESSING_ONLY,
2838+
));
2839+
assert_eq!(value, Some("True"));
2840+
}
2841+
2842+
/// T-7 — When the latch is NOT set, the helper does not emit the
2843+
/// header. Covers AC-5 / cross-operation isolation guarantee at the
2844+
/// emission layer.
2845+
#[test]
2846+
fn transport_request_omits_hub_region_header_when_not_latched() {
2847+
let mut request = build_minimal_transport_request();
2848+
let state = super::OperationRetryState::initial(0, false, Vec::new(), 3, 1);
2849+
assert!(!state.hub_region_processing_only);
2850+
2851+
super::apply_hub_region_header(&mut request, &state);
2852+
2853+
let value = request.headers.get_optional_str(&HeaderName::from_static(
2854+
request_header_names::HUB_REGION_PROCESSING_ONLY,
2855+
));
2856+
assert!(
2857+
value.is_none(),
2858+
"hub-region header must not be present when latch is unset, got {value:?}",
2859+
);
2860+
}
2861+
27392862
// ── apply_failover_delay ──────────────────────────────────────────
27402863

27412864
#[tokio::test]

0 commit comments

Comments
 (0)