Skip to content

Commit 2ad6e08

Browse files
Build PartitionKeyRangeCache and CollectionRoutingMap to Support Caching of PK-Ranges (#4007)
## Summary Implements the `PartitionKeyRangeCache` and `CollectionRoutingMap` in the `azure_data_cosmos_driver` crate to support caching and resolution of partition key ranges. This is the foundational layer required for **partition-level failover** (PPAF/PPCB) — enabling the driver to resolve a user-supplied partition key into the concrete partition key range ID that owns it. ## Motivation Cosmos DB distributes data across physical partitions, each owning a contiguous range of the hash space described by a `[minInclusive, maxExclusive)` pair. To support partition-level failover, the driver must be able to: 1. Compute the **effective partition key (EPK)** from user-supplied partition key values. 2. Resolve the EPK to the owning **partition key range ID** via binary search. 3. Cache the routing map per collection and refresh it incrementally via the change feed protocol when partitions split. ## What's Included ### New Models (`src/models/`) | File | Description | |------|-------------| | `range.rs` | Generic `Range<T>` with `contains()`, `check_overlapping()`, `MinComparer`, `MaxComparer`, and boundary semantics (`[min, max)`) | | `partition_key_range.rs` | `PartitionKeyRange` model with 14 serde-annotated fields, `to_range()` conversion, `PartitionKeyRangeStatus` enum (Online/Splitting/Offline/Split), custom `PartialEq`/`Hash` on identity fields | | `service_identity.rs` | `ServiceIdentity` model for direct-mode routing (federation ID, service name, master flag) | | `murmur_hash.rs` | MurmurHash3 x64-128 and x86-32 implementations (ported from the Cosmos DB SDK / Austin Appleby's SMHasher) | | `effective_partition_key.rs` | EPK computation engine — dispatches to V1 (MurmurHash3-32 + binary-encoded components) or V2 (MurmurHash3-128 + reversed bytes) based on partition key kind and version | ### Partition Key Value Extensions (`src/models/partition_key.rs`) - Made `PartitionKeyValue` `pub` (was `pub(crate)`) for EPK hashing access. - Added `Infinity` variant to `InnerPartitionKeyValue` for EPK boundary calculations. - Added `write_for_hashing_v1()`, `write_for_hashing_v2()`, `write_for_binary_encoding_v1()` methods for EPK computation. - Added `PartitionKey::values()` accessor for the EPK engine. ### Cache Components (`src/driver/cache/`) | File | Description | |------|-------------| | `collection_routing_map.rs` | `CollectionRoutingMap` — sorted routing map with O(log n) EPK→range binary search, O(1) ID lookup via `HashMap<String, (PartitionKeyRange, Option<ServiceIdentity>)>`, gone-parent filtering, completeness validation, `try_combine()` for incremental merge, `get_overlapping_ranges()`, and `highest_non_offline_pk_range_id` | | `partition_key_range_cache.rs` | `PartitionKeyRangeCache` — async cache keyed by collection RID, backed by `AsyncCache<String, CollectionRoutingMap>`. Supports lazy fetch, change-feed incremental refresh (If-None-Match/304), `force_refresh` with ETag-based deduplication, and transport-decoupled `fetch_fn` callback | ### Design Spec (`docs/PARTITION_KEY_RANGE_CACHE_SPEC.md`) Comprehensive specification covering: - Architectural overview and component design - EPK computation (V1/V2 hash algorithms) - `CollectionRoutingMap` data structure, construction, validation, and lookup algorithms - Cache lifecycle (lazy fetch, steady state, incremental refresh, invalidation) - Error handling and fallback strategies - Performance characteristics - Cross-SDK feature matrix comparison (.NET, Java, Rust driver, Rust SDK) - Testing strategy with existing and recommended tests - SDK deprecation and migration plan - Future work priorities ## Architecture Highlights ### Transport Decoupling The cache accepts a generic `Fn(String, Option<String>) -> Future<Output = Option<PkRangeFetchResult>>` callback instead of holding a direct reference to the transport pipeline. This keeps the cache fully unit-testable without a live endpoint and supports both gateway and direct mode. ### ServiceIdentity Support The `CollectionRoutingMap` stores `(PartitionKeyRange, Option<ServiceIdentity>)` tuples per range, enabling future direct-mode routing. Currently populated as `None` when fetched via gateway; will be populated when the driver has direct connectivity information. ### Change Feed Protocol The cache follows the Cosmos DB change feed pattern: 1. Initial fetch with `If-None-Match: None` returns all current ranges. 2. Subsequent incremental refreshes pass the previous `etag` as continuation. 3. HTTP 304 Not Modified signals no changes since last fetch. 4. New ranges are merged via `try_combine()` with gone-parent filtering and completeness validation. ### Single-Pending-I/O Semantics Backed by `AsyncCache` + `AsyncLazy`, concurrent requests for the same collection share a single in-flight fetch. Post-initialization reads are lock-free (`Arc` clone). ## Cross-SDK Feature Parity The implementation achieves near-complete parity with .NET and Java SDKs: - ✅ EPK → range binary search - ✅ Range ID → range HashMap lookup + ServiceIdentity - ✅ Overlapping range resolution - ✅ `forceRefresh` with ETag-based deduplication - ✅ Change feed incremental refresh with `try_combine` - ✅ Gone parent filtering - ✅ `isGone(rangeId)` check - ✅ `HighestNonOfflinePkRangeId` for split detection - ✅ `PartitionKeyRangeStatus` enum - ✅ Completeness validation (distinct `OverlappingRanges` vs `IncompleteRanges` errors) ## Integration Status The cache is currently **standalone** — it is `pub(crate)` with `#[allow(unused_imports)]` on its re-export. It will be wired into the operation pipeline when pre-flight partition key range resolution lands (tracked separately). See the spec §8.2 for a sample `fetch_pk_ranges` implementation showing the planned integration pattern. ## Testing **30+ unit tests** across all new modules: - `collection_routing_map.rs`: 13 tests — construction, binary search, overlapping ranges, gone filtering, validation errors, empty input - `partition_key_range_cache.rs`: 4 tests — end-to-end resolve, empty PK short-circuit, force refresh with incremental merge, response parsing - `partition_key_range.rs`: 5 tests — construction, `to_range()`, equality, serialization, overlap - `range.rs`: 11 tests — creation, contains, overlap, edge cases, comparers, serialization, string ranges - `effective_partition_key.rs`: 5 tests — empty PK, V2 string/bool hashes against cross-SDK reference values - `murmur_hash.rs`: 2 tests — 128-bit float hash, 32-bit basic - `service_identity.rs`: 3 tests — creation, application name, display ## Files Changed (11) | File | Change | |------|--------| | `docs/PARTITION_KEY_RANGE_CACHE_SPEC.md` | **New** — Design specification (+1,013) | | `src/driver/cache/collection_routing_map.rs` | **New** — Routing map with binary search (+560) | | `src/driver/cache/mod.rs` | **Modified** — Register new modules and re-exports (+7) | | `src/driver/cache/partition_key_range_cache.rs` | **New** — Async PK range cache (+430) | | `src/models/effective_partition_key.rs` | **New** — EPK computation engine (+175) | | `src/models/mod.rs` | **Modified** — Register new model modules, export `PartitionKeyValue` (+6/−2) | | `src/models/murmur_hash.rs` | **New** — MurmurHash3 x64-128 and x86-32 (+206) | | `src/models/partition_key.rs` | **Modified** — PK value hashing and encoding methods (+132/−1) | | `src/models/partition_key_range.rs` | **New** — Partition key range model (+228) | | `src/models/range.rs` | **New** — Generic range with boundary semantics (+350) | | `src/models/service_identity.rs` | **New** — Service identity for direct-mode routing (+114) | ## Issue Fixes #3999 --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 7e11672 commit 2ad6e08

18 files changed

Lines changed: 4132 additions & 2 deletions

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

sdk/cosmos/.cspell.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
"centralindia",
2323
"centralus",
2424
"centraluseuap",
25+
"changefeed",
2526
"chinaeast",
2627
"chinanorth",
2728
"cloneable",
@@ -65,6 +66,7 @@
6566
"hotfixes",
6667
"idents",
6768
"IMDS",
69+
"inclusivity",
6870
"japaneast",
6971
"japanwest",
7072
"keepalive",
@@ -117,6 +119,7 @@
117119
"PPCB",
118120
"purgeable",
119121
"pushback",
122+
"qname",
120123
"qself",
121124
"RAII",
122125
"reactivations",
@@ -147,6 +150,7 @@
147150
"sysinfo",
148151
"TEAMPROJECTID",
149152
"testcontainer",
153+
"testdata",
150154
"testdb",
151155
"uaecentral",
152156
"uaenorth",

sdk/cosmos/azure_data_cosmos_driver/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ uuid = { workspace = true, features = ["v4", "fast-rng"] }
3737

3838
[dev-dependencies]
3939
azure_identity.workspace = true
40+
quick-xml.workspace = true
4041
serde = { workspace = true, features = ["derive"] }
4142
tokio = { workspace = true, features = [
4243
"rt-multi-thread",

0 commit comments

Comments
 (0)