[Cosmos] Fix pkranges fetch: use container name for URL, RID for cache key#4047
Merged
tvaron3 merged 6 commits intoMar 27, 2026
Merged
Conversation
The pkranges fetch URL was using the collection RID in a name-based link hierarchy (e.g., dbs/perfdb/colls/<RID>/pkranges), which Cosmos DB rejects with 404 because mixed name/RID addressing is not supported. This fix passes the container name for URL construction (matching how all other SDK operations work) while keeping the RID as the cache key and for the resolved_collection_rid request context. Changes: - partition_key_range_cache: add collection_name parameter to try_lookup, get_routing_map_for_collection, resolve_partition_key_range_by_id, and resolve_overlapping_ranges; use .item(name) for URL, RID for cache key - container_connection: extract container name alongside RID, pass both to pk_range_cache methods - Updated unit tests to verify name-based URL and RID-based cache key - Added fault injection integration test for pkrange resolution Fixes: #4031 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
API Change CheckAPIView identified API level changes in this PR and created the following API reviews |
Add passthrough status tracking to FaultInjectionRule so spy rules can verify real service responses without injecting faults. Replace error-injection test with spy-rule test that verifies the pkranges request returns 200, proving the URL is correct. - Add passthrough_statuses field to FaultInjectionRule - Record response status in FaultClient for true spy rules only - Normalize collection_resource_id → collection_rid param naming - Route test through fault client pipeline for correct interception Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes Cosmos partition-key-ranges (pkranges) fetch failures caused by mixed name/RID addressing by using the container name for URL construction while continuing to use the container RID for cache keys and request context.
Changes:
- Updates
PartitionKeyRangeCacheAPIs to acceptcollection_namefor building name-based pkranges URLs while keeping RID-based cache keys. - Updates
ContainerConnectionto pass both container name and RID into pk-range cache lookups. - Extends fault injection to support “spy” passthrough status recording and adds an emulator test to assert the pkranges path is exercised and succeeds.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/cosmos/azure_data_cosmos/src/routing/partition_key_range_cache.rs | Build pkranges links with container name (not RID) while keeping RID as cache key/context; update unit tests accordingly. |
| sdk/cosmos/azure_data_cosmos/src/handler/container_connection.rs | Pass container name + RID into pk-range cache calls during request routing. |
| sdk/cosmos/azure_data_cosmos/src/fault_injection/rule.rs | Add storage/access for passthrough response status codes on matched “spy” rules. |
| sdk/cosmos/azure_data_cosmos/src/fault_injection/http_client.rs | Record real service response status codes for matched passthrough (spy) rules. |
| sdk/cosmos/azure_data_cosmos/tests/emulator_tests/cosmos_fault_injection.rs | Add emulator test covering the pkranges readfeed path via a spy rule and validating it returns 200. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
simorenoh
approved these changes
Mar 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #4031 — the pkranges fetch was using the collection RID in a name-based URL hierarchy (
dbs/perfdb/colls/<RID>/pkranges), which Cosmos DB rejects with 404 because mixed name/RID addressing is not supported.The previous fix (#4041) corrected URL encoding (
.item()→.item_by_rid()) but did not fix the fundamental mixed-addressing issue. This PR resolves it by passing the container name for URL construction while keeping the RID as the cache key and request context value.Root Cause
PR #4005 changed
container_connection.rs::send()to passself.container_ref.rid()topk_range_cache.try_lookup(). The cache used this RID to build the pkranges URL:All other SDK and driver operations use name-based URLs. The pkranges fetch was the only code path using a RID in a name-based link hierarchy.
Impact (observed on continuous benchmarks)
try_lookup→Ok(routing_map.ok())Changes
partition_key_range_cache.rscollection_name: &strparameter totry_lookup,get_routing_map_for_collection,resolve_partition_key_range_by_id, andresolve_overlapping_ranges.item_by_rid(collection_rid)→.item(collection_name)for pkranges URL constructioncollection_rid.to_string())resource_idon the request remains the RIDtracing::warn!to include bothcollection_nameandcollection_ridcontainer_connection.rscollection_namefromself.container_ref.name()alongside existingcollection_ridcollection_nameto allpk_range_cachemethod callsresolved_collection_ridon request context still uses the RID (unchanged)cosmos_fault_injection.rsfault_injection_pkrange_readfeed_is_exercisedintegration testMetadataPartitionKeyRangesReadFeed with hit_limit=1Test Results