[DO NOT REVIEW] Cosmos: Support addressing databases and containers by RID#4613
[DO NOT REVIEW] Cosmos: Support addressing databases and containers by RID#4613simorenoh wants to merge 11 commits into
Conversation
Adds the ability to obtain `DatabaseClient` and `ContainerClient` instances using a Cosmos resource ID (RID) in addition to a name, per issue #4513. Public crate (`azure_data_cosmos`): - New `ResourceId` newtype and `#[non_exhaustive]` `ResourceIdentity` enum (owned, no lifetime parameter). `CosmosClient::database_client` and `DatabaseClient::container_client` now take `impl Into<ResourceIdentity>`, so a `&str`/`String` selects name addressing and a `ResourceId` selects RID addressing. - `DatabaseClient` stores the identity, exposes `identity()`/`name()`/ `rid()`, and skips the extra database read when resolving throughput for a RID-addressed database. - `ContainerClient::new` enforces matching addressing modes (no mixing name and RID) and rejects a container RID whose derived parent database does not match the addressed database. Driver crate (`azure_data_cosmos_driver`): - `ContainerReference` is now addressing-mode aware: name-based fields are grouped behind an optional `NameAddressing`, equality/hashing key on account + container RID only, and `base_path()` selects the name- or RID-based path so item/sub-resource links keep name-based leaves over a RID-based container path. - Added `resolve_container_by_rid`/`fetch_container_by_rid` (deriving the parent database RID from the container RID's bytes and validating container-level RID length), `read_container_by_rid`, a RID-aware container cache, and the `CLIENT_INVALID_RESOURCE_ID` / `CLIENT_MIXED_NAME_RID_ADDRESSING` statuses. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cold-cache resolution by RID (e.g. container_client(rid)) failed against real Cosmos with 401 Unauthorized because base64 RIDs were placed raw in the request URL path. The gateway canonicalizes the path, derives a resource id different from the one we signed, and rejects the request. Encode the reserved characters in each path segment (notably the = padding in RIDs) when building the request URL, while the authorization signature continues to use the raw resource link via the same paths buffer. Separators and unreserved characters are preserved; the helper borrows its input when no encoding is needed. Verified live: cold container_client(rid) resolve and container.read() by RID now return 200 (previously 401). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
RID-addressed data-plane requests were always signed over the full name-style resource link and had their RID percent-encoded in the URL, so the gateway rejected them (401 for database/container reads by RID). Match the service's `is_name_based = false` rule instead: when a reference is RID-addressed, sign over the lowercased resource RID (the leaf for point reads, the parent for feeds) and send the RID raw in the request path. Name-addressed paths keep full-link signing and percent-encoding. - cosmos_resource_reference.rs: add `ResourcePaths.rid_based` plus a `signing_override` populated for RID resources via a new `rid_signing_override` helper; `is_rid_addressed` drives raw paths. Generalizes the existing offers signing-override pattern. - operation_pipeline.rs: skip `encode_path_segments` for RID paths. - Supersedes the percent-encoding band-aid; `encode_path_segments` is now name-path-only. Validated live (db/container read by RID, query_containers under a RID db, read_throughput by RID, query_items single- and cross-partition on a RID container) plus name-addressed controls. Item point-ops by item RID remain a follow-up. Fixes #4513 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
⚠️ Not ready to approve
There are confirmed correctness/compilation blockers in the updated signing logic and a non-compiling Send assertion that must be fixed before approval.
Pull request overview
This PR adds first-class support for addressing Cosmos databases/containers by RID (in addition to name) in the azure_data_cosmos public API, and updates the azure_data_cosmos_driver request-path + signing logic so RID-addressed operations authenticate and route correctly.
Changes:
- Introduces
ResourceId+ResourceIdentityand updatesCosmosClient::database_client/DatabaseClient::container_clientto accept either name or RID identity. - Updates driver resource reference/path computation to sign RID-addressed requests over the lowercased RID and to send RID paths raw (while still percent-encoding name paths).
- Adds RID container resolution path (
resolve_container_by_rid) plus related caching/keying adjustments.
File summaries
| File | Description |
|---|---|
| sdk/cosmos/azure_data_cosmos/tests/framework/test_client.rs | Adjusts tests to pass container/database identifiers as &str to match the new Into<ResourceIdentity> API. |
| sdk/cosmos/azure_data_cosmos/tests/emulator_tests/cosmos_containers.rs | Updates emulator container CRUD test to use &str for container identity. |
| sdk/cosmos/azure_data_cosmos/src/resource_identity.rs | Adds the new public ResourceId + ResourceIdentity types and unit tests for conversions. |
| sdk/cosmos/azure_data_cosmos/src/lib.rs | Wires the new module and re-exports ResourceId / ResourceIdentity. |
| sdk/cosmos/azure_data_cosmos/src/clients/database_client.rs | Switches DatabaseClient to store ResourceIdentity, updates container_client signature, and adds identity accessors + RID shortcut for throughput lookups. |
| sdk/cosmos/azure_data_cosmos/src/clients/cosmos_client.rs | Updates database_client to accept impl Into<ResourceIdentity> and documents name vs RID usage. |
| sdk/cosmos/azure_data_cosmos/src/clients/container_client.rs | Enforces no-mix name/RID addressing when constructing a ContainerClient and adds RID resolution path. |
| sdk/cosmos/azure_data_cosmos/CHANGELOG.md | Documents the new identity API and the RID signing/path routing fix. |
| sdk/cosmos/azure_data_cosmos_driver/src/models/resource_reference.rs | Makes ContainerReference optionally carry name-addressing data, adds base_path(), and updates equality/hash semantics for cache keying. |
| sdk/cosmos/azure_data_cosmos_driver/src/models/mod.rs | Re-exports encode_path_segments for use by the pipeline. |
| sdk/cosmos/azure_data_cosmos_driver/src/models/cosmos_resource_reference.rs | Generalizes RID signing overrides and tracks whether paths must be sent raw vs percent-encoded. |
| sdk/cosmos/azure_data_cosmos_driver/src/models/cosmos_operation.rs | Adds a read_container_by_rid factory method. |
| sdk/cosmos/azure_data_cosmos_driver/src/error/cosmos_status.rs | Adds client-side sub-status codes for invalid RIDs and mixed name/RID addressing. |
| sdk/cosmos/azure_data_cosmos_driver/src/driver/routing/session_container.rs | Switches session-token keying to use ContainerReference::base_path(). |
| sdk/cosmos/azure_data_cosmos_driver/src/driver/pipeline/operation_pipeline.rs | Applies percent-encoding only for name-based paths; sends RID paths raw. |
| sdk/cosmos/azure_data_cosmos_driver/src/driver/cosmos_driver.rs | Implements container metadata fetch/resolve by RID (including decoding/validation). |
| sdk/cosmos/azure_data_cosmos_driver/src/driver/cache/container_cache.rs | Adds by-RID fetch path and avoids populating the by-name index for RID-only references. |
| sdk/cosmos/azure_data_cosmos_driver/CHANGELOG.md | Documents the RID signing + raw-path routing bug fix. |
Copilot's findings
- Files reviewed: 18/18 changed files
- Comments generated: 4
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
| async fn resource_id(&self) -> crate::Result<String> { | ||
| if let Some(rid) = self.rid() { | ||
| return Ok(rid.as_str().to_owned()); | ||
| } | ||
| let db = self.read(None).await?.into_model()?; | ||
| Ok(db | ||
| .system_properties | ||
| .resource_id | ||
| .expect("service should always return a '_rid' for a database")) | ||
| } |
| fn assert_send<T: Send>(_: T) {} | ||
| let client: &DatabaseClient = todo!(); | ||
| assert_send(client.container_client(todo!())); | ||
| assert_send(client.container_client(todo!() as ResourceIdentity)); | ||
| assert_send(client.read(todo!())); | ||
| assert_send(client.query_containers(Query::from("SELECT * FROM c"), todo!())); |
There was a problem hiding this comment.
It should be a valid cast, it'll just fail at runtime (and this is just testing compile-time conditions), but we should use something explicit rather than as ResourceIdentity.
| /// Returns the lowercased RID to sign against when this reference is | ||
| /// RID-addressed, or `None` when it is name-addressed (in which case the | ||
| /// full, case-preserved resource link is used for signing). | ||
| /// | ||
| /// For RID-based requests the Cosmos master-key signature is computed over | ||
| /// the lowercased RID of the **signing resource** only — the leaf for point | ||
| /// operations and the parent for feed operations — not the full resource | ||
| /// link. This mirrors the `is_name_based = false` path in the service SDKs. | ||
| /// | ||
| /// `is_feed` selects the parent (feed) vs. leaf (point op) signing resource. | ||
| fn rid_signing_override(&self, is_feed: bool) -> Option<String> { | ||
| if is_feed { | ||
| // Feed/parent-signed: parent is the account (no RID), the database, | ||
| // or the container, depending on the child resource type. | ||
| match self.resource_type { | ||
| ResourceType::DocumentCollection => self | ||
| .database | ||
| .as_ref() | ||
| .and_then(|db| db.rid()) | ||
| .map(str::to_lowercase), |
| /// Reads a container's properties by database and container RID. | ||
| /// | ||
| /// Like [`read_container_by_name`](Self::read_container_by_name) but addresses | ||
| /// the container by RID. `database` must itself be a RID-based | ||
| /// [`DatabaseReference`] so the request path is fully RID-based | ||
| /// (`/dbs/{db_rid}/colls/{container_rid}`). | ||
| pub fn read_container_by_rid( | ||
| database: DatabaseReference, | ||
| container_rid: impl Into<std::borrow::Cow<'static, str>>, | ||
| ) -> Self { | ||
| let resource_ref: CosmosResourceReference = CosmosResourceReference::from(database) | ||
| .with_resource_type(ResourceType::DocumentCollection) | ||
| .with_rid(container_rid.into()); | ||
| Self::new(OperationType::Read, resource_ref, None) | ||
| } |
There was a problem hiding this comment.
Since this is an infallible API, I think Copilot has it right here. We should either make this API fallible and have it fail if it's given a non-RID database reference, or change the API to stop that. The debug assert is good, but we need to consider what happens in production.
There was a problem hiding this comment.
hmm I didn't consider this since I figured the network request for resolution would fail regardless. will reinforce this _rid path - my take is changing the signature to include account reference so that the factory builds the RID-based DatabaseReference itself from raw db_rid/container_rid strings, instead of trusting the caller to hand in a correctly-addressed one. main concern with this was the asymmetry on signatures but the rid-path is meant to be used as a fallback from name-based mostly so should be ok in my opinion
Tightens two latent issues surfaced in PR review (no behavior change for existing call sites): - Unify the "is this request RID-addressed?" decision. `is_rid_addressed()` now also returns true when the leaf `id` carries a RID, so the raw-path routing can never diverge from `rid_signing_override` (which signs over the leaf RID). A reference signed RID-based is now always routed raw, preventing a latent signed-RID/name-encoded-path mismatch (opaque 401). Adds a `debug_assert!` in `read_container_by_rid` requiring a RID-based `DatabaseReference`, plus a regression test. - Replace the `.expect()` panic in `DatabaseClient::resource_id()` with a typed `CosmosError`. Adds the `SERVICE_RETURNED_DATABASE_WITHOUT_RID` status (500 / 20306), mirroring `SERVICE_RETURNED_OFFER_WITHOUT_ID`, so a broken-server-invariant response fails cleanly instead of panicking. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Integrate the Public API Reorganization (#4512) and runtime/options restructure (#4588) from main with the RID-addressing work on this branch. Conflict resolutions: - lib.rs / cosmos_client.rs / container_client.rs: adopt main's reorganized module layout and import grouping, re-adding the RID surface (resource_identity module, ResourceId/ResourceIdentity, ResourceIdentity imports). - database_client.rs: keep the RID-aware resource_id() short-circuit; route throughput methods through it and delegate the missing-_rid error path to main's resource_id_or_error helper. - cosmos_status.rs: drop the duplicate 20306 status; main's generic SERVICE_RETURNED_OBJECT_WITHOUT_RID is now canonical. - cosmos_driver.rs: keep fetch_container_by_rid; adopt main's fallible CosmosDriver::new signature. - CHANGELOGs: combine both crates' Unreleased entries. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a Breaking Changes entry for azure_data_cosmos noting that database_client/container_client moved from &str to impl Into<ResourceIdentity>, a compile-time-only source change affecting deref-coercion call sites. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
analogrelay
left a comment
There was a problem hiding this comment.
Blocking stuff is minor. Some non-blocking thoughts on pushing ResourceIdentity down.
| /// Under the no-mix addressing rule a RID database implies a RID container, | ||
| /// so checking the container and database covers database, container, and | ||
| /// item operations. The leaf `id` is also checked so the raw-path decision | ||
| /// can never diverge from [`rid_signing_override`](Self::rid_signing_override), | ||
| /// which signs over the leaf RID when `id` carries one: a request signed as | ||
| /// RID-based must always be routed RID-based (raw), otherwise the gateway | ||
| /// rejects it with an opaque `401`. Offers are handled separately and are | ||
| /// not considered RID-addressed for path-encoding purposes. |
There was a problem hiding this comment.
Can we debug_assert! this somewhere? That if a reference is created with RIDs, all the pieces are RIDs and if it's created with Names, all the pieces are Names?
| /// Reads a container's properties by database and container RID. | ||
| /// | ||
| /// Like [`read_container_by_name`](Self::read_container_by_name) but addresses | ||
| /// the container by RID. `database` must itself be a RID-based | ||
| /// [`DatabaseReference`] so the request path is fully RID-based | ||
| /// (`/dbs/{db_rid}/colls/{container_rid}`). | ||
| pub fn read_container_by_rid( | ||
| database: DatabaseReference, | ||
| container_rid: impl Into<std::borrow::Cow<'static, str>>, | ||
| ) -> Self { | ||
| let resource_ref: CosmosResourceReference = CosmosResourceReference::from(database) | ||
| .with_resource_type(ResourceType::DocumentCollection) | ||
| .with_rid(container_rid.into()); | ||
| Self::new(OperationType::Read, resource_ref, None) | ||
| } |
There was a problem hiding this comment.
Since this is an infallible API, I think Copilot has it right here. We should either make this API fallible and have it fail if it's given a non-RID database reference, or change the API to stop that. The debug assert is good, but we need to consider what happens in production.
|
|
||
| ### Bugs Fixed | ||
|
|
||
| - RID-addressed data-plane requests are now authenticated correctly. The driver signs RID-addressed databases, containers, and their feeds/children over the lowercased resource RID (matching the service's `is_name_based = false` rule) and sends the RID raw in the request URL instead of percent-encoding it. Previously the driver always signed the full name-style resource link and percent-encoded the RID, which made the gateway reject RID reads (e.g. `401` for a database or container read by RID). Reading a database or container by RID, listing/querying containers under a RID database, and querying items, reading throughput, and creating items on a RID-addressed container now work end-to-end. ([#4513](https://github.com/Azure/azure-sdk-for-rust/issues/4513)) |
There was a problem hiding this comment.
CHANGELOG entries should reference the PR, not the issue
| fn assert_send<T: Send>(_: T) {} | ||
| let client: &DatabaseClient = todo!(); | ||
| assert_send(client.container_client(todo!())); | ||
| assert_send(client.container_client(todo!() as ResourceIdentity)); | ||
| assert_send(client.read(todo!())); | ||
| assert_send(client.query_containers(Query::from("SELECT * FROM c"), todo!())); |
There was a problem hiding this comment.
It should be a valid cast, it'll just fail at runtime (and this is just testing compile-time conditions), but we should use something explicit rather than as ResourceIdentity.
| /// Returns the identifier of the Cosmos database. | ||
| /// Returns the identifier used to construct this client: the database name | ||
| /// when addressed by name, or the RID string when addressed by RID. | ||
| pub fn id(&self) -> &str { |
There was a problem hiding this comment.
If this could be a name or a RID, we should return &ResourceIdentity.
| } | ||
|
|
||
| /// Returns the identity (name or RID) used to construct this client. | ||
| pub fn identity(&self) -> &ResourceIdentity { |
There was a problem hiding this comment.
Remove this and use id to return the ResourceIdentity reference.
We can break this API pre-1.0, especially since name() gives the same results as the old id().
| /// allocates an owned copy. | ||
| #[derive(Clone, Debug, PartialEq, Eq, Hash)] | ||
| #[non_exhaustive] | ||
| pub enum ResourceIdentity { |
There was a problem hiding this comment.
Maybe this belongs in the driver, and DatabaseReference (et al) can accept it instead of having separate _from_name and _from_rid constructors.
There was a problem hiding this comment.
That would also allow you to have one API in the driver resolve_container instead of two variants.
…sing A4/A5: DatabaseClient::id() now returns &ResourceIdentity and the redundant identity() accessor is removed. Add From<&ResourceIdentity> for ResourceIdentity so round-tripping a client's id() back into database_client/container_client preserves the addressing mode (previously a RID id() round-tripped as a name). A3/C2: replace the 'todo!() as ResourceIdentity' cast in the Send assertion with an explicit typed binding. A1: add a debug-only addressing-consistency assert in CosmosResourceReference::compute_paths that rejects a database/container parent chain (and database/container leaf id) mixing name and RID addressing. Item/sub-resource leaf ids stay exempt. Split the prior regression test into a release-mode safety-net check and a should_panic debug check. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Per repo convention, CHANGELOG entries link the PR (#4613), not the issue. Also corrects the feature entry to describe id() returning the ResourceIdentity (the identity() accessor was removed) and adds the missing PR link to the driver RID signing bug-fix entry. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Make mixed name/RID addressing unrepresentable and reject it at the driver boundary, not just the SDK layer. read_container_by_rid now takes (account, db_rid, container_rid) and builds the RID-based DatabaseReference itself, so a name-addressed parent can no longer be passed; the sole caller already decodes db_rid, so the call site is simpler and the debug_assert is gone. CosmosResourceReference gains validate_addressing(), a release-mode counterpart to the debug-time addressing assertion. CosmosDriver::execute_operation calls it once per operation, turning a mixed reference into a deterministic CLIENT_MIXED_NAME_RID_ADDRESSING error in every build instead of an opaque gateway 401. The check is an in-memory field comparison and a no-op for consistent references, so request flow and network calls are unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Resolve CHANGELOG conflicts by relocating the #4613 RID entries from the now-released 0.36.0/0.5.0 sections into the current 0.37.0/0.6.0 Unreleased sections, keeping main's released bug-fix entries in place. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cover the full RID-addressing flow against the emulator: create a database and container by name, capture their service-assigned _rids, then re-address both purely by RID and verify container read, read_throughput, item create/read, single- and cross-partition item queries, and query_containers all resolve to the same resources. Add negative cases asserting that mixing name and RID addressing is rejected (CLIENT_MIXED_NAME_RID_ADDRESSING) and that a container RID from another database is rejected (CLIENT_INVALID_RESOURCE_ID). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Adds the ability to address Azure Cosmos databases and containers by resource ID (RID) in addition to name, and fixes the driver so RID-addressed requests actually authenticate against the gateway.
Fixes #4513
What changed
Public API (
azure_data_cosmos)ResourceIdnewtype andResourceIdentityenum.database_client(...)andcontainer_client(...)acceptimpl Into<ResourceIdentity>, so a plain&str/Stringselects by name andResourceId::from(...)selects by RID.Driver protocol fix (
azure_data_cosmos_driver)The earlier approach signed every request over the full name-style resource link and percent-encoded the RID in the URL, so RID reads were rejected with
401 Unauthorized. This PR matches the service'sis_name_based = falserule:signing_override(ResourcePaths.rid_based+rid_signing_override/is_rid_addressedhelpers, generalizing the existing offers pattern).encode_path_segmentsis now name-path-only).validate_addressing) in addition to the debug assertion, so an inconsistent reference can never reach the wire.Validation
cargo fmt+cargo clippy --all-features --all-targetsclean on both crates.main.query_containersunder a RID database,read_throughputby RID,query_itemssingle- and cross-partition by RID, plus name-addressed controls.Out of scope (follow-up)
Item point-operations (
read_item/replace_item/delete_item/patch_item) by item_ridon RID-addressed containers — all SDK item methods address items by name today. By design we add no client-side guard for the mixed shape and let the service reject it. Tracked in #4612.