Cosmos: RID-aware resource references and raw-path protocol (1/3)#4640
Cosmos: RID-aware resource references and raw-path protocol (1/3)#4640simorenoh wants to merge 6 commits into
Conversation
38005bf to
f6199a7
Compare
There was a problem hiding this comment.
Pull request overview
This is slice 1 of a 3-PR stack (epic #4637) that re-lands "address Cosmos databases/containers by RID" in small, reviewable pieces. This slice is driver-internal only with no public API surface: it makes ContainerReference RID-capable and implements the wire protocol for RID addressing — RID-based requests are signed over the lowercased RID and sent with the path raw, while name-based paths continue to be percent-encoded. The work feeds the later slices that add RID resolution (slice 2) and the SDK public API (slice 3).
Changes:
ContainerReferencenow carries name-based addressing in an optionalNameAddressinggroup;database_name()/name_based_path()returnOption, newbase_path()/is_by_rid()/new_by_rid()are added, and equality/hash key only on account + container RID.CosmosResourceReferencegains the raw-path / lowercased-RID signing protocol (is_rid_addressed,rid_signing_override,ResourcePaths::is_rid_based,encode_path_segments) plus debug-only mixed-addressing assertions, and the transport pipeline now sends RID paths raw vs. percent-encoding name paths.- Session-token indexing and the container name-cache shim are updated for the
Option-returning accessors; a driver CHANGELOG entry is added under a new0.6.0 (Unreleased)section.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
models/resource_reference.rs |
Makes ContainerReference RID-capable (NameAddressing, new_by_rid, base_path, is_by_rid); database_name/name_based_path now Option; equality/hash on account+container RID; sub-resource links use base_path(); new unit tests. |
models/cosmos_resource_reference.rs |
Adds rid_based to ResourcePaths, RID signing override, is_rid_addressed, encode_path_segments, debug-only mixed-addressing checks, and extensive path/signing tests. |
models/mod.rs |
Exports encode_path_segments to the driver crate. |
driver/pipeline/operation_pipeline.rs |
Sends RID-based paths raw and percent-encodes name-based paths via encode_path_segments. |
driver/routing/session_container.rs |
Keys the session-token index on base_path() so RID- and name-addressed containers index consistently. |
driver/cache/container_cache.rs |
Minimal compile shim for the Option-returning database_name() (uses unwrap_or_default(); test asserts wrapped in Some(...)). |
CHANGELOG.md |
Adds 0.6.0 (Unreleased) Bugs Fixed entry describing RID signing/routing fix. |
A couple of points to highlight for the human reviewer: offers now percent-encode their base64 = padding (previously sent raw via Url::set_path), which by the PR's own RID/is_name_based=false rationale could regress read_offer/replace_offer and is only verifiable against a live/emulator account; and the central raw-vs-encoded pipeline branch lacks a build_transport_request test despite that module's strong path-assertion test conventions. This is the epic's explicitly "highest-risk" protocol-correctness work and touches request signing.
f6199a7 to
9317b7f
Compare
Slice 1 of 3 (stacked, epic #4637) of the Cosmos RID-addressing work. This slice is driver-internal only and has no public API surface. Add RID-capable ContainerReference (new_by_rid, base_path, is_by_rid; database_name/name_based_path now return Option) and the raw-path / lowercased-RID signing protocol in CosmosResourceReference (compute_paths, rid_signing_override, is_rid_addressed, ResourcePaths::is_rid_based, encode_path_segments). build_transport_request sends RID-based paths raw and percent-encodes name-based paths; session_container indexes on base_path. validate_addressing and the CLIENT_MIXED_NAME_RID_ADDRESSING status are intentionally deferred to Slice 2; addressing_conflict/parent_chain_is_rid are debug-only here. A minimal compile shim in container_cache.rs adapts to the new Option-returning database_name. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
9317b7f to
357f0db
Compare
Address review findings on Slice 1 of the RID-addressing series:
- Offers are signed over the lowercased RID (is_name_based=false), so
their /offers/{rid} path must be sent raw. The offer branch produced
rid_based=false, which percent-encoded reserved base64 characters in
the RID and caused 401s for any offer whose RID contained +, /, or =
padding. Flip the offer branch to rid_based=true so offers route raw,
matching main's proven behavior. Verified against a live account: a
raw RID read returns 200 while the same percent-encoded RID returns
401.
- Add build_transport_request tests asserting the final Url::path():
RID-addressed paths (incl. = padding and +) stay raw, name-addressed
paths are percent-encoded, and offers route raw.
- Correct the encode_path_segments doc: RIDs are not URL-safe base64;
they use a variant that maps / to - (so splitting on / is safe) but
can still contain + and = padding, which is why RID paths go raw.
- Add a CHANGELOG Breaking Changes entry for the resource-reference
accessors that now return Option, and note offers now route raw.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The RID test-fixture words (balxjyz, QBAA, QBAOW) were added to the repo-wide .vscode/cspell.json. They are Cosmos-specific, so move them to sdk/cosmos/.cspell.json (which imports the root config) to keep the repo-wide dictionary free of service-specific noise. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Non-functional polish from the Slice 1 re-review (no behavior change): - Rename session_container's name_path helper to index_path: it now returns base_path(), which is the RID-based path for RID-addressed containers, so name_path was misleading. - Reword is_rid_addressed doc to state only the guaranteed direction of the routing/signing invariant (RID-signed implies raw-routed); the converse does not hold, so the two helpers must not be derived from each other. - Call debug_assert_addressing_consistent in compute_feed_paths for symmetry with compute_paths (debug-only). - Fix an encode_path_segments doc remnant that said the function splits on '/'; it iterates bytes and special-cases '/'. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Document the latent by-name cache-key collision at the Slice-1/2 shim: database_name() is None for RID-addressed containers, so unwrap_or_default collapses the key to {endpoint, empty, name}. Harmless today (nothing emits RID references yet); Slice 2's resolution layer must handle it before caching RID-addressed containers.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Pilchie
left a comment
There was a problem hiding this comment.
LGTM, but would like another set of 👀 on it too.
Slice 1 of 3 (stacked) — epic #4637
This is the bottom of a 3-PR stack that re-lands the validated "address Cosmos databases/containers by RID" change as small, reviewable slices. Slice 1 is driver-internal only and has no public API surface. It is fully unit-tested.
What's in this slice
Driver references + the wire protocol for RID addressing:
models/resource_reference.rs—ContainerReferencebecomes RID-capable. Newnew_by_ridconstructor andbase_path()/is_by_rid()accessors;database_name()andname_based_path()now returnOption(absent when the container was resolved purely by RID). Equality/hash key on account + container RID so name- and RID-resolved references for the same physical container collapse to one key.models/cosmos_resource_reference.rs— the raw-path / lowercased-RID signing protocol:compute_pathsraw-path handling,rid_signing_override,is_rid_addressed,ResourcePaths::is_rid_based,encode_path_segments, plusaddressing_conflict/debug_assert_addressing_consistent(debug-only) and the full path/signing/round-trip unit tests.models/mod.rs— exportencode_path_segmentsto the driver crate.driver/pipeline/operation_pipeline.rs—build_transport_requestsends RID-based paths raw and percent-encodes name-based paths (encoding the=padding of a base64 RID would make the gateway treat it as a name and reject the RID-based signature).driver/routing/session_container.rs— session-token index keys onbase_path()so RID- and name-addressed containers index consistently.Deliberately deferred to later slices
validate_addressing+ theCLIENT_MIXED_NAME_RID_ADDRESSINGstatus → Slice 2 (resolution). Because of that,addressing_conflictandparent_chain_is_ridare#[cfg(debug_assertions)]here (debug-only consumers only) and will be un-gated by Slice 2.Deviations to keep this slice self-contained
driver/cache/container_cache.rs:database_name()now returnsOption, so the name-key builder usesunwrap_or_default()and two test asserts wrap inSome(...). No Slice-2 RID-cache logic is pulled in; behavior is unchanged because Slice 1 never produces a RID-addressedContainerReferencein any real flow.resolve_container_by_rid) in a doc comment is reduced to plain prose to avoid a broken link.Validation
cargo fmt,cargo build --all-features(debug + release),cargo clippy --all-features --all-targets— all clean, zero warnings.cargo test --all-features --lib— 1998 passed, 0 failed, 64 ignored.Slices 2 (resolution) and 3 (SDK public API) build on top of this.