Cosmos: Add ContainerClient::patch_item via driver-side RMW#4386
Open
NaluTripician wants to merge 19 commits into
Open
Cosmos: Add ContainerClient::patch_item via driver-side RMW#4386NaluTripician wants to merge 19 commits into
NaluTripician wants to merge 19 commits into
Conversation
Logic Spren stages per-work-item handoff JSONs and a handoff README under .coding-harness/ in the working tree as part of the Phase 1 -> Phase 2 bridge to the Coding Agent Harness. These files are intentionally local to the implementer's checkout and must never enter the PR diff. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement PATCH driver-side as a Read-Modify-Write loop behind a new `OperationType::Patch` variant. The handler is the only code path in the driver allowed to deserialize a data plane response body — every other pipeline stage continues to treat payloads as opaque `Vec<u8>`. Added: - `PatchOp`, `IncrValue`, `PatchSpec` model types (add/set/replace/ remove/increment/move ops). `IncrValue` preserves i64 fidelity. - `CosmosOperation::patch_item()` + `with_patch_max_attempts()` setter so SDK wrappers can cap the RMW loop (default 5). - `driver::pipeline::patch_eval` — RFC 6901 JSON Pointer evaluator hand-rolled to avoid an external dependency (~440 LOC + 19 unit tests). - `driver::pipeline::patch_handler` — RMW loop with partition-key path pre-flight validation, ETag capture, `If-Match`-guarded Replace, and retry on 412 PreconditionFailed. - `driver::pipeline::from_local_body` — single helper that synthesizes the final `CosmosResponse` from the locally-merged body plus the Replace's transport metadata. - `CosmosDriver::execute_operation` early-dispatches PATCH ops via `Box::pin` (the handler re-enters `execute_operation` for its internal Read/Replace pair). Docs: - `ARCHITECTURE.md` documents the body-opacity exception scoped to the patch handler. - `docs/PATCH_HANDLER_SPEC.md` describes the full algorithm, error surface, and invariants. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add the idiomatic SDK wrapper around `OperationType::Patch`. Mirrors the shape of `replace_item` but accepts a `PatchSpec` (a list of add/set/replace/remove/increment/move ops) and a dedicated `PatchItemOptions` so callers can cap the driver's RMW retry budget. Notes: - `patch_item` never exposes `Precondition` — the handler manages `If-Match` internally. Only `session_token` and `operation` are plumbed through `apply_item_options`. - `PatchSpec`, `PatchOp`, `IncrValue` are re-exported at the crate root (and through `models`) so callers don't have to reach into the driver crate. - `PatchItemOptions::max_attempts` is `Option<NonZeroU8>` to avoid the `u8 == 0` footgun. - README hero example shows a basic `set` + `increment` PATCH. - CHANGELOG entry under 0.34.0 Features Added. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Stub out emulator tests for scenarios A5/A6/A7/A8/A9/A12/A13 from the original RMW spec. All are `#[ignore]`'d because the driver test harness does not yet expose a high-level `patch_item` helper; wiring them up is tracked in the harness's implementation-state manifest as follow-up work. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Extends DriverTestClient with two helpers needed by the Phase 4 PATCH emulator tests: * `create_container_with_pk_paths` lets a test create a hierarchical partition-key container (MultiHash). The pre-existing `create_container` is now a thin wrapper that delegates with a single PK path. * `patch_item` mirrors `read_item` but builds the operation body from a `PatchSpec` and forwards an optional `patch_max_attempts` override. The synthetic response from the RMW handler is returned verbatim so callers can assert on the locally-merged post-image. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rewrites driver_patch.rs from Phase 3 `Ok(())` stubs into real emulator-backed bodies for the canonical Logic Spren A-mapping: * **A5** `cosmos_patch_basic_set` -- Set(/deleted, true) round-trips through the RMW handler and is observed in both the synthetic patch response and a follow-up read. * **A6** `cosmos_patch_pk_guard` + `cosmos_patch_pk_guard_hierarchical` -- preflight rejects Set/Replace/Remove/Add/Move on partition-key paths for both flat and hierarchical containers, never issuing a network call. * **A9** `cosmos_patch_compare` -- iterates a 34-row `PatchCompareCase` catalog (traceable to .NET ItemPatchSuccessTest / ItemPatchFailureTest / PatchOperationTests, Java itemPatchSuccessForNullValue, plus rust-derived RFC 6901/6902 + i64 fidelity rows) and asserts each post-image or error substring. A7/A8/A12/A13 stay `#[ignore]` with explicit TODOs citing the missing fault-injection primitive (412 fault type, response-header strip, EndToEndOperationLatencyPolicy plumbing). The coverage rollup lives in `cosmos_patch_compare_coverage.md`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds the `_assert_futures_are_send` compile-time backstop used elsewhere in the cosmos crates so future regressions that accidentally drop the `Send` bound on the `patch_item` future (e.g. by capturing an `Rc` or `RefCell` across an `.await`) fail at `cargo build` time instead of surfacing as spawn-time runtime errors in caller code. The body is never executed -- the `todo!()` placeholders are unreachable. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replaces the placeholder `cosmos-patch-ref` plan with a concrete contract for the deferred .NET PATCH oracle: * stdin/stdout JSON schema (compatible with `PatchCompareCase` rows), * error-kind taxonomy required for cross-SDK matching, * harness wiring (`tokio::process`, `COSMOS_PATCH_REF_BIN` env, Test-Setup.ps1 append-only), * acceptance checklist for the future PR that lands the binary. The .NET helper itself remains deferred per .coding-harness/ implementation-state.json; only the README ships. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address Phase 5 review findings F1, F2, F10 on PR Azure#4386. F1 — typed PATCH return body: ContainerClient::patch_item is now generic over the response model (`patch_item<T>(...) -> ItemResponse<T>`), matching the read_item<T> pattern. The T: DeserializeOwned bound only kicks in at .into_model(), so callers that don't care about the body stay zero-cost. The rustdoc hero example and the README example were rewritten to deserialize the synthesized post-image into a user struct via .into_model(). A new `# Response Body` section calls out that the body is always synthesized locally — the SDK does not rely on the server's `content_response_on_write` setting. F2 — session-token RMW (SE-004 mitigation): The PATCH RMW loop now (a) captures the caller's session token once before the loop and threads it through the internal Read so the read sees a session-consistent view, then (b) overrides that token with the Read response's session token on the internal Replace so the write commits against the same replica we just read from. The Read also forwards the caller's OperationOptions (consistency level, latency budget, throttling settings) instead of defaulting them, matching what an explicit Read by the caller would have done. The sub-op construction was extracted into `build_read_sub_op` / `build_replace_sub_op` and is covered by four new unit tests. F10 — exhaustion error chains the last 412: After `max_attempts` 412s, patch_item now returns `ErrorKind::HttpResponse { status: PreconditionFailed, .. }` with a custom message (`"patch_item: ETag conflict after N attempts"`) and `Error::with_error` chains the per-attempt 412 as the source so callers can downcast or trace it. Validated locally: cargo fmt --check, cargo build, cargo clippy `-D warnings` (lib + tests), cargo test `-p azure_data_cosmos_driver --lib` — 794 passed (baseline 790 + 4 new patch_handler tests). Refs PR Azure#4386 review feedback (iteration 1). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address Phase 5 review findings F4, F5, F6, F7 on PR Azure#4386. F4 — token-level descendant check in `move_op`: Compare parsed JSON-Pointer tokens instead of raw byte prefixes when rejecting `move` into a descendant. The byte-prefix form happens to be correct for canonical RFC 6901 encodings (only '/' and '~' are escaped), but it relied on a fragile trailing-'/' guard. The new token-slice compare makes the invariant explicit and immune to any future encoding edge cases. F5 — atomic `move_op` (clone-stage-commit): The previous `move_op` ran `remove` directly on the live document before `add_or_set`. If the destination's parent was missing, the source was already gone and the caller observed a partial state. We now stage both steps on a clone and only commit `*doc = staged` on success — a failed move leaves the document unchanged. F6 — `#[non_exhaustive]` on `PatchOp` and `IncrValue`: Both enums are pre-1.0 and may need to grow (e.g. when Cosmos ships the conditional-PATCH server feature, or for new increment variants). Marking them `#[non_exhaustive]` reserves forward-compatible headroom without affecting internal exhaustive matches. F7 — `Set` on a numeric array index replaces in place: Backend Cosmos PATCH semantics: "set is similar to add… in an array, Set replaces an existing element value". The previous `add_or_set` treated Set identically to Add for arrays (insert+shift). We now branch: numeric-token Set ⇒ `arr[idx] = value` (no shift, fails on out-of-range), trailing-'-' Set ⇒ append (matches Add), Add ⇒ insert+shift as before. Three new unit tests cover the matrix (replace, append, out-of-range). 8 new patch_eval unit tests (27 total, was 19) + 0 driver lib regressions. Validated locally: cargo fmt, cargo build, cargo clippy `-D warnings` (lib + tests), cargo test `-p azure_data_cosmos_driver --lib` — 800 passed. Refs PR Azure#4386 review feedback (iteration 1). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address Phase 5 review findings F8, F12, F13 (and document the Q3 / F3 override + F11 deferral in the harness state). F8 — # Latency rustdoc on PatchItemOptions: Calls out the 2× single-RTT floor (PATCH is at minimum Read+Replace) and recommends sizing the end-to-end latency budget at >= 2x the p99 budget for a plain Replace, plus headroom for 412 retries. Without this guidance a too-tight budget can cancel the RMW between the Read and the Replace and produce a timeout error even on a healthy service. F12 — unimplemented!() in deferred emulator test bodies: The four deferred PATCH emulator tests (cosmos_patch_412_retry, cosmos_patch_412_exhaustion, cosmos_patch_no_etag_returns_error, cosmos_patch_e2e_latency_budget) previously returned Ok(()) — which would silently certify behavior we have not exercised the moment someone removes the `#[ignore]`. They now call unimplemented!() with citations to the matching `implementation-state.json#known_issues` rows (fault-injection-412, fault-injection-header-strip, e2e-latency-policy). The `#[ignore]` attributes are unchanged. F13 — TODO(A9-deferred) inline in cosmos_patch_compare: The A9 fixture catalog's expected_props are hand-curated against the documented Cosmos PATCH semantics. A parallel run against the .NET SDK would catch silent semantic drift. Added a TODO(A9-deferred) comment immediately above `let cases = fixtures();` anchoring the deferred work to the exact site that needs the swap. Complements the broader contract at `tests/tools/cosmos-patch-ref/README.md`. Q3 / F3 override and F11 deferral are recorded in the harness state (`.coding-harness/implementation-state.json`, gitignored) — see `.coding-harness/feedback-response-1.json` for the full disposition table. Validated locally: cargo fmt --check, cargo build, cargo clippy `-D warnings` (lib + tests), cargo test `-p azure_data_cosmos_driver --lib` — 800 passed, `cargo test -p azure_data_cosmos --lib` — 133 passed. Refs PR Azure#4386 review feedback (iteration 1). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
F14: `parse_array_index(..., allow_append=false)` previously let `idx=0`
slip through on an empty array (because `len.saturating_sub(1) = 0`),
which made `Set('/xs/0', v)` and `Replace('/xs/0', v)` against `{xs: []}`
panic at the subsequent `arr[idx] = v`. Both call sites are reachable
from `ContainerClient::patch_item`, so the panic surfaces to user code
as a tokio JoinError. Guard the empty-array case explicitly and return
the typed `ArrayIndexOutOfRange` instead.
F15: `read_status_error` returned `ErrorKind::Other` for any non-success
status surfacing from the internal Read sub-op (and from the non-412
Replace path), while iter-1's F10 fix typed the 412 path as
`ErrorKind::HttpResponse`. Callers could downcast a 412 via
`error.http_status()` but had to string-match on a 404 / 429 / 5xx.
Mirror F10 — use `ErrorKind::HttpResponse { status, .. }` here too so
the error contract is symmetric across all PATCH failure modes.
F16: the iter-1 helper-level test `replace_sub_op_uses_read_response_session_token`
asserted operation type, body, and session_token but not the If-Match
precondition. A future refactor that silently dropped
`.with_precondition(Precondition::if_match(etag))` would have downgraded
the RMW to a non-conditional Replace, defeating R3-DRIVER's ETag guard.
Add the assertion via the existing `op.precondition()` accessor.
Tests:
* +4 patch_eval tests (Set/Replace on `/xs/0` against `{xs: []}` return
ArrayIndexOutOfRange; Add on `/xs/0` and `/xs/-` against `{xs: []}`
still succeed).
* +1 patch_handler test (`read_status_error` returns
`ErrorKind::HttpResponse { status: NotFound, .. }`).
* +1 assertion on the existing F16 test.
Driver lib tests: 805 passed (was 800). SDK lib tests: 133 passed.
All canonical commands (fmt, build, clippy, test for both crates) green.
Addresses review-feedback-2.json findings: F14, F15, F16
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ADO Build Analyze stage flagged 3 unknown words and 4 relative links to source files. None of these are real bugs but the gates are blocking. CSpell: - .gitignore: drop "Spren" from the harness comment (irrelevant detail). - driver_patch.rs: change "nonexistentparent" needle to "nonExistentParent" so cspell tokenizes it as non+Existent+Parent. The needle match is case-insensitive (cosmos_patch_compare lowercases both sides at line 825) so this preserves the assertion semantics. - tests/tools/cosmos-patch-ref/README.md: canonicalising -> canonicalizing (cspell uses en-US). Link verify: - PATCH_HANDLER_SPEC.md: two markdown links pointed at .rs files. The Azure SDK link guidance disallows relative links outside .github/** and eng/** and the link verifier flags them. Replace with inline code paths that still let readers locate the file. - cosmos_patch_compare_coverage.md: same treatment for the link to driver_patch.rs and to tools/cosmos-patch-ref/README.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds Cosmos DB item PATCH support to the Rust SDK by introducing a new ContainerClient::patch_item API in azure_data_cosmos, backed by a driver-side OperationType::Patch implemented as a bounded Read‑Modify‑Write (RMW) loop in azure_data_cosmos_driver.
Changes:
- Added new public PATCH surface in
azure_data_cosmos:ContainerClient::patch_item<T>andPatchItemOptions, plus re-exports ofPatchSpec/PatchOp/IncrValue. - Added driver-side PATCH implementation:
OperationType::Patch, PATCH operation factory/options plumbing, local RFC 6901 evaluator, and an RMW handler that synthesizes a post-image response. - Added emulator-backed E2E tests and documentation/spec artifacts for the PATCH behavior and future .NET oracle tooling.
Reviewed changes
Copilot reviewed 23 out of 24 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/cosmos/azure_data_cosmos/src/options/mod.rs | Adds PatchItemOptions for configuring SDK PATCH requests. |
| sdk/cosmos/azure_data_cosmos/src/models/mod.rs | Re-exports driver PATCH model types from the SDK models module. |
| sdk/cosmos/azure_data_cosmos/src/lib.rs | Re-exports PATCH model types at the crate root. |
| sdk/cosmos/azure_data_cosmos/src/clients/container_client.rs | Adds ContainerClient::patch_item<T> and a compile-time Send future assertion. |
| sdk/cosmos/azure_data_cosmos/README.md | Adds a README example showing item PATCH usage. |
| sdk/cosmos/azure_data_cosmos/CHANGELOG.md | Documents the new SDK PATCH API addition. |
| sdk/cosmos/azure_data_cosmos_driver/tests/tools/cosmos-patch-ref/README.md | Adds a deferred contract/spec for a future .NET PATCH oracle tool. |
| sdk/cosmos/azure_data_cosmos_driver/tests/framework/test_client.rs | Adds test helpers for hierarchical PK containers and driver PATCH dispatch. |
| sdk/cosmos/azure_data_cosmos_driver/tests/emulator_tests/mod.rs | Registers the new emulator test module for PATCH. |
| sdk/cosmos/azure_data_cosmos_driver/tests/emulator_tests/driver_patch.rs | Adds emulator-backed PATCH E2E tests and fixture-based comparison harness. |
| sdk/cosmos/azure_data_cosmos_driver/tests/emulator_tests/cosmos_patch_compare_coverage.md | Documents coverage of the PATCH comparison fixture catalog. |
| sdk/cosmos/azure_data_cosmos_driver/src/models/patch.rs | Introduces PATCH model types: PatchOp, PatchSpec, IncrValue. |
| sdk/cosmos/azure_data_cosmos_driver/src/models/mod.rs | Wires PATCH models and adds OperationType::Patch. |
| sdk/cosmos/azure_data_cosmos_driver/src/models/cosmos_resource_reference.rs | Adds helper to reconstruct ItemReference for PATCH handler sub-ops. |
| sdk/cosmos/azure_data_cosmos_driver/src/models/cosmos_operation.rs | Adds PATCH operation factory and patch_max_attempts plumbing. |
| sdk/cosmos/azure_data_cosmos_driver/src/driver/pipeline/patch_handler.rs | Adds the driver-side PATCH RMW handler. |
| sdk/cosmos/azure_data_cosmos_driver/src/driver/pipeline/patch_eval.rs | Adds the local PATCH evaluator (RFC 6901 + Cosmos semantics). |
| sdk/cosmos/azure_data_cosmos_driver/src/driver/pipeline/mod.rs | Exposes new PATCH pipeline modules. |
| sdk/cosmos/azure_data_cosmos_driver/src/driver/pipeline/from_local_body.rs | Adds helper to synthesize a response from local body + Replace headers. |
| sdk/cosmos/azure_data_cosmos_driver/src/driver/cosmos_driver.rs | Dispatches OperationType::Patch to the PATCH handler. |
| sdk/cosmos/azure_data_cosmos_driver/docs/PATCH_HANDLER_SPEC.md | Adds a spec doc for the PATCH handler behavior. |
| sdk/cosmos/azure_data_cosmos_driver/CHANGELOG.md | Documents the new driver PATCH operation and models. |
| sdk/cosmos/azure_data_cosmos_driver/ARCHITECTURE.md | Documents the “body-opacity exception” for PATCH in the driver architecture. |
| .gitignore | Ignores .coding-harness/ local artifacts directory. |
F-C1 is a real correctness bug: the RMW loop's 412 detection in
`patch_handler::execute` was unreachable. `driver.execute_operation`
returns `Err(ErrorKind::HttpResponse { status: 412, .. })` for a 412 —
the pipeline maps 412 to `OperationAction::Abort` and returns `Err`
verbatim (see `operation_pipeline.rs:324..358`). `await?` therefore
propagated the error out of the loop and the `status_code() ==
PreconditionFailed` arm was dead code, so a single concurrent writer
would surface as a 412 to the caller instead of triggering retry.
The fix uses `match driver.execute_operation(...).await` on the
Replace result. A new `is_precondition_failed(&Error)` predicate
pattern-matches `ErrorKind::HttpResponse { status: 412, .. }`; on a
match the loop stashes the real service error in `last_412` and
continues. Any other Err propagates verbatim, preserving the original
`raw_response` and diagnostics. The Read side already wanted
propagation, so its dead `if !status.is_success()` post-check is
removed in favour of plain `await?` (with a comment explaining why).
F-C2 extends the PK-overlap preflight to also reject `MoveOp { from }`
when `from` overlaps a partition-key path; moving *out of* a PK path
deletes a PK component just as surely as moving into one.
F-C7 adds emulator test rows for "Move from /pk" (flat) and "Move from
/tenantId" (hierarchical MultiHash) covering the F-C2 fix.
F-C3 trims the `PatchItemOptions` rustdoc so the session-token field
is described as the canonical location rather than via the operation
sub-field.
F-C4 fixes the doctest in `models/patch.rs`: the previous `windows(7)
.any(... == b"\"set\"")` could never match a 5-byte substring.
F-C5 rewrites the Algorithm block in `PATCH_HANDLER_SPEC.md` to
reflect that the Read uses caller `OperationOptions`+session_token and
the Replace overrides with the Read response's session_token (the
SE-004 mitigation landed in iter-1 F2). Adds a session-token-threading
section explaining the rationale.
F-C6 rewrites the `create_container_with_pk_paths` doc to drop the
incorrect "service infers kind" claim — the helper explicitly sets
Hash vs MultiHash from `paths.len()`.
F-C8 trims the patch_item CHANGELOG entries in both crates to a
single user-facing line and points readers at PATCH_HANDLER_SPEC.md
for the RMW / ETag / session-token / PK guard details.
Driver lib: 805 → 809 tests (+4 net: +3 `is_precondition_failed`
predicate tests, +2 `validate_partition_key_paths` move-from tests,
-1 obsolete `read_status_error_uses_http_response_kind` test whose
helper is no longer called from production code). SDK lib: 133
(unchanged).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Build Analyze on the previous push surfaced two CI lint issues introduced by the Copilot-review fix commit: - cspell flagged `hpkc` (3 occurrences in `pk_guard_rejects_move_from_pk_path_hierarchical`) as an unknown word. Renamed the placeholder container id/name/rid to `multi_hash_container` variants so cspell tokenizes the snake_case parts as real English words. - Link verification flagged the markdown link to `docs/PATCH_HANDLER_SPEC.md` in `azure_data_cosmos_driver/CHANGELOG.md` as a relative link in a directory outside the allow-list (only `.github/**` and `eng/**` may use relative links per Azure SDK link guidance). Converted to inline code so the path is still discoverable without violating the policy. Local cspell pass against all changed files reports 0 issues; `pk_guard` tests still pass after the rename. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The `tests/tools/cosmos-patch-ref/` directory was a placeholder README for a future .NET reference oracle that was never built. The Rust PATCH tests already verify the same observable behavior as the equivalent .NET and Java tests via a curated fixture catalog (each row carries a `source_test_id` pointing at the upstream test method), so this stub no longer adds value. Changes: - Delete `tests/tools/cosmos-patch-ref/` (README-only directory). - Rename the catalog test `cosmos_patch_compare` to `cosmos_patch_cross_sdk_parity` so the name reflects what the test actually does: run a cross-SDK behavioral parity check against the emulator. Also rename the companion coverage doc to match. - Rewrite the coverage doc to drop references to the deferred .NET oracle and to add a short note on the canonical way to re-verify a row (run the source .NET / Java test and compare). - Strip internal review-finding prefixes (`F4:`, `F5:`, `F7:`, `F12:`, `F14:`, `F16:`, `F-C1:`, `F-C2:`) from inline comments while preserving the engineering rationale that followed. - Drop `; see implementation-state.json#known_issues` trailers from deferred-test ignore messages and `unimplemented!` strings. No behavioral changes; tests pass (912 driver lib tests, no new ignores, no new failures). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address the deep-reviewer test-coverage findings on the Cosmos PATCH
RMW handler:
* **Refactor**: extract `pub(crate) trait SubOperationDispatcher` and a
generic `execute_with_dispatcher` core in `patch_handler.rs`. The
original `execute()` is now a thin wrapper. Zero public-API change,
but the loop is now unit-testable with a scripted stub without a
live emulator.
* **Drop `PatchSpec::condition`**: the field and `with_condition`
builder were a silent no-op (handler never read them) and structurally
misplaced vs peer SDKs, which expose the SQL filter on request
options. Removed both; a new test pins that the JSON wire form has no
`condition` field.
* **New unit tests** (driver `patch_handler.rs`):
* `rmw_recovers_from_412_on_first_replace` — pins the 412 continue
branch (2 Reads, 2 Replaces, If-Match wired).
* `rmw_propagates_412_after_exhausting_max_attempts` — exhaustion.
* `rmw_propagates_non_412_replace_error_immediately` — no retry on
non-412 Replace failure.
* `rmw_propagates_read_error_immediately` — non-412 Read short-
circuits before any Replace.
* `pk_guard_rejection_issues_no_sub_operations` — zero sub-ops issued
when the preflight rejects.
* `empty_patch_spec_issues_no_sub_operations` — no-ops short-circuit.
* `exhaustion_error_with_source_chains_underlying_412` /
`_without_source_is_still_412_shaped` — pin both arms.
* `rmw_fails_without_etag_before_replacing` — ETag-less Read aborts.
* `rmw_caller_session_token_reaches_read_then_replace_uses_read_response_token`
— session-token plumbing caller -> Read -> Replace.
* **New emulator test**: `cosmos_patch_read_missing_item_returns_not_found`
in `driver_patch.rs` patches a never-created id and asserts a 404 from
the read leg propagates as a typed error.
* **New SDK-level integration tests**: `cosmos_patch.rs`
(`patch_item_round_trip`, `patch_item_missing_returns_not_found`,
`patch_item_honors_max_attempts_option`) round-trip the public
`ContainerClient::patch_item` surface end-to-end.
Validated: `cargo fmt --check`, `cargo clippy -D warnings` on the
driver (SDK clippy is clean modulo 3 pre-existing warnings in untouched
files), and `cargo test -p azure_data_cosmos_driver --lib --all-features`
-> 923 passed, 0 failed (12 new patch_handler tests + 1 new patch model
test).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Round 2 deep review surfaced two issues; close both. * **B-1 (blocking)**: `patch_item_round_trip` in the new SDK integration test (`cosmos_patch.rs`) used JSON Pointer `/displayName` against a document whose serialized field is `display_name` (snake_case — no `serde(rename)` on the test struct). `PatchOp::Replace` is strict (see `patch_eval.rs:347-349` and `replace_missing_leaf_fails`), so the test would have failed with `MissingTarget` the first time it was actually run against the emulator. The `test_category = "emulator"` gate hid the failure from `cargo test`. Fix: change the pointer to `/display_name` so it lines up with the on-the-wire shape. * **N-1 (nit)**: `rmw_caller_session_token_reaches_read_then_replace_uses_read_response_token` promised more than it actually checked — `DispatchedCall` records only `(op_type, if_match_etag)`, not session tokens. The session-token wire-up is already covered by the per-builder unit tests `read_sub_op_propagates_caller_session_token` and `replace_sub_op_uses_read_response_session_token`. Fix: rename to `rmw_loop_dispatches_read_then_etag_guarded_replace` and expand the doc comment to explain the test pins the structural Read -> Replace sequence + ETag inheritance only, with a pointer to the helpers that pin the session-token contract. Verified: `cargo test -p azure_data_cosmos_driver --lib --all-features` -> 923 passed, 0 failed. `cargo clippy -p azure_data_cosmos_driver --lib --tests --all-features -- -D warnings` clean. `cargo fmt --check` clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #4048.
Summary
Adds
ContainerClient::patch_itemtoazure_data_cosmos, restoring the only major write-side gap between the Rust Cosmos SDK and its peer SDKs (.NET / Java / Python / JS). PATCH is implemented as a driver-side Read-Modify-Write (RMW) loop behind a newOperationType::Patch, so any future SDK that consumesazure_data_cosmos_driverinherits PATCH for free and so the driver can later swap in native HTTP PATCH transparently without an SDK API change.PATCH was previously removed in #3764 pending an idempotency model for retry. This PR ships RMW that is byte-for-byte equivalent to the backend evaluator for the supported op set; idempotency caveats for
Incrementunder automatic retry are documented in both the rustdoc andPATCH_HANDLER_SPEC.md.Approach
The RMW loop lives in
azure_data_cosmos_driverbehind a newOperationType::Patch. The SDK is a ~90 LOC wrapper aroundCosmosOperation::patch_item+ContainerClient::patch_item. Four new driver modules:models/patch.rs—PatchOp(Add / Remove / Replace / Set / Increment / MoveOp),IncrValue,PatchSpec. Both enums are#[non_exhaustive]for forward-compat. Provides both idiomatic Rust enum variants AND factory methods that mirror .NET (PatchOp::add(...),PatchOp::set(...),PatchOp::increment(...),PatchOp::move_op(...)).driver/pipeline/patch_eval.rs— RFC 6901 JSON Pointer evaluator + per-op semantics matching the Cosmos backend evaluator. Hand-rolled (see Alternatives). 27 unit tests covering:~0→~,~1→/)move_op(not byte-prefix)Seton numeric array index replaces in place (Cosmos backend semantics);Setwith/-appendsparse_array_index(no panic onSet('/xs/0')against{xs: []})driver/pipeline/patch_handler.rs— the bounded RMW loop:OperationType::Read— uses the caller'sOperationOptions(preserves caller's session_token + consistency level + latency budget). Extracts ETag + session_token from the response headers.NoneETag →Errimmediately, before any Replace is issued.apply_patch_opson the deserialized body.OperationType::ReplacewithPrecondition::IfMatch(etag)and the Read response's session_token (overriding caller's) — closes the TOCTOU window so the Replace commits against the same replica view that produced the Read.Err(412)andattempt < max_attempts - 1: continue. On any other error: propagate (typed viaErrorKind::HttpResponse { status, .. }so callers can downcast a 404 / 429 / 503 from inside the RMW). On exhaustion:ErrorKind::HttpResponse { status: PreconditionFailed, .. }with message"patch_item: ETag conflict after {N} attempts", chained to the last per-attempt 412 viaError::with_error.driver/pipeline/from_local_body.rs— synthesizes the response the SDK bridge consumes (body from the locally-patched JSON; headers from the successful Replace).The RMW loop is the first and only code path in the driver pipeline permitted to deserialize a response body to
serde_json::Value. The existing Upsert/Batch dispatch branches inject request headers only; they do not touch the response body. The body-opacity exception is documented inARCHITECTURE.mdanddocs/PATCH_HANDLER_SPEC.md.Call flow
PatchSpec(vec ofPatchOps, optionally via the .NET-style factories) and callscontainer.patch_item::<T>(pk, item_id, spec, options).CosmosOperation::patch_item(item_ref).with_body(spec)(+ optionalwith_patch_max_attemptsfromPatchItemOptions::max_attempts) and forwards to the driver via the existingexecute_operationpipeline. No loop, no body deserialization in the SDK.execute_operation_pipelinedetectsOperationType::Patchand dispatches topatch_handler::execute(...)before the existing Upsert/Batch header-injection branches.patch_handler::executeruns the PK pre-flight, then the bounded RMW loop. InternalRead/Replaceoperations are re-entrant through the same pipeline (the dispatch branch fires only for top-levelPatch).CosmosResponsesynthesized viafrom_local_body_and_driver_headers— body from the locally-patched JSON, headers from the successful Replace.ItemResponse<T>via the existingdriver_response_to_cosmos_responsebridge; callers invoke.into_model()to deserialize the synthesized post-image into their type.Testing
apply_patch_ops+ JSON Pointer (RFC 6901 escapes, fail-fast on first invalid op, atomic move, set-vs-add semantics, empty-array guard, token-level descendant detection).patch_handlerunit tests —build_read_sub_op+build_replace_sub_ophelper-level coverage of session-token propagation, ETag threading,If-Matchprecondition presence, typed error kinds, plus the 412-detection predicate.cosmos_patch_basic_set— Set roundtrip with post-image verified via the synthesized PATCH response and a follow-up Read.cosmos_patch_pk_guard+cosmos_patch_pk_guard_hierarchical— preflight rejection on flat + MultiHash containers, no network call issued, including move-FROM-PK-path coverage.cosmos_patch_cross_sdk_parity— 34-row fixture catalog whose expected behavior is derived from the equivalent .NET (ItemPatchSuccessTest,ItemPatchFailureTest,PatchOperationTests,PatchOperationTTests) and Java (PatchAsyncTest) tests, plus Rust-derived RFC 6901/6902 + i64-fidelity edge rows. Coverage rollup intests/emulator_tests/cosmos_patch_cross_sdk_parity_coverage.md.DriverTestClient::patch_item+ hierarchical-PK container helpers test infrastructure._assert_futures_are_sendextended forContainerClient::patch_item(compile-time Send-bound regression guard).cargo build/cargo clippy -D warnings/cargo fmt --checkonazure_data_cosmos+azure_data_cosmos_driver.cargo test -p azure_data_cosmos_driver --lib— 912 passed, 0 failed.#[ignore]'d withunimplemented!()bodies citing the blocking fault-injection / latency-policy primitives. See "Deferred to follow-ups" below.Risk Assessment
PatchItemOptionsrustdoc# Latencysection +PATCH_HANDLER_SPEC.md)unimplemented!()-guard, +1 Send-future compile-time testContainerClient::patch_item<T>,PatchItemOptions,PatchSpec/PatchOp/IncrValue(re-exported from driver; both#[non_exhaustive])Notable side effects, each with documented mitigation:
max_attempts, documented as not-for-hot-items in the rustdoc.PatchItemOptionsrustdoc andPATCH_HANDLER_SPEC.md.Alternatives Considered
jsonptr)max_attempts: u8with0 → default 5Option<NonZeroU8>makes "fail-fast on first 412" (Some(NonZeroU8::new(1))) explicit.Cross-SDK Impact
PATCH on Rust lands in three carefully-scoped relationships to the other Cosmos SDKs:
PatchOp/PatchSpec/PatchItemOptionsare named, shaped, and casing-aligned withMicrosoft.Azure.Cosmos.PatchOperation(.NET) andCosmosPatchOperations(Java). The factory methods (PatchOp::add,PatchOp::set,PatchOp::increment,PatchOp::move_op) match the .NET static factories one-to-one. Callers porting code between SDKs see a familiar surface.Readthen a conditionalReplace. The observable evaluation semantics (operator behavior, RFC 6901 pointer handling, partition-key guard) are designed to match what the .NET/Java tests observe today — that match is whatcosmos_patch_cross_sdk_parityverifies.This PR is intentionally Rust-only. No code change required in .NET / Java / Python / JS (they already ship PATCH); Go is a separate work item.
Review Loop Resolution
Two rounds of automated and manual review ran on this PR. The fixes that landed:
First-round changes:
patch_itemnow returnsItemResponse<T>instead ofItemResponse<()>, matching the read-item / create-item shape.PatchOpandIncrValueare both#[non_exhaustive]./ais not a descendant of/ab.Seton a numeric array index replaces in place (matching the backend);Setwith/-appends.PatchItemOptions::max_attemptsrustdoc gained a# Latencysection.PATCH_HANDLER_SPEC.mddocuments the 2+RTT lower bound.Error::with_error.#[ignore]'d AND callunimplemented!()so removing the ignore without wiring the test body fails loud.Second-round changes:
parse_array_indexempty-array guard —Set('/xs/0')against{xs: []}now surfacesArrayIndexOutOfRangeinstead of panicking.read_status_errornow usesErrorKind::HttpResponsefor symmetry with the exhaustion error contract.Precondition::if_match(etag)is present on the Replace, not just the session-token threading.Copilot-review changes (commits in this PR):
Err(_)instead ofOk(_)— the driver pipeline returns 412 asErr(ErrorKind::HttpResponse{status: PreconditionFailed, ..}), not as a successful response with a status code. Ais_precondition_failedpredicate was added and the Replace path now usesmatchinstead of?so the loop can detect-and-continue on a 412 error.PatchItemOptions::session_token,models/patch.rsexamples,PATCH_HANDLER_SPEC.mdalgorithm description,create_container_with_pk_paths, and CHANGELOG entries.Deferred to follow-ups
These are documented deferrals, not silent gaps:
FaultInjectionErrorType::PreconditionFailed(412); A12 requires a header-strip primitive; A13 requiresEndToEndOperationLatencyPolicyplumbing throughpatch_handler. The four deferred emulator tests are present,#[ignore]'d, and their bodies callunimplemented!("<id> deferred: needs <primitive>")so they fail loud if someone removes the ignore without doing the work.PatchHandlerOptionsside-channel — todaymax_attemptsrides onCosmosOperation::with_patch_max_attempts(). A typed per-op handler-options struct accessible fromCosmosOperationwould scale better as more PATCH-only knobs land. The next PATCH-specific option is the forcing function.Lineage
Cosmos: Restore PATCH API).Microsoft.Azure.Cosmos.PatchOperation(.NET v3 SDK),CosmosPatchOperations(Java v4 SDK). Public API naming and factory shape were aligned to these.~0/~1.Add/Remove/Replace/Moveop semantics, including the-(append) array index.OperationType::Patchslots into the existing dispatch model alongside Upsert/Batch; the body-opacity exception for PATCH is documented inARCHITECTURE.mdanddocs/PATCH_HANDLER_SPEC.md.Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com