Cosmos: Add azure_data_cosmos_driver_native C-ABI Wrapper Spec#4461
Cosmos: Add azure_data_cosmos_driver_native C-ABI Wrapper Spec#4461kundadebdatta wants to merge 5 commits into
azure_data_cosmos_driver_native C-ABI Wrapper Spec#4461Conversation
Adds docs/NATIVE_WRAPPER_SPEC.md to the azure_data_cosmos_driver crate. This is the design spec for a new azure_data_cosmos_driver_native crate that will expose the driver's transport, routing, retry, partition-key, operation, response, and diagnostics primitives through a stable C ABI so non-Rust SDKs (Java, .NET, Python, ...) and C/C++ apps can reuse the driver's machinery. Shipping the spec first so the design can be reviewed independently before implementation lands in a follow-up PR. The spec covers: motivation and scope, crate layout, naming conventions, FFI plumbing (CallContext, RuntimeContext, byte-buffer marshalling, error model), the full handle surface (runtime, account, driver, references, partition keys, operations, response, diagnostics, options builders), build and distribution (cdylib + staticlib, cbindgen, CMake + corrosion), error semantics that intentionally diverge from the prior typed-SDK wrapper (non-success HTTP statuses are surfaced via cosmos_response_status_code, not mapped to error codes), versioning and ABI rules, a 10-phase implementation plan, open design questions, and a migration table for consumers of the deleted azure_data_cosmos_native crate.
…PER_SPEC Blocking fixes: - §6 / §4.7: bind cosmos_driver_execute to execute_singleton_operation, document Result<Option<CosmosResponse>> handling and FEED_EXHAUSTED. - §4.6: remove cosmos_operation_with_partition_key; PK now lives on item / feed factory args (ItemReference::from_name requires PK at construction). - §4.4: add normative driver-cache documentation (endpoint-only key, options dropped on cache hit, credential collision caveat) plus a new COSMOS_ERROR_CODE_OPTIONS_IGNORED_ON_CACHE_HIT advisory. - §3.5 / §6: rewrite error model for azure_data_cosmos::Error from #4442 (Kind enum, typed accessors, predicates, synthetic sub-status codes, non_exhaustive future-proofing via COSMOS_ERROR_KIND_UNKNOWN). - §4.2: fix DriverOptions surface to mirror the actual 3-field type; builder takes account, with_operation_options for per-call defaults; removed allow_emulator_invalid_certs (lives on runtime instead). - §5.2: fix Cargo features (default = tokio + rustls; drop fake reqwest_native_tls / tracing feature; native_tls is the correct name). - §4.6.3: normative execute-consumption contract (sentinel, free always safe, failed execute does not consume, double-execute returns 4005). Recommended fixes: - §4.6.2: split with_precondition into IF_MATCH / IF_NONE_MATCH with a PRECONDITION_ALREADY_SET error to enforce single-precondition rule. - §4.3: document resource-token routing via master-key Secret path; add with_credential mirror. - §4.6.1: add missing factories — read_all_items_cross_partition, query_items, batch, query/read/replace_offer. - §4.3 / §4.5 / §3.4: define cosmos_*_clone functions promised in §3.4. - §5.3: Phase 0 ancillary-tooling re-introduction checklist for entries PR #4103 removed (cbindgen as [build-dependencies] only per heaths, dict files, .cspell.json, verify-dependencies.rs, AGENTS.md, skills). - §3 / §5.1: tighten inheritance attribution between #2906 and #3347. - §7: loosen strict ABI version equality to major-equal / minor->= so the spec's additive-growth promise actually holds. - §2.2: add cbindgen export.rename / item_types policy so the generated header matches the naming table (avoids cosmos_cosmos_* double-prefix and prevents driver-internal types from leaking). Other: - §9: replaced single open question on RuntimeAlreadyInitialized with 9 questions covering implementation parking-lot items (header-visitor borrow vs copy, continuation-token format, multi-part body, C++ companion header, credential identity in cache, ConnectionString parser ownership, symbol stripping, pager continuation-token resume). - §8: updated Phase 0 / 1 / 2 / 3 / 5 / 6 acceptance criteria to match the new contracts (header check-in, error accessor coverage, cache-hit advisory test, error-mapped 404 surfaces as is_not_found). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Second-round PR Deep Reviewer findings on PR #4461 (post-1672bd8). Addresses 5 blocking and 5 recommended items; nits / one open question deferred. F1 Predicate forwarding + cosmos_error_is_service_error (#4442 alignment). §3.5.2: documented that cosmos_error_is_* predicates forward to CosmosStatus::is_*; added is_service_error. §6.4: split backtrace rate knob into separate captures/resolutions per #4442 surface. F2 query_items + query_plan factory shape. §4.6.1: query_items now takes cosmos_feed_range_t* (matches driver's Option<FeedRange>); added a separate cosmos_operation_query_plan for the SQL-string path. F3 Factory signature template. §4.6.1: every CosmosOperation factory converted to "cosmos_error_code_t fn(... cosmos_operation_t **out_op)" matching the §3.2 ABI shape; removes the bare-pointer divergence. F4 Removed cosmos_response_iter_headers visitor. §4.7: typed accessors only; documented unknown-header drop; §9 Q2 reframed as a forward-compat passthrough question. F5 Opacity sweep. §3.1 cosmos_call_context_t opaque + accessors (_create/_free/_runtime/_set_include_error_details/_include_error_ details). §3.3 cosmos_bytes_t opaque (_data/_len/_free); kept cosmos_bytes_view_t for by-value input. §3.4 ownership table updated. §4.7 cosmos_response_into_body now writes cosmos_bytes_t **out_body. §4.8 cosmos_diagnostics_to_json signature corrected to **out_json. F6 Landing-prereq callouts. §3.5 / §4.2: explicitly cite #4442 (errors) and #4452 (Tokio thread-name prefix) as prerequisites; #4452-only surface reworded as "landed in". F7 Phase 5 operation-options enumeration. Removed max_item_count from the OperationOptions list (it's on CosmosOperation::with_max_item_count per §4.6.2); grouped all 17 OperationOptions fields by category and documented optional v1 subset path. §4.2 prose updated accordingly. F8 Partition-key value variants. §4.5: renamed append_none -> append_undefined to match Cosmos JSON semantics; added append_infinity; corrected source line ref to :303. F9 Cache-advisory warning class. §3.5.1: created a 5001..=5999 warning band (non-SUCCESS, populates out_*); moved OPTIONS_IGNORED_ON_CACHE_HIT from 4001 -> 5001; reserved 4001. §4.4.1: cache-hit advisory now returns 5001 (not SUCCESS) and is no longer predicated on single-runtime mode; documented the runtime.rs:380-390 lost-race redundant-init path. §9 Q1 reworded. F10 §5.3 ancillary-tooling checklist expanded. Added P0 workspace `members` entry, deny.toml MPL-2.0, sibling azure_data_cosmos README/lib.rs/ARCHITECTURE.md cross-link restoration, cspell regression-diff note, and deleted-file disposition for azurecosmos.pc.in / cmake/DiscoverTests.cmake / next_generation_ sdks_design_principles.md; added "Lessons from #4090 / #4103" preamble. Also: §4.6.4 NEW (minimal cosmos_feed_range_t surface); §10 migration table updated to show the new out_op factory pattern. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three driver-surface alignment fixes surfaced by the PR Deep Reviewer (review #3) — each one was promising a C-ABI shape the underlying azure_data_cosmos_driver cannot deliver as currently exposed. B1 (s4.5 partition keys). The driver's InnerPartitionKeyValue::Infinity variant is pub(crate) and explicitly documented as 'used only internally for EPK boundary calculations'. There is no public constructor for it. Dropped cosmos_partition_key_builder_append_infinity from the v1 surface and rewrote the s4.5 preamble to clarify that the wrapper exposes the five public variants only. Added Q10 to s9 to track the decision between (a) promoting Infinity to pub in the driver and (b) routing all EPK-boundary use cases through cosmos_feed_range_* permanently. B2 (s4.6.1 query_plan). The driver's CosmosOperation::query_plan(container, supported_query_features) is a metadata fetch keyed by the supported-query-features mask string — NOT a SQL-execution entry point. The earlier spec text described it as 'build a SQL query as a standalone query-plan operation' and paired it with a fictional cosmos_operation_with_query_parameter mutator that has no driver counterpart. Renamed the wrapper symbol to cosmos_operation_query_plan_for_features, rewrote its doc to explain it's a feature-mask fetch rather than a SQL submitter, and removed _with_query_parameter entirely. The actual SQL path is documented as cosmos_operation_query_items(c, feed_range, andop) + cosmos_operation_with_body(op, json_body), matching the schema-agnostic G2 contract. Updated the Phase 5 factory list in s8 to reflect the rename. B3 (s4.6.4 FeedRange). The earlier surface promised four constructors, two of which had no driver-side counterpart and two of which had the wrong shape. Replaced with the two driver-public constructors that exist today (FeedRange::full() and FeedRange::for_partition(pk, andPartitionKeyDefinition)), with the for_partition wrapper taking a cosmos_container_ref_t so it can pull the partition-key definition off the container internally. Documented the dropped constructors as deferred (Q11 in s9): cosmos_feed_range_for_epk_range is blocked on a driver-side string-parseable EPK type, and cosmos_feed_range_for_partition_key_range is blocked on a non-existent FeedRangeRepr::PartitionKeyRangeId variant. Also fixed the s3.4 ownership-table reference (_for_* placeholder updated to the two actual constructor names). No other sections changed. The spec is now line-grounded against the current azure_data_cosmos_driver/src/models/{partition_key, feed_range, cosmos_operation}.rs surface; implementers can match every promised cosmos_* function to a real driver entry point.
| |---|------| | ||
| | G1 | Expose every primitive a language SDK needs to build a fully-featured Cosmos DB client: account, runtime, driver, operations, partition keys, options, responses, diagnostics. | | ||
| | G2 | Stay **schema-agnostic**: bodies are `const uint8_t*`/`size_t` in, `const uint8_t*`/`size_t` out. No JSON parsing inside the wrapper. | | ||
| | G3 | Map cleanly to the driver's Rust API. A C function should correspond to a single driver method or a small, mechanical builder step. | |
There was a problem hiding this comment.
I'm not sure this is the right goal. For example, forcing SDKs to go through the builder APIs to build up options types would be very chatty. It would probably be better to expose options types as C-style structs and let languages manage them in whatever way makes sense.
| cosmos_call_context_t *cosmos_call_context_create(const cosmos_runtime_t *runtime); | ||
| void cosmos_call_context_free(cosmos_call_context_t *ctx); | ||
|
|
||
| /* Borrowed accessor for the bound runtime. Lifetime = the call context. */ | ||
| const cosmos_runtime_t *cosmos_call_context_runtime(const cosmos_call_context_t *ctx); | ||
|
|
||
| /* When set, fallible APIs populate the rich cosmos_error_t payload (see §3.5.2) | ||
| * in addition to returning the coarse cosmos_error_code_t. Defaults to true. */ | ||
| void cosmos_call_context_set_include_error_details(cosmos_call_context_t *ctx, | ||
| bool include); | ||
| bool cosmos_call_context_include_error_details(const cosmos_call_context_t *ctx); | ||
|
|
||
| /* Error access — see §3.5.2 for details. */ | ||
| const cosmos_error_t *cosmos_call_context_last_error(const cosmos_call_context_t *ctx); | ||
| cosmos_error_t *cosmos_call_context_take_error(cosmos_call_context_t *ctx); |
There was a problem hiding this comment.
The idea of the CallContext is that it's a structure that both sides can share access to. The calling SDK thread can write directly to it, then hand it off to the driver which can read/write it, and then the calling SDK can read from it when it gets it back. Given that, I don't think we should use this pattern of heap allocating and using accessors. Rather, I think it should be a C-style struct that the two sides can both access directly.
| // Caller-owned input: caller keeps memory live for the duration of the call. | ||
| // Layout is published because this is pass-by-value across the ABI. | ||
| typedef struct cosmos_bytes_view { | ||
| const uint8_t *data; | ||
| size_t len; | ||
| } cosmos_bytes_view_t; |
There was a problem hiding this comment.
We should be very clear what the lifetime of this data needs to be. Especially if we're doing async callbacks. When is the caller able to release this memory?
| // SDK-owned output: opaque handle. Layout is intentionally NOT published so the | ||
| // internal representation (currently a Box<Vec<u8>>) can evolve without an ABI | ||
| // break. | ||
| typedef struct cosmos_bytes cosmos_bytes_t; | ||
|
|
||
| const uint8_t *cosmos_bytes_data(const cosmos_bytes_t *b); /* borrowed; valid until _free */ | ||
| size_t cosmos_bytes_len(const cosmos_bytes_t *b); | ||
| void cosmos_bytes_free(cosmos_bytes_t *b); /* NULL is a no-op */ |
There was a problem hiding this comment.
This creates unnecessary indirection IMO. In most cases, the driver can hand off the memory directly to the SDK. This should be able to use cosmos_bytes_view_t, or perhaps an alternate similar type to indicate driver-owned memory that must be freed with cosmos_bytes_free
| | `cosmos_runtime_t*` | `cosmos_runtime_create` | `cosmos_runtime_free` | No (use one per process) | | ||
| | `cosmos_driver_t*` | `cosmos_driver_get_or_create` | `cosmos_driver_free` | Internally `Arc`; FFI handle is a single owner | | ||
| | `cosmos_call_context_t*` | `cosmos_call_context_create(runtime, ctx_options)` | `cosmos_call_context_free` | No (single-thread-affine; create one per logical caller) | | ||
| | `cosmos_account_ref_t*` | `cosmos_account_ref_with_*` | `cosmos_account_ref_free` | Yes, via `cosmos_account_ref_clone` (cheap; new strong handle to the same `Arc`) | |
There was a problem hiding this comment.
The problem here is that we can't prevent a caller from copying the reference around. We may need to consider using a handle table, where we store the actual Arcs in a table inside the _native crate and hand out indexes into that handle table.
We're going to need a more general way of understanding how these "rich" value types like Account Reference (which is really something that we could pass-by-value except that it can be quite large and contains heap-pointers) are tracked on both sides.
There was a problem hiding this comment.
A handle table would also solve the problem with moves. We need to be careful that we only hand out pointers to heap data, because stack data moves around in memory and pointers become invalid.
|
|
||
| ### 3.5 Error model | ||
|
|
||
| The wrapper's error surface is built on two complementary types — a coarse `cosmos_error_code_t` numeric return value for the C function contract, and a rich `cosmos_error_t` payload that mirrors the driver's `azure_data_cosmos::Error` (introduced in [#4442](https://github.com/Azure/azure-sdk-for-rust/pull/4442)). Both **must** be exposed so language SDKs can implement retry policies, throttling backoff, and conditional-write recovery without re-parsing HTTP headers. |
There was a problem hiding this comment.
Both must be exposed so language SDKs can implement retry policies, throttling backoff, and conditional-write recovery without re-parsing HTTP headers
The point of the driver is that the SDK does not implement these, right?
There was a problem hiding this comment.
native only needs to expsoe what is relevant for debugging IMO - raw response and headers would only be relevant when trying to build error form it - 400 Bad Request is notorious for being impossible to debug without the resonse payload - so agree that we will need the full error - but motivation was off IMO.
|
|
||
| > **Landing prerequisites — read this before implementing.** The §3.5.2 rich-error surface and the §6 error-semantics chapter both depend on PR [#4442](https://github.com/Azure/azure-sdk-for-rust/pull/4442) ("Refactoring to use `Error` instead of `azure_core::Error`") landing first. On `main` today `execute_singleton_operation` still returns `azure_core::Result<…>`, so none of the `cosmos_error_*` accessors below can be wired until #4442 merges. Predicate placement, backtrace-knob naming, and the `Service` / `Transport` / `Client` / `Authentication` / `Serialization` / `Configuration` `Kind` taxonomy are all owned by #4442 — verify the final shape against the merged commit before lifting this section out of draft. Sub-status synthetic codes (`20008`, `20912`, `20010..=20015`, `20020..=20021`, `20030`, `20402`) are also defined in #4442. | ||
|
|
||
| #### 3.5.1 `cosmos_error_code_t` |
There was a problem hiding this comment.
Suggest syncing this up with Fabian's work on error refactoring, it was done with FFI in mind.
|
|
||
| ```c | ||
| typedef struct cosmos_driver_options_builder cosmos_driver_options_builder_t; | ||
| typedef struct cosmos_driver_options cosmos_driver_options_t; |
There was a problem hiding this comment.
For something like driver options, that is set once at startup, I think this makes sense. It allows the driver to copy over strings and other bytes from the SDK memory and take ownership over it. For high-traffic options types like OperationOptions, we should consider how much we could expose directly to the SDK as a C struct, so that it doesn't have to make a bunch of back-and-forth FFI calls just to initialize a single call.
There was a problem hiding this comment.
And IMO API contract should be that teh SDK is resposnible for merging config of runtime/driver/operation levels - for execute_operation there shoudl be one struct that can flow through - but populating it based on inherited/overlapping configs should be done in teh sdk layer.
| cosmos_operation_t *op, /* consumed on success */ | ||
| const cosmos_operation_options_t *options, /* nullable */ |
There was a problem hiding this comment.
The main enemy in an FFI system like this is "chattiness". In most language environments, each call from SDK to Driver will incur a relatively high overhead. So we need to try and find ways to keep that overhead as low as possible. That means that ideally, each call to cosmos_driver_execute accepts parameters that either:
- Are reused from previous calls (
cosmos_call_context_t*,cosmos_driver_t*) - Are pure value types passed through the calling convention (ints, bools, structs passed by-value)
- Are pointers to heap memory managed by the CALLER, for which the lifetime is clear (pointers to pinned .NET buffers, etc.)
- Are pointers to heap memory owned by the driver that the caller was able to write directly to (we could expose a kind of "buffer pool" option to allow SDKs to write request bodies directly to Driver-owned memory for example)
That way, only 1 FFI call is actually needed to perform the request. Once the call reaches the driver, it's free to do things like copy data out of foreign-owned heap buffers (thus allowing that pinned .NET buffer to be unpinned/released while an async operation takes place, for example), but the point is that there was only 1 costly FFI transition needed.
There was a problem hiding this comment.
+100 - especially teh dirver-owned buffer pool is important IMO - in teh upstream layer we will have to deal with various GCs - all of them will make it harder/less efificent to ensure driver can safely use SDK-owned memory - so, wherever possible instead using native pool is useful
|
|
||
| --- | ||
|
|
||
| ## 9. Open Questions |
There was a problem hiding this comment.
We also need to think about async. The original wrapper was 100% synchronous, which would have been a terrible fit for .NET (for example). We need to consider some options for how we handle asynchrony. There are a lot of options there (callbacks, our own internal event loop that SDKs can poll, etc.) so we probably need some exploratory work.
There was a problem hiding this comment.
Agreed - my 2 cents - this speci goed way too much into details in many parts - first few decisions based on proto-typing will need to be how to deal with memory ownership and passing thorugh options etc. to reduce chatiness and how to deal with async code. I would just explore that with a simplistic upsert implementation to get request- and response payload. Then afterwrads teh spec should have one highlevel explanation on how these channelges are addressed - and then maybe in a section all the details above.
Additional questions i would have
- logging - should we account for a mechsniam where the driver can be instructed to log to an SDK provided logger (to make sure drievr-emitte dlogs show up where cx expect them)
- Diagnostics I beleiev will just come fro the CosmosDiagnosticsContext - and then otel etc. will be emitted on sdk layer.
FabianMeiswinkel
left a comment
There was a problem hiding this comment.
Like discussed - I woudl start with more prototyping and then finsih the spec
|
|
||
| --- | ||
|
|
||
| ## 9. Open Questions |
There was a problem hiding this comment.
Agreed - my 2 cents - this speci goed way too much into details in many parts - first few decisions based on proto-typing will need to be how to deal with memory ownership and passing thorugh options etc. to reduce chatiness and how to deal with async code. I would just explore that with a simplistic upsert implementation to get request- and response payload. Then afterwrads teh spec should have one highlevel explanation on how these channelges are addressed - and then maybe in a section all the details above.
Additional questions i would have
- logging - should we account for a mechsniam where the driver can be instructed to log to an SDK provided logger (to make sure drievr-emitte dlogs show up where cx expect them)
- Diagnostics I beleiev will just come fro the CosmosDiagnosticsContext - and then otel etc. will be emitted on sdk layer.
Issue: #4372
Branch:
users/kundadebdatta/4372_cosmos_driver_native_crate_spec→mainScope: Documentation only — adds one spec file to
sdk/cosmos/azure_data_cosmos_driver/docs/. No code changes, no public API changes.Summary
Adds NATIVE_WRAPPER_SPEC.md — the design specification for a new
azure_data_cosmos_driver_nativecrate that will expose the Cosmos DB Rust driver through a stable C ABI. The wrapper is intended as the reuse point for non-Rust language SDKs (Java, .NET, Python, …) and C/C++ applications, replacing the deletedazure_data_cosmos_nativecrate (removed in PR #4103, commitccf43caae).This PR ships the spec only. The implementation will land in follow-up PRs that the spec's §8 phased plan sequences.
Why
The original
azure_data_cosmos_nativewrapped the typed SDK (azure_data_cosmos), which forcedserdeserialization through the FFI boundary — a Java/.NET/Python SDK would receive responses already parsed by Rust'sserde, re-serialized to a string, only to be re-parsed by the host language. That is the wrong layering.azure_data_cosmos_driverwas explicitly designed as the schema-agnostic data plane (seeARCHITECTURE.md) — bodies in/out are raw bytes, item shape is the consuming SDK's concern. Wrapping the driver, not the typed SDK, is the correct boundary for cross-language reuse.This spec captures that design decision and pins the contract before code is written, so review feedback shapes the ABI once rather than chasing it across the implementation PRs.
What the spec covers
cosmos_*_t/cosmos_*_*), cbindgenexport.rename+item_typespolicyCallContext+RuntimeContext(opaque), function signature template, byte-buffer marshalling (cosmos_bytes_t/cosmos_bytes_view_t), ownership table, error model (cosmos_error_t+cosmos_error_code_tbands)cdylib+staticlib, cbindgen, CMake + corrosion, Cargo features, Phase 0 ancillary-tooling re-introduction checklistazure_data_cosmos::Errorfrom #4442; non-success HTTP status codes are not mapped to error codes (surfaced viacosmos_response_status_code)Architecture pinned by the spec
Key design invariants the spec locks in:
const uint8_t*/size_tin, opaquecosmos_bytes_tout. No JSON parsing inside the wrapper.typedef struct cosmos_x cosmos_x_t;with accessor functions — no struct layouts in the header (preserves ABI freedom).&[u8]/Vec<u8>contract; serialization belongs in the consuming SDK.cosmos_runtime_create; a future runtime swap doesn't break C consumers.cosmos_response_status_codeand not mapped to error codes — only transport / plumbing failures populatecosmos_error_t.Review history baked into the spec
The spec was shaped by two rounds of PR Deep Reviewer findings; the current revision incorporates all of them:
Round 1 —
1672bd823"Apply PR Deep Reviewer blocking + recommended findings" (7 blocking + 7 recommended)cosmos_driver_executetoexecute_singleton_operationwithResult<Option<CosmosResponse>>+FEED_EXHAUSTEDhandling.ItemReference::from_namerequiring PK at construction).COSMOS_ERROR_CODE_OPTIONS_IGNORED_ON_CACHE_HITadvisory.azure_data_cosmos::Errorfrom Reafctoring to use CosmosError instead of azure_core:Error #4442 (Kind enum, typed accessors, predicates, synthetic sub-status codes,COSMOS_ERROR_KIND_UNKNOWNfornon_exhaustivefuture-proofing).DriverOptionssurface to mirror the actual 3-field type.default = tokio + rustls; correctednative_tlsname; dropped non-existent features).with_preconditionintoIF_MATCH/IF_NONE_MATCHwith single-precondition enforcement.read_all_items_cross_partition,query_items,batch,query/read/replace_offer.cosmos_*_clonefunctions promised by the ownership table.azure_data_cosmos_native#4103 removed (cbindgen as[build-dependencies]per heaths, dict files,.cspell.json,verify-dependencies.rs, AGENTS.md, skills).export.rename/item_typespolicy so generated headers match §2.2 naming (avoidscosmos_cosmos_*double-prefix and prevents driver-internal types leaking).Round 2 —
7ea4f8204"Apply re-review findings (F1-F10)" (5 blocking + 5 recommended)cosmos_error_is_*predicates forward toCosmosStatus::is_*; addedis_service_error; split backtrace knob per #4442 surfacequery_itemstakescosmos_feed_range_t*(matches driver'sOption<FeedRange>); separatecosmos_operation_query_planfor the SQL-string pathCosmosOperationfactory converted tocosmos_error_code_t fn(..., cosmos_operation_t **out_op)matching the §3.2 ABI shapecosmos_response_iter_headersvisitor — typed accessors only; documented unknown-header dropcosmos_call_context_tandcosmos_bytes_tmade fully opaque with accessor functions;cosmos_bytes_view_tretained for by-value inputmax_item_countcorrectly placed onCosmosOperationnotOperationOptions; documented optional v1 subset pathappend_none→append_undefinedto match Cosmos JSON semantics; addedappend_infinityOPTIONS_IGNORED_ON_CACHE_HITmoved 4001 → 5001membersentry,deny.tomlMPL-2.0, siblingazure_data_cosmoscross-link restoration, cspell regression-diff note, deleted-file disposition; added "Lessons from #4090 / #4103" preambleAlso added §4.6.4 (minimal
cosmos_feed_range_tsurface) and updated the §10 migration table to show the newout_opfactory pattern.Backward compatibility
azure_data_cosmosandazure_data_cosmos_driverare untouched.Commits
What lands next (out of scope for this PR)
The spec's §8 sequences implementation across 10 phases. The notable landing prerequisites that are tracked in flight:
azure_data_cosmos::Errorredesign (Kind enum, typed accessors, predicates). The native error surface in §3.5.2 / §6 forwards directly to this.azure_data_cosmos_driver_nativeCrate #4452 — Tokio thread-name prefix (already landed in main; referenced by §4.2).Once those are in, Phase 0 (crate scaffold + cbindgen + CMake + Corrosion) is unblocked.
Reviewer notes