Python SDK v1.3.0
·
8 commits
to main
since this release
New Features and Improvements
- Built on Rust SDK 2.0.1: The Python wrapper now depends on
databricks-zerobus-ingest-sdk = "2.0.1"from crates.io (was 1.2.0).
Internal PyO3 binding was rewritten to use the newStreamBuilder
typestate API exclusively. The Python-facing API surface
(TableProperties,create_stream(client_id, client_secret, table_properties, options, headers_provider),ingest_record,
RecordAcknowledgment, etc.) is unchanged. - Arrow Flight promoted to Beta:
ZerobusArrowStream/
ArrowStreamConfigurationOptionsand the surrounding documentation are
no longer labelled experimental/unsupported. The API is stabilising but
may still change before reaching GA. (Mirrors the Rust SDK 2.0.1
promotion.) - Arrow Flight — graceful stream close: On server signaled close, the client pauses sending, drains in-flight acks within a bounded wait, then recovers.
stream_paused_max_wait_time_msonArrowStreamConfigurationOptions: Optional milliseconds cap for the paused wait (None= full server duration,0= immediate recovery).- Python SDK identifier on the wire: The SDK now reports itself as
zerobus-sdk-py/<version>on the HTTPuser-agentheader
(previously it inherited the Rust SDK identifier
zerobus-sdk-rs/<rust-version>). Server-side telemetry can now tell
Python clients apart from Rust clients. AckCallback.on_erroris now delivered to Python: The PyO3 binding
previously only logged ack errors viaeprintln!; subclasses overriding
on_errorwould never see the call. They will now.
Behavior Changes
-
StreamConfigurationOptions.record_typeis no longer applied: Format
is now set on the stream builder fromTableProperties(proto descriptor
present → Proto, absent → JSON). The field is kept for API compatibility. -
StreamConfigurationOptions.max_inflight_recordsdefault raised from
50_000to1_000_000to match the Rust SDK 2.0.1 default. Previously the
Python wrapper hard-coded 50k while the Rust SDK quietly raised its default
to 1M, so Python clients ran with a 20× lower in-flight ceiling than Rust
clients for no good reason. Callers who relied on the old cap can pin it
back explicitly:StreamConfigurationOptions(max_inflight_records=50_000)
Bug Fixes
TableProperties(name, MyMessage.DESCRIPTOR)proto-descriptor
selection now picks the message by name when a
google.protobuf.descriptor.Descriptorobject is passed. Previously
the first message in theFileDescriptorProtowas always chosen, which
silently mis-routed schemas for.protofiles containing multiple
messages. Raw-bytes input still falls back to the first message (no
name hint is available).
Documentation
- Updated docstrings, the Arrow example files, and
zerobus.sdk.shared.arrow
to reflect the Beta promotion.
Internal Changes
- Bumped Rust dependencies to match the Rust SDK 2.0.1 workspace:
prost/prost-types0.13 → 0.14,tonic0.13 → 0.14,arrow-ipc/
arrow-schema/arrow-array56.2 → 58.2. - Removed the in-tree
[patch.crates-io]redirect for
databricks-zerobus-ingest-sdk— the 2.0.1 release is now resolved
from crates.io. StreamConfigurationOptionsandArrowStreamConfigurationOptions
fields are now applied viaStreamBuildersetters because the
underlying Rust structs are#[non_exhaustive]in 2.0.1.- Bounded the
HeadersProviderWrapperstatic-key leak: header names are
now interned in a process-wide table so each distinct name is leaked at
most once, instead of once perget_headers()call. - Shared payload-extraction, options-application, and the
AckCallback
bridge moved intopython/rust/src/common.rs, removing duplicated code
betweensync_wrapper.rsandasync_wrapper.rs. ZerobusSdk.set_use_tls(...)is retained as a no-op for backwards
compatibility. Rust SDK 2.0.1 removed the underlying TLS toggle; TLS is
always controlled via the SDK builder.