Releases: databricks/zerobus-sdk
TypeScript SDK v1.1.0
New Features and Improvements
- Arrow Flight ingestion promoted to Beta. Mirrors the Rust SDK 2.0
promotion. The API is stabilising but may still change before reaching GA.
Thearrow-flightfeature is no longer labelled experimental/unsupported
in docs and examples. - macOS pre-built binaries. Added
@databricks/zerobus-ingest-sdk-darwin-x64
and@databricks/zerobus-ingest-sdk-darwin-arm64tooptionalDependencies,
sonpm installon Intel and Apple Silicon Macs now fetches a pre-built
.nodebinary instead of falling back to a source build. waitForOffsetprecision. Replaced theNumber(bigint)round-trip
with napi-rs's losslessBigInt::get_i64(). Both gRPC and Arrow streams
now error cleanly on offsets that exceedi64range instead of silently
truncating past2^53 - 1.
Bug Fixes
- Fixed bogus
apache-arrowpeer dependency. v1.0.x declared
apache-arrow: "^56.0.0", which doesn't exist on npm (56 was a Rust
crate version copied by mistake). Corrected to^18.0.0to match the
current dev dep range.
Internal Changes
- Depends on Rust SDK 2.0.1. The wrapper now goes through
sdk.stream_builder()(Rust 2.0 removed the legacy
create_stream_with_headers_provider/create_arrow_stream/
ingest_record/ingest_recordsmethods). The TS-facing API is
unchanged — the v1 deprecatedingestRecord/ingestRecordsmethods
still resolve after server ack (now viaingest_record_offset+
wait_for_offsetunder the hood). - Arrow crates bumped 56.2.0 → 58.2 to match the Rust SDK 2.0
workspace.bytesadded so the wrapper can hand IPC payloads to
ingest_ipc_batchasBytes. napi6feature onnapi-rssoBigInt::get_i64()is available.- CI install step switched from
npm citonpm install --no-audit --no-fund.npm ci's strict lockfile validation rejects
optionalDependenciesreferencing a not-yet-published version (every
napi-rs major-version bump hits this);npm installtolerates it.
Rust SDK v2.1.1
Major Changes
New Features and Improvements
Bug Fixes
- Fix the Arrow Flight example so it works against the prerequisite
orders
table — corrected the schema toLargeUtf8forSTRING,
Timestamp(Microsecond, Some("UTC"))forTIMESTAMP, andnullable: true.
All fourexamples/{json,proto}/{batch,single}.rsnow use
timestamp_micros()socreated_at/updated_atland at the current
time instead of January 1970 (the server stores any int64 in a TIMESTAMP
column without unit validation).
Documentation
- Enable
all-featureson docs.rs soarrow-flightandzeroparserare
visible. Re-exportTimeUnitfrom the SDK root. - Refresh
rust/README.md: correctprost/tokioversions in the
install snippet, fix the schema-tool build command, advertise the
arrow-flightBeta feature, and update Repository Structure.
Internal Changes
Breaking Changes
Deprecations
API Changes
Rust SDK v2.1.0
Major Changes
New Features and Improvements
zeroparser(opt-in Cargo feature): zero-copy, descriptor-driven
protobuf parser for ingestion paths where the schema is only known at
runtime. Exposesdatabricks_zerobus_ingest_sdk::zeroparser. Off by
default; seesdk/src/zeroparser/README.md.
Bug Fixes
Documentation
Internal Changes
Breaking Changes
Deprecations
API Changes
Java SDK v1.2.0
New Features and Improvements
- Built on Rust SDK 2.0.1: The JNI layer (
rust/jni/) now depends on
databricks-zerobus-ingest-sdk = "2.0.1"(was 2.0.0). The Java-facing API
surface (ZerobusSdk,ZerobusProtoStream,ZerobusJsonStream,
ZerobusArrowStream,StreamConfigurationOptions,
ArrowStreamConfigurationOptions) is unchanged. - Arrow Flight promoted to Beta:
ZerobusArrowStream/
ArrowStreamConfigurationOptionsand the surrounding documentation
(README, examples) are no longer labelled experimental. The API is
stabilising but may still change before reaching GA. (Mirrors the Rust
SDK 2.0.1 promotion.) ArrowStreamConfigurationOptions: AddedstreamPausedMaxWaitTimeMsfor the maximum time (milliseconds) to wait in the paused state during graceful close (-1= full server duration,0= immediate recovery).- Java SDK identifier on the wire: The SDK now reports itself as
zerobus-sdk-java/<version>on the HTTPuser-agentheader (previously
it inherited the Rust SDK identifierzerobus-sdk-rs/<rust-version>).
Server-side telemetry can now distinguish Java clients from Rust clients.
Behavior Changes
-
StreamConfigurationOptions.maxInflightRecordsdefault raised from
50_000to1_000_000to match the Rust SDK 2.0.1 default. The Java
wrapper previously hard-coded 50k while the Rust SDK quietly raised its
default to 1M, so Java clients ran with a 20× lower in-flight ceiling
than Rust clients. Callers who relied on the old cap can pin it back
explicitly:StreamConfigurationOptions.builder() .setMaxInflightRecords(50_000) .build();
Documentation
- Updated
ArrowIngestionExample.javato demonstrate all three IPC
compression codecs end-to-end. Opens three streams in sequence (NONE,
LZ4_FRAME,ZSTD), ingests 10 batches per stream, then
waitForOffset+flush+close. - Updated
README.md,examples/README.md,examples/arrow/README.md,
and Javadoc onZerobusArrowStream/ArrowStreamConfigurationOptions
/ZerobusSdk.createArrowStreamto reflect the Beta promotion. - Documented the JDK 9+
--add-opens=java.base/java.nio=...JVM flags
required byarrow-memory-netty17.x when usingZerobusArrowStream,
in both the main README and the Arrow examples README.
Internal Changes
maven-surefire-pluginnow passes the required--add-opensflags so
Arrow integration tests run cleanly on JDK 9+ without per-developer
setup.- Bumped
zerobus-jnicrate version from1.1.1to1.2.0(used by the
Java SDK identifier embedded at compile time viaCARGO_PKG_VERSION).
Go SDK v1.2.0
Release v1.2.0
New Features and Improvements
IngestRecordNowait/IngestRecordsNowait: New fire-and-forget ingestion methods onZerobusStream. Both return immediately after spawning a background task; ingestion errors are silently ignored.IngestRecordNowaitaccepts a single[]byteorstringpayload;IngestRecordsNowaitaccepts a batch as[]interface{}. Returns immediately after spawning a background task to queue the record; accepts[]byte(protobuf) orstring(JSON). Ingestion errors from the background task are silently ignored.- Arrow Flight promoted to Beta: The Arrow Flight ingestion API (
ZerobusArrowStream,CreateArrowStream,CreateArrowStreamWithHeadersProvider,ArrowStreamConfigurationOptions) is no longer labelled experimental/unsupported. The API is stabilising but may still change before reaching GA. - Arrow Flight — graceful stream close: When the server signals an impending close, the client pauses sends, drains in-flight acks within a bounded wait, then recovers.
ArrowStreamConfigurationOptions.StreamPausedMaxWaitTimeMs: Optional*uint64limiting how long to wait (ms) while paused (nil= full server duration,0= immediate recovery).
Bug Fixes
- Reduced GC pressure in batch ingest FFI paths (#271):
streamIngestJSONRecordswas allocating one heap-allocated closure per record per call (defer-in-loop). These closures are not pooled by the Go runtime, causing measurable allocation growth at high ingestion rates. Fixed by replacing N defers with a single closure.streamIngestProtoRecordswas also allocating the pointer/length arrays on the Go heap and unnecessarily pinning them; both are now allocated in C memory viaC.malloc. - Vendoring support:
go mod vendornow preserves the prebuilt FFI archives underlib/<GOOS>_<GOARCH>/when downstream consumers vendor this module. Previously, cgo#cgo LDFLAGSpaths were invisible to the vendor tool's dependency analysis, so vendored builds failed to link.
Python SDK v1.3.0
New Features and Improvements
- Built on Rust SDK 2.0.1: The Python wrapper now depends on
databricks-zerobus-ingest-sdk = "2.0.1"from crates.io (was 1.2.0).
Internal PyO3 binding was rewritten to use the newStreamBuilder
typestate API exclusively. The Python-facing API surface
(TableProperties,create_stream(client_id, client_secret, table_properties, options, headers_provider),ingest_record,
RecordAcknowledgment, etc.) is unchanged. - Arrow Flight promoted to Beta:
ZerobusArrowStream/
ArrowStreamConfigurationOptionsand the surrounding documentation are
no longer labelled experimental/unsupported. The API is stabilising but
may still change before reaching GA. (Mirrors the Rust SDK 2.0.1
promotion.) - Arrow Flight — graceful stream close: On server signaled close, the client pauses sending, drains in-flight acks within a bounded wait, then recovers.
stream_paused_max_wait_time_msonArrowStreamConfigurationOptions: Optional milliseconds cap for the paused wait (None= full server duration,0= immediate recovery).- Python SDK identifier on the wire: The SDK now reports itself as
zerobus-sdk-py/<version>on the HTTPuser-agentheader
(previously it inherited the Rust SDK identifier
zerobus-sdk-rs/<rust-version>). Server-side telemetry can now tell
Python clients apart from Rust clients. AckCallback.on_erroris now delivered to Python: The PyO3 binding
previously only logged ack errors viaeprintln!; subclasses overriding
on_errorwould never see the call. They will now.
Behavior Changes
-
StreamConfigurationOptions.record_typeis no longer applied: Format
is now set on the stream builder fromTableProperties(proto descriptor
present → Proto, absent → JSON). The field is kept for API compatibility. -
StreamConfigurationOptions.max_inflight_recordsdefault raised from
50_000to1_000_000to match the Rust SDK 2.0.1 default. Previously the
Python wrapper hard-coded 50k while the Rust SDK quietly raised its default
to 1M, so Python clients ran with a 20× lower in-flight ceiling than Rust
clients for no good reason. Callers who relied on the old cap can pin it
back explicitly:StreamConfigurationOptions(max_inflight_records=50_000)
Bug Fixes
TableProperties(name, MyMessage.DESCRIPTOR)proto-descriptor
selection now picks the message by name when a
google.protobuf.descriptor.Descriptorobject is passed. Previously
the first message in theFileDescriptorProtowas always chosen, which
silently mis-routed schemas for.protofiles containing multiple
messages. Raw-bytes input still falls back to the first message (no
name hint is available).
Documentation
- Updated docstrings, the Arrow example files, and
zerobus.sdk.shared.arrow
to reflect the Beta promotion.
Internal Changes
- Bumped Rust dependencies to match the Rust SDK 2.0.1 workspace:
prost/prost-types0.13 → 0.14,tonic0.13 → 0.14,arrow-ipc/
arrow-schema/arrow-array56.2 → 58.2. - Removed the in-tree
[patch.crates-io]redirect for
databricks-zerobus-ingest-sdk— the 2.0.1 release is now resolved
from crates.io. StreamConfigurationOptionsandArrowStreamConfigurationOptions
fields are now applied viaStreamBuildersetters because the
underlying Rust structs are#[non_exhaustive]in 2.0.1.- Bounded the
HeadersProviderWrapperstatic-key leak: header names are
now interned in a process-wide table so each distinct name is leaked at
most once, instead of once perget_headers()call. - Shared payload-extraction, options-application, and the
AckCallback
bridge moved intopython/rust/src/common.rs, removing duplicated code
betweensync_wrapper.rsandasync_wrapper.rs. ZerobusSdk.set_use_tls(...)is retained as a no-op for backwards
compatibility. Rust SDK 2.0.1 removed the underlying TLS toggle; TLS is
always controlled via the SDK builder.
Rust SDK v2.0.1
Major Changes
New Features and Improvements
Bug Fixes
-
Arrow Flight: fix race condition causing stale wire offsets after non-close-signal
recovery. When a stream broke via a server error or ack timeout (rather than a graceful
close signal), the supervisor did not set the ingest-pause gate before starting reconnect.
A concurrentingest_batchcall could send a batch with a pre-recovery wire offset,
which the server rejects with error code 4002 (NonIncrementalOffset), exhausting
recovery retries and failing the entire stream. Fix: setis_paused = trueimmediately
when entering the retriable-error retry branch, symmetric with the existing close-signal
path. -
Arrow Flight: restore automatic batch chunking at 2 MiB. Reverted the manual
zero-copy IPC encoding introduced in v2.0.0 back toFlightDataEncoderBuilder, which
automatically chunks largeRecordBatchvalues at 2 MiB. The zero-copy refactor had
removed this chunking, causing large batches to exceed the server's message size limit of 10MB and be rejected.ingest_ipc_batchnow deserialises IPC bytes into aRecordBatch
before encoding, so it correctly benefits from the same chunking and supports streams
withipc_compressionenabled.
Documentation
Internal Changes
Breaking Changes
Deprecations
API Changes
FFI Libraries v1.2.1
Major Changes
New Features and Improvements
Bug Fixes
zerobus_arrow_stream_ingest_batch_via_record_batchnow works correctly on compression-enabled streams. Previously the function performed its own IPC deserialization and calledingest_batchdirectly, bypassing the compression re-encoding step. It now delegates toingest_ipc_batch, which handles compression transparently. The function is now fully equivalent tozerobus_arrow_stream_ingest_batchregardless of stream configuration.
Documentation
Internal Changes
Breaking Changes
Deprecations
API Changes
FFI Libraries v1.2.0
Major Changes
New Features and Improvements
- Arrow stream options (C API):
CArrowStreamConfigurationOptions.stream_paused_max_wait_time_ms(int64_t) configures graceful-close paused wait:-1= None (full server duration),0= immediate recovery,>0= capped wait (seezerobus.hcomments). - Zero-copy Arrow IPC ingestion:
zerobus_arrow_stream_ingest_batchnow forwards IPC bytes directly viaingest_ipc_batch, skipping the deserialization round-trip. Usezerobus_arrow_stream_ingest_batch_via_record_batchfor compression-enabled streams. - Fire-and-forget ingestion: Added nowait variants that spawn a background task and return immediately —
zerobus_stream_ingest_proto_record_nowait,zerobus_stream_ingest_json_record_nowait,zerobus_stream_ingest_proto_records_nowait,zerobus_stream_ingest_json_records_nowait.
Bug Fixes
- Arrow IPC compression fix: Added
zerobus_arrow_stream_ingest_batch_via_record_batchfor streams created withLZ4_FRAMEorZSTDcompression. The existingzerobus_arrow_stream_ingest_batchuses the zero-copy path and does not apply compression; callers must use the new function when compression is configured. This fixes a regression where compression was silently ignored.
Documentation
Internal Changes
Breaking Changes
Deprecations
API Changes
- Added
zerobus_arrow_stream_ingest_batch_via_record_batch(stream, ipc_bytes, ipc_len, result)for compression-enabled Arrow streams. - Added
zerobus_stream_ingest_proto_record_nowait,zerobus_stream_ingest_json_record_nowait,zerobus_stream_ingest_proto_records_nowait,zerobus_stream_ingest_json_records_nowaitfor fire-and-forget ingestion.
Rust SDK v2.0.0
New Features and Improvements
- Arrow Flight ingestion promoted to Beta: The
arrow-flightfeature
(ZerobusArrowStream,ArrowStreamConfigurationOptions, and related types)
is no longer labelled experimental/unsupported. The API is stabilising but
may still change before reaching GA. - Arrow schema from UC schema (feature
arrow-flight):
schema::arrow_schema_from_uc_columnsandschema::arrow_schema_from_uc_schema
build anarrow_schema::Schemadirectly from Unity Catalog metadata, parallel
to the existingdescriptor_from_uc_*functions. Emits native Arrow types
(Date32,Timestamp(Microsecond, ..),LargeUtf8,LargeBinary,
Map("entries", Struct{keys,values})) matching the canonical Arrow schema
the Databricks Arrow Flight server builds from Delta. ZerobusSdkBuilder::application_name: Set a custom application identifier
appended to the HTTPuser-agentheader (sent on the underlying tonic
Endpoint) on every request. The defaultzerobus-sdk-rs/<version>prefix
is preserved for server-side telemetry, so the wire value becomes
zerobus-sdk-rs/<version> <application_name>. The previousx-zerobus-sdk
gRPC metadata header is no longer emitted; downstream consumers that parsed
it should switch to readinguser-agent.ZerobusSdkBuilder::sdk_identifier: Override the SDK prefix of the
HTTPuser-agentheader, replacing the defaultzerobus-sdk-rs/<version>.
Intended for wrapper SDKs that need to replace the SDK identification; most
callers should preferapplication_name, which preserves the SDK version
prefix. When both are set,application_nameis still appended, so the wire value becomes<sdk_identifier> <application_name>.
Bug Fixes
- Corrected the values returned by the C FFI
zerobus_get_default_config()
forcallback_max_wait_time_ms/has_callback_max_wait_time_ms. The
function previously reported0 / false(i.e., "no callback timeout"),
while the actual Rust SDK default isSome(5000ms). The C-side defaults
now correctly mirror the Rust defaults (5000 / true).
Documentation
- Updated
rust/README.md,rust/examples/README.md,
rust/examples/json/README.md, andrust/examples/proto/README.mdto
remove all references to the deleted future-based APIs. The
"Future-based API (Deprecated)" example sections and the deprecated
method entries in the API Reference were removed. - Added an Arrow Flight example under
examples/arrow/(example_arrow)
demonstrating bothingest_batch(RecordBatch) andingest_ipc_batch
(Arrow IPC bytes).
Internal Changes
- Consolidated Cargo workspace dependencies under
[workspace.dependencies]
inrust/Cargo.toml; member crates now usedep.workspace = trueso
versions are pinned in one place. - Collapsed the four example packages (
example_json_{single,batch},
example_proto_{single,batch}) into two packages,
rust-examples-jsonandrust-examples-proto, each exposing two
[[example]]targets. Examples are invoked as
cargo run -p rust-examples-json --example json_{single,batch}and
cargo run -p rust-examples-proto --example proto_{single,batch}. - Bumped
prostandprost-typesfrom 0.13 to 0.14;prost-reflectfrom
0.14 to 0.16. Public APIs that nameprost::Message(e.g.
ProtoMessage<T: prost::Message>) now require callers to use prost 0.14
messages. - Bumped
tonicfrom 0.13 to 0.14. The 0.14 release splits code generation
into separate crates: build-time codegen now usestonic-prost-build
(replacingtonic-build), and the runtime depends on the new
tonic-prostcrate for the prost codec.sdk/build.rs,tests/build.rs,
andtools/generate_files/src/generate.rswere updated accordingly. - Bumped Arrow crates (
arrow-flight,arrow-array,arrow-schema,
arrow-ipc) from 56.2.0 to 58.2. SwitchedIpcDataGenerator::encoded_batch
to the non-deprecatedencodeAPI which takes an explicit
CompressionContext. - Raised minimum-version floors on several non-breaking dependencies to
current latest minor:tokio1.42 → 1.52,tokio-stream0.1.16 →
0.1.18,tokio-util0.7.17 → 0.7.18,once_cell1.19 → 1.21,
bytes1 → 1.11,tempfile3.21 → 3.27,clap4 → 4.6,
urlencoding2 → 2.1. - Migrated the FFI and JNI crates off the deleted stream-creation methods.
Both wrappers now build streams viaStreamBuilder. Default config in
zerobus_get_default_config()/zerobus_arrow_get_default_config()
now readsstream_options::defaults::*constants directly instead of
constructing*ConfigurationOptions(no longer needed at the FFI layer).
No C ABI or JNI signature changes. - FFI and JNI no longer construct
StreamConfigurationOptions/
ArrowStreamConfigurationOptions. They read C/Java struct fields
directly and apply each via builder setters.
Breaking Changes
- Removed
ZerobusSdk::create_stream()(in deprecation since v1.3.0).
Usesdk.stream_builder().table(name).oauth(id, secret).json()/
.compiled_proto(desc).build().awaitinstead. Removed from all
examples, documentation, and tests. - Removed
ZerobusSdk::create_stream_with_headers_provider()(in
deprecation since v1.3.0). Use
sdk.stream_builder().table(name).headers_provider(p).json()/
.compiled_proto(desc).build().awaitinstead. Removed from all
examples, documentation, and tests. - Removed
ZerobusSdk::create_arrow_stream()(featurearrow-flight)
(in deprecation since v1.3.0). Use
sdk.stream_builder().table(name).oauth(id, secret).arrow(schema).build_arrow().await
instead. Removed from all examples, documentation, and tests. - Removed
ZerobusSdk::create_arrow_stream_with_headers_provider()
(featurearrow-flight) (in deprecation since v1.3.0). Use
sdk.stream_builder().table(name).headers_provider(p).arrow(schema).build_arrow().await
instead. Removed from all examples, documentation, and tests. - Removed
ZerobusStream::ingest_record()(in deprecation since v0.4.0).
Usestream.ingest_record_offset(payload).await?followed by
stream.wait_for_offset(offset).await?to wait for acknowledgment.
Removed from all examples, documentation, and tests. - Removed
ZerobusStream::ingest_records()(in deprecation since v0.4.0).
Usestream.ingest_records_offset(payloads).await?followed by
stream.wait_for_offset(offset).await?. Removed from all examples,
documentation, and tests. - Removed
ZerobusSdk::new()(in deprecation since v0.5.0). Use
ZerobusSdk::builder().endpoint(...).unity_catalog_url(...).build()?
instead. - Removed the
ZerobusSdk::use_tlsfield (in deprecation since v0.5.0).
TLS is controlled viaZerobusSdkBuilder::tls_config(...). The C FFI
zerobus_sdk_set_use_tls()function is retained as a no-op for ABI
compatibility. - Removed the
test_proto_stream_creation_without_descriptor_failstest
— the typestateStreamBuildermakes that scenario impossible at
compile time. - Added
#[non_exhaustive]toStreamConfigurationOptions. External
crates can no longer construct the struct via struct-literal syntax;
all configuration must go throughStreamBuildersetters. Field reads
viastream.options.*are unaffected. Adding new config fields in
future releases is now non-breaking. - Added
#[non_exhaustive]toArrowStreamConfigurationOptions. Same
semantics as above; reads viastream.options().*are unaffected. - Added
#[non_exhaustive]toZerobusError,StreamType, and
SchemaErrorenums. Externalmatchexpressions on these types now
require a_ =>wildcard arm. Adding new variants is non-breaking. - Added
#[non_exhaustive]toZerobusSdk,ZerobusStream, and
ZerobusArrowStreamstructs. Adding new fields to these top-level
handle types is non-breaking. TablePropertiesandArrowTablePropertiesare nowpub(crate)and
no longer part of the public API. They are only used internally by
StreamBuilder; after the deletion of the deprecated
create_*_stream()methods there are no external constructors.- Removed
ZerobusArrowStream::table_properties()getter (returned the
now-privateArrowTableProperties). Use the existingtable_name()
andschema()getters instead. - Major-version bumps of
prost(0.13 → 0.14),tonic(0.13 → 0.14),
prost-reflect(0.14 → 0.16), and the Arrow crates (56 → 58). Downstream
consumers that directly handle SDK-exportedprost::Messageor
arrow_array::RecordBatchvalues must move to the matching major
versions of those crates.