Skip to content

Releases: databricks/zerobus-sdk

TypeScript SDK v1.1.0

28 May 07:32
typescript/v1.1.0
a579b23

Choose a tag to compare

New Features and Improvements

  • Arrow Flight ingestion promoted to Beta. Mirrors the Rust SDK 2.0
    promotion. The API is stabilising but may still change before reaching GA.
    The arrow-flight feature is no longer labelled experimental/unsupported
    in docs and examples.
  • macOS pre-built binaries. Added @databricks/zerobus-ingest-sdk-darwin-x64
    and @databricks/zerobus-ingest-sdk-darwin-arm64 to optionalDependencies,
    so npm install on Intel and Apple Silicon Macs now fetches a pre-built
    .node binary instead of falling back to a source build.
  • waitForOffset precision. Replaced the Number(bigint) round-trip
    with napi-rs's lossless BigInt::get_i64(). Both gRPC and Arrow streams
    now error cleanly on offsets that exceed i64 range instead of silently
    truncating past 2^53 - 1.

Bug Fixes

  • Fixed bogus apache-arrow peer dependency. v1.0.x declared
    apache-arrow: "^56.0.0", which doesn't exist on npm (56 was a Rust
    crate version copied by mistake). Corrected to ^18.0.0 to match the
    current dev dep range.

Internal Changes

  • Depends on Rust SDK 2.0.1. The wrapper now goes through
    sdk.stream_builder() (Rust 2.0 removed the legacy
    create_stream_with_headers_provider / create_arrow_stream /
    ingest_record / ingest_records methods). The TS-facing API is
    unchanged — the v1 deprecated ingestRecord / ingestRecords methods
    still resolve after server ack (now via ingest_record_offset +
    wait_for_offset under the hood).
  • Arrow crates bumped 56.2.0 → 58.2 to match the Rust SDK 2.0
    workspace. bytes added so the wrapper can hand IPC payloads to
    ingest_ipc_batch as Bytes.
  • napi6 feature on napi-rs so BigInt::get_i64() is available.
  • CI install step switched from npm ci to npm install --no-audit --no-fund. npm ci's strict lockfile validation rejects
    optionalDependencies referencing a not-yet-published version (every
    napi-rs major-version bump hits this); npm install tolerates it.

Rust SDK v2.1.1

27 May 22:58
rust/v2.1.1
9990a36

Choose a tag to compare

Major Changes

New Features and Improvements

Bug Fixes

  • Fix the Arrow Flight example so it works against the prerequisite orders
    table — corrected the schema to LargeUtf8 for STRING,
    Timestamp(Microsecond, Some("UTC")) for TIMESTAMP, and nullable: true.
    All four examples/{json,proto}/{batch,single}.rs now use
    timestamp_micros() so created_at / updated_at land at the current
    time instead of January 1970 (the server stores any int64 in a TIMESTAMP
    column without unit validation).

Documentation

  • Enable all-features on docs.rs so arrow-flight and zeroparser are
    visible. Re-export TimeUnit from the SDK root.
  • Refresh rust/README.md: correct prost / tokio versions in the
    install snippet, fix the schema-tool build command, advertise the
    arrow-flight Beta feature, and update Repository Structure.

Internal Changes

Breaking Changes

Deprecations

API Changes

Rust SDK v2.1.0

27 May 13:05
rust/v2.1.0
0a74bab

Choose a tag to compare

Major Changes

New Features and Improvements

  • zeroparser (opt-in Cargo feature): zero-copy, descriptor-driven
    protobuf parser
    for ingestion paths where the schema is only known at
    runtime. Exposes databricks_zerobus_ingest_sdk::zeroparser. Off by
    default; see sdk/src/zeroparser/README.md.

Bug Fixes

Documentation

Internal Changes

Breaking Changes

Deprecations

API Changes

Java SDK v1.2.0

27 May 09:13
java/v1.2.0
a8319b8

Choose a tag to compare

New Features and Improvements

  • Built on Rust SDK 2.0.1: The JNI layer (rust/jni/) now depends on
    databricks-zerobus-ingest-sdk = "2.0.1" (was 2.0.0). The Java-facing API
    surface (ZerobusSdk, ZerobusProtoStream, ZerobusJsonStream,
    ZerobusArrowStream, StreamConfigurationOptions,
    ArrowStreamConfigurationOptions) is unchanged.
  • Arrow Flight promoted to Beta: ZerobusArrowStream /
    ArrowStreamConfigurationOptions and the surrounding documentation
    (README, examples) are no longer labelled experimental. The API is
    stabilising but may still change before reaching GA. (Mirrors the Rust
    SDK 2.0.1 promotion.)
  • ArrowStreamConfigurationOptions: Added streamPausedMaxWaitTimeMs for the maximum time (milliseconds) to wait in the paused state during graceful close (-1 = full server duration, 0 = immediate recovery).
  • Java SDK identifier on the wire: The SDK now reports itself as
    zerobus-sdk-java/<version> on the HTTP user-agent header (previously
    it inherited the Rust SDK identifier zerobus-sdk-rs/<rust-version>).
    Server-side telemetry can now distinguish Java clients from Rust clients.

Behavior Changes

  • StreamConfigurationOptions.maxInflightRecords default raised from
    50_000 to 1_000_000
    to match the Rust SDK 2.0.1 default. The Java
    wrapper previously hard-coded 50k while the Rust SDK quietly raised its
    default to 1M, so Java clients ran with a 20× lower in-flight ceiling
    than Rust clients. Callers who relied on the old cap can pin it back
    explicitly:

    StreamConfigurationOptions.builder()
        .setMaxInflightRecords(50_000)
        .build();

Documentation

  • Updated ArrowIngestionExample.java to demonstrate all three IPC
    compression codecs end-to-end. Opens three streams in sequence (NONE,
    LZ4_FRAME, ZSTD), ingests 10 batches per stream, then
    waitForOffset + flush + close.
  • Updated README.md, examples/README.md, examples/arrow/README.md,
    and Javadoc on ZerobusArrowStream / ArrowStreamConfigurationOptions
    / ZerobusSdk.createArrowStream to reflect the Beta promotion.
  • Documented the JDK 9+ --add-opens=java.base/java.nio=... JVM flags
    required by arrow-memory-netty 17.x when using ZerobusArrowStream,
    in both the main README and the Arrow examples README.

Internal Changes

  • maven-surefire-plugin now passes the required --add-opens flags so
    Arrow integration tests run cleanly on JDK 9+ without per-developer
    setup.
  • Bumped zerobus-jni crate version from 1.1.1 to 1.2.0 (used by the
    Java SDK identifier embedded at compile time via CARGO_PKG_VERSION).

Go SDK v1.2.0

28 May 07:41
feb1e53

Choose a tag to compare

Release v1.2.0

New Features and Improvements

  • IngestRecordNowait / IngestRecordsNowait: New fire-and-forget ingestion methods on ZerobusStream. Both return immediately after spawning a background task; ingestion errors are silently ignored. IngestRecordNowait accepts a single []byte or string payload; IngestRecordsNowait accepts a batch as []interface{}. Returns immediately after spawning a background task to queue the record; accepts []byte (protobuf) or string (JSON). Ingestion errors from the background task are silently ignored.
  • Arrow Flight promoted to Beta: The Arrow Flight ingestion API (ZerobusArrowStream, CreateArrowStream, CreateArrowStreamWithHeadersProvider, ArrowStreamConfigurationOptions) is no longer labelled experimental/unsupported. The API is stabilising but may still change before reaching GA.
  • Arrow Flight — graceful stream close: When the server signals an impending close, the client pauses sends, drains in-flight acks within a bounded wait, then recovers.
  • ArrowStreamConfigurationOptions.StreamPausedMaxWaitTimeMs: Optional *uint64 limiting how long to wait (ms) while paused (nil = full server duration, 0 = immediate recovery).

Bug Fixes

  • Reduced GC pressure in batch ingest FFI paths (#271): streamIngestJSONRecords was allocating one heap-allocated closure per record per call (defer-in-loop). These closures are not pooled by the Go runtime, causing measurable allocation growth at high ingestion rates. Fixed by replacing N defers with a single closure. streamIngestProtoRecords was also allocating the pointer/length arrays on the Go heap and unnecessarily pinning them; both are now allocated in C memory via C.malloc.
  • Vendoring support: go mod vendor now preserves the prebuilt FFI archives under lib/<GOOS>_<GOARCH>/ when downstream consumers vendor this module. Previously, cgo #cgo LDFLAGS paths were invisible to the vendor tool's dependency analysis, so vendored builds failed to link.

Python SDK v1.3.0

24 May 20:22
ac9fa6d

Choose a tag to compare

New Features and Improvements

  • Built on Rust SDK 2.0.1: The Python wrapper now depends on
    databricks-zerobus-ingest-sdk = "2.0.1" from crates.io (was 1.2.0).
    Internal PyO3 binding was rewritten to use the new StreamBuilder
    typestate API exclusively. The Python-facing API surface
    (TableProperties, create_stream(client_id, client_secret, table_properties, options, headers_provider), ingest_record,
    RecordAcknowledgment, etc.) is unchanged.
  • Arrow Flight promoted to Beta: ZerobusArrowStream /
    ArrowStreamConfigurationOptions and the surrounding documentation are
    no longer labelled experimental/unsupported. The API is stabilising but
    may still change before reaching GA. (Mirrors the Rust SDK 2.0.1
    promotion.)
  • Arrow Flight — graceful stream close: On server signaled close, the client pauses sending, drains in-flight acks within a bounded wait, then recovers.
  • stream_paused_max_wait_time_ms on ArrowStreamConfigurationOptions: Optional milliseconds cap for the paused wait (None = full server duration, 0 = immediate recovery).
  • Python SDK identifier on the wire: The SDK now reports itself as
    zerobus-sdk-py/<version> on the HTTP user-agent header
    (previously it inherited the Rust SDK identifier
    zerobus-sdk-rs/<rust-version>). Server-side telemetry can now tell
    Python clients apart from Rust clients.
  • AckCallback.on_error is now delivered to Python: The PyO3 binding
    previously only logged ack errors via eprintln!; subclasses overriding
    on_error would never see the call. They will now.

Behavior Changes

  • StreamConfigurationOptions.record_type is no longer applied: Format
    is now set on the stream builder from TableProperties (proto descriptor
    present → Proto, absent → JSON). The field is kept for API compatibility.

  • StreamConfigurationOptions.max_inflight_records default raised from
    50_000 to 1_000_000
    to match the Rust SDK 2.0.1 default. Previously the
    Python wrapper hard-coded 50k while the Rust SDK quietly raised its default
    to 1M, so Python clients ran with a 20× lower in-flight ceiling than Rust
    clients for no good reason. Callers who relied on the old cap can pin it
    back explicitly:

    StreamConfigurationOptions(max_inflight_records=50_000)

Bug Fixes

  • TableProperties(name, MyMessage.DESCRIPTOR) proto-descriptor
    selection now picks the message by name when a
    google.protobuf.descriptor.Descriptor object is passed. Previously
    the first message in the FileDescriptorProto was always chosen, which
    silently mis-routed schemas for .proto files containing multiple
    messages. Raw-bytes input still falls back to the first message (no
    name hint is available).

Documentation

  • Updated docstrings, the Arrow example files, and zerobus.sdk.shared.arrow
    to reflect the Beta promotion.

Internal Changes

  • Bumped Rust dependencies to match the Rust SDK 2.0.1 workspace:
    prost / prost-types 0.13 → 0.14, tonic 0.13 → 0.14, arrow-ipc /
    arrow-schema / arrow-array 56.2 → 58.2.
  • Removed the in-tree [patch.crates-io] redirect for
    databricks-zerobus-ingest-sdk — the 2.0.1 release is now resolved
    from crates.io.
  • StreamConfigurationOptions and ArrowStreamConfigurationOptions
    fields are now applied via StreamBuilder setters because the
    underlying Rust structs are #[non_exhaustive] in 2.0.1.
  • Bounded the HeadersProviderWrapper static-key leak: header names are
    now interned in a process-wide table so each distinct name is leaked at
    most once, instead of once per get_headers() call.
  • Shared payload-extraction, options-application, and the AckCallback
    bridge moved into python/rust/src/common.rs, removing duplicated code
    between sync_wrapper.rs and async_wrapper.rs.
  • ZerobusSdk.set_use_tls(...) is retained as a no-op for backwards
    compatibility. Rust SDK 2.0.1 removed the underlying TLS toggle; TLS is
    always controlled via the SDK builder.

Rust SDK v2.0.1

21 May 16:36
13bf689

Choose a tag to compare

Major Changes

New Features and Improvements

Bug Fixes

  • Arrow Flight: fix race condition causing stale wire offsets after non-close-signal
    recovery.
    When a stream broke via a server error or ack timeout (rather than a graceful
    close signal), the supervisor did not set the ingest-pause gate before starting reconnect.
    A concurrent ingest_batch call could send a batch with a pre-recovery wire offset,
    which the server rejects with error code 4002 (NonIncrementalOffset), exhausting
    recovery retries and failing the entire stream. Fix: set is_paused = true immediately
    when entering the retriable-error retry branch, symmetric with the existing close-signal
    path.

  • Arrow Flight: restore automatic batch chunking at 2 MiB. Reverted the manual
    zero-copy IPC encoding introduced in v2.0.0 back to FlightDataEncoderBuilder, which
    automatically chunks large RecordBatch values at 2 MiB. The zero-copy refactor had
    removed this chunking, causing large batches to exceed the server's message size limit of 10MB and be rejected. ingest_ipc_batch now deserialises IPC bytes into a RecordBatch
    before encoding, so it correctly benefits from the same chunking and supports streams
    with ipc_compression enabled.

Documentation

Internal Changes

Breaking Changes

Deprecations

API Changes

FFI Libraries v1.2.1

21 May 17:21
13bf689

Choose a tag to compare

Major Changes

New Features and Improvements

Bug Fixes

  • zerobus_arrow_stream_ingest_batch_via_record_batch now works correctly on compression-enabled streams. Previously the function performed its own IPC deserialization and called ingest_batch directly, bypassing the compression re-encoding step. It now delegates to ingest_ipc_batch, which handles compression transparently. The function is now fully equivalent to zerobus_arrow_stream_ingest_batch regardless of stream configuration.

Documentation

Internal Changes

Breaking Changes

Deprecations

API Changes

FFI Libraries v1.2.0

19 May 13:19
8906ffb

Choose a tag to compare

Major Changes

New Features and Improvements

  • Arrow stream options (C API): CArrowStreamConfigurationOptions.stream_paused_max_wait_time_ms (int64_t) configures graceful-close paused wait: -1 = None (full server duration), 0 = immediate recovery, >0 = capped wait (see zerobus.h comments).
  • Zero-copy Arrow IPC ingestion: zerobus_arrow_stream_ingest_batch now forwards IPC bytes directly via ingest_ipc_batch, skipping the deserialization round-trip. Use zerobus_arrow_stream_ingest_batch_via_record_batch for compression-enabled streams.
  • Fire-and-forget ingestion: Added nowait variants that spawn a background task and return immediately — zerobus_stream_ingest_proto_record_nowait, zerobus_stream_ingest_json_record_nowait, zerobus_stream_ingest_proto_records_nowait, zerobus_stream_ingest_json_records_nowait.

Bug Fixes

  • Arrow IPC compression fix: Added zerobus_arrow_stream_ingest_batch_via_record_batch for streams created with LZ4_FRAME or ZSTD compression. The existing zerobus_arrow_stream_ingest_batch uses the zero-copy path and does not apply compression; callers must use the new function when compression is configured. This fixes a regression where compression was silently ignored.

Documentation

Internal Changes

Breaking Changes

Deprecations

API Changes

  • Added zerobus_arrow_stream_ingest_batch_via_record_batch(stream, ipc_bytes, ipc_len, result) for compression-enabled Arrow streams.
  • Added zerobus_stream_ingest_proto_record_nowait, zerobus_stream_ingest_json_record_nowait, zerobus_stream_ingest_proto_records_nowait, zerobus_stream_ingest_json_records_nowait for fire-and-forget ingestion.

Rust SDK v2.0.0

14 May 13:44
rust/v2.0.0
132996d

Choose a tag to compare

New Features and Improvements

  • Arrow Flight ingestion promoted to Beta: The arrow-flight feature
    (ZerobusArrowStream, ArrowStreamConfigurationOptions, and related types)
    is no longer labelled experimental/unsupported. The API is stabilising but
    may still change before reaching GA.
  • Arrow schema from UC schema (feature arrow-flight):
    schema::arrow_schema_from_uc_columns and schema::arrow_schema_from_uc_schema
    build an arrow_schema::Schema directly from Unity Catalog metadata, parallel
    to the existing descriptor_from_uc_* functions. Emits native Arrow types
    (Date32, Timestamp(Microsecond, ..), LargeUtf8, LargeBinary,
    Map("entries", Struct{keys,values})) matching the canonical Arrow schema
    the Databricks Arrow Flight server builds from Delta.
  • ZerobusSdkBuilder::application_name: Set a custom application identifier
    appended to the HTTP user-agent header (sent on the underlying tonic
    Endpoint) on every request. The default zerobus-sdk-rs/<version> prefix
    is preserved for server-side telemetry, so the wire value becomes
    zerobus-sdk-rs/<version> <application_name>. The previous x-zerobus-sdk
    gRPC metadata header is no longer emitted; downstream consumers that parsed
    it should switch to reading user-agent.
  • ZerobusSdkBuilder::sdk_identifier: Override the SDK prefix of the
    HTTP user-agent header, replacing the default zerobus-sdk-rs/<version>.
    Intended for wrapper SDKs that need to replace the SDK identification; most
    callers should prefer application_name, which preserves the SDK version
    prefix. When both are set, application_name is still appended, so the wire value becomes <sdk_identifier> <application_name>.

Bug Fixes

  • Corrected the values returned by the C FFI zerobus_get_default_config()
    for callback_max_wait_time_ms / has_callback_max_wait_time_ms. The
    function previously reported 0 / false (i.e., "no callback timeout"),
    while the actual Rust SDK default is Some(5000ms). The C-side defaults
    now correctly mirror the Rust defaults (5000 / true).

Documentation

  • Updated rust/README.md, rust/examples/README.md,
    rust/examples/json/README.md, and rust/examples/proto/README.md to
    remove all references to the deleted future-based APIs. The
    "Future-based API (Deprecated)" example sections and the deprecated
    method entries in the API Reference were removed.
  • Added an Arrow Flight example under examples/arrow/ (example_arrow)
    demonstrating both ingest_batch (RecordBatch) and ingest_ipc_batch
    (Arrow IPC bytes).

Internal Changes

  • Consolidated Cargo workspace dependencies under [workspace.dependencies]
    in rust/Cargo.toml; member crates now use dep.workspace = true so
    versions are pinned in one place.
  • Collapsed the four example packages (example_json_{single,batch},
    example_proto_{single,batch}) into two packages,
    rust-examples-json and rust-examples-proto, each exposing two
    [[example]] targets. Examples are invoked as
    cargo run -p rust-examples-json --example json_{single,batch} and
    cargo run -p rust-examples-proto --example proto_{single,batch}.
  • Bumped prost and prost-types from 0.13 to 0.14; prost-reflect from
    0.14 to 0.16. Public APIs that name prost::Message (e.g.
    ProtoMessage<T: prost::Message>) now require callers to use prost 0.14
    messages.
  • Bumped tonic from 0.13 to 0.14. The 0.14 release splits code generation
    into separate crates: build-time codegen now uses tonic-prost-build
    (replacing tonic-build), and the runtime depends on the new
    tonic-prost crate for the prost codec. sdk/build.rs, tests/build.rs,
    and tools/generate_files/src/generate.rs were updated accordingly.
  • Bumped Arrow crates (arrow-flight, arrow-array, arrow-schema,
    arrow-ipc) from 56.2.0 to 58.2. Switched IpcDataGenerator::encoded_batch
    to the non-deprecated encode API which takes an explicit
    CompressionContext.
  • Raised minimum-version floors on several non-breaking dependencies to
    current latest minor: tokio 1.42 → 1.52, tokio-stream 0.1.16 →
    0.1.18, tokio-util 0.7.17 → 0.7.18, once_cell 1.19 → 1.21,
    bytes 1 → 1.11, tempfile 3.21 → 3.27, clap 4 → 4.6,
    urlencoding 2 → 2.1.
  • Migrated the FFI and JNI crates off the deleted stream-creation methods.
    Both wrappers now build streams via StreamBuilder. Default config in
    zerobus_get_default_config() / zerobus_arrow_get_default_config()
    now reads stream_options::defaults::* constants directly instead of
    constructing *ConfigurationOptions (no longer needed at the FFI layer).
    No C ABI or JNI signature changes.
  • FFI and JNI no longer construct StreamConfigurationOptions /
    ArrowStreamConfigurationOptions. They read C/Java struct fields
    directly and apply each via builder setters.

Breaking Changes

  • Removed ZerobusSdk::create_stream() (in deprecation since v1.3.0).
    Use sdk.stream_builder().table(name).oauth(id, secret).json() /
    .compiled_proto(desc).build().await instead. Removed from all
    examples, documentation, and tests.
  • Removed ZerobusSdk::create_stream_with_headers_provider() (in
    deprecation since v1.3.0). Use
    sdk.stream_builder().table(name).headers_provider(p).json() /
    .compiled_proto(desc).build().await instead. Removed from all
    examples, documentation, and tests.
  • Removed ZerobusSdk::create_arrow_stream() (feature arrow-flight)
    (in deprecation since v1.3.0). Use
    sdk.stream_builder().table(name).oauth(id, secret).arrow(schema).build_arrow().await
    instead. Removed from all examples, documentation, and tests.
  • Removed ZerobusSdk::create_arrow_stream_with_headers_provider()
    (feature arrow-flight) (in deprecation since v1.3.0). Use
    sdk.stream_builder().table(name).headers_provider(p).arrow(schema).build_arrow().await
    instead. Removed from all examples, documentation, and tests.
  • Removed ZerobusStream::ingest_record() (in deprecation since v0.4.0).
    Use stream.ingest_record_offset(payload).await? followed by
    stream.wait_for_offset(offset).await? to wait for acknowledgment.
    Removed from all examples, documentation, and tests.
  • Removed ZerobusStream::ingest_records() (in deprecation since v0.4.0).
    Use stream.ingest_records_offset(payloads).await? followed by
    stream.wait_for_offset(offset).await?. Removed from all examples,
    documentation, and tests.
  • Removed ZerobusSdk::new() (in deprecation since v0.5.0). Use
    ZerobusSdk::builder().endpoint(...).unity_catalog_url(...).build()?
    instead.
  • Removed the ZerobusSdk::use_tls field (in deprecation since v0.5.0).
    TLS is controlled via ZerobusSdkBuilder::tls_config(...). The C FFI
    zerobus_sdk_set_use_tls() function is retained as a no-op for ABI
    compatibility.
  • Removed the test_proto_stream_creation_without_descriptor_fails test
    — the typestate StreamBuilder makes that scenario impossible at
    compile time.
  • Added #[non_exhaustive] to StreamConfigurationOptions. External
    crates can no longer construct the struct via struct-literal syntax;
    all configuration must go through StreamBuilder setters. Field reads
    via stream.options.* are unaffected. Adding new config fields in
    future releases is now non-breaking.
  • Added #[non_exhaustive] to ArrowStreamConfigurationOptions. Same
    semantics as above; reads via stream.options().* are unaffected.
  • Added #[non_exhaustive] to ZerobusError, StreamType, and
    SchemaError enums. External match expressions on these types now
    require a _ => wildcard arm. Adding new variants is non-breaking.
  • Added #[non_exhaustive] to ZerobusSdk, ZerobusStream, and
    ZerobusArrowStream structs. Adding new fields to these top-level
    handle types is non-breaking.
  • TableProperties and ArrowTableProperties are now pub(crate) and
    no longer part of the public API. They are only used internally by
    StreamBuilder; after the deletion of the deprecated
    create_*_stream() methods there are no external constructors.
  • Removed ZerobusArrowStream::table_properties() getter (returned the
    now-private ArrowTableProperties). Use the existing table_name()
    and schema() getters instead.
  • Major-version bumps of prost (0.13 → 0.14), tonic (0.13 → 0.14),
    prost-reflect (0.14 → 0.16), and the Arrow crates (56 → 58). Downstream
    consumers that directly handle SDK-exported prost::Message or
    arrow_array::RecordBatch values must move to the matching major
    versions of those crates.