feat: add NamespaceClientTableContext for cached namespace info#96
Closed
jackye1995 wants to merge 186 commits into
Closed
feat: add NamespaceClientTableContext for cached namespace info#96jackye1995 wants to merge 186 commits into
jackye1995 wants to merge 186 commits into
Conversation
…6146) fix CI error: `FAILED python/tests/test_integration.py::test_duckdb_pushdown_extension_types - _duckdb.Error: DeprecationWarning: fetch_arrow_table() is deprecated, use to_arrow_table() instead.`
20%+ faster for 2GB index, could be more for larger index
There was a conflict table in transaction.rs but this was incomplete (some rows/columns missing) and seemed to be imprecise or incorrect in a few spots. I've attempted to more thoroughly document this in transaction.md instead.
…ance-format#6160) Previously, `adjust_child_validity` would call `ArrayData::try_new` with a null bitmap on a `DataType::Null` array, causing an `.unwrap()` panic with `InvalidArgumentError("Arrays of type Null cannot contain a null bitmask")`. The trigger: when a user inserts rows where a struct sub-field has only null values, Arrow infers `DataType::Null` for that column. If a subsequent fragment omits that nullable sub-field, Lance inserts a `NullReader` to fill it in. `MergeStream` then merges the real batch (with null struct rows) and the `NullReader` batch (all-null struct), recursing into the struct where `adjust_child_validity` is called with the `Null`-typed child and a non-empty parent validity — triggering the panic. Fix: skip the bitmask operation when `child.data_type() == DataType::Null`. A `Null` array is always entirely null by definition and needs no validity adjustment. Closes lance-format#6159 --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…e-format#6163) Previously, when `FragReuseIndexDetails` exceeded 204800 bytes (triggered by large compactions with many fragments), the code wrote the details to an external file (`details.binpb`). On local filesystems, `ObjectStore::create` returns a `LocalWriter` that atomically renames a temp file to the final path in `Writer::shutdown`. However, `frag_reuse.rs` imported `tokio::io::AsyncWriteExt` but not `lance_io::traits::Writer`, so `writer.shutdown()` resolved to `AsyncWriteExt::shutdown` (flush/close only) — the temp file was deleted on drop without being persisted. Any subsequent `load_indices` call would fail with `Not found: .../details.binpb`. Fixed by using UFCS `Writer::shutdown(writer.as_mut()).await?` to explicitly call the lance trait method, matching the existing pattern in `ivf.rs` and `blob.rs`. Fixes lance-format#6161 --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This breaks the "build_partitions" stage into "build_partitions" and "merge_partitions", and also updates the progress reporting on the shuffle phase to be in terms of rows instead of batches.
This PR moves a few unrelated clippy cleanups out of lance-format#6168 so the blob empty-range fix can stay focused on the regression it addresses. The changes here are all mechanical simplifications with no intended behavior change.
…t#6175) This PR moves the Linux and Windows workflows that currently run on Warp onto GitHub-hosted runners. The goal is to reduce reliance on custom runners and take advantage of the sponsored larger GitHub-hosted machines for the slowest CI paths. This is focused on the current CI bottlenecks we observed in recent successful PR runs, especially Rust ARM and Python Windows jobs, while keeping the existing macOS and benchmark-specific runners unchanged until we verify equivalent GitHub-hosted options for them. Context: - Recent PR history shows Rust `linux-arm` and Python `windows` as the dominant critical-path jobs. - This change upgrades those jobs to larger GitHub-hosted runners where available (`ubuntu-24.04-8x`, `ubuntu-24.04-arm64-8x`, `windows-latest-4x`) and aligns the remaining Linux/Windows workflows with the same runner family. - I validated the workflow YAML locally after the runner migration; no product code or test logic changed. --- Updates: - Rust linux-arm:40.7 -> 19.4,about -52% - Rust windows-build:27.7 -> 21.0,about -24% - Python windows:36.5 -> 23.1,about -37% - Python Linux 3.13 ARM:26.9 -> 20.7,about -23% - Python Linux 3.13 x86_64:26.8 -> 19.1,about -29% - Python Linux 3.9 x86_64:25.9 -> 19.2,about -26%
Improvements lance-format#4247 alicloud storage config doc. Signed-off-by: FarmerChillax <farmerchillax@outlook.com>
Blob reads should return empty bytes when the logical blob is empty or the cursor is already at EOF. Today `BlobFile::read` / `read_up_to` can still issue a `get_range(start..end)` request with `start == end`, which is tolerated by local readers but rejected by cloud object stores. This showed up while investigating `random_blob` failures on the original-scale `laion10m-full` dataset, where legacy blob reads on S3 failed with errors like `Range started at 1 and ended at 1`. The fix short-circuits empty reads and restores the cursor to blob-relative semantics after `read()`, and adds regression coverage for both the empty-range case and packed-blob cursor behavior.
<img width="1340" height="800" alt="image" src="https://github.com/user-attachments/assets/355caf26-14cb-4823-9474-6e4c9e780823" /> - FTS indexing is ~2.5x faster, this removes merge phase, and produces large partitions directly. - memory footprint is reduced by ~60%, this compresses posting lists while building them, which can save a lot of memory, and reduces fragmented objects in memory. This also bumps the default worker memory budget from 256MiB to 1GiB because we need to produce larger partition directly, but the memory footprint is still much less. This adds a new param `memory_limit` so that users can control how the indexing should work --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com> Co-authored-by: LuQQiu <luqiujob@gmail.com> Co-authored-by: Weston Pace <weston.pace@gmail.com>
…#6187) This fixes the reader panic in lance-format#6185 when a page keeps nullable rep/def layer metadata but does not materialize any definition levels. The decoder now treats that page-local state as all-valid and includes a regression test that reproduces the mixed-page case before the fix. Closes lance-format#6185.
) This fixes the merge-insert fast path for delete-by-source operations while preserving the existing `UpdateIf` semantics. It also keeps full-schema `FixedSizeList` merges on the optimized path so target-side payload columns are pruned from the join build side. Fix lancedb/lancedb#3094
This updates the benchmark TPC-H datagen path to use DuckDB's `to_arrow_reader()` API instead of the deprecated `fetch_arrow_reader()` call. The benchmark CI treats `DeprecationWarning` as an error, so this removes the warning that was breaking the random access benchmark job. I also dropped a leftover `print(ds.count_rows())` debug statement to keep benchmark logs clean.
…lance-format#6191) Signed-off-by: BubbleCal <bubble-cal@outlook.com>
In retrospect the old name was somewhat presumptuous. It would probably be good to get the Arrow project's permission before taking up cargo real estate. This also adds a README which was preventing the publish.
…mat#6145) ## Summary Closes lance-format#6138 This PR extends `index_matches_criteria()` in `rust/lance/src/index/scalar.rs` to handle vector index types in addition to scalar indices. ## Problem Previously, `index_matches_criteria()` contained an early return at lines 464-467 that rejected all non-scalar (vector) indices. This made it impossible to use `describe_indices` to filter for vector indices on a specific column. ## Solution - Removed the early return that rejected all vector indices - Refactored FTS and exact equality checks to only apply to scalar indices (these checks are not relevant for vector indices) - Vector indices now pass through when matching basic criteria (name and column filters) ## Changes - 1 file modified: `rust/lance/src/index/scalar.rs` - 15 lines added, 16 lines removed - Updated existing test `test_index_matches_criteria_vector_index()` to reflect the new expected behavior ## Testing - Updated the existing unit test for vector index criteria matching - The test now correctly expects vector indices to match basic criteria instead of being rejected ## AI Disclosure This contribution was developed with the assistance of Claude (AI by Anthropic). The implementation approach, code, and PR description were AI-assisted. All changes are focused on resolving the specific issue described above. Co-Authored-By: AI Assistant (Claude) <ai-assistant@contributor-bot.dev> Signed-off-by: ndpvt-web <ndpvt-web@users.noreply.github.com> Co-authored-by: ndpvt-web <ndpvt-web@users.noreply.github.com> Co-authored-by: AI Assistant (Claude) <ai-assistant@contributor-bot.dev>
…er (lance-format#6197) Signed-off-by: BubbleCal <bubble-cal@outlook.com>
…rmat#6194) This PR makes two changes to ensure stale credentials are not used: (1) In the Directory namespace if either vending is not enabled or a credential vendor is not configured we return `None` for storage options. (2) The `DynamicStorageOptionsCredentialProvider` falls back to the default credential provider (lazily loaded) if it is not able to retrieve credentials. Closes lance-format/lance-spark#292 --------- Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
…ance-format#6119) SimpleIndex (HNSW over centroids) previously only supported fp32 centroids, causing fp16 vector workloads to fall back to brute-force partition assignment — O(K×D) per vector instead of O(log K × D). For 31K centroids × 1024 dims this is a ~600x difference per vector. Cast fp16 centroids to fp32 at HNSW construction time (one-time cost) and cast fp16 query vectors at search time (1024 floats per query, negligible vs the distance computations saved). --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Xuanwo <github@xuanwo.io>
lance-format#6142) Previously we would use the default file version when creating new index files. This was originally done to get some testing of the 2.0 format before it was made the default. However, this led to a bit of a potential compatibility problem. If we change the default file version then the files created by the new release would become unreadable on very old versions that didn't know how to read that file, even if the dataset itself had an older file version and the old version knew how to handle the index otherwise. To avoid this we change things in this PR so that new index files use the same format version as the dataset. This should mean the indexes are always readable if the dataset is readable, regardless of what version was used to write the index. --- Parts of this PR were written with Claude (Opus 4.6) and I take full responsibility for its contents.
…t#6415) Before this fix, it is possible for table uri to have a trailing ? because we do not clear the query component.
…tion (lance-format#6439) Source rows with NULL ON key columns were silently dropped because the action assignment logic used `ON_col IS NOT NULL` as a proxy for "source row is present in the join output". This conflates a legitimate NULL key with a NULL introduced by the outer join on the target side. Fix by injecting a `lit(true)` sentinel column into the source DataFrame before the join. After the join the sentinel is non-null for every source row and null only for target-only rows, making source row detection independent of ON column values. Strip the sentinel in `prepare_stream_schema` before writing and propagate it through projection pushdown in `necessary_children_exprs`. Before the join, inject a constant `lit(true)` column (`__merge_source_sentinel`) into every source row. After the join: - Source rows (whether matched or unmatched) → sentinel = true - Target-only rows (no source match) → sentinel = NULL (outer join NULL-fill) [assign_action](https://github.com/lance-format/lance/blob/6112a34bfe38618f07c099217dc3d89fd39ca6bb/rust/lance/src/dataset/write/merge_insert/assign_action.rs#L77) now uses sentinel IS NOT NULL to detect source row presence, making it correct regardless of what values the ON columns hold. The sentinel is a pure logical column — it never touches disk. It's stripped in [prepare_stream_schema](https://github.com/lance-format/lance/blob/6112a34bfe38618f07c099217dc3d89fd39ca6bb/rust/lance/src/dataset/write/merge_insert/exec/write.rs#L383) before any data is written, and [necessary_children_exprs](https://github.com/lance-format/lance/blob/6112a34bfe38618f07c099217dc3d89fd39ca6bb/rust/lance/src/dataset/write/merge_insert/logical_plan.rs#L148) is updated to propagate it through DataFusion's projection pushdown. Example that was broken before: ``` Target: (id=1, record_type="A") and (id=0, record_type=NULL) Source: (id=2, record_type=NULL) — new row, should be inserted ON: ["id", "record_type"] Old behavior: source row silently dropped (Action::Nothing) New behavior: source row correctly inserted (Action::Insert) ``` Fixes: lance-format#4644 --------- Signed-off-by: Pratik <pratikrocks.dey11@gmail.com> Co-authored-by: Will Jones <willjones127@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…t#6440) Most workflows lacked a `permissions` block, causing GitHub security warnings. Added `permissions: contents: read` at the top level for all affected workflows. Special cases: - `benchmark-comment-trigger`: also needs `pull-requests: read` to call the pulls REST API - `nightly_run`: `run` job needs `actions: write` to dispatch `file_verification.yml` - `rust`: `clippy` job-level permissions updated to include `contents: read` alongside `checks: write` - `cargo-publish`: `build` job updated to include `contents: read` alongside `id-token: write` Workflows already having correct permissions (`claude.yml`, `claude-code-review.yml`, `pr-title.yml`, `stale.yml`, `rust-benchmark.yml`, `docs-deploy.yml`, `codex-fix-ci.yml`, `codex-backport-pr.yml`, `file_verification.yml`, `cargo-publish.yml`) were left unchanged or minimally updated.
## Feature ### What is the new feature? This PR adds a first-class blob-aware `to_pandas()` API to the Lance Python bindings on `LanceDataset`, `LanceScanner`, and `LanceFragment`. ### Why do we need this feature? Today, pandas export goes through `to_table().to_pandas()`, which is constrained by Arrow blob representations. That means blob columns either surface as descriptor structs or must be eagerly materialized as bytes before pandas conversion. For large blobs, eager materialization is the wrong default because it can pull a large amount of binary data into memory unexpectedly. ### How does it work? The new `to_pandas(*, blob_mode=...)` API keeps Arrow-facing behavior unchanged and adds pandas-specific blob handling: - `blob_mode="lazy"` (default) returns `lance.BlobFile` objects in pandas object columns. - `blob_mode="bytes"` eagerly reads blobs into Python `bytes`. - `blob_mode="descriptions"` preserves the old `to_table().to_pandas()` behavior. Implementation details: - Add Python-side blob helpers to detect top-level blob columns and map direct alias projections back to source blob columns. - Snapshot Python scanner builder options so `LanceScanner.to_pandas()` can reconstruct the same scan with `_rowaddr` and blob descriptions. - Rebuild the scan internally with `with_row_address=True` and `blob_handling="blobs_descriptions"`, convert non-blob columns through Arrow's `to_pandas()`, and backfill blob columns via `take_blobs(..., addresses=...)`. - Preserve Arrow APIs (`to_table()` / `blob_handling`) unchanged. - Raise a clear `NotImplementedError` for transformed blob projections that cannot be mapped back to a single source blob column. ## Testing - `cd python && uv run pytest python/tests/test_blob.py -q` - `cd python && uv run --extra dev ruff check python/lance/dataset.py python/lance/fragment.py python/tests/test_blob.py`
## Summary - add `PrewarmOptions` and `FtsPrewarmOptions` on the Rust side, with dataset plumbing for `prewarm_index_with_options` - add Python `prewarm_index(..., *, with_position=False)` support for FTS indices while keeping the default prewarm path unchanged
Unreleased version after creating v5.0.0-rc.1
…ance-format#6389) ## Summary - Replace `panic!()` in `initial_upload_size()` with a warn-and-clamp fallback when `LANCE_INITIAL_UPLOAD_SIZE` is set outside the valid `[5MB, 5GB]` range, so misconfiguration can't crash the process - Extract `MAX_UPLOAD_PART_SIZE` constant for the 5GB upper bound - Extract `clamp_initial_upload_size` as a pure helper and add boundary unit tests ## Motivation Setting `LANCE_INITIAL_UPLOAD_SIZE` to a value outside the valid range previously crashed the entire process via `panic!()` — a disproportionate response to a perf-tuning env var misconfiguration. Per review feedback (lance-format#6389 (comment)), a crash (or even a propagated `Result`) forces every caller to handle a purely operator-side mistake. Clamping to the valid range and emitting a single warning lets the workload proceed and surfaces the misconfiguration to operators. This also matches the silent-fallback behavior of the sibling env vars `LANCE_UPLOAD_CONCURRENCY` and `LANCE_CONN_RESET_RETRIES`. ## What Changed **`initial_upload_size()`**: Return type stays `usize`. Out-of-range values are clamped into `[5MB, 5GB]` and a single `tracing::warn!` is emitted with `requested` and `clamped` fields. The existing `OnceLock` cache guarantees the warning fires at most once per process, so no separate rate-limiting logic is needed. Non-numeric and unset values continue to fall back silently to the 5MB default. **`clamp_initial_upload_size(raw) -> (usize, bool)`**: Pure helper extracted for testability. Returns the clamped value and whether clamping occurred. **`MAX_UPLOAD_PART_SIZE`**: New constant for the 5GB upper bound. ## Behavioral Equivalence | Input | Before | After | |-------|--------|-------| | Env not set | 5MB default | 5MB default | | Non-numeric (e.g. `"abc"`) | 5MB default | 5MB default | | Valid integer in `[5MB, 5GB]` | Returns the value | Returns the value | | Integer `< 5MB` | **`panic!()`** | Clamped to 5MB + `warn!` (once) | | Integer `> 5GB` | **`panic!()`** | Clamped to 5GB + `warn!` (once) | No API changes — `ObjectWriter::new()` signature is unchanged. ## Test plan - [x] New boundary unit tests: below min, min boundary, in-range, max boundary, above max, `usize::MAX` - [x] `cargo test -p lance-io --lib object_writer` — 7 tests pass - [x] `cargo clippy -p lance-io --tests -- -D warnings` — clean - [x] `cargo fmt -p lance-io -- --check` — clean - [x] `cargo check --workspace --tests` — full workspace compiles
Upgrades the pinned Rust toolchain from 1.91.0 to 1.94.0. The only code change needed was boxing two futures in `build_partial_fixture` in the `distributed_vector_build` bench, where 1.94's stricter layout computation overflowed the default recursion limit. Boxing makes the awaited futures' sizes constant (a pointer), breaking the recursion. --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
## Feature ### What is the new feature? This PR adds native `float16` and `float64` support for `IVF_FLAT` and `IVF_HNSW_FLAT`. ### Why do we need this feature? Flat IVF indexing previously only worked end to end for `float32`. That meant users could not build, merge, reload, and query flat IVF indexes on `float16` or `float64` vectors without running into `Float32`-specific assumptions in flat storage, writer initialization, and merge/query paths. ### How does it work? The implementation makes flat IVF paths dispatch on the actual Arrow element type from stored flat data instead of assuming `Float32`. - `FlatFloatStorage` now dispatches distance calculators for `float16`, `float32`, and `float64`. - Query/training helpers that previously special-cased `Float32` now accept the native float dtype where needed. - Tests now cover flat storage distance, partition serde roundtrip, IVF create/query/remap, and distributed merge behavior for `float16` / `float64`. ## Validation - `cargo fmt --all` - `cargo check -p lance-index --lib` - `cargo check -p lance --lib` - `cargo test -p lance-index test_flat_float_storage_distance_f16 -- --nocapture` - `cargo test -p lance-index test_merge_ivf_flat_preserves_float64_schema -- --nocapture` - `cargo test -p lance test_build_ivf_flat -- --nocapture` - `cargo test -p lance test_create_ivf_hnsw_flat -- --nocapture` - `cargo test -p lance test_create_ivf_flat_f16 -- --nocapture` ## Benchmark Note I also benchmarked float32 `IVF_FLAT` before vs after, no obvious performance diffs
…at#6428) ## Summary Stacked on lance-format#6388. Please merge that PR first. - Adds `batch_size_bytes: Option<u64>` to `FileReaderOptions` and propagates it through all 6 `SchedulerDecoderConfig` creation sites in the file reader - Adds `batch_size_bytes` field + setter to `Scanner`, wired through both `scan_fragments` (via `LanceScanConfig`) and `pushdown_scan` (via `FileReaderOptions` in `ScanConfig`) - Adds `batch_size_bytes` to `LanceScanConfig`, with `try_new_v2` injecting it into `FragReadConfig` via `FileReaderOptions` - Exposes `batch_size_bytes` in the Python API: `LanceDataset.scanner()`, `to_table()`, `to_batches()`, `ScannerBuilder` ## Test plan - [x] `cargo check -p lance-file -p lance --tests` — clean - [x] `cargo clippy -p lance-file -p lance --tests -- -D warnings` — clean - [x] `cargo fmt --all` — applied - [x] `cargo test -p lance-encoding -- byte_sized` — 3/3 pass - [x] `cargo test -p lance -- test_scan` — 38/38 pass 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…rmat#6344) `arrow-cast` 57.0.0 added native support for `FixedSizeList → FixedSizeList` casting, which was the only reason `lance_arrow::cast::cast_with_options` existed. This removes the wrapper and updates all call sites to use `arrow_cast::cast_with_options` directly. --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously, when a manifest commit failed (conflict, error, or retry exhaustion), the `.txn` file written to `_transactions/` was left orphaned. These files would accumulate until the GC cleanup interval (7+ days by default). This PR adds `cleanup_transaction_file()` — a best-effort delete helper — and calls it from all three commit failure paths (`do_commit_new_dataset`, `do_commit_detached_transaction`, `commit_transaction`). Failures to delete are logged as warnings and do not surface to the caller. In the retry loop of `commit_transaction`, the previous iteration's transaction file is cleaned up before each retry attempt, since a new transaction file is written on each iteration. Fixes lance-format#6125 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
## Bug Fix ### What is the bug? FTS v2 indices built with `with_position=true` can reconstruct cached posting lists with the wrong shared-position codec during prewarm. The prewarm path rebuilds `PostingList` values from projected `RecordBatch` objects and re-infers the positions codec from batch metadata, even though the reader has already resolved the correct partition-level `positions_layout`. ### What issues or incorrect behavior does the bug cause? After `prewarm_index(..., with_position=True)`, phrase queries can fail even though the same query succeeds before prewarm. In practice this shows up as shared position stream decode failures in the cached path because `PackedDelta` data can be interpreted as the legacy codec. ### How does this PR fix the problem? This PR threads `positions_layout` from `PostingListReader` through the prewarm reconstruction path into `CompressedPostingList::from_batch`, so cached postings reuse the already-parsed shared-position codec instead of guessing from projected batch metadata. It also adds a regression test that covers the V2 + positions + tail-remainder case and verifies prewarm preserves correct phrase-query behavior. ## Validation - `cargo test -p lance-index test_prewarm_with_ -- --nocapture`
…:DeepSizeOf (lance-format#6480) ## Summary - Fix `CachedFileMetadata::DeepSizeOf` to include `column_metadatas` and `column_infos` — the two largest fields that were previously omitted (marked TODO since initial implementation) - This caused the moka cache weigher to underestimate entry sizes by ~100x, preventing eviction and causing unbounded memory growth on random-access workloads ## Problem `LanceCache` uses moka with a weighted capacity of 1 GB (`DEFAULT_METADATA_CACHE_SIZE`). The weigher calls `DeepSizeOf` on `CachedFileMetadata`, but the implementation only counted `file_schema` and `file_buffers` — ignoring `column_metadatas` (protobuf `ColumnMetadata`) and `column_infos` (`Vec<Arc<ColumnInfo>>` containing page encodings). Each cache entry's true size is hundreds of KB, but was reported as ~1 KB. Moka never reached the 1 GB limit, so entries accumulated indefinitely. ## Profiling Evidence Tested on a 221M-row dataset with random `ds.take()` (64 rows per call): | Metric | Before | After | |--------|--------|-------| | RSS growth (30 iters) | **+7,503 MB** | **+535 MB** | | Growth rate | 250 MB/iter (linear, no plateau) | 18 MB/iter (plateaus ~500 MB) | jemalloc heap profiling (debug build, 243K symbols) showed 99.9% of leaked memory in `LanceCache::get_or_insert_with_key` → `FileReader::meta_to_col_infos` and `prost::encoding::message::merge`. ## Approach Since protobuf-generated types (`pbfile::ColumnMetadata`, `pb::ColumnEncoding`, etc.) don't implement `DeepSizeOf`, we use `prost::Message::encoded_len() * 4` as an approximation for in-memory size. The 4x multiplier accounts for heap allocations in repeated/string/bytes fields that are larger in memory than on the wire. ## Test plan - [x] Added `test_deep_size_of_includes_column_metadata` for V2_0 and V2_1 file formats - [x] Verified fix reduces memory growth from 250 MB/iter to 18 MB/iter on a production dataset - [x] `cargo test -p lance-file` passes 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-format#6488) ## Summary - Add missing `(SubIndexType::Flat, QuantizationType::FlatBin)` match arm in `optimize_vector_indices_v2` The v2 function handles all other sub-index/quantization combinations but misses the FlatBin case for binary vector IVF_FLAT indices, hitting the catch-all `unimplemented!` panic during incremental indexing (`optimize_indices`). The v1 function already handles this correctly.
…t#6435) This teaches `merge_insert` to keep the delete-by-source fast path even when a scalar index exists on the join key. The actual indexed join path is still only used when unmatched target rows are kept, so the presence of index metadata should not force these operations back to the legacy full-join path. This also adds regression coverage for full-schema `FixedSizeList` merges with `when_not_matched_by_source(Delete)` both with and without a scalar index. That closes the gap behind lance-format#6195 and preserves the earlier fix for lancedb/lancedb#3094.
…lance-format#6477) ## Summary - Change `DataFile.fields` and `DataFile.column_indices` from `Vec<i32>` to `Arc<[i32]>` so that fragments with identical field lists share a single heap allocation - Add `DataFileFieldInterner` that deduplicates these slices during manifest deserialization - In homogeneous tables (the common case), every fragment carries the same field list, so at 20M fragments this saves **~2.4 GB** of redundant heap allocations ## Motivation When dataset manifests grow large (>1 GB with millions of fragments), opening the dataset becomes very expensive in terms of memory. Each `DataFile` previously owned its own `Vec<i32>` for `fields` and `column_indices`, even though in most tables every fragment has the exact same field list. This PR deduplicates those allocations at deserialization time. ### Per-fragment memory breakdown (before) | Field | Size per fragment | |-------|------------------| | `fields: Vec<i32>` (10 fields) | ~64 bytes | | `column_indices: Vec<i32>` (10 cols) | ~64 bytes | | **Total redundant** | **~128 bytes x 20M = ~2.4 GB** | ### After this change With interning, all 20M fragments share a single `Arc<[i32]>` allocation (~80 bytes total instead of 2.4 GB). ## Changes - **`lance-table/src/format/fragment.rs`** — Core struct change (`Vec<i32>` → `Arc<[i32]>`), custom `Serialize`/`Deserialize` impls, and `DataFileFieldInterner` - **`lance-table/src/format/manifest.rs`** — Use interner during manifest deserialization - **`lance/src/dataset/fragment.rs`**, **`merge_insert.rs`**, **`io/commit.rs`** — Tombstoning and field-remapping rebuilt as new `Arc<[i32]>` instead of in-place mutation - **`python/src/fragment.rs`**, **`java/lance-jni/src/fragment.rs`** — FFI boundary conversions - Various test files — Updated struct literals and assertions ## Compatibility - No format change — protobuf schema is unchanged - Serde JSON output is identical (custom impl serializes `Arc<[i32]>` as `[i32]`) - All public API signatures that take `Vec<i32>` (e.g., `DataFile::new()`, `Fragment::add_file()`) still accept `Vec<i32>` and convert internally 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mory (lance-format#6499) ## Summary - Change `RowDatasetVersionMeta::Inline` from `Vec<u8>` to `Arc<[u8]>` so that fragments with identical version metadata share a single heap allocation - Extend `DataFileFieldInterner` to deduplicate these inline byte payloads during manifest deserialization - Introduce `InternCache<T>`: a hybrid cache that uses Vec linear scan for ≤16 entries and upgrades to HashMap for larger caches - Add custom `Serialize`/`Deserialize` impls for `RowDatasetVersionMeta` to handle `Arc<[u8]>` transparently ## Motivation Follow-up to lance-format#6477 (interning `DataFile.fields`/`column_indices`). After a compaction, all fragments are stamped with the same version metadata (both `last_updated_at_version_meta` and `created_at_version_meta`), but each fragment previously owned its own `Vec<u8>` copy. ### Per-fragment memory breakdown (before) | Field | Size per fragment | |-------|------------------| | `last_updated_at_version_meta: Inline(Vec<u8>)` | ~24 bytes + payload | | `created_at_version_meta: Inline(Vec<u8>)` | ~24 bytes + payload | | **Total redundant at 20M fragments** | **~480 MB+** | ### After this change With interning, all 20M fragments share a single `Arc<[u8]>` allocation per unique payload. ## Benchmark results Microbenchmark at 100K fragments (10 fields per fragment): | Scenario | No interning | With interning | Delta | |----------|-------------|----------------|-------| | **Uniform (1 unique version)** | 24.5 ms | 17.9 ms | **27% faster** | | **Diverse (10 unique)** | 25.7 ms | 19.7 ms | **23% faster** | | **Diverse (100 unique)** | 26.0 ms | 23.4 ms | **10% faster** | | **Diverse (500 unique)** | 26.0 ms | 22.8 ms | **12% faster** | | Memory (100K fragments) | No interning | With interning | Savings | |------------------------|-------------|----------------|---------| | **10 fields** | 39.47 MB | 29.74 MB | **24.6%** | | **50 fields** | 69.99 MB | 29.74 MB | **57.5%** | Both memory and speed improve across all scenarios. The hybrid `InternCache` uses fast Vec scan for the common case (1-3 unique values) and upgrades to HashMap when diversity exceeds 16 entries. Run with: `cargo bench -p lance-table --bench manifest_intern` ## Changes - **`rust/lance-table/src/rowids/version.rs`** — `Inline(Vec<u8>)` → `Inline(Arc<[u8]>)`, custom serde impls, updated protobuf conversions - **`rust/lance-table/src/format/fragment.rs`** — `InternCache<T>` (Vec/HashMap hybrid), extended `DataFileFieldInterner` with version meta interning - **`rust/lance-table/benches/manifest_intern.rs`** — Microbenchmark covering uniform and diverse scenarios ## Compatibility - No format change — protobuf schema is unchanged - Serde JSON output is identical (custom impl serializes `Arc<[u8]>` as `[u8]`) - `from_sequence()` still works as before (converts internally) ## Test plan - [x] `cargo check --workspace --tests` passes - [x] `cargo clippy -p lance-table -p lance -- -D warnings` passes - [x] All 88 `lance-table` tests pass - [x] `cargo fmt --all -- --check` passes - [x] Microbenchmark validates performance across uniform and diverse scenarios - [ ] CI 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rmat#6308) - `list_all_tables` - `restore_table` - `update_table_schema_metadata` - `get_table_stats` - `explain_table_query_plan` - `analyze_table_query_plan` --------- Co-authored-by: zhangyue19921010 <zhangyue.1010@bytedance.com>
## Summary - Adds `#[instrument]` attributes from the `tracing` crate to key functions across the `mem_wal` module - Covers write path (`RegionWriter::open`, `put`, `close`), flush path (`MemTableFlusher::flush`, `flush_with_indexes`), WAL operations, manifest store, memtable inserts, scanner/planner, point lookups, and vector search - Uses appropriate trace levels (`info` for high-level operations, `debug` for internals) with relevant fields (region_id, epoch, row counts, batch counts) ## Test plan - [x] `cargo check` passes — no functional changes, only attribute additions - [x] Existing `mem_wal` tests continue to pass - [ ] Tracing output verified with `RUST_LOG=debug` showing instrumented spans 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
) ## Summary Refactor `FullZipScheduler::create_page_load_task` to accept a pre-submitted I/O future instead of deferring I/O submission until the async task executes. This allows the I/O requests to be submitted immediately during scheduling, enabling the object store layer to batch and parallelize them. close lance-format#6504 ## I/O Model Change ### Before: Lazy I/O submission (serialized) Previously, `create_page_load_task` received a `FullZipReadSource::Remote(io)` along with byte ranges and priority. The actual `io.submit_request()` call happened **inside** the async block, meaning the I/O request was not submitted until the future was first polled. When decoding multiple pages (e.g. across many fragments), this created a sequential I/O pattern: ``` Page 1: [schedule] -> [poll] -> [submit I/O] -> [wait response] -> [decode] Page 2: [schedule] -> [poll] -> [submit I/O] -> [wait response] -> [decode] Page 3: [schedule] -> [poll] -> ... ``` Each page's I/O request could only be submitted after the previous task started executing. The I/O scheduler had no visibility into upcoming requests, preventing it from batching or parallelizing them effectively. ### After: Eager I/O submission (pipelined) Now, `io.submit_request()` is called **before** constructing the `PageLoadTask`, and the resulting future is passed into `create_page_load_task`. All I/O requests for all pages are submitted upfront during the scheduling phase: ``` [schedule all pages] --> submit I/O page 1 -+ --> submit I/O page 2 -+ --> submit I/O page 3 -+ (all in-flight concurrently) --> submit I/O page N -+ | [poll] -> [await page 1 response] -> [decode] [poll] -> [await page 2 response] -> [decode] [poll] -> [await page 3 response] -> [decode] ``` The object store layer can now see all pending requests at once and optimize I/O through batching, connection multiplexing, and parallel fetches. The async tasks only await the already-in-flight I/O futures. ## Changes - `rust/lance-encoding/src/encodings/logical/primitive.rs`: - Changed `create_page_load_task` signature to accept `BoxFuture<'static, Result<Vec<Bytes>>>` instead of `FullZipReadSource` + byte ranges + priority - Moved `io.submit_request()` calls to happen eagerly at both call sites (`schedule_ranges_with_rep_index` and the non-rep-index path), before constructing the page load task ## Performance Tested with a multi-fragment dataset containing fixed-width columns (768-dim float32 vectors, 40 fragments, 50 rows/fragment): | Benchmark | Before (p50) | After (p50) | Speedup | |---|---|---|---| | Fixed-width column scan | 3453 ms | 523 ms | **6.6x** | The improvement comes entirely from I/O pipelining — the decoding logic itself is unchanged. The effect is most pronounced with many fragments or pages, where the serialized I/O submission was the dominant bottleneck.
## Summary - Add `blob_max_pack_file_bytes` to `WriteParams`, allowing users to override the default 1 GiB maximum pack (`.blob`) sidecar file size - Thread the configuration through the full write path: `WriteParams` -> `WriterGenerator` -> `WriterOptions` -> `BlobPreprocessor` -> `PackWriter` - Expose the option in Python (`write_dataset`) and Java (`WriteParams.Builder`) bindings ## Test plan - [x] All 37 existing blob tests pass (`cargo test -p lance blob`) - [x] Clippy clean on `lance` and `lance-jni` crates - [x] Verify Python binding works end-to-end with `blob_max_pack_file_bytes` kwarg - [x] Verify Java binding compiles with `./mvnw compile` 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s Rust, Python, Java Introduce NamespaceClientTableContext struct/class that holds cached describe_table/declare_table response data (location, storage_options, managed_versioning). This is passed all the way to Rust where all decisions about storage options merging and managed versioning are made. - Rust: NamespaceClientTableContext in lance-namespace crate, with from_describe_table_response/from_declare_table_response constructors. DatasetBuilder::from_namespace_context and Dataset::write_into_namespace_context accept the context. - Python: NamespaceClientTableContext class in namespace module. All APIs (dataset, write_dataset, fragments, file reader/writer/session, TF) accept namespace_client_table_context parameter. PyO3 binding extracts context fields in Rust. - Java: NamespaceClientTableContext class with static factory methods. All builders (OpenDatasetBuilder, WriteDatasetBuilder, CommitBuilder, WriteFragmentBuilder) accept the context. JNI binding extracts context fields in Rust. Also removes deprecated Dataset.create/open overloads and createWithFfiSchema JNI path in Java. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
NamespaceClientTableContext(Rust struct, Python class, Java class) that holds cacheddescribe_table/declare_tableresponse data (location, storage_options, managed_versioning)Dataset.create/openoverloads andcreateWithFfiSchemaJNI path in Java