Support byte-sized batch limits in file reader

## Summary

Allow the batch size to be specified in bytes instead of rows when reading lance files. This gives callers control over memory usage per batch, which is especially important for variable-width data where row count is a poor proxy for memory consumption.

## Approach

Add an `estimate_decoded_bytes` method to the page and field decoder traits. Before each drain, the batch stream queries this to compute a row count that fits the byte budget. A post-decode feedback loop measures actual batch sizes and corrects the estimate for subsequent batches. No file format changes required — all estimates use data already available at decode time.

See [investigation notes](https://github.com/lancedb/lance/blob/main/rust/lance-encoding/AGENTS.md) and worktree at `feat-byte-sized-batches-file-reader` for full design details.

## Tasks

- [ ] Add `batch_size_bytes` config option — wire `batch_size_bytes: Option<u64>` through `SchedulerDecoderConfig` into `BatchDecodeStream`. When set, replaces fixed `rows_per_batch`. When unset, existing row-based behavior unchanged.
- [ ] Modify `BatchDecodeStream::next_batch_task()` for byte-based row selection — query `estimate_decoded_bytes` across children, compute row count that fits the budget, use cross-column coordination loop (estimate → check → adjust → converge).
- [ ] Add post-decode feedback loop — after `into_batch()` measures actual batch size (`decoder.rs:2551`), feed measured bytes-per-row back to refine row count for subsequent batches.
- [ ] Add `estimate_decoded_bytes` to `StructuralPageDecoder` trait — default implementation returns conservative fallback so the system works before all encodings are covered.
- [ ] Add `estimate_decoded_bytes` to `StructuralFieldDecoder` trait — field-level decoder spans pages, knows data type, delegates to page decoders. Struct decoder aggregates across children.
- [ ] Implement estimates for exact fixed-width encodings — Flat, Constant, InlineBitpacking, OutOfLineBitpacking, RLE, ByteStreamSplit, PackedStruct. All `num_rows * known_width`.
- [ ] Implement estimate for General compression (LZ4/Zstd) — read length prefix from compressed frame header (8-byte u64 for Zstd, 4-byte u32 for LZ4) to get exact decompressed size without decompressing.
- [ ] Implement estimate for Dictionary encoding — inspect loaded dictionary DataBlock. Fixed-width: use value width. Variable-width: scan offsets for max value size, bound = `num_rows * max_value_size`.
- [ ] Implement estimate for Variable encoding (no wrapper) — read offsets from loaded chunk data for exact size of N rows.
- [ ] Implement estimate for FSST — apply 8x algorithmic bound on compressed data size. If wrapped in General, read General length prefix first then apply 8x.
- [ ] Implement estimation for list types with miniblock rep index — binary search rep index to map rows to chunks, estimate at chunk granularity, overestimate boundary chunks.
- [ ] Testing — estimation accuracy per encoding, end-to-end byte-sized batch tests, edge cases (empty/single-row pages, extreme variance), backward compatibility.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support byte-sized batch limits in file reader #6387

Summary

Approach

Tasks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support byte-sized batch limits in file reader #6387

Description

Summary

Approach

Tasks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions