You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
perf(parquet): skip per-chunk vals_in_chunk computation when all values are non-null
The chunker's per-chunk `partition_point` (arrow path) or
`LevelDataRef::value_count` (non-arrow path) returns `chunk_size` by
construction whenever the column has no nulls. The GKE bench showed
~+12–27% regressions on `list_primitive_non_null/*` and
`string_non_null/*` consistent with that walk dominating: ~50 K chunks
× a binary search through a 50 M-entry `non_null_indices` buffer means
cold cache reads on every chunk.
Compute a `ValueCountStrategy` once at `write_batch_internal` entry:
- `AllPresent` — set when the arrow caller passed
`non_null_indices.len() == num_levels`, or when the column has
`max_def_level == 0`. The chunker uses `chunk_size` directly with no
per-chunk work.
- `Sorted(&[usize])` — arrow nullable path; binary-search the indices.
- `DefLevelScan(max_def)` — non-arrow nullable path; def-level scan.
For the bench's `list_primitive_non_null` (all-non-null lists with a
50 M-entry leaf), this drops the per-chunk binary search entirely;
expected to bring those rows back near noise.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
0 commit comments