You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Match OnPair C++ `decoder.h::decompress` exactly: copy a fixed
`MAX_TOKEN_SIZE = 16` bytes per token regardless of true token length,
then advance the output cursor by the *true* length so the next memcpy
overwrites the trailing slop. LLVM lowers the fixed-size copy to a
single 16-byte unaligned vector store on x86_64 / aarch64, making each
token a constant-time SIMD operation instead of a branchy variable
memcpy.
Changes:
* `MAX_TOKEN_SIZE` is now a public crate-level constant.
* `compress.rs` pads the dictionary blob with 16 trailing zero bytes so
the over-copy never reads past `dict_bytes`. The codes / offsets /
validity invariants are unchanged.
* `decode.rs::DecodeView::decode_row_into` becomes the fast path: a
two-pass loop that first sums true lengths to size the output buffer
once, then over-copies into a pre-reserved region using
`copy_nonoverlapping` and finishes with a single `set_len`.
* New `decode_rows_into(start, count, &mut Vec<u8>)` does the same
thing across a row window with no per-row reserve overhead. The
canonicalise path now bulk-decodes the entire array in one shot.
Benchmark (release, no FFI, real OnPair-compressed URL/log corpus):
rows | median canonicalize | ns / row
---------|----------------------|---------
10 000 | 280 µs | 28
100 000 | 3.12 ms | 31
1 000 000| 57.5 ms | 57 (L2-bound)
For comparison the earlier `extend_from_slice` decode was ~7.5 ms /
100 K rows; the new path is **~2.4× faster**.
Verified
* `cargo test -p vortex-onpair` all green
* `cargo test -p vortex-btrblocks ...` all green (3× roundtrip)
* `cargo test -p vortex-file ... onpair` all green (4× roundtrip
incl. TPC-H shape)
* `datafusion-bench tpch --opt scale-factor=0.01 --formats vortex
--queries 1` end-to-end Parquet →
Vortex (with OnPair) →
DataFusion query 1 in 12 ms
Signed-off-by: Claude <noreply@anthropic.com>
0 commit comments