[claude] fastlanes: allow signed integers in Delta encoding by joseph-isaacs · Pull Request #7918 · vortex-data/vortex

joseph-isaacs · 2026-05-13T21:43:11Z

Summary

Lifts the is_unsigned_int gate on DeltaArray so i8 / i16 / i32 / i64 columns can be delta-encoded.

The upstream FastLanes Delta::delta / Transpose::transpose kernels are bounded on T: FastLanes: Unsigned, so signed inputs are processed by reinterpret-casting the underlying buffer to the same-width unsigned counterpart, running the existing kernel, then reinterpret-casting back. wrapping_sub / wrapping_add are bit-identical for signed and unsigned operands under two's-complement, so the round-trip is exact.

Also:

Delta::cast now bails on signed sources. Value-preserving casts of signed deltas (e.g. -1i8 → 4294967295u32) break the wrapping-add invariant during decompression — the slow decompress-and-re-encode path handles those.
Added encodings/fastlanes/src/delta/FUSED_DECODE.md describing a triple-fused unpack + add-reference + undelta kernel for future work. Today decoding a Delta(FoR(BitPacked)) makes three passes over the buffer; a fused kernel cuts that to one.

Measured impact

Synthetic workloads (32 KiB raw, i32, 8 × 1024 elements each):

Workload	Δ range	Wnaive	Wffor	ratio
monotone i32 (0..N)	[0, 1]	1	1	15.97x
sensor i32 in [-100, 100]	[-196, 199]	32	9	3.20x
offset i32 base=-1e9	[0, 1]	1	1	15.97x
near-monotone i32 (5% backtrack)	[-2, 1]	32	2	10.65x

Wnaive = 32 on the two workloads with negative deltas confirms the bit-packing collapse on raw signed deltas (any negative two's-complement delta sets the high bits and the OR mask forces W = T). FFoR brings them to 9 and 2 bits.

Test plan

cargo build -p vortex-fastlanes (clean)
cargo nextest run -p vortex-fastlanes — 256/256 pass, including 7 new signed rstest cases (i8_full_range, i32_crossing_zero, i32_all_negative, i16_crossing_zero, i64_large_negative, nullable_i32_crossing, i32_non_negative)
cargo clippy -p vortex-fastlanes --all-targets --all-features — no warnings
cargo fmt --all -- --check — clean
./scripts/public-api.sh — not run locally (no public API surface added; unsigned_counterpart is pub(crate)). Worth verifying in CI.

Generated by Claude Code

Lifts the `is_unsigned_int` gate on `DeltaArray` so `i8` / `i16` / `i32` / `i64` columns can be delta-encoded. The upstream FastLanes kernels (`Delta::delta`, `Transpose::transpose`) are bounded on `T: FastLanes: Unsigned`, so signed inputs are processed by reinterpret-casting the underlying buffer to the same-width unsigned counterpart, running the existing kernel, and reinterpret-casting back. `wrapping_sub`/`wrapping_add` are bit-identical for signed and unsigned operands under two's-complement, so the round-trip is exact. Note that the encoded delta bytes for inputs that cross zero have the high bits set (e.g. delta `-1i8` = `0xFF`); naively bit-packing those would force the bit width to `T`. A follow-up should compose `Delta` with `FoR` so the deltas are stored as `value - min(delta)` before bit-packing. See encodings/fastlanes/src/delta/FUSED_DECODE.md for a design note on a fused triple-kernel (unpack + add-reference + undelta) that addresses the decode bandwidth. Also guards `Delta::cast` against signed sources: value-preserving casts of signed deltas (e.g. `-1i8` -> `4294967295u32`) break the wrapping-add invariant during decompression, so signed sources fall back to the decompress-and-reencode path. Signed-off-by: Claude <noreply@anthropic.com>

Measures the encoded byte budget under three bit-packing strategies for four representative signed `i32` shapes (monotone, sensor-like wobble around zero, large-negative offset, near-monotone with backtracks): | Workload | range | Wnaive | Wffor | Wzz | ratio | |-----------------------------------|----------------|-------:|------:|----:|--------:| | monotone i32 (0..N) | [0, 1] | 1 | 1 | 2 | 15.97x | | sensor i32 in [-100, 100] | [-196, 199] | 32 | 9 | 9 | 3.20x | | offset i32 base=-1e9 | [0, 1] | 1 | 1 | 2 | 15.97x | | near-monotone i32 (5% backtrack) | [-2, 1] | 32 | 2 | 3 | 10.65x | The "naive" column is the OR-mask of the raw delta bit-patterns: a single negative delta sets every high bit and forces `W = T`, which is why the two workloads with negative deltas (`sensor`, `near-monotone`) blow up to 32 bits. FFoR brings them to 9 and 2 bits. ZigZag matches FFoR only on the symmetric `sensor` workload and loses on every asymmetric column. Asserts that FFoR never exceeds naive, drops below `T` whenever a negative delta is present, and beats ZigZag on the asymmetric workloads. Run with `--nocapture` to see the table. Signed-off-by: Claude <noreply@anthropic.com>

Extends the synthetic workload report with two extra columns: bases byte size and the FFoR bit-width those bases would pack to. For 8K-element i32 inputs the bases buffer is ~50% of the FFoR total on monotone-like columns, and the bases sequence inherits the smoothness of the input, so recursively packing the bases with FoR gives a further ~1.4x on top of FFoR(deltas): workload FFoR (B) ratio bases (B) Wb +bcomp ratio monotone i32 (0..N) 2052 15.97x 1024 13 1448 22.63x sensor i32 in [-100, 100] 10244 3.20x 1024 8 9480 3.46x offset i32 base=-1e9 2052 15.97x 1024 13 1448 22.63x near-monotone i32 (5% backtrack) 3076 10.65x 1024 13 2472 13.26x This is already structurally enabled: the bases child is an `ArrayRef`, and the btrblocks compressor at vortex-btrblocks/src/schemes/integer.rs:917 already routes bases through `compress_child` so the cascading compressor picks whatever encoding fits (typically FoR + BitPacked). Signed-off-by: Claude <noreply@anthropic.com>

REUSE compliance — markdown files outside the patterns in REUSE.toml need inline SPDX comments. Signed-off-by: Claude <noreply@anthropic.com>

codspeed-hq · 2026-05-13T21:53:59Z

Merging this PR will degrade performance by 32.11%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

❌ 4 regressed benchmarks
✅ 1206 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
❌	Simulation	`decompress_rd[f32, (100000, 0.01)]`	413.1 µs	585.8 µs	-29.48%
❌	Simulation	`decompress_rd[f64, (100000, 0.01)]`	668.8 µs	1,023.4 µs	-34.65%
❌	Simulation	`decompress_rd[f32, (100000, 0.1)]`	413.2 µs	585.8 µs	-29.47%
❌	Simulation	`decompress_rd[f64, (100000, 0.1)]`	668.8 µs	1,023.4 µs	-34.65%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing claude/vortex-delta-negative-values-yQh1m (48d5602) with develop (da19bca)}

Pre-merge polish across the three things a reviewer would notice: * DeltaArray docstring: add a signed `i32` example next to the unsigned one so users see signed support is first-class. Verified by doctest. * Conformance: extend `test_delta_consistency` and `test_delta_binary_numeric` with i32 / i64 / i8 cases (crossing zero, all-negative, single-negative). These run the array-trait conformance harness, so any operation that's silently broken for signed inputs surfaces here. * cast.rs: expand the comment justifying why signed sources fall back to decompress-and-re-encode (the wrapping-add invariant breaks under value-preserving widening; the same hazard applies to cross-signedness). * synthetic_workload_compression table: rename duplicate "ratio" columns to `FFoR x` / `+bcomp x` so the report is unambiguous. 256 -> 263 tests, all pass. Clippy clean. Fmt clean. Signed-off-by: Claude <noreply@anthropic.com>

claude added 3 commits May 13, 2026 21:12

joseph-isaacs added the changelog/feature A new feature label May 13, 2026 — with Claude

fastlanes: add SPDX headers to FUSED_DECODE.md

1a5c639

REUSE compliance — markdown files outside the patterns in REUSE.toml need inline SPDX comments. Signed-off-by: Claude <noreply@anthropic.com>

joseph-isaacs added the do not merge Pull requests that are not intended to merge label May 13, 2026

joseph-isaacs changed the title ~~fastlanes: allow signed integers in Delta encoding~~ [claude] fastlanes: allow signed integers in Delta encoding May 13, 2026

joseph-isaacs closed this May 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[claude] fastlanes: allow signed integers in Delta encoding#7918

[claude] fastlanes: allow signed integers in Delta encoding#7918
joseph-isaacs wants to merge 5 commits into
developfrom
claude/vortex-delta-negative-values-yQh1m

joseph-isaacs commented May 13, 2026

Uh oh!

codspeed-hq Bot commented May 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

joseph-isaacs commented May 13, 2026

Summary

Measured impact

Test plan

Uh oh!

codspeed-hq Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will degrade performance by 32.11%

Performance Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codspeed-hq Bot commented May 13, 2026 •

edited

Loading