Commit 04b8d40
RFC 33: update for merged PR 7269, split Stage 1 into 1a/1b
Stage 1a (merged in vortex-data/vortex#7269):
- MSE-only TurboQuant with 8-bit default (near-lossless, ~4e-5 MSE)
- Dimension >= 128 scheme selection, 3-round SORF
- Original QJL PR (#7167) closed
Stage 1b (next — array representation cleanup):
- Power-of-2 dimension requirement (remove internal padding)
- FixedSizeListArray rotation signs for variable SRHT rounds
- Dtype-matching norms, structured metadata (format TBD pending
vtable refactor)
- Goal: wire format ready for backward-compat guarantees
Stage 2 reframed as general-purpose structural encoding:
- Block decomposition is a vertical split of FSL by dimension,
analogous to ChunkedArray's horizontal split by rows
- Encoding-agnostic: each block is independently encoded (all TQ
initially, but supports heterogeneous child encodings)
- Straggler blocks noted as future work for no-qualifying-B dims
- PDX (Stage 3) similarly structural, not TQ-specific
Other changes:
- Codes/centroids remain separate slots; DictArray for canonicalize
- Updated compression ratio examples for 8-bit default
- Updated array layouts, migration table, references throughout
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Will Manning <will@willmanning.io>1 parent 396f279 commit 04b8d40
1 file changed
Lines changed: 351 additions & 183 deletions
0 commit comments