11- Start Date: 2026-03-04
22- Tracking Issue: [ vortex-data/vortex #0000 ] ( https://github.com/vortex-data/vortex/issues/0000 )
33
4+ # Fixed-shape Tensor Extension
5+
46## Summary
57
6- We would like to add a ` FixedShapeTensor ` type to Vortex as an extension over ` FixedSizeList ` . This
7- RFC proposes the design of a fixed-shape tensor with contiguous backing memory.
8+ We would like to add a ` FixedShapeTensor ` type to Vortex as an extension type backed by
9+ ` FixedSizeList ` . This RFC proposes the design of a fixed-shape tensor with contiguous backing
10+ memory.
811
912## Motivation
1013
11- #### Tensors in the wild
14+ ### Tensors in the wild
1215
1316Tensors are multi-dimensional (n-dimensional) arrays that generalize vectors (1D) and matrices (2D)
1417to arbitrary dimensions. They are quite common in ML/AI and scientific computing applications. To
@@ -18,7 +21,7 @@ name just a few examples:
1821- Multi-dimensional sensor or time-series data
1922- Embedding vectors from language models and recommendation systems
2023
21- #### Fixed-shape tensors in Vortex
24+ ### Fixed-shape tensors in Vortex
2225
2326In the current version of Vortex, there are two ways to represent fixed-shape tensors using the
2427` FixedSizeList ` ` DType ` , and neither seems satisfactory.
@@ -63,7 +66,7 @@ for this tensor would be `FixedSizeList<i32, 24>` since `2 x 3 x 4 = 24`.
6366
6467This is equivalent to the design of Arrow's canonical Fixed Shape Tensor extension type. For
6568discussion on why we choose not to represent tensors as nested FSLs (for example
66- ` FixedSizeList<FixedSizeList<FixedSizeList<i32, 2 >, 3>, 4 > ` ), see the [ alternatives] ( #alternatives )
69+ ` FixedSizeList<FixedSizeList<FixedSizeList<i32, 4 >, 3>, 2 > ` ), see the [ alternatives] ( #alternatives )
6770section.
6871
6972### Element Type
@@ -97,36 +100,43 @@ This is a restriction we can relax in the future if a compelling use case arises
97100
98101Theoretically, we only need the dimensions of the tensor to have a useful Tensor type. However, we
99102likely also want two other pieces of information, the dimension names and the permutation order,
100- which mimics the [ Arrow Fixed Shape Tensor] ( https://arrow.apache.org/docs/format/CanonicalExtensions.html#fixed-shape-tensor )
101- type (which is a Canonical Extension type) .
103+ which aligns with Arrow's [ Fixed Shape Tensor] ( https://arrow.apache.org/docs/format/CanonicalExtensions.html#fixed-shape-tensor )
104+ canonical extension type.
102105
103- Here is what the metadata of the ` FixedShapeTensor ` extension type in Vortex will look like (in
106+ Here is what the metadata of the ` FixedShapeTensor ` extension type in Vortex might look like (in
104107Rust):
105108
106109``` rust
107- /// Metadata for a [ `FixedShapeTensor`] extension type.
110+ /// Metadata for a `FixedShapeTensor` extension type.
108111#[derive(Debug , Clone , PartialEq , Eq , Hash )]
109112pub struct FixedShapeTensorMetadata {
110- /// The shape of the tensor.
113+ /// The logical shape of the tensor.
114+ ///
115+ /// `logical_shape[i]` is the size of the `i`-th logical dimension. When a `permutation` is
116+ /// present, the physical shape (i.e., the row-major memory layout) is derived as
117+ /// `physical_shape[permutation[i]] = logical_shape[i]`.
111118 ///
112- /// The shape is always defined over row-major storage. May be empty (0D scalar tensor) or
113- /// contain dimensions of size 0 (degenerate tensor).
114- shape : Vec <usize >,
119+ /// May be empty (0D scalar tensor) or contain dimensions of size 0 (degenerate tensor).
120+ logical_shape : Vec <usize >,
115121
116- /// Optional names for each dimension. Each name corresponds to a dimension in the `shape`.
122+ /// Optional names for each logical dimension. Each name corresponds to an entry in
123+ /// `logical_shape`.
117124 ///
118- /// If names exist, there must be an equal number of names to dimensions.
125+ /// If names exist, there must be an equal number of names to logical dimensions.
119126 dim_names : Option <Vec <String >>,
120127
121- /// The permutation of the tensor's dimensions, mapping each logical dimension to its
122- /// corresponding physical dimension: `permutation[logical] = physical` .
128+ /// The permutation of the tensor's dimensions. `permutation[i]` is the physical dimension
129+ /// index that logical dimension `i` maps to .
123130 ///
124- /// If this is `None`, then the logical and physical layout are equal, and the permutation is
125- /// in-order `[0, 1, ..., N-1]`.
131+ /// If this is `None`, then the logical and physical layouts are identical, equivalent to
132+ /// the identity permutation `[0, 1, ..., N-1]`.
126133 permutation : Option <Vec <usize >>,
127134}
128135```
129136
137+ Note that this metadata would store the _ logical_ shape of the tensor, not the physical shape. For
138+ more info on this, see the [ physical vs. logical shape] ( #physical-vs-logical-shape ) discussion.
139+
130140### Stride
131141
132142The stride of a tensor defines the number of elements to skip in memory to move one step along each
@@ -148,37 +158,30 @@ The element at index `[i, j, k]` is located at memory offset `12*i + 4*j + k`.
148158
149159### Physical vs. logical shape
150160
151- When a permutation is present, stride derivation depends on whether ` shape ` is stored as physical
152- or logical (see [ unresolved questions] ( #unresolved-questions ) ). If ` shape ` is ** physical**
153- (matching Arrow's convention), the process is straightforward: compute row-major strides over the
154- stored shape, then permute them to get logical strides
155- (` logical_stride[i] = physical_stride[perm[i]] ` ).
161+ When a permutation is present, stride derivation depends on whether ` logical_shape ` stores logical
162+ or physical dimensions. We lean towards storing ** logical** dimensions (matching NumPy/PyTorch and
163+ Vortex's logical type system), though this is not yet finalized (see
164+ [ unresolved questions] ( #unresolved-questions ) ).
156165
157- Continuing the example with physical shape ` [2, 3, 4] ` and permutation ` [2, 0, 1] ` , the physical
158- strides are ` [12, 4, 1] ` and the logical strides are
159- ` [physical_stride[2], physical_stride[0], physical_stride[1]] ` = ` [1, 12, 4] ` .
166+ With logical shape, we first invert the permutation to recover the physical shape
167+ ( ` physical_shape[perm[i]] = logical_shape[i] ` ), compute row-major strides over that, then map them
168+ back to logical order .
160169
161- If ` shape ` is ** logical** , we must first invert the permutation to recover the physical shape
162- (` physical_shape[perm[l]] = shape[l] ` ), compute row-major strides over that, then map them back to
163- logical order.
170+ For example, with logical shape ` [4, 2, 3] ` and permutation ` [2, 0, 1] ` : the physical shape is
171+ ` [2, 3, 4] ` , physical strides are ` [12, 4, 1] ` , and logical strides are ` [1, 12, 4] ` .
164172
165- For the same example with logical shape ` [4, 2, 3] ` and permutation ` [2, 0, 1] ` :
166- the physical shape is ` [2, 3, 4] ` , physical strides are ` [12, 4, 1] ` , and logical strides are
167- ` [1, 12, 4] ` .
173+ Alternatively, if we stored ** physical** dimensions instead (matching Arrow's convention), stride
174+ derivation would be simpler: compute row-major strides directly over the stored shape, then permute
175+ them (` logical_stride[i] = physical_stride[perm[i]] ` ). For the same tensor with physical shape
176+ ` [2, 3, 4] ` and permutation ` [2, 0, 1] ` , the result is the same: ` [1, 12, 4] ` .
168177
169- We want to emphasize that this is the same result, but with an extra inversion step. In either case,
170- logical strides are always a permutation of the physical strides.
171-
172- The choice of whether ` shape ` stores physical or logical dimensions also affects interoperability
173- with [ Arrow] ( #arrow ) and [ NumPy/PyTorch] ( #numpy-and-pytorch ) (see those sections for details), as
174- well as stride derivation complexity.
178+ In either case, logical strides are always a permutation of the physical strides. The cost of
179+ conversion between conventions is a cheap O(ndim) permutation at the boundary, so the difference is
180+ more about convention than performance.
175181
176182Physical shape favors Arrow compatibility and simpler stride math. Logical shape favors
177- NumPy/PyTorch compatibility and is arguably more intuitive for our users since Vortex has a logical
178- type system.
179-
180- The cost of conversion in either direction is a cheap O(ndim) permutation at the boundary, so the
181- difference is more about convention than performance.
183+ NumPy/PyTorch compatibility and is arguably more intuitive for users since Vortex has a logical type
184+ system.
182185
183186### Conversions
184187
@@ -188,11 +191,10 @@ Our storage type and metadata are designed to closely match Arrow's Fixed Shape
188191extension type. The ` FixedSizeList ` backing buffer, dimension names, and permutation pass through
189192unchanged, making the data conversion itself zero-copy (for tensors with at least one dimension).
190193
191- Arrow stores ` shape ` as ** physical** (the dimensions of the row-major layout). Whether the ` shape `
192- field passes through directly depends on the outcome of the
193- [ physical vs. logical shape] ( #physical-vs-logical-shape ) open question. If Vortex adopts the same
194- convention, shape maps directly. If Vortex stores logical shape instead, conversion requires a
195- cheap O(ndim) scatter: ` arrow_shape[perm[i]] = vortex_shape[i] ` .
194+ Arrow stores ` shape ` as ** physical** (the dimensions of the row-major layout). Since we lean towards
195+ storing logical shape in Vortex, Arrow conversion will require a cheap O(ndim) scatter:
196+ ` arrow_shape[perm[i]] = vortex_shape[i] ` . If we instead adopt physical shape, the field would pass
197+ through directly.
196198
197199#### NumPy and PyTorch
198200
@@ -204,10 +206,9 @@ memory with the original without copying. However, this means that non-contiguou
204206anywhere, and kernels must handle arbitrary stride patterns. PyTorch supposedly requires many
205207operations to call ` .contiguous() ` before proceeding.
206208
207- NumPy and PyTorch store ` shape ` as ** logical** (the dimensions the user indexes with). If Vortex
208- also stores logical shape, the shape field passes through unchanged. If Vortex stores physical
209- shape, a cheap O(ndim) permutation is needed at the boundary (see
210- [ physical vs. logical shape] ( #physical-vs-logical-shape ) ).
209+ NumPy and PyTorch store ` shape ` as ** logical** (the dimensions the user indexes with). Since we lean
210+ towards storing logical shape in Vortex, the shape field would pass through unchanged. If we instead
211+ adopt physical shape, a cheap O(ndim) permutation would be needed at the boundary.
211212
212213Since Vortex fixed-shape tensors always have dense backing memory, we can always zero-copy _ to_
213214NumPy and PyTorch by passing the buffer pointer, logical shape, and logical strides. A permuted
@@ -242,7 +243,7 @@ elements in a tensor is the product of its shape dimensions, and that the
242243
2432440D tensors have an empty shape ` [] ` and contain exactly one element (since the product of no
244245dimensions is 1). These represent scalar values wrapped in the tensor type. The storage type is
245- ` FixedSizeList<p, 1> ` (which is identical to a flat ` PrimitiveArray ` ).
246+ ` FixedSizeList<p, 1> ` (semantically equivalent to a flat ` PrimitiveArray ` ).
246247
247248#### Size-0 dimensions
248249
@@ -257,7 +258,8 @@ dimensions of size 0 are valid (e.g., `np.zeros((3, 0, 4))`). PyTorch supports 0
257258v0.4.0 and also allows size-0 dimensions.
258259
259260Arrow's Fixed Shape Tensor spec, however, requires at least one dimension (` ndim >= 1 ` ), so 0D
260- tensors would need special handling during Arrow conversion (we would likely just panic).
261+ tensors would need special handling during Arrow conversion (e.g., returning an error or unwrapping
262+ to a scalar).
261263
262264### Compression
263265
@@ -368,35 +370,35 @@ _Note: This section was Claude-researched._
368370
369371## Unresolved Questions
370372
371- - Should ` shape ` store physical dimensions (matching Arrow ) or logical dimensions (matching
372- NumPy/PyTorch )? See the [ physical vs. logical shape] ( #physical-vs-logical-shape ) discussion in
373- the stride section. The current RFC assumes physical shape, but this is not finalized .
373+ - Should ` logical_shape ` store logical dimensions (matching NumPy/PyTorch ) or physical dimensions
374+ (matching Arrow )? The RFC currently leans towards logical shape, but this is not finalized. See
375+ the [ physical vs. logical shape ] ( # physical-vs-logical- shape) discussion in the stride section .
374376- Are two tensors with different permutations but the same logical values considered equal? This
375377 affects deduplication and comparisons. The type metadata might be different but the entire tensor
376378 value might be equal, so it seems strange to say that they are not actually equal?
377379- Are there potential tensor-specific compression schemes we can take advantage of?
378380
379381## Future Possibilities
380382
381- #### Variable-shape tensors
383+ ### Variable-shape tensors
382384
383385Arrow defines a
384386[ Variable Shape Tensor] ( https://arrow.apache.org/docs/format/CanonicalExtensions.html#variable-shape-tensor )
385387extension type for arrays where each tensor can have a different shape. This would enable workloads
386388like batched sequences of different lengths.
387389
388- #### Sparse tensors
390+ ### Sparse tensors
389391
390392A sparse tensor type could use ` List ` or ` ListView ` as its storage type to efficiently represent
391393tensors with many zero or absent elements.
392394
393- #### A unified ` Tensor ` type
395+ ### A unified ` Tensor ` type
394396
395397This RFC proposes ` FixedShapeTensor ` as a single, concrete extension type. However, tensors
396398naturally vary along two axes: shape (fixed vs. variable) and density (dense vs. sparse). Both a
397399variable-shape tensor (fixed dimensionality, variable shape per element) and a sparse tensor would
398- need a different storage type, since it needs to efficiently skip over zero or null regions (and
399- for both this would likely be ` List ` or ` ListView ` ).
400+ need a different storage type, since it needs to efficiently skip over zero or null regions (and for
401+ both, this would likely be ` List ` or ` ListView ` ).
400402
401403Each combination would be its own extension type (` FixedShapeTensor ` , ` VariableShapeTensor ` ,
402404` SparseFixedShapeTensor ` , etc.), but this proliferates types and fragments any shared tensor logic.
@@ -408,12 +410,12 @@ with and a single place to define tensor operations.
408410For now, ` FixedShapeTensor ` is the only variant we need. The others can be added incrementally
409411as use cases arise.
410412
411- #### Tensor-specific encodings
413+ ### Tensor-specific encodings
412414
413415Beyond general-purpose compression, encodings tailored to tensor data (e.g., exploiting spatial
414416locality across dimensions) could improve compression ratios for specific workloads.
415417
416- #### ndindex-style compute expressions
418+ ### ndindex-style compute expressions
417419
418420As the extension type expression system matures, we can implement a rich set of tensor indexing and
419421slicing operations inspired by [ ndindex] ( https://quansight-labs.github.io/ndindex/index.html ) ,
0 commit comments