Skip to content

Commit e28883d

Browse files
authored
Tensor RFC touchups (#26)
I forgot that fix some inconsistencies in this RFC related to logical vs physical shape. You can look at the individual commits to filter out noise. --------- Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
1 parent 0ffa944 commit e28883d

File tree

1 file changed

+67
-65
lines changed

1 file changed

+67
-65
lines changed

accepted/0024-tensor.md

Lines changed: 67 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,17 @@
11
- Start Date: 2026-03-04
22
- Tracking Issue: [vortex-data/vortex#0000](https://github.com/vortex-data/vortex/issues/0000)
33

4+
# Fixed-shape Tensor Extension
5+
46
## Summary
57

6-
We would like to add a `FixedShapeTensor` type to Vortex as an extension over `FixedSizeList`. This
7-
RFC proposes the design of a fixed-shape tensor with contiguous backing memory.
8+
We would like to add a `FixedShapeTensor` type to Vortex as an extension type backed by
9+
`FixedSizeList`. This RFC proposes the design of a fixed-shape tensor with contiguous backing
10+
memory.
811

912
## Motivation
1013

11-
#### Tensors in the wild
14+
### Tensors in the wild
1215

1316
Tensors are multi-dimensional (n-dimensional) arrays that generalize vectors (1D) and matrices (2D)
1417
to arbitrary dimensions. They are quite common in ML/AI and scientific computing applications. To
@@ -18,7 +21,7 @@ name just a few examples:
1821
- Multi-dimensional sensor or time-series data
1922
- Embedding vectors from language models and recommendation systems
2023

21-
#### Fixed-shape tensors in Vortex
24+
### Fixed-shape tensors in Vortex
2225

2326
In the current version of Vortex, there are two ways to represent fixed-shape tensors using the
2427
`FixedSizeList` `DType`, and neither seems satisfactory.
@@ -63,7 +66,7 @@ for this tensor would be `FixedSizeList<i32, 24>` since `2 x 3 x 4 = 24`.
6366

6467
This is equivalent to the design of Arrow's canonical Fixed Shape Tensor extension type. For
6568
discussion on why we choose not to represent tensors as nested FSLs (for example
66-
`FixedSizeList<FixedSizeList<FixedSizeList<i32, 2>, 3>, 4>`), see the [alternatives](#alternatives)
69+
`FixedSizeList<FixedSizeList<FixedSizeList<i32, 4>, 3>, 2>`), see the [alternatives](#alternatives)
6770
section.
6871

6972
### Element Type
@@ -97,36 +100,43 @@ This is a restriction we can relax in the future if a compelling use case arises
97100

98101
Theoretically, we only need the dimensions of the tensor to have a useful Tensor type. However, we
99102
likely also want two other pieces of information, the dimension names and the permutation order,
100-
which mimics the [Arrow Fixed Shape Tensor](https://arrow.apache.org/docs/format/CanonicalExtensions.html#fixed-shape-tensor)
101-
type (which is a Canonical Extension type).
103+
which aligns with Arrow's [Fixed Shape Tensor](https://arrow.apache.org/docs/format/CanonicalExtensions.html#fixed-shape-tensor)
104+
canonical extension type.
102105

103-
Here is what the metadata of the `FixedShapeTensor` extension type in Vortex will look like (in
106+
Here is what the metadata of the `FixedShapeTensor` extension type in Vortex might look like (in
104107
Rust):
105108

106109
```rust
107-
/// Metadata for a [`FixedShapeTensor`] extension type.
110+
/// Metadata for a `FixedShapeTensor` extension type.
108111
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
109112
pub struct FixedShapeTensorMetadata {
110-
/// The shape of the tensor.
113+
/// The logical shape of the tensor.
114+
///
115+
/// `logical_shape[i]` is the size of the `i`-th logical dimension. When a `permutation` is
116+
/// present, the physical shape (i.e., the row-major memory layout) is derived as
117+
/// `physical_shape[permutation[i]] = logical_shape[i]`.
111118
///
112-
/// The shape is always defined over row-major storage. May be empty (0D scalar tensor) or
113-
/// contain dimensions of size 0 (degenerate tensor).
114-
shape: Vec<usize>,
119+
/// May be empty (0D scalar tensor) or contain dimensions of size 0 (degenerate tensor).
120+
logical_shape: Vec<usize>,
115121

116-
/// Optional names for each dimension. Each name corresponds to a dimension in the `shape`.
122+
/// Optional names for each logical dimension. Each name corresponds to an entry in
123+
/// `logical_shape`.
117124
///
118-
/// If names exist, there must be an equal number of names to dimensions.
125+
/// If names exist, there must be an equal number of names to logical dimensions.
119126
dim_names: Option<Vec<String>>,
120127

121-
/// The permutation of the tensor's dimensions, mapping each logical dimension to its
122-
/// corresponding physical dimension: `permutation[logical] = physical`.
128+
/// The permutation of the tensor's dimensions. `permutation[i]` is the physical dimension
129+
/// index that logical dimension `i` maps to.
123130
///
124-
/// If this is `None`, then the logical and physical layout are equal, and the permutation is
125-
/// in-order `[0, 1, ..., N-1]`.
131+
/// If this is `None`, then the logical and physical layouts are identical, equivalent to
132+
/// the identity permutation `[0, 1, ..., N-1]`.
126133
permutation: Option<Vec<usize>>,
127134
}
128135
```
129136

137+
Note that this metadata would store the _logical_ shape of the tensor, not the physical shape. For
138+
more info on this, see the [physical vs. logical shape](#physical-vs-logical-shape) discussion.
139+
130140
### Stride
131141

132142
The stride of a tensor defines the number of elements to skip in memory to move one step along each
@@ -148,37 +158,30 @@ The element at index `[i, j, k]` is located at memory offset `12*i + 4*j + k`.
148158

149159
### Physical vs. logical shape
150160

151-
When a permutation is present, stride derivation depends on whether `shape` is stored as physical
152-
or logical (see [unresolved questions](#unresolved-questions)). If `shape` is **physical**
153-
(matching Arrow's convention), the process is straightforward: compute row-major strides over the
154-
stored shape, then permute them to get logical strides
155-
(`logical_stride[i] = physical_stride[perm[i]]`).
161+
When a permutation is present, stride derivation depends on whether `logical_shape` stores logical
162+
or physical dimensions. We lean towards storing **logical** dimensions (matching NumPy/PyTorch and
163+
Vortex's logical type system), though this is not yet finalized (see
164+
[unresolved questions](#unresolved-questions)).
156165

157-
Continuing the example with physical shape `[2, 3, 4]` and permutation `[2, 0, 1]`, the physical
158-
strides are `[12, 4, 1]` and the logical strides are
159-
`[physical_stride[2], physical_stride[0], physical_stride[1]]` = `[1, 12, 4]`.
166+
With logical shape, we first invert the permutation to recover the physical shape
167+
(`physical_shape[perm[i]] = logical_shape[i]`), compute row-major strides over that, then map them
168+
back to logical order.
160169

161-
If `shape` is **logical**, we must first invert the permutation to recover the physical shape
162-
(`physical_shape[perm[l]] = shape[l]`), compute row-major strides over that, then map them back to
163-
logical order.
170+
For example, with logical shape `[4, 2, 3]` and permutation `[2, 0, 1]`: the physical shape is
171+
`[2, 3, 4]`, physical strides are `[12, 4, 1]`, and logical strides are `[1, 12, 4]`.
164172

165-
For the same example with logical shape `[4, 2, 3]` and permutation `[2, 0, 1]`:
166-
the physical shape is `[2, 3, 4]`, physical strides are `[12, 4, 1]`, and logical strides are
167-
`[1, 12, 4]`.
173+
Alternatively, if we stored **physical** dimensions instead (matching Arrow's convention), stride
174+
derivation would be simpler: compute row-major strides directly over the stored shape, then permute
175+
them (`logical_stride[i] = physical_stride[perm[i]]`). For the same tensor with physical shape
176+
`[2, 3, 4]` and permutation `[2, 0, 1]`, the result is the same: `[1, 12, 4]`.
168177

169-
We want to emphasize that this is the same result, but with an extra inversion step. In either case,
170-
logical strides are always a permutation of the physical strides.
171-
172-
The choice of whether `shape` stores physical or logical dimensions also affects interoperability
173-
with [Arrow](#arrow) and [NumPy/PyTorch](#numpy-and-pytorch) (see those sections for details), as
174-
well as stride derivation complexity.
178+
In either case, logical strides are always a permutation of the physical strides. The cost of
179+
conversion between conventions is a cheap O(ndim) permutation at the boundary, so the difference is
180+
more about convention than performance.
175181

176182
Physical shape favors Arrow compatibility and simpler stride math. Logical shape favors
177-
NumPy/PyTorch compatibility and is arguably more intuitive for our users since Vortex has a logical
178-
type system.
179-
180-
The cost of conversion in either direction is a cheap O(ndim) permutation at the boundary, so the
181-
difference is more about convention than performance.
183+
NumPy/PyTorch compatibility and is arguably more intuitive for users since Vortex has a logical type
184+
system.
182185

183186
### Conversions
184187

@@ -188,11 +191,10 @@ Our storage type and metadata are designed to closely match Arrow's Fixed Shape
188191
extension type. The `FixedSizeList` backing buffer, dimension names, and permutation pass through
189192
unchanged, making the data conversion itself zero-copy (for tensors with at least one dimension).
190193

191-
Arrow stores `shape` as **physical** (the dimensions of the row-major layout). Whether the `shape`
192-
field passes through directly depends on the outcome of the
193-
[physical vs. logical shape](#physical-vs-logical-shape) open question. If Vortex adopts the same
194-
convention, shape maps directly. If Vortex stores logical shape instead, conversion requires a
195-
cheap O(ndim) scatter: `arrow_shape[perm[i]] = vortex_shape[i]`.
194+
Arrow stores `shape` as **physical** (the dimensions of the row-major layout). Since we lean towards
195+
storing logical shape in Vortex, Arrow conversion will require a cheap O(ndim) scatter:
196+
`arrow_shape[perm[i]] = vortex_shape[i]`. If we instead adopt physical shape, the field would pass
197+
through directly.
196198

197199
#### NumPy and PyTorch
198200

@@ -204,10 +206,9 @@ memory with the original without copying. However, this means that non-contiguou
204206
anywhere, and kernels must handle arbitrary stride patterns. PyTorch supposedly requires many
205207
operations to call `.contiguous()` before proceeding.
206208

207-
NumPy and PyTorch store `shape` as **logical** (the dimensions the user indexes with). If Vortex
208-
also stores logical shape, the shape field passes through unchanged. If Vortex stores physical
209-
shape, a cheap O(ndim) permutation is needed at the boundary (see
210-
[physical vs. logical shape](#physical-vs-logical-shape)).
209+
NumPy and PyTorch store `shape` as **logical** (the dimensions the user indexes with). Since we lean
210+
towards storing logical shape in Vortex, the shape field would pass through unchanged. If we instead
211+
adopt physical shape, a cheap O(ndim) permutation would be needed at the boundary.
211212

212213
Since Vortex fixed-shape tensors always have dense backing memory, we can always zero-copy _to_
213214
NumPy and PyTorch by passing the buffer pointer, logical shape, and logical strides. A permuted
@@ -242,7 +243,7 @@ elements in a tensor is the product of its shape dimensions, and that the
242243

243244
0D tensors have an empty shape `[]` and contain exactly one element (since the product of no
244245
dimensions is 1). These represent scalar values wrapped in the tensor type. The storage type is
245-
`FixedSizeList<p, 1>` (which is identical to a flat `PrimitiveArray`).
246+
`FixedSizeList<p, 1>` (semantically equivalent to a flat `PrimitiveArray`).
246247

247248
#### Size-0 dimensions
248249

@@ -257,7 +258,8 @@ dimensions of size 0 are valid (e.g., `np.zeros((3, 0, 4))`). PyTorch supports 0
257258
v0.4.0 and also allows size-0 dimensions.
258259

259260
Arrow's Fixed Shape Tensor spec, however, requires at least one dimension (`ndim >= 1`), so 0D
260-
tensors would need special handling during Arrow conversion (we would likely just panic).
261+
tensors would need special handling during Arrow conversion (e.g., returning an error or unwrapping
262+
to a scalar).
261263

262264
### Compression
263265

@@ -368,35 +370,35 @@ _Note: This section was Claude-researched._
368370

369371
## Unresolved Questions
370372

371-
- Should `shape` store physical dimensions (matching Arrow) or logical dimensions (matching
372-
NumPy/PyTorch)? See the [physical vs. logical shape](#physical-vs-logical-shape) discussion in
373-
the stride section. The current RFC assumes physical shape, but this is not finalized.
373+
- Should `logical_shape` store logical dimensions (matching NumPy/PyTorch) or physical dimensions
374+
(matching Arrow)? The RFC currently leans towards logical shape, but this is not finalized. See
375+
the [physical vs. logical shape](#physical-vs-logical-shape) discussion in the stride section.
374376
- Are two tensors with different permutations but the same logical values considered equal? This
375377
affects deduplication and comparisons. The type metadata might be different but the entire tensor
376378
value might be equal, so it seems strange to say that they are not actually equal?
377379
- Are there potential tensor-specific compression schemes we can take advantage of?
378380

379381
## Future Possibilities
380382

381-
#### Variable-shape tensors
383+
### Variable-shape tensors
382384

383385
Arrow defines a
384386
[Variable Shape Tensor](https://arrow.apache.org/docs/format/CanonicalExtensions.html#variable-shape-tensor)
385387
extension type for arrays where each tensor can have a different shape. This would enable workloads
386388
like batched sequences of different lengths.
387389

388-
#### Sparse tensors
390+
### Sparse tensors
389391

390392
A sparse tensor type could use `List` or `ListView` as its storage type to efficiently represent
391393
tensors with many zero or absent elements.
392394

393-
#### A unified `Tensor` type
395+
### A unified `Tensor` type
394396

395397
This RFC proposes `FixedShapeTensor` as a single, concrete extension type. However, tensors
396398
naturally vary along two axes: shape (fixed vs. variable) and density (dense vs. sparse). Both a
397399
variable-shape tensor (fixed dimensionality, variable shape per element) and a sparse tensor would
398-
need a different storage type, since it needs to efficiently skip over zero or null regions (and
399-
for both this would likely be `List` or `ListView`).
400+
need a different storage type, since it needs to efficiently skip over zero or null regions (and for
401+
both, this would likely be `List` or `ListView`).
400402

401403
Each combination would be its own extension type (`FixedShapeTensor`, `VariableShapeTensor`,
402404
`SparseFixedShapeTensor`, etc.), but this proliferates types and fragments any shared tensor logic.
@@ -408,12 +410,12 @@ with and a single place to define tensor operations.
408410
For now, `FixedShapeTensor` is the only variant we need. The others can be added incrementally
409411
as use cases arise.
410412

411-
#### Tensor-specific encodings
413+
### Tensor-specific encodings
412414

413415
Beyond general-purpose compression, encodings tailored to tensor data (e.g., exploiting spatial
414416
locality across dimensions) could improve compression ratios for specific workloads.
415417

416-
#### ndindex-style compute expressions
418+
### ndindex-style compute expressions
417419

418420
As the extension type expression system matures, we can implement a rich set of tensor indexing and
419421
slicing operations inspired by [ndindex](https://quansight-labs.github.io/ndindex/index.html),

0 commit comments

Comments
 (0)