11- Start Date: 2026-02-26
2- - RFC PR: [ vortex-data/rfcs #0021 ] ( https://github.com/vortex-data/rfcs/pull/0021 )
32- Tracking Issue: [ vortex-data/vortex #6719 ] ( https://github.com/vortex-data/vortex/issues/6719 )
43
54## Summary
@@ -20,13 +19,13 @@ The key observation is that a list column stored as `(offsets, elements)` is a p
2019grouping. Computing ` list_sum(list_col) ` is a grouped ` sum ` over the flat elements partitioned
2120by offsets. Every aggregate function has a corresponding list scalar function for free:
2221
23- | Aggregate | List scalar | Operation |
24- | --------- | ---------------------- | ------------------------- |
25- | ` sum ` | ` list_sum(list_col) ` | Sum elements per list |
26- | ` min ` | ` list_min(list_col) ` | Min element per list |
27- | ` max ` | ` list_max(list_col) ` | Max element per list |
28- | ` count ` | ` list_count(list_col) ` | Count non-null per list |
29- | ` mean ` | ` list_mean(list_col) ` | Mean of elements per list |
22+ | Aggregate | List scalar | Operation |
23+ | ----------- | -------------------------- | -- ------------------------- |
24+ | ` sum ` | ` list_sum(list_col) ` | Sum elements per list |
25+ | ` min ` | ` list_min(list_col) ` | Min element per list |
26+ | ` max ` | ` list_max(list_col) ` | Max element per list |
27+ | ` count ` | ` list_count(list_col) ` | Count non-null per list |
28+ | ` mean ` | ` list_mean(list_col) ` | Mean of elements per list |
3029| ` nan_count ` | ` list_nan_count(list_col) ` | Count NaN elements per list |
3130
3231Since Vortex does not support shuffling, grouped aggregates only apply to pre-existing groups.
@@ -124,15 +123,15 @@ Each aggregate declares a `state_dtype` — the type of its intermediate accumul
124123State is a single ` Scalar ` whose dtype matches this declaration. For aggregates with multiple
125124fields, use a struct dtype:
126125
127- | Aggregate | ` state_dtype ` | Example state value |
128- | ------------ | ---------------------------------------- | --------------------------------------- |
126+ | Aggregate | ` state_dtype ` | Example state value |
127+ | ------------ | ---------------------------------------- | ----------------------------------------- |
129128| ` Sum ` | ` i64 ` (or widened input type) | ` Scalar(42) ` — overflow saturates to null |
130- | ` Count ` | ` u64 ` | ` Scalar(7) ` |
131- | ` NanCount ` | ` u64 ` | ` Scalar(2) ` |
132- | ` Min ` | input element type | ` Scalar(3) ` |
133- | ` Mean ` | ` Struct { sum: f64, count: u64 } ` | ` Scalar({sum: 10.0, count: 5}) ` |
134- | ` IsConstant ` | ` Struct { value: T, is_constant: bool } ` | ` Scalar({value: 5, is_constant: true}) ` |
135- | ` IsSorted ` | ` Struct { last: T, is_sorted: bool } ` | ` Scalar({last: 9, is_sorted: true}) ` |
129+ | ` Count ` | ` u64 ` | ` Scalar(7) ` |
130+ | ` NanCount ` | ` u64 ` | ` Scalar(2) ` |
131+ | ` Min ` | input element type | ` Scalar(3) ` |
132+ | ` Mean ` | ` Struct { sum: f64, count: u64 } ` | ` Scalar({sum: 10.0, count: 5}) ` |
133+ | ` IsConstant ` | ` Struct { value: T, is_constant: bool } ` | ` Scalar({value: 5, is_constant: true}) ` |
134+ | ` IsSorted ` | ` Struct { last: T, is_sorted: bool } ` | ` Scalar({last: 9, is_sorted: true}) ` |
136135
137136The ` merge ` method on ` Accumulator ` combines a partial state scalar into the currently open
138137group. For Sum, this is addition. For IsConstant, this checks whether the incoming value
0 commit comments