Goal
Make Vortex statistics pluggable by modeling stats as aggregate-function partials and exposing them through expressions. The concrete success case is demonstrating a Bloom-filter zone-map stat for UTF-8 equality pruning added purely through plugins: a custom aggregate function, scalar function, and rewrite rule, without changing built-in pruning logic.
Direction
Add a nullable stat(expr, AggregateFnRef) expression. It returns the stat value for the current stats scope, or null when unavailable. Falsification should produce normal expressions containing stat(...); simplification/execution decides whether anything is proven.
Keep the first steps small: add the new expression and rewrite APIs beside the existing pruning path. Migrate file stats and zoned stats after the model is tested.
All new stats-facing APIs should live under vortex-array/src/stats/. The scalar function implementation may live with scalar functions, but should be re-exported through vortex_array::stats.
Phase 1: Stat Expressions
Phase 2: Rewrite Registry
Phase 3: Built-In Rewrite Rules
Phase 4: Zoned Layout Migration
Phase 5: Aggregate-Function Zoned Stats
WARNING: this is the phase that changes the ZonedLayout serialized form
Phase 6: Plugin Bloom Proof
Phase 7: Satisfaction Follow-Up
Phase 8: Cleanup
Status
In progress.
Goal
Make Vortex statistics pluggable by modeling stats as aggregate-function partials and exposing them through expressions. The concrete success case is demonstrating a Bloom-filter zone-map stat for UTF-8 equality pruning added purely through plugins: a custom aggregate function, scalar function, and rewrite rule, without changing built-in pruning logic.
Direction
Add a nullable
stat(expr, AggregateFnRef)expression. It returns the stat value for the current stats scope, or null when unavailable. Falsification should produce normal expressions containingstat(...); simplification/execution decides whether anything is proven.Keep the first steps small: add the new expression and rewrite APIs beside the existing pruning path. Migrate file stats and zoned stats after the model is tested.
All new stats-facing APIs should live under
vortex-array/src/stats/. The scalar function implementation may live with scalar functions, but should be re-exported throughvortex_array::stats.Phase 1: Stat Expressions
StatFn/stat(expr, AggregateFnRef)undervortex-array/src/stats/.vortex_array::stats.StatFn::new_expr(...)directly in tests and rewrites.Min,Max,AllNull, andAllNonNull.NullCountas a legacy bridge for existing stats, not as a pruning proof aggregate.Statslots. Centralize aggregate stat bridge #7931 Add pruning aggregate functions #7934MinandMaxmap separately toStat::MinandStat::Max.StatFnread existing legacy stats through the aggregate-to-Statmapping. Centralize aggregate stat bridge #7931 Add pruning aggregate functions #7934StatFnworks for flat arrays and chunked arrays. Remove chunked special case from stat execution #7928StatFn::new_expr(...)reading legacy pruning stats. Add NullCount aggregate function #7933 Add pruning aggregate functions #7934Phase 2: Rewrite Registry
vortex-array/src/stats/session.rsfor stats rewrite session state. Add stats rewrite session API #7930vortex-array/src/stats/rewrite.rsfor rewrite traits and helpers. Add stats rewrite session API #7930VortexSession. Add stats rewrite session API #7930OR.Expression::falsify(session). Add stats rewrite session API #7930Phase 3: Built-In Rewrite Rules
StatFn::new_expr(...). Add built-in stats rewrite rules #7935stat_falsificationimplementations.StatsCatalog.and/orrewrites. Add built-in stats rewrite rules #7935between. Add built-in stats rewrite rules #7935is_nullandis_not_null. Add built-in stats rewrite rules #7935like. Add built-in stats rewrite rules #7935list_contains. Add built-in stats rewrite rules #7935Phase 4: Zoned Layout Migration
ZoneMapLayoutto lowerStatFnexpressions against existing zone stats. Teach zoned pruning to lower StatFn #7937Expression::falsify(session). Teach zoned pruning to lower StatFn #7937StatsCatalogzoned pruning path onceStatFnlowering covers existing behavior. Teach zoned pruning to lower StatFn #7937Phase 5: Aggregate-Function Zoned Stats
WARNING: this is the phase that changes the ZonedLayout serialized form
AggregateFnRef, notStatenum values.DisplayforAggregateFnRefas the descriptor string.Statonly as a compatibility bridge for existing array stats and legacy zoned metadata.zone_lenfollowed by a legacyStatbitset.zone_lenandpresent_aggregates: repeated string.Statbitsets into built-in aggregate descriptor strings.stat(expr, aggregate_fn)at read time by matching aggregate-function descriptors in the zone stats table. Use aggregate descriptors for zoned stats #7938Statenum values. Use aggregate descriptors for zoned stats #7938Phase 6: Plugin Bloom Proof
bloom_might_contain(filter, value)scalar function.Phase 7: Satisfaction Follow-Up
OR.Phase 8: Cleanup
StatsCatalogpruning path.Status
In progress.