Skip to content

Tracking Issue: Duckdb projection expression pushdown #8310

@myrrc

Description

@myrrc

We want to push down scalar (and, in the future, aggregate) functions that are part of SELECT into vortex.

Example: SELECT strlen(col) doesn't need to decompress the strings.

Duckdb PR for type pushdown which is the same mechanism: duckdb/duckdb#22788
Vortex draft PR with byte_length pushdown #8303

Issues to solve:

  • byte_length is u64 but Duckdb wants i64. cast() decompresses the strings, although it shouldn't.
    cast reduce rule for dict evaluated validity(), causing decompression. Solved by adding validity() to byte_length().
  • Dict reader values use SharedArray which uncompresses strings into VarBinView, so we have no way to access FSST directly.
    This boils down to the task whether we want to push expression or subexpression tree to Dict values cache to avoid the cost of recanonicalization. Push down some expressions to Dict layout reader's cached values #8341

Metadata

Metadata

Assignees

Labels

ext/duckdbRelates to the DuckDB integrationtracking-issueShared implementation context for work likely to span multiple PRs.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions