I’d like to restart the partitioning side of this discussion separately from the dynamic-filter PRs and threads (see #21207 for more background)
The main issue seems to be that DataFusion cannot always represent the physical partitioning that some data sources actually have. In our case, range-partitioned data has had to claim Partitioning::Hash, which lets us avoid repartitioning but makes later optimizer and dynamic-filter decisions brittle due to this false advertisement.
Today we are in this scenario:
Data is actually range-partitioned:
- partition 0: key < 100
- partition 1: 100 <= key < 200
- partition 2: 200 <= key < 300
To mimic this we tell DataFusion our partitioning properties are:
- Partitioning::Hash([key], 3)
Those are not the same partitioning scheme.
The general goal should be that DataFusion should be able to describe physical partitioning truthfully, be able to compare two partitionings for compatibility, and use that information only when compatibility is proven.
This is well shown in dynamic filtering:
Build side partitioning:
p0: key < 100
p1: 100 <= key < 200
p2: 200 <= key < 300
Probe side partitioning:
p0: key < 100
p1: 100 <= key < 200
p2: 200 <= key < 300
These are compatible as every partition X on the build side is compatible with partition X on the probe side. With this information, optimizations can safely use partition-local behavior:
if partitioning is compatible:
use partition-local filter (more selective)
else:
use global/safe fallback
A possible direction is to evolve Partitioning toward an extensible abstraction, such as a PhysicalPartitioning trait, and add range partitioning as the first use case. Longer term, built-ins like hash partitioning could move behind the same abstraction.
This would give us a cleaner foundation for:
Does this direction seem interesting to others in the community nd any thoughts on this proposal?
cc: @NGA-TRAN @alamb @jayshrivastava @gabotechs @adriangb
I’d like to restart the partitioning side of this discussion separately from the dynamic-filter PRs and threads (see #21207 for more background)
The main issue seems to be that DataFusion cannot always represent the physical partitioning that some data sources actually have. In our case, range-partitioned data has had to claim
Partitioning::Hash, which lets us avoid repartitioning but makes later optimizer and dynamic-filter decisions brittle due to this false advertisement.Today we are in this scenario:
Those are not the same partitioning scheme.
The general goal should be that DataFusion should be able to describe physical partitioning truthfully, be able to compare two partitionings for compatibility, and use that information only when compatibility is proven.
This is well shown in dynamic filtering:
These are compatible as every partition X on the build side is compatible with partition X on the probe side. With this information, optimizations can safely use partition-local behavior:
A possible direction is to evolve Partitioning toward an extensible abstraction, such as a
PhysicalPartitioningtrait, and add range partitioning as the first use case. Longer term, built-ins like hash partitioning could move behind the same abstraction.This would give us a cleaner foundation for:
Does this direction seem interesting to others in the community nd any thoughts on this proposal?
cc: @NGA-TRAN @alamb @jayshrivastava @gabotechs @adriangb