Skip to content

perf: Improve ConstantVector hashing in OptimizedVectorHasher#2014

Open
yingsu00 wants to merge 1 commit into
IBM:optimized_partitionedoutputfrom
yingsu00:ConstantVectorHashing
Open

perf: Improve ConstantVector hashing in OptimizedVectorHasher#2014
yingsu00 wants to merge 1 commit into
IBM:optimized_partitionedoutputfrom
yingsu00:ConstantVectorHashing

Conversation

@yingsu00
Copy link
Copy Markdown
Collaborator

@yingsu00 yingsu00 commented May 9, 2026

After this change, OptimizedVectorHasherBenchmark shows 150x gain in hashing ConstantVectors than the legacy VectorHasher.

After this change, OptimizedVectorHasherBenchmark shows 150x gain in
hashing ConstantVectors than the legacy VectorHasher.
@yingsu00 yingsu00 requested a review from xin-zhang2 May 9, 2026 21:59
@yingsu00 yingsu00 self-assigned this May 9, 2026
if (!decoded_.isConstantMapping()) {
return std::nullopt;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also need an early return when decoded_.size() == 0

Copy link
Copy Markdown
Member

@xin-zhang2 xin-zhang2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. The unit tests for ConstantVector are still using the hash function. Can you also update them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants