Skip to content

Add benchmarks for dictionary path of new_group_values#22004

Open
Rich-T-kid wants to merge 2 commits intoapache:mainfrom
Rich-T-kid:rich-T-kid/dictionary-criterion-benchmarks
Open

Add benchmarks for dictionary path of new_group_values#22004
Rich-T-kid wants to merge 2 commits intoapache:mainfrom
Rich-T-kid:rich-T-kid/dictionary-criterion-benchmarks

Conversation

@Rich-T-kid
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

benchmarks for #21765. Also related to #21860
The goal is to merge this PR and then rebase the branch on #21765 to contain these benchmarks, so that they can be run and compared to the original.

Rationale for this change

Originally this was included in #21765 but that PR is already very large. I decided to move it to its own separate PR

What changes are included in this PR?

Adds benchmarks for the dictionary encoding array path of new_group_values().

Are these changes tested?

n/a

Are there any user-facing changes?

no

@github-actions github-actions Bot added the physical-plan Changes to the physical-plan crate label May 3, 2026

for &size in &SIZES {
let mut cards = CARDS_RELATIVE.to_vec();
cards.push(size); // all-unique stress case
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For size == 8192, this adds 8192 again because CARDS_RELATIVE already has 8 * 1024. Criterion needs unique benchmark IDs, so this benchmark panics before it can run. Please dedupe cards or remove 8 * 1024 from CARDS_RELATIVE.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced 8*1024 with 1000

//! column. Each iteration is timed end-to-end: it constructs the
//! `Box<dyn GroupValues>` returned by `new_group_values`, runs `intern`
//! once (or N times), and then `emit(EmitTo::All)`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says GroupValues construction is timed, but new_group_values is inside the iter_batched_ref setup closure, so Criterion does not measure it, please update the comment to say the measured part is intern + emit.

@Rich-T-kid
Copy link
Copy Markdown
Contributor Author

@kumarUjjawal Just updated it, let me know if theres anything else I should change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants