Skip to content

Introduce CachingCollectorManager to parallelize search when using CachingCollector#16247

Open
gaobinlong wants to merge 15 commits into
apache:mainfrom
gaobinlong:cachingCollectorManager
Open

Introduce CachingCollectorManager to parallelize search when using CachingCollector#16247
gaobinlong wants to merge 15 commits into
apache:mainfrom
gaobinlong:cachingCollectorManager

Conversation

@gaobinlong

@gaobinlong gaobinlong commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Description

This PR introduces CachingCollectorManager, switches GroupingSearch to use search concurrency and move away from the deprecated search(Query, Collector) method.

In addition, remove experimental GroupingSearch constructor that takes a GroupSelector as argument in favor of providing a GroupSelector supplier.

Relates to #12892.

…chingCollector

Signed-off-by: Binlong Gao <gbinlong@amazon.com>
Signed-off-by: Binlong Gao <gbinlong@amazon.com>

@javanna javanna left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see movement in this area, thanks for working on this @gaobinlong ! I left some comments

Comment thread lucene/core/src/java/org/apache/lucene/search/CachingCollectorManager.java Outdated
Comment thread lucene/core/src/java/org/apache/lucene/search/CachingCollectorManager.java Outdated
Comment thread lucene/core/src/java/org/apache/lucene/search/CachingCollectorManager.java Outdated
Comment thread lucene/grouping/src/java/org/apache/lucene/search/grouping/GroupingSearch.java Outdated
Comment thread lucene/core/src/java/org/apache/lucene/search/CachingCollectorManager.java Outdated
Signed-off-by: Binlong Gao <gbinlong@amazon.com>
@github-actions github-actions Bot modified the milestones: 10.5.0, 11.0.0 Jun 17, 2026
@gaobinlong

Copy link
Copy Markdown
Contributor Author

@javanna all comments are addressed yet, please help to review again, thanks!

@javanna javanna left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple more comments, this is close! Thanks again!

@javanna javanna modified the milestones: 10.5.0, 10.6.0 Jun 25, 2026

@javanna javanna left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the work, I left a new batch of comments, I 'd expect these to be the last ones.

Comment thread lucene/CHANGES.txt Outdated
Signed-off-by: Binlong Gao <gbinlong@amazon.com>
Signed-off-by: Binlong Gao <gbinlong@amazon.com>

@javanna javanna left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last remarks. LGTM otherwise

Comment thread lucene/CHANGES.txt Outdated
Collection<SearchGroup<T>> topSearchGroups =
(Collection<SearchGroup<T>>) firstRoundResults[resultIdx++];
if (topSearchGroups.isEmpty()) {
return new TopGroups<>(new SortField[0], new SortField[0], 0, 0, new GroupDocs[0], Float.NaN);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we exit early in this scenario, matchingGroups and matchingGroupHeads are not set. I suspect this will cause NPEs down the line for users relying on those two being always non-null.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set them to non-null values and added some tests.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could have set them to actual values being computed, as opposed to always empty?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That calls for either adjusting the test you added or introducing a new that hits that scenario where with the current changes we lose info when we hit the isEmpty conditional.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed that, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants