Skip to content

Introduce LargeNumHitsTopDocsCollectorManager#15660

Open
gaobinlong wants to merge 6 commits into
apache:mainfrom
gaobinlong:large
Open

Introduce LargeNumHitsTopDocsCollectorManager#15660
gaobinlong wants to merge 6 commits into
apache:mainfrom
gaobinlong:large

Conversation

@gaobinlong
Copy link
Copy Markdown
Contributor

Description

This pr introduces LargeNumHitsTopDocsCollectorManager and switches TestLargeNumHitsTopDocsCollector.java to use search concurrency and move away from the deprecated search(Query, Collector) method.

Relates to #12892.

Signed-off-by: Binlong Gao <gbinlong@amazon.com>
Signed-off-by: Binlong Gao <gbinlong@amazon.com>
@github-actions github-actions Bot added this to the 11.0.0 milestone Feb 3, 2026
@github-actions
Copy link
Copy Markdown
Contributor

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

@github-actions github-actions Bot added the Stale label Feb 18, 2026
@javanna javanna modified the milestones: 11.0.0, 10.5.0 May 1, 2026
Copy link
Copy Markdown
Contributor

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments, thanks for working on this!

final TopDocs[] topDocs = new TopDocs[collectors.size()];
int i = 0;
for (LargeNumHitsTopDocsCollector collector : collectors) {
topDocs[i++] = collector.topDocs();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will throw if numHits is 0. It's an edge case. Also an existing small issue when interacting directly with the collector. Should we reject 0 and negative values in the constructor? Would make sense to do the same in the collector as well, perhaps as a follow-up.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we also need to handle the case where a collector returns 0 hits. This is really an existing problem with the collector impl. topDocs should perhaps return empty top docs in that case instead of throwing. We need a test for that scenario as well.

Copy link
Copy Markdown
Contributor Author

@gaobinlong gaobinlong May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we reject 0 and negative values in the constructor

Yes, we already have this logic.

I believe we also need to handle the case where a collector returns 0 hits.

Handled that case and add a new test for it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks good. I wonder if we should have modified the collector though to not throw. What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, we should make the behavior consistent between the collector and the collector manager, I'll open another PR for this change.

Comment thread lucene/CHANGES.txt Outdated
@github-actions github-actions Bot removed the Stale label May 2, 2026
Signed-off-by: Binlong Gao <gbinlong@amazon.com>
Signed-off-by: Binlong Gao <gbinlong@amazon.com>
Signed-off-by: Binlong Gao <gbinlong@amazon.com>
Copy link
Copy Markdown
Contributor

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment on testing. Thanks!

writer.commit();
writer.addDocument(new Document());
writer.commit();
writer.close();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you clarify why we need a different directory, writer, reader and also a custom executor in this test, compared to the other tests?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The index writer used here is to make sure we generate 2 segments, one segment would have non-zero search hits, another segment would have zero search hits because no document containing the search field; the custom executor is to force one collector per segment, if not the 2 segments may share one collector, then the zero search hits in one collector test case wouldn't be covered.

final TopDocs[] topDocs = new TopDocs[collectors.size()];
int i = 0;
for (LargeNumHitsTopDocsCollector collector : collectors) {
topDocs[i++] = collector.topDocs();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks good. I wonder if we should have modified the collector though to not throw. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants