Skip to content

Add maxValueCount to DocValuesSkipper metadata#15993

Open
iprithv wants to merge 1 commit intoapache:mainfrom
iprithv:dv-skipper-max-count
Open

Add maxValueCount to DocValuesSkipper metadata#15993
iprithv wants to merge 1 commit intoapache:mainfrom
iprithv:dv-skipper-max-count

Conversation

@iprithv
Copy link
Copy Markdown
Contributor

@iprithv iprithv commented Apr 28, 2026

Description

resolves #15794

  • Add DocValuesSkipper#maxValueCount() so consumers can determine whether a doc values field is definitely single-valued without unwrapping iterators.
  • Persist exact global maxValueCount metadata in Lucene90 doc values skipper metadata for new segments.
  • Add validation and test coverage for empty, single-valued, and multi-valued skipper fields.

Compatibility

Older Lucene90 segments do not have this metadata, so maxValueCount() returns a safe upper bound: 0 for empty fields and Integer.MAX_VALUE otherwise. Exact values become available once segments are rewritten or merged with the new format.

Copy link
Copy Markdown
Contributor

@jainankitk jainankitk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if this information needs to stored as part of DocValuesSkipper as this should be single value per field per segment right? Also, this PR - #15737 might be related.

@iprithv
Copy link
Copy Markdown
Contributor Author

iprithv commented May 1, 2026

I am wondering if this information needs to stored as part of DocValuesSkipper as this should be single value per field per segment right? Also, this PR - #15737 might be related.

Thanks for taking a look.

Yes, maxValueCount is a single global value per field per segment. This PR stores it that way: it is part of the Lucene90 DocValuesSkipperEntry field metadata, alongside the existing global minValue, maxValue, docCount, and maxDocId. It is not stored per skip interval/block.

I put it on DocValuesSkipper because this is where Lucene already exposes doc-values-derived segment-level metadata for fields that have a skip index. Putting it in FieldInfo would make it look like schema/static field metadata, but it is actually a computed per-segment doc values statistic and may be unavailable when no skipper is present.

@iprithv iprithv requested a review from jainankitk May 1, 2026 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add maxValueCount to DocValuesSkipper metadata

2 participants