Skip to content

The combination of bitmap terms lookup + additional filter + text match produces wrong results, and force merge fixes it. #21781

@mw-vvasudev

Description

@mw-vvasudev

Describe the bug

Bitmap terms lookup (value_type: "bitmap") produces incorrect results (0 hits) when the target index has multiple segments and the query combines the bitmap filter with additional filter clauses and a text match query. After force-merging the target index to a single segment, the same query returns correct results.

Related component

Search:Performance

To Reproduce

  1. Create two indices: index_main (with multiple segments) and index_user (containing bitmap data).
  2. index_user has a binary field with "store": true containing a roaring bitmap.
  3. index_main has an integer field that the bitmap is matched against, a boolean field, and a text field.
  4. Run the following query:
{
  "query": {
    "bool": {
      "minimum_should_match": 1,
      "should": [
        { "match": { "text_field": "some_term" } }
      ],
      "filter": [
        { "term": { "boolean_field": true } },
        {
          "terms": {
            "integer_field": {
              "index": "index_user",
              "id": "some_doc_id",
              "path": "bitmap_field",
              "store": true
            },
            "value_type": "bitmap"
          }
        }
      ]
    }
  }
}

Expected behavior

The query should return documents that match all three conditions (text match + boolean filter + bitmap filter). Each pair of conditions works correctly in isolation:

Query Combination Result
Bitmap filter only Correct hits
Bitmap filter + text match (no boolean filter) Correct hits
Bitmap filter + boolean filter (no text match) Correct hits
Bitmap filter + boolean filter + text match 0 hits (incorrect)

Actual Behavior

When all three clauses are present and index_main has multiple segments, the query returns 0 hits. An aggregation on the same data confirms the intersection is non-empty.

Workaround

Force-merging index_main to a single segment (_forcemerge?max_num_segments=1) resolves the issue. After the merge, the identical query returns the expected results.

Additional Details

Additional Observations

  • The issue does not depend on query clause ordering (must vs should vs filter placement).
  • post_filter with the text match works correctly even with multiple segments.
  • wildcard and prefix queries work correctly in place of match, suggesting the issue is specific to analyzed text queries interacting with bitmap evaluation across segments.
  • The problem does not occur when using explicit terms arrays (e.g., "terms": {"integer_field": [123, 456]}) instead of bitmap lookup.

Environment

  • Single-node cluster
  • Index with 4 segments (approximately 20K documents, ~15 MB)
  • Bitmap stored as base64-encoded roaring bitmap in a binary field with store: true

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions