Skip to content

[BUG] distinct_count_approx does not work with array types #4533

Description

@ahkcs

Query Information

PPL Query:

source = opensearch_dashboards_sample_data_ecommerce
| stats distinct_count_approx(`manufacturer`) as dc

Expected Result:
distinct_count_approx should return an approximate distinct count of the values in the manufacturer field, regardless of whether the field is a single value or an array.

Actual Result:
The query fails when the target field is of array type:

{
  "error": {
    "reason": "There was internal problem at backend",
    "details": "java.sql.SQLException: exception while executing query: class java.util.ArrayList cannot be cast to class java.lang.String (java.util.ArrayList and java.lang.String are in module java.base of loader 'bootstrap')",
    "type": "RuntimeException"
  },
  "status": 500
}

Steps to Reproduce:

  1. Create an index with an array-type field (e.g., manufacturer as ["A", "B"]).

  2. Run:

    source = your_index | stats distinct_count_approx(`manufacturer`)
    
  3. Observe that the query fails with a 500 Internal Server Error.

Metadata

Metadata

Assignees

Labels

PPLPiped processing languagebugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

Status
Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions