Skip to content

Stabilize PPL ITs on the analytics-engine route (array/map-path/datatype/basic)#5562

Merged
ahkcs merged 1 commit into
opensearch-project:mainfrom
ahkcs:fix-arrayfunction-analytics-parity
Jun 18, 2026
Merged

Stabilize PPL ITs on the analytics-engine route (array/map-path/datatype/basic)#5562
ahkcs merged 1 commit into
opensearch-project:mainfrom
ahkcs:fix-arrayfunction-analytics-parity

Conversation

@ahkcs

@ahkcs ahkcs commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Description

Analytics-engine route (-Dtests.analytics.parquet_indices=true) parity pass across four PPL IT classes. All changes are test-only; the v2/Calcite route is unchanged (every gated test still runs there — the assumeNotAnalytics(...) guards are no-ops off-route and the gradle excludes apply only to integTestRemote).

CalciteArrayFunctionIT (43/60 → 44/60 pass, 16 excluded)

  • array index load aborts init(): the array dataset's multi-value numbers field (scalar long mapping) can't be bulk-loaded into the parquet store. No test queries the array index (all build arrays inline via array(...)), so the load is skipped on the AE route.
  • Higher-order lambda functions (transform/mvmap, reduce, filter, exists, forall) → No backend supports scalar function [...]. New ARRAY_HIGHER_ORDER_FUNC (16 tests).

CalcitePPLMapPathIT (25/27 → 25 pass, 2 excluded)

  • mvcombine lowers to ARRAY_AGG, unregistered on the analytics backend → new MVCOMBINE_ARRAY_AGG.
  • addtotals crashes the DataFusion backend with a join panic → new ADDTOTALS_JOIN_PANIC.

CalciteDataTypeIT (6 fail → excluded; guards added to base DataTypeIT)

  • test_nonnumeric_data_types / test_alias_data_type: nested/object/geo/alias types stripped on the AE route (reuse NESTED_FIELDS). Also broadened the pre-existing org.opensearch.sql.ppl.* build.gradle globs to * so they cover the Calcite subclass (a latent gap — the old globs never matched CalciteDataTypeIT/CalciteSystemFunctionIT).
  • test_numeric_data_types: scaled_float reported as bigint not double → new SCALED_FLOAT_TYPE.
  • testNumericFieldFromString: empty-string → numeric coerces to null not 0 → new STRING_TO_NUMERIC_COERCION.
  • testBooleanFieldFromNumberAcrossWildcardIndices: cross-index incompatible field types rejected → new CROSS_INDEX_INCOMPATIBLE_TYPES.
  • testBooleanFieldFromString: seeds + deletes a doc; DELETE unsupported (DOC_MUTATION).

CalcitePPLBasicIT (1 fail → excluded)

Results (analytics route)

Test class Before After v2/Calcite
CalciteArrayFunctionIT 43/60 44/60 pass, 16 excluded, 0 fail 60/60
CalcitePPLMapPathIT 25/27 25 pass, 2 excluded, 0 fail 27/27
CalciteDataTypeIT 1/7 1 run + 6 excluded, 0 fail 7/7
CalcitePPLBasicIT 45/46 45 run, 0 fail, 7 skip (pre-existing) 46/46

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • Commits are signed per the DCO using --signoff.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@ahkcs ahkcs added the infrastructure Changes to infrastructure, testing, CI/CD, pipelines, etc. label Jun 17, 2026
@ahkcs ahkcs changed the title Stabilize CalciteArrayFunctionIT on the analytics-engine route Stabilize PPL ITs on the analytics-engine route (array/map-path/datatype/basic) Jun 17, 2026
@ahkcs ahkcs force-pushed the fix-arrayfunction-analytics-parity branch 2 times, most recently from 6a5c598 to 5646561 Compare June 17, 2026 22:27
…ype/basic)

Analytics-engine route parity for four PPL IT classes; test-only. Uses the
@RequiresCapability annotation + Capability registry (opensearch-project#5560) plus matching
excludeTestsMatching entries.

CalciteArrayFunctionIT:
  - Skip the array-index load on the AE route (multi-value 'numbers' field the
    parquet store rejects); no test queries it (all build arrays inline).
  - 16 higher-order lambda functions (transform/mvmap, reduce, filter, exists,
    forall) -> new ARRAY_HIGHER_ORDER_FUNC (no DataFusion lambda execution).

CalcitePPLMapPathIT:
  - mvcombine lowers to ARRAY_AGG, unregistered on the analytics backend ->
    new MVCOMBINE_ARRAY_AGG.
  - addtotals crashes the DataFusion backend with a join panic ->
    new ADDTOTALS_JOIN_PANIC.

CalciteDataTypeIT (guards on base DataTypeIT; build.gradle globs broadened to
'*' so they cover the Calcite subclass):
  - test_nonnumeric_data_types / test_alias_data_type: nested/object/geo/alias
    types stripped (NESTED_FIELDS).
  - test_numeric_data_types: scaled_float reported as bigint not double ->
    new SCALED_FLOAT_TYPE.
  - testNumericFieldFromString: empty-string -> numeric coerces to null not 0 ->
    new STRING_TO_NUMERIC_COERCION.
  - testBooleanFieldFromNumberAcrossWildcardIndices: cross-index incompatible
    field types rejected -> new CROSS_INDEX_INCOMPATIBLE_TYPES.
  - testBooleanFieldFromString: seeds+deletes a doc; DELETE unsupported (DOC_MUTATION).

CalcitePPLBasicIT.testRegexpFilter: REGEXP filter throws a backend
NullPointerException on the AE route -> new REGEXP_FILTER.

v2/Calcite route unchanged (all run, 0 skips).

Signed-off-by: Kai Huang <ahkcs@amazon.com>
@ahkcs ahkcs force-pushed the fix-arrayfunction-analytics-parity branch from 5646561 to 921aa8c Compare June 18, 2026 17:52
@ahkcs ahkcs merged commit 7f2b60f into opensearch-project:main Jun 18, 2026
28 of 32 checks passed
@ahkcs ahkcs deleted the fix-arrayfunction-analytics-parity branch June 18, 2026 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

infrastructure Changes to infrastructure, testing, CI/CD, pipelines, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants