Skip to content

Stabilize more PPL ITs on the analytics-engine route (sort/streamstats/IP-UDT/metadata/strip-verifier)#5566

Merged
ahkcs merged 1 commit into
opensearch-project:mainfrom
ahkcs:fix-batch31-analytics-parity
Jun 19, 2026
Merged

Stabilize more PPL ITs on the analytics-engine route (sort/streamstats/IP-UDT/metadata/strip-verifier)#5566
ahkcs merged 1 commit into
opensearch-project:mainfrom
ahkcs:fix-batch31-analytics-parity

Conversation

@ahkcs

@ahkcs ahkcs commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Description

Brings 16 more PPL integration test classes to parity on the analytics-engine route (-Dtests.analytics.parquet_indices=true, which routes PPL through the parquet/composite store to DataFusion). Route-only divergences are skipped only when the analytics flag is on, via @RequiresCapability + a matching build.gradle excludeTestsMatching; the v2/Calcite path runs every test unchanged. Continues #5560 / #5561 / #5562 / #5564.

A key finding: several of the originally-reported "failures" were init-load contamination — a class loading a multi-value dataset (game_of_thrones, deep_nested, …) in init() throws at bulk load on the AE route, aborting init() and mislabeling whichever test ran first. Those are fixed (guarded loads), not gated.

Pass rate (this batch, on the analytics-engine route)

Class Before After
AnalyticsUnsupportedFieldStripVerifyIT 0/1 (guardrail) 1 pass / 0 fail
CalciteWhereCommandIT 30/32 32 pass / 0 fail
CalcitePPLSortIT 15/18 15 pass / 3 skip / 0 fail
CalciteReverseCommandIT 23/27 23 pass / 4 skip / 0 fail
CalcitePPLBuiltinDatetimeFunctionInvalidIT 43/45 43 pass / 2 skip / 0 fail
CalciteMathematicalFunctionIT 61/62 61 pass / 1 skip / 0 fail
CalcitePPLCastFunctionIT 20/21 20 pass / 1 skip / 0 fail
CalcitePPLAppendCommandIT 6/8 6 pass / 2 skip / 0 fail
CalcitePPLBuiltinFunctionsNullIT 70/71 70 pass / 1 skip / 0 fail
CalciteDedupCommandIT 3/4 3 pass / 1 skip / 0 fail
CalciteMultiValueStatsIT 29/31 29 pass / 2 skip / 0 fail
CalciteSortCommandIT 28/29 28 pass / 1 skip / 0 fail
SortCommandIT 23/24 23 pass / 1 skip / 0 fail
CalciteLikeQueryIT 9/10 9 pass / 1 skip / 0 fail
FieldsCommandIT 2/5 2 pass / 0 fail (3 excluded)

AE route: 16 classes, 373 run, 0 failures (was 24 failures).
V2 baseline: 408 run, 0 failures, 2 pre-existing/by-design skips (none from these gates).

Strip-verifier (the #5541 guardrail)

AnalyticsUnsupportedFieldStripVerifyIT was failing because 8 datasets carry a multi-value JSON array for a scalar-mapped field, which the parquet store rejects at bulk load. That's a cardinality limitation, not an unsupported type, so it's out of scope for the type strip — the same situation as the existing join out-of-scope skip. Added a curated MULTI_VALUE_DATASETS allowlist + safeToSkipForMultiValueLoad that skips only the exact multi-value signature on a known dataset; any other failure still surfaces loudly, and Legs 2-3 still type-check every index that loads.

Init-load contamination (fixed, not gated)

CalciteWhereCommandIT failed on testDoubleEqual* because init() loaded game_of_thrones (base) and deep_nested (subclass), both multi-value datasets whose bulk-load failure aborted init(). Guarded both with isAnalyticsParquetIndicesEnabled(); no test in the hierarchy queries them. 32/32 now pass.

Not divergences — handled without skipping result assertions

Where a test branched on isCalciteEnabled() for a type, the AE route runs the Calcite path but the cluster setting reads false — those were already addressed in #5564. This batch contains no test-expectation edits on dual-route classes.

New capabilities

SORT_TIE_ORDER_UNSTABLE, INVALID_DATETIME_ERROR_SHAPE, RAND_SEED_UNSUPPORTED, IP_UDT_BINARY_REPRESENTATION, TIME_TYPE_WIDENED_TO_TIMESTAMP, BINARY_FIELD_STRIPPED, VALUES_LIMIT_NOT_HONORED, INDEX_METADATA, CROSS_INDEX_OBJECT_LEAF_MERGE, TEXT_KEYWORD_PUSHDOWN_REWRITE, LUCENE_PUSHDOWN_EXPLAIN.

Reused capabilities

  • WILDCARD_COLUMN_ORDER — streamstats carries all source columns through; AE returns them in a different order (4 tests)
  • HEAD_WITHOUT_STABLE_SORThead N without a stable sort (testHeadThenSort, testAppendWithMergedColumn)
  • DEDUP_NONDETERMINISTIC — consecutive dedup has no working V2 fallback on the AE route

Out of scope

  • FieldsCommandIT.testEnhancedFieldsWhenCalciteDisabled asserts the Calcite-disabled error, but the AE route is always Calcite-enabled — build.gradle exclude only.

Worth a real fix later (gated for now, flagged in capability docs)

  • IP_UDT_BINARY_REPRESENTATION — the IP UDT is materialized as BINARY on the route (next UDT-on-parquet gap after the DATE/TIME UDT fix).
  • VALUES_LIMIT_NOT_HONOREDPplAggregateCallRewriter emits no LIMIT for values()/list().

Check List

  • New functionality includes testing.
  • New functionality has been documented (in-source capability reasons).
  • Commits are signed per the DCO using --signoff.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@ahkcs ahkcs added the infrastructure Changes to infrastructure, testing, CI/CD, pipelines, etc. label Jun 18, 2026
@ahkcs ahkcs force-pushed the fix-batch31-analytics-parity branch from a225c00 to c4583ab Compare June 19, 2026 04:21
…s/IP-UDT/metadata/strip-verifier)

Brings 16 more PPL IT classes to parity on the analytics-engine route
(-Dtests.analytics.parquet_indices=true). Route-only divergences are
gated AE-only via @RequiresCapability + a matching build.gradle
excludeTestsMatching; the v2/Calcite path runs every test unchanged.

Strip-verifier (the opensearch-project#5541 guardrail):
- AnalyticsUnsupportedFieldStripVerifyIT was failing because 8 datasets
  carry a multi-value JSON array for a scalar-mapped field, which the
  parquet store rejects at bulk load. That's a cardinality limitation,
  not an unsupported field *type*, so it's out of scope for the type
  strip — the same situation as the existing `join` out-of-scope skip.
  Added a curated MULTI_VALUE_DATASETS allowlist + safeToSkipForMultiValueLoad
  that skips only the exact multi-value signature on a known dataset;
  any other failure still surfaces loudly, and Legs 2-3 still type-check
  every index that loads.

Init-load contamination (not divergences — fixed, not gated):
- CalciteWhereCommandIT failed on testDoubleEqual* because init() loaded
  game_of_thrones (base) and deep_nested (subclass), both multi-value
  datasets whose bulk-load failure aborted init() and mislabeled the
  first test. Guarded both loads with isAnalyticsParquetIndicesEnabled();
  no test in the hierarchy queries them on the AE route. 32/32 now pass.

Non-deterministic sort ties stabilized (not gated — full coverage kept):
- The three CalcitePPLSortIT tie tests (testSortWithNullValue,
  testSortAgeAndFieldsNameAge, testSortWithAutoCast) sorted on a
  non-unique key, so the tied rows had no defined order and the AE route
  ordered them differently than the captured Lucene doc order. Added a
  unique secondary sort key (firstname) to each, making the order
  deterministic and engine-independent. Verified identical on BOTH
  routes: 18/18 on AE and 18/18 on v2/Calcite. No gate needed.

Engine divergences gated (new capabilities):
- INVALID_DATETIME_ERROR_SHAPE: dayname/monthname over an invalid literal
  throw a different message shape (2 tests)
- RAND_SEED_UNSUPPORTED: seeded RAND(seed) is rejected on AE
- IP_UDT_BINARY_REPRESENTATION: the IP UDT is materialized as BINARY, so
  cast(... as IP) and cidrmatch over an IP column fail (2 tests)
- TIME_TYPE_WIDENED_TO_TIMESTAMP: a TIME field reads back as TIMESTAMP,
  defeating TIMEDIFF's [TIME,TIME] signature
- BINARY_FIELD_STRIPPED: binary fields are stripped at load
- VALUES_LIMIT_NOT_HONORED: values()/list() ignore the configured limit
- INDEX_METADATA: _index metadata not exposed (sibling of ID_METADATA)
- CROSS_INDEX_OBJECT_LEAF_MERGE: an object leaf in only some wildcard
  member indices resolves to FIELD_NOT_FOUND
- TEXT_KEYWORD_PUSHDOWN_REWRITE: like() doesn't rewrite to .keyword in
  the explain plan (no Lucene term-pushdown)
- LUCENE_PUSHDOWN_EXPLAIN: a test asserting a Lucene SORT-> pushdown
  fragment can't match the DataFusion plan

Reused existing capabilities:
- WILDCARD_COLUMN_ORDER: streamstats carries all source columns through;
  AE returns them in a different column order (values and row order are
  correct, so a sort can't fix it) (4 CalciteReverseCommandIT tests)
- HEAD_WITHOUT_STABLE_SORT: the non-determinism is which rows head N
  keeps, before the trailing sort, so a sort can't recover it
  (testHeadThenSort, testAppendWithMergedColumn)
- DEDUP_NONDETERMINISTIC: consecutive dedup has no working V2 fallback on
  the AE route

Out of scope:
- FieldsCommandIT.testEnhancedFieldsWhenCalciteDisabled asserts the
  Calcite-DISABLED error; the AE route is always Calcite-enabled.
  build.gradle exclude only.

Results (this batch, on the AE route): 16 classes, 0 failures (was 24
failures), with the 3 sort tests now passing rather than skipped. V2
baseline: 0 failures, only pre-existing/by-design skips (none from these
gates).

Signed-off-by: Kai Huang <ahkcs@amazon.com>
@ahkcs ahkcs force-pushed the fix-batch31-analytics-parity branch from c4583ab to 539cef4 Compare June 19, 2026 17:42
@ahkcs ahkcs merged commit 5aa6ea0 into opensearch-project:main Jun 19, 2026
28 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

infrastructure Changes to infrastructure, testing, CI/CD, pipelines, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants