Decouple IT from execution backend with capability-based gating#5560
Merged
dai-chen merged 1 commit intoJun 17, 2026
Merged
Conversation
c3bb8dd to
3179c96
Compare
Rename AnalyticsRouteLimitation to Capability and add a declarative @RequiresCapability annotation (enforced by CapabilityRule) wired into SQLIntegTestCase, so SQL and PPL ITs gate by capability instead of naming a backend. Gate logic lives in BackendCapabilities. All assumeNotAnalytics(...) calls migrated to @RequiresCapability, with per-test divergence comments folded into the annotation note. Verified: :integ-test:compileTestJava and :integ-test:spotlessCheck pass. Signed-off-by: Chen Dai <daichen@amazon.com>
3179c96 to
dc298d8
Compare
ahkcs
approved these changes
Jun 17, 2026
ahkcs
added a commit
to ahkcs/sql
that referenced
this pull request
Jun 17, 2026
…t/like/appendpipe) Analytics-engine route parity for several PPL IT classes; test-only. Uses the @RequiresCapability annotation + Capability registry (opensearch-project#5560) plus matching excludeTestsMatching entries. CalcitePPLCaseFunctionIT: - Guard the weblogs raw-PUT seeding (appendDataForBadResponse) on a pre-load isIndexExist check — the append-only AE store inflated counts per method. - Skip the otel_logs load on the AE route (multi-value keyword the parquet store rejects); only testNestedCaseAggWithAutoDateHistogram uses it, and that test requires BIN_TIME_FIELD_BUCKETING (bucket column typed string). CalcitePPLStringBuiltinFunctionIT: 7 tests re-PUT a shared _id with different data; the append-only AE store can't replace docs (DELETE unsupported) -> DOC_MUTATION. MultiMatchIT / QueryStringIT / SimpleQueryStringIT wildcard tests: full-text relevance functions with no DataFusion equivalent -> new FULLTEXT_RELEVANCE_FUNC. CalciteLikeQueryIT.test_the_default_3rd_option: AE LIKE is case-insensitive but v2/Calcite is case-sensitive -> new LIKE_CASE_SENSITIVITY. CalcitePPLAppendPipeCommandIT.testDoubleAppendPipeWithFilter: appendpipe drops the main pipeline's rows on the AE route -> new APPENDPIPE_MAIN_RESULT_DROPPED. v2/Calcite route unchanged (all run, 0 skips). Signed-off-by: Kai Huang <ahkcs@amazon.com>
ahkcs
added a commit
to ahkcs/sql
that referenced
this pull request
Jun 17, 2026
…ype/basic) Analytics-engine route parity for four PPL IT classes; test-only. Uses the @RequiresCapability annotation + Capability registry (opensearch-project#5560) plus matching excludeTestsMatching entries. CalciteArrayFunctionIT: - Skip the array-index load on the AE route (multi-value 'numbers' field the parquet store rejects); no test queries it (all build arrays inline). - 16 higher-order lambda functions (transform/mvmap, reduce, filter, exists, forall) -> new ARRAY_HIGHER_ORDER_FUNC (no DataFusion lambda execution). CalcitePPLMapPathIT: - mvcombine lowers to ARRAY_AGG, unregistered on the analytics backend -> new MVCOMBINE_ARRAY_AGG. - addtotals crashes the DataFusion backend with a join panic -> new ADDTOTALS_JOIN_PANIC. CalciteDataTypeIT (guards on base DataTypeIT; build.gradle globs broadened to '*' so they cover the Calcite subclass): - test_nonnumeric_data_types / test_alias_data_type: nested/object/geo/alias types stripped (NESTED_FIELDS). - test_numeric_data_types: scaled_float reported as bigint not double -> new SCALED_FLOAT_TYPE. - testNumericFieldFromString: empty-string -> numeric coerces to null not 0 -> new STRING_TO_NUMERIC_COERCION. - testBooleanFieldFromNumberAcrossWildcardIndices: cross-index incompatible field types rejected -> new CROSS_INDEX_INCOMPATIBLE_TYPES. - testBooleanFieldFromString: seeds+deletes a doc; DELETE unsupported (DOC_MUTATION). CalcitePPLBasicIT.testRegexpFilter: REGEXP filter throws a backend NullPointerException on the AE route -> new REGEXP_FILTER. v2/Calcite route unchanged (all run, 0 skips). Signed-off-by: Kai Huang <ahkcs@amazon.com>
ahkcs
added a commit
to ahkcs/sql
that referenced
this pull request
Jun 17, 2026
…t/like/appendpipe) Analytics-engine route parity for several PPL IT classes; test-only. Uses the @RequiresCapability annotation + Capability registry (opensearch-project#5560) plus matching excludeTestsMatching entries. CalcitePPLCaseFunctionIT: - Guard the weblogs raw-PUT seeding (appendDataForBadResponse) on a pre-load isIndexExist check — the append-only AE store inflated counts per method. - Skip the otel_logs load on the AE route (multi-value keyword the parquet store rejects); only testNestedCaseAggWithAutoDateHistogram uses it, and that test requires BIN_TIME_FIELD_BUCKETING (bucket column typed string). CalcitePPLStringBuiltinFunctionIT: 7 tests re-PUT a shared _id with different data; the append-only AE store can't replace docs (DELETE unsupported) -> DOC_MUTATION. MultiMatchIT / QueryStringIT / SimpleQueryStringIT wildcard tests: full-text relevance functions with no DataFusion equivalent -> new FULLTEXT_RELEVANCE_FUNC. CalciteLikeQueryIT.test_the_default_3rd_option: AE LIKE is case-insensitive but v2/Calcite is case-sensitive -> new LIKE_CASE_SENSITIVITY. CalcitePPLAppendPipeCommandIT.testDoubleAppendPipeWithFilter: appendpipe drops the main pipeline's rows on the AE route -> new APPENDPIPE_MAIN_RESULT_DROPPED. v2/Calcite route unchanged (all run, 0 skips). Signed-off-by: Kai Huang <ahkcs@amazon.com>
ahkcs
added a commit
to ahkcs/sql
that referenced
this pull request
Jun 17, 2026
…ype/basic) Analytics-engine route parity for four PPL IT classes; test-only. Uses the @RequiresCapability annotation + Capability registry (opensearch-project#5560) plus matching excludeTestsMatching entries. CalciteArrayFunctionIT: - Skip the array-index load on the AE route (multi-value 'numbers' field the parquet store rejects); no test queries it (all build arrays inline). - 16 higher-order lambda functions (transform/mvmap, reduce, filter, exists, forall) -> new ARRAY_HIGHER_ORDER_FUNC (no DataFusion lambda execution). CalcitePPLMapPathIT: - mvcombine lowers to ARRAY_AGG, unregistered on the analytics backend -> new MVCOMBINE_ARRAY_AGG. - addtotals crashes the DataFusion backend with a join panic -> new ADDTOTALS_JOIN_PANIC. CalciteDataTypeIT (guards on base DataTypeIT; build.gradle globs broadened to '*' so they cover the Calcite subclass): - test_nonnumeric_data_types / test_alias_data_type: nested/object/geo/alias types stripped (NESTED_FIELDS). - test_numeric_data_types: scaled_float reported as bigint not double -> new SCALED_FLOAT_TYPE. - testNumericFieldFromString: empty-string -> numeric coerces to null not 0 -> new STRING_TO_NUMERIC_COERCION. - testBooleanFieldFromNumberAcrossWildcardIndices: cross-index incompatible field types rejected -> new CROSS_INDEX_INCOMPATIBLE_TYPES. - testBooleanFieldFromString: seeds+deletes a doc; DELETE unsupported (DOC_MUTATION). CalcitePPLBasicIT.testRegexpFilter: REGEXP filter throws a backend NullPointerException on the AE route -> new REGEXP_FILTER. v2/Calcite route unchanged (all run, 0 skips). Signed-off-by: Kai Huang <ahkcs@amazon.com>
ahkcs
added a commit
that referenced
this pull request
Jun 18, 2026
…t/like/appendpipe) (#5561) Analytics-engine route parity for several PPL IT classes; test-only. Uses the @RequiresCapability annotation + Capability registry (#5560) plus matching excludeTestsMatching entries. CalcitePPLCaseFunctionIT: - Guard the weblogs raw-PUT seeding (appendDataForBadResponse) on a pre-load isIndexExist check — the append-only AE store inflated counts per method. - Skip the otel_logs load on the AE route (multi-value keyword the parquet store rejects); only testNestedCaseAggWithAutoDateHistogram uses it, and that test requires BIN_TIME_FIELD_BUCKETING (bucket column typed string). CalcitePPLStringBuiltinFunctionIT: 7 tests re-PUT a shared _id with different data; the append-only AE store can't replace docs (DELETE unsupported) -> DOC_MUTATION. MultiMatchIT / QueryStringIT / SimpleQueryStringIT wildcard tests: full-text relevance functions with no DataFusion equivalent -> new FULLTEXT_RELEVANCE_FUNC. CalciteLikeQueryIT.test_the_default_3rd_option: AE LIKE is case-insensitive but v2/Calcite is case-sensitive -> new LIKE_CASE_SENSITIVITY. CalcitePPLAppendPipeCommandIT.testDoubleAppendPipeWithFilter: appendpipe drops the main pipeline's rows on the AE route -> new APPENDPIPE_MAIN_RESULT_DROPPED. v2/Calcite route unchanged (all run, 0 skips). Signed-off-by: Kai Huang <ahkcs@amazon.com>
ahkcs
added a commit
to ahkcs/sql
that referenced
this pull request
Jun 18, 2026
…ype/basic) Analytics-engine route parity for four PPL IT classes; test-only. Uses the @RequiresCapability annotation + Capability registry (opensearch-project#5560) plus matching excludeTestsMatching entries. CalciteArrayFunctionIT: - Skip the array-index load on the AE route (multi-value 'numbers' field the parquet store rejects); no test queries it (all build arrays inline). - 16 higher-order lambda functions (transform/mvmap, reduce, filter, exists, forall) -> new ARRAY_HIGHER_ORDER_FUNC (no DataFusion lambda execution). CalcitePPLMapPathIT: - mvcombine lowers to ARRAY_AGG, unregistered on the analytics backend -> new MVCOMBINE_ARRAY_AGG. - addtotals crashes the DataFusion backend with a join panic -> new ADDTOTALS_JOIN_PANIC. CalciteDataTypeIT (guards on base DataTypeIT; build.gradle globs broadened to '*' so they cover the Calcite subclass): - test_nonnumeric_data_types / test_alias_data_type: nested/object/geo/alias types stripped (NESTED_FIELDS). - test_numeric_data_types: scaled_float reported as bigint not double -> new SCALED_FLOAT_TYPE. - testNumericFieldFromString: empty-string -> numeric coerces to null not 0 -> new STRING_TO_NUMERIC_COERCION. - testBooleanFieldFromNumberAcrossWildcardIndices: cross-index incompatible field types rejected -> new CROSS_INDEX_INCOMPATIBLE_TYPES. - testBooleanFieldFromString: seeds+deletes a doc; DELETE unsupported (DOC_MUTATION). CalcitePPLBasicIT.testRegexpFilter: REGEXP filter throws a backend NullPointerException on the AE route -> new REGEXP_FILTER. v2/Calcite route unchanged (all run, 0 skips). Signed-off-by: Kai Huang <ahkcs@amazon.com>
ahkcs
added a commit
that referenced
this pull request
Jun 18, 2026
…ype/basic) (#5562) Analytics-engine route parity for four PPL IT classes; test-only. Uses the @RequiresCapability annotation + Capability registry (#5560) plus matching excludeTestsMatching entries. CalciteArrayFunctionIT: - Skip the array-index load on the AE route (multi-value 'numbers' field the parquet store rejects); no test queries it (all build arrays inline). - 16 higher-order lambda functions (transform/mvmap, reduce, filter, exists, forall) -> new ARRAY_HIGHER_ORDER_FUNC (no DataFusion lambda execution). CalcitePPLMapPathIT: - mvcombine lowers to ARRAY_AGG, unregistered on the analytics backend -> new MVCOMBINE_ARRAY_AGG. - addtotals crashes the DataFusion backend with a join panic -> new ADDTOTALS_JOIN_PANIC. CalciteDataTypeIT (guards on base DataTypeIT; build.gradle globs broadened to '*' so they cover the Calcite subclass): - test_nonnumeric_data_types / test_alias_data_type: nested/object/geo/alias types stripped (NESTED_FIELDS). - test_numeric_data_types: scaled_float reported as bigint not double -> new SCALED_FLOAT_TYPE. - testNumericFieldFromString: empty-string -> numeric coerces to null not 0 -> new STRING_TO_NUMERIC_COERCION. - testBooleanFieldFromNumberAcrossWildcardIndices: cross-index incompatible field types rejected -> new CROSS_INDEX_INCOMPATIBLE_TYPES. - testBooleanFieldFromString: seeds+deletes a doc; DELETE unsupported (DOC_MUTATION). CalcitePPLBasicIT.testRegexpFilter: REGEXP filter throws a backend NullPointerException on the AE route -> new REGEXP_FILTER. v2/Calcite route unchanged (all run, 0 skips). Signed-off-by: Kai Huang <ahkcs@amazon.com>
This was referenced Jun 18, 2026
ahkcs
added a commit
that referenced
this pull request
Jun 24, 2026
…5582) Gate the streamstats ITs that diverge on the analytics-engine route (parquet-backed composite store, DataFusion backend) so the route runs green, while keeping every test active on the v2/Calcite path. Mechanism follows the established capability-gating pattern (#5560): an in-test @RequiresCapability(...) annotation plus a matching integTestRemote excludeTestsMatching entry; both are no-ops on the v2 route. Triaged single-shard and multi-shard (num_shards=3) analytics runs against the v2 baseline. Three groups: - DOC_MUTATION (4 tests): testStreamstatsGlobalWithNull, testStreamstatsGlobalWithNullBucket, testStreamstatsResetWithNull, testStreamstatsResetWithNullBucket seed state via PUT+DELETE. Doc-level DELETE is unsupported on the parquet store, and same-_id PUT is append-only, so the leaked doc inflated the row counts of every sibling test that reads the shared index. Gated with the same DOC_MUTATION capability the three existing mutation tests already carry. - CHAINED_STREAMSTATS_BY (4 tests): chaining two streamstats where an upstream stage partitions `by` a group emits a ROW_NUMBER() sequence column from each stage; the Substrait converter names both physical columns identically, so the stacked schema has a duplicate/ambiguous field name (500) or, for chained window streamstats, non-deterministic values. The Calcite logical plan is correct; the alias is lost in Substrait conversion. Fails single- and multi-shard. - STREAMSTATS_SORT_NOT_HONORED (1 test): testStreamstatsAndSort. The window is computed over the backend scan order, ignoring a preceding `| sort` (the OVER clause carries no explicit ORDER BY), so the per-row aggregates diverge from the v2/Calcite path. Pass rate on the single-shard analytics route, CalciteStreamstatsCommandIT: | metric | before | after | |-----------|--------|-------| | tests run | 47 | 42 | | failures | 12 | 0 | | skipped | 3 | 7 | (The before-failures count is inflated by the DOC_MUTATION leak described above; the four leaking tests plus their downstream row-count victims all clear once gated.) v2 route (:integTest): 47 run, 0 failed, 0 skipped — gates are no-ops off the analytics route. Twelve further tests fail only on the multi-shard route (they pass single-shard) due to cross-shard fragment-order non-determinism in the streamstats window gather; that is an engine-side gap and is left unchanged here rather than gated, to avoid skipping passing tests on the single-shard route. Signed-off-by: Kai Huang <ahkcs@amazon.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Make analytics-engine route test gating backend-agnostic and declarative, replacing the imperative, backend-named
assumeNotAnalytics(...)calls. Test-only change.AnalyticsRouteLimitationto aCapabilityregistry and route skip decisions throughBackendCapabilities, so tests declare the capability they need instead of naming a backend.@RequiresCapability(method/class), enforced byCapabilityRule, and convert all existing call sites to it — keeping test bodies clean.Related Issues
Part of #5246
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.