Stabilize PPL ITs on the analytics-engine route (case/string/full-text/like/appendpipe/datatype)#5561
Open
ahkcs wants to merge 1 commit into
Open
Conversation
cd45dd1 to
ad0121c
Compare
aa67346 to
d352b23
Compare
…t/like/appendpipe) Analytics-engine route parity for several PPL IT classes; test-only. Uses the @RequiresCapability annotation + Capability registry (opensearch-project#5560) plus matching excludeTestsMatching entries. CalcitePPLCaseFunctionIT: - Guard the weblogs raw-PUT seeding (appendDataForBadResponse) on a pre-load isIndexExist check — the append-only AE store inflated counts per method. - Skip the otel_logs load on the AE route (multi-value keyword the parquet store rejects); only testNestedCaseAggWithAutoDateHistogram uses it, and that test requires BIN_TIME_FIELD_BUCKETING (bucket column typed string). CalcitePPLStringBuiltinFunctionIT: 7 tests re-PUT a shared _id with different data; the append-only AE store can't replace docs (DELETE unsupported) -> DOC_MUTATION. MultiMatchIT / QueryStringIT / SimpleQueryStringIT wildcard tests: full-text relevance functions with no DataFusion equivalent -> new FULLTEXT_RELEVANCE_FUNC. CalciteLikeQueryIT.test_the_default_3rd_option: AE LIKE is case-insensitive but v2/Calcite is case-sensitive -> new LIKE_CASE_SENSITIVITY. CalcitePPLAppendPipeCommandIT.testDoubleAppendPipeWithFilter: appendpipe drops the main pipeline's rows on the AE route -> new APPENDPIPE_MAIN_RESULT_DROPPED. v2/Calcite route unchanged (all run, 0 skips). Signed-off-by: Kai Huang <ahkcs@amazon.com>
d352b23 to
7bf75e0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Analytics-engine route (
-Dtests.analytics.parquet_indices=true) parity pass across several PPL IT classes. All changes are test-only; the v2/Calcite route is unchanged (every gated test still runs there — theassumeNotAnalytics(...)guards are no-ops off-route and the gradle excludes apply only tointegTestRemote).CalcitePPLCaseFunctionIT (3/9 → 8/9 pass, 1 excluded)
appendDataForBadResponse()raw-PUTs 4 weblogs docs unconditionally ininit(); the append-only AE store can't replace on same-_idPUT, so they accumulated per method. Guarded on a pre-loadIndexisIndexExistcheck.attributes.email.invalid_recipientsarray the parquet store rejects (Cannot accept multiple values for field ... of type keyword), abortinginit(). OnlytestNestedCaseAggWithAutoDateHistogramuses it, so the load is skipped on the AE route.BIN_TIME_FIELD_BUCKETING(existing):bin @timestamp | stats by @timestampreturns the bucket column typedstringnottimestamp. SkipstestNestedCaseAggWithAutoDateHistogram.CalcitePPLStringBuiltinFunctionIT (7 tests skipped —
DOC_MUTATION)testConcatWithField/ConcatWs/Reverse/Right/Trim/RTrim/LTrimre-PUT a shared_id(5/6/7) with different data, relying on PUT-replace. On the AE store same-_idPUT appends and DELETE is unsupported, so these accumulate and cross-contaminate (counts are order-dependent). Skipped via the existingDOC_MUTATIONlimitation.Full-text relevance functions (3 tests skipped — new
FULLTEXT_RELEVANCE_FUNC)MultiMatchIT.test_wildcard_multi_match,QueryStringIT.wildcard_test,SimpleQueryStringIT.test_wildcard_simple_query_stringusemulti_match/query_string/simple_query_string— Lucene relevance functions with no DataFusion equivalent (they return no rows).CalciteLikeQueryIT.test_the_default_3rd_option (skipped — new
LIKE_CASE_SENSITIVITY)The v3 branch expects case-sensitive
LIKE(0 rows for'test Wildcard%'vs lowercase data); the AE route'sLIKEis case-insensitive (DataFusion) and returns 7.CalcitePPLAppendPipeCommandIT.testDoubleAppendPipeWithFilter (skipped — new
APPENDPIPE_MAIN_RESULT_DROPPED)appendpipe [subpipe]drops the main pipeline's rows on the AE route: the subpipe's filter is applied to the main result instead of being appended, so the originals are lost (verified:stats ... | appendpipe [where gender='F']returns only the F rows, not the originals plus the filtered copy).DataTypeIT exclude globs (coverage fix)
test_nonnumeric_data_types,test_alias_data_type, andSystemFunctionIT.typeof_opensearch_typeswere excluded withorg.opensearch.sql.ppl.*globs that did not match the Calcite subclasses. Broadened to*soCalciteDataTypeIT/CalciteSystemFunctionITare covered too.Results (analytics route, per-listed-test)
v2/Calcite route: all edited classes 100% pass, 0 skips.
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.