Stabilize subquery PPL ITs on the analytics-engine route by ahkcs · Pull Request #5555 · opensearch-project/sql

ahkcs · 2026-06-16T18:35:38Z

Description

Brings CalcitePPLScalarSubqueryIT and CalcitePPLInSubqueryIT to parity on the analytics-engine route (-Dtests.analytics.parquet_indices=true), where every test-created index is composite/parquet-backed and PPL queries route through the analytics engine (DataFusion). Both ITs already pass on the v2/Calcite route; this change is test-only.

Root cause 1 — append-only accumulation (fixes 19 of 23 failures)

Both ITs seed an extra worker doc (_id 7, "Tommy") via an unconditional raw PUT in init(). init() runs as @Before, before every test method, and preserveClusterUponCompletion() keeps indices across methods. On the analytics-engine parquet-backed store a PUT with an existing _id appends a new row instead of replacing, so Tommy accumulated one duplicate per test method and inflated row counts across the suite (e.g. a 7-row expectation returned 20 = 6 bulk docs + 14 duplicate Tommys).

The bulk loadIndex(...) calls don't have this problem because they're already isIndexExist-guarded. Fix: capture isIndexExist(...) before loadIndex creates the index, then seed Tommy only on first creation. End state is identical on the v2/Calcite route.

Root cause 2 — two unsupported behaviors on the route (the remaining 4 failures)

These are genuine analytics-engine behaviors, not test bugs (each verified directly against the route). They're skipped via the assumeNotAnalytics(...) registry (AnalyticsRouteLimitation) introduced in #5551, plus matching excludeTestsMatching entries in integTestRemote so the skip set stays countable in one place. Two new AnalyticsRouteLimitation constants:

TEXT_FIELD_EXACT_MATCH — exact equality (= / ==) on an explicitly text-mapped field (no .keyword sub-field) returns no rows on the DataFusion scan. The sibling of the existing DYNAMIC_STRING_NO_KEYWORD for fields mapped text on purpose rather than by dynamic mapping. Verified directly: where department = 'DATA' returns no rows, while like(department, 'DATA') and keyword-field equality both work. Covers testTwoUncorrelatedScalarSubqueriesInOr, testInSubqueryWithTableAlias, testInCorrelatedSubquery (each filters department/occupation with = in the subsearch).
SUBSEARCH_MAXOUT_IN_SUBQUERY — the subsearch.maxout cap is lowered as a LIMIT on the right-hand side of the IN-subquery semi-join (LogicalSystemLimit(fetch=N, type=SUBSEARCH_MAXOUT)), which the route does not honor, so the subsearch returns all rows regardless of the cap. Verified: with subsearch.maxout=1 an id in [...] subquery still returns every matching row. Covers testSubsearchMaxOut.

Note on the text-equality root cause: an earlier revision of this PR described it as "text fields can't do exact match", which was too coarse — testSelfInSubquery filters a text field (country = 'USA') inside a self-index subquery and passes. Re-verified directly against the cluster: standalone text = returns no rows for every text field, keyword = works, and like() works; the three skipped tests all hinge on a text = filter in a cross-index subsearch. The gap is the analyzed-text exact-match scan limitation (same family as DYNAMIC_STRING_NO_KEYWORD), now captured precisely in TEXT_FIELD_EXACT_MATCH.

Results

Run via :integ-test:integTestRemote against an analytics :run cluster.

Test class	Analytics route — before	Analytics route — after	v2/Calcite route
`CalcitePPLScalarSubqueryIT`	2/14 pass, 12 fail	13/13 run, 0 fail (1 excluded)	14/14 pass
`CalcitePPLInSubqueryIT`	7/18 pass, 11 fail	14/15 run, 0 fail (3 excluded; 1 pre-existing `@Ignore`)	17/18 pass (1 `@Ignore`)

On the analytics route the skipped tests are removed by excludeTestsMatching so they don't run; each also carries an in-test assumeNotAnalytics(...) as a safety net. The v2/Calcite route runs all of them (the gradle excludes apply only to integTestRemote, and assumeNotAnalytics is a no-op when the analytics flag is off).

Check List

New functionality includes testing.
New functionality has been documented.
Commits are signed per the DCO using --signoff.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@ignore

CalcitePPLScalarSubqueryIT and CalcitePPLInSubqueryIT both seed an extra worker doc via an unconditional raw PUT in init(). init() runs as @before before every test method, and the analytics-engine parquet-backed store is append-only on same-_id PUT, so the doc accumulated a duplicate per method and inflated row counts across the suite. Guard the seed on a pre-loadIndex isIndexExist check so it runs exactly once; behavior is unchanged on the v2/Calcite route (same end state). Skip the four tests that exercise behaviors the analytics-engine route does not support, using the assumeNotAnalytics(...) registry (AnalyticsRouteLimitation) plus matching excludeTestsMatching entries in integTestRemote so the skip set stays countable in one place. Two new AnalyticsRouteLimitation constants: - TEXT_FIELD_EXACT_MATCH: exact = / == on an explicitly text-mapped field (no .keyword sub-field) returns no rows on the DataFusion scan. Sibling of DYNAMIC_STRING_NO_KEYWORD for explicitly-mapped (not dynamic) text fields. Covers testTwoUncorrelatedScalarSubqueriesInOr, testInSubqueryWithTableAlias, testInCorrelatedSubquery (each filters department/occupation = '...'). - SUBSEARCH_MAXOUT_IN_SUBQUERY: the subsearch.maxout cap is lowered as a LIMIT on the in-subquery semi-join's right side, which the route does not honor, so the subsearch returns all rows. Covers testSubsearchMaxOut. Results (-Dtests.analytics.parquet_indices=true against the analytics route): CalcitePPLScalarSubqueryIT: 2/14 -> 13/13 run, 0 fail (1 excluded) CalcitePPLInSubqueryIT: 7/18 -> 14/15 run, 0 fail (3 excluded; 1 @ignore) v2/Calcite route unchanged: 14/14 and 17/17 (1 pre-existing @ignore). Signed-off-by: Kai Huang <ahkcs@amazon.com>

Rename AnalyticsRouteLimitation to Capability and add a declarative @RequiresCapability annotation enforced by CapabilityRule (JUnit4) on SQLIntegTestCase, so both SQL and PPL ITs can gate by capability. Replace all assumeNotAnalytics(...) calls with method-level @RequiresCapability (including the cases added by opensearch-project#5555 and opensearch-project#5556). Skip stays AE-route-only. Verified: ./gradlew :integ-test:compileTestJava and :integ-test:spotlessCheck pass. Signed-off-by: Chen Dai <daichen@amazon.com>

Rename AnalyticsRouteLimitation to Capability and add a declarative @RequiresCapability annotation enforced by CapabilityRule (JUnit4), wired into SQLIntegTestCase so SQL and PPL ITs gate by capability. Gate logic lives in BackendCapabilities (skips on the analytics-engine route, which supports none of the defined capabilities today). Replace all assumeNotAnalytics(...) with @RequiresCapability (incl. opensearch-project#5555/opensearch-project#5556). Verified: ./gradlew :integ-test:compileTestJava and :integ-test:spotlessCheck pass. Signed-off-by: Chen Dai <daichen@amazon.com>

ahkcs requested review from LantaoJin, RyanL1997, Swiddis, acarbonetto, anirudha, dai-chen, joshuali925, mengweieric, noCharger, penghuo, ps48, qianheng-aws, songkant-aws, vamsimanohar, ykmr1224 and yuancu as code owners June 16, 2026 18:35

ahkcs added the infrastructure Changes to infrastructure, testing, CI/CD, pipelines, etc. label Jun 16, 2026

mengweieric approved these changes Jun 16, 2026

View reviewed changes

ahkcs force-pushed the fix-subquery-analytics-accumulation branch from 8ee2fc4 to c34ffd4 Compare June 16, 2026 19:26

ahkcs merged commit 3829dfa into opensearch-project:main Jun 16, 2026
29 of 33 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stabilize subquery PPL ITs on the analytics-engine route#5555

Stabilize subquery PPL ITs on the analytics-engine route#5555
ahkcs merged 1 commit into
opensearch-project:mainfrom
ahkcs:fix-subquery-analytics-accumulation

ahkcs commented Jun 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ahkcs commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Root cause 1 — append-only accumulation (fixes 19 of 23 failures)

Root cause 2 — two unsupported behaviors on the route (the remaining 4 failures)

Results

Check List

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ahkcs commented Jun 16, 2026 •

edited

Loading