[BugFix] Schedule PPLSimplifyDedupRule before FilterMergeRule in HEP (#5)#6
Closed
ryan-gh-bot wants to merge 1 commit into
Closed
[BugFix] Schedule PPLSimplifyDedupRule before FilterMergeRule in HEP (#5)#6ryan-gh-bot wants to merge 1 commit into
ryan-gh-bot wants to merge 1 commit into
Conversation
…pensearch-project#5) When a user where clause sits between the dedup analyzer's bucket-non-null filter and the table scan, FilterMergeRule could fire first under MatchOrder.ARBITRARY in the same rule collection and merge the filters; the merged AND condition no longer satisfied PlanUtils.mayBeFilterFromBucketNonNull, PPLSimplifyDedupRule never matched, no LogicalDedup was produced, and DedupPushdownRule could not push the dedup down as composite + top_hits. Split the HEP program into two ordered instructions so PPLSimplifyDedupRule runs to fixpoint against the original adjacent-filter shape before FilterMergeRule runs. Signed-off-by: Jialiang Liang <jiallian@amazon.com>
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
PPL
dedupis initially compiled into aROW_NUMBER() OVER (PARTITION BY field)window pattern with anIS_NOT_NULL(field)bucket-non-null filter directly above the scan.PPLSimplifyDedupRuleis supposed to fold that pattern into aLogicalDedup, whichDedupPushdownRulethen pushes to the scan as acomposite+top_hitsaggregation.In the previous HEP program (
CalciteToolsHelper.java),FilterMergeRuleandPPLSimplifyDedupRulewere both registered through a singleaddRuleCollection(...)call, which usesMatchOrder.ARBITRARY. With this scheduling the order in which the two rules fire is non-deterministic. When a userwhereclause sits between the bucket-non-null filter and the scan, those two filters are adjacent above the scan; ifFilterMergeRulehappens to fire first, it folds them intoAND(IS_NOT_NULL(field), <user predicate>). That merged condition no longer satisfiesPlanUtils.mayBeFilterFromBucketNonNull(which only accepts a pureIS_NOT_NULLor anAND-of-IS_NOT_NULLs),PPLSimplifyDedupRule's operand never matches, noLogicalDedupis produced, andDedupPushdownRulehas nothing to match - so the dedup falls through to a coordinator-sideROW_NUMBERwindow. On large indices that path can exceed timeouts or trigger shard failures.The fix splits the HEP program into two ordered instructions so that
PPLSimplifyDedupRuleruns to fixpoint against the original adjacent-filter shape before any merging happens, and thenFilterMergeRuleruns. The remaining userwherefilter ends up below the newLogicalDedup, where the existing VolcanoFilterIndexScanRulepushes it into the scan, andDedupPushdownRulethen pushes the dedup down ascomposite+top_hits.The fix touches one file in production (
core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java); a regression test is added inppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLDedupTest.javatogether with a small helper inCalcitePPLAbstractTestthat returns the analyzer plan without the test-only auxiliaryFilterMergeRulepass, so the production HEP can be exercised against the un-merged shape.Verification
Before - the failing post-HEP plan shape captured by the bug report (
EXPLAINagainst a real cluster). With any userwhereupstream ofdedup, the scan emits_sourceand dedup runs in memory:The newly added unit test
CalcitePPLDedupTest.testDedupAfterWhereSimplifiesToLogicalDedupasserts the desired shape by running the productionCalciteToolsHelper.optimizeHEP on the analyzer plan forsource=EMP | where DEPTNO > 10 | dedup 1 DEPTNO | fields DEPTNO. It checks that the analyzer plan still has the un-merged adjacent filters (the input the simplify rule expects) and that the post-HEP plan deterministically contains aLogicalDedup, retains the user predicate, and no longer contains aROW_NUMBERwindow.After - run on this branch:
$ JAVA_HOME=/usr/lib/jvm/java-21-amazon-corretto ./gradlew :ppl:test --tests \ "org.opensearch.sql.ppl.calcite.CalcitePPLDedupTest" ... CalcitePPLDedupTest > testDedupAfterWhereSimplifiesToLogicalDedup PASSED CalcitePPLDedupTest > testDedup1 PASSED CalcitePPLDedupTest > testDedup2 PASSED CalcitePPLDedupTest > testDedupExpr PASSED CalcitePPLDedupTest > testDedupKeepEmpty1 PASSED CalcitePPLDedupTest > testDedupKeepEmpty2 PASSED CalcitePPLDedupTest > testRenameDedup PASSED CalcitePPLDedupTest > testSortFieldProjectedAwayBeforeDedup PASSED CalcitePPLDedupTest > testSortThenDedup PASSED CalcitePPLDedupTest > testSortThenDedupWithEval PASSED BUILD SUCCESSFULBroader regression coverage - the full
:ppl:test,:core:test,:opensearch:testand:sql:testsuites pass with no failures on this branch.spotlessCheckis also clean.The bug report's "actual"
EXPLAINoutput (filter merged intoAND(>(@ts,...), IS NOT NULL(...)),EnumerableWindowover the scan, no aggregation in DSL) requires a live cluster with the reporter's index template and dataset to reproduce verbatim and is not exercised by the unit-test harness. The fix targets the documented root cause - non-deterministic HEP rule ordering betweenFilterMergeRuleandPPLSimplifyDedupRule- and the new unit test guards the deterministic post-HEP behavior we want regardless of the underlying scheduler.Note on test scoping: in synthetic plan shapes built by
CalcitePPLAbstractTest, the unfixedaddRuleCollectionordering happens to firePPLSimplifyDedupRulefirst under the current Calcite vertex iteration order, so the new test alone does not flip from FAIL to PASS on those shapes. The fix is still required to make the ordering deterministic and to address the documented production failure mode; the new test serves as a deterministic guard against regressions in the post-HEP behavior.Related Issues
Resolves #5
Check List
--signoffor-s.This pull request was authored by @ryan-gh-bot in response to the maintainer command
/fixon #5 (invoked by @RyanL1997). This is a maintainer-triggered automated contribution. Please review carefully before merging.