[BugFix] Fix multiple appendpipe error while revisiting parent AST by songkant-aws · Pull Request #5322 · opensearch-project/sql

songkant-aws · 2026-04-08T02:12:55Z

Description

Chained appendpipe queries can produce literal-only projections during prepare-time field trimming. Calcite may simplify those projections into Values, which can trigger planner mismatch assertions during execution. Preserve the Project shape for this narrow case in OpenSearchRelFieldTrimmer.

Related Issues

Resolves #5173

Check List

New functionality includes testing.
New functionality has been documented.
New functionality has javadoc added.
New functionality has a user manual doc added.
New PPL command checklist all confirmed.
API changes companion pull request created.
Commits are signed per the DCO using --signoff or -s.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

github-actions · 2026-04-08T02:14:05Z

PR Reviewer Guide 🔍

(Review updated until commit `9124702`)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes Sub-PR theme: Fix planner mismatch by overriding trimUnusedFields and preserving hint strategies Relevant files: core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java Sub-PR theme: Add regression tests for multiple appendpipe scenarios (issue Relevant files: integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAppendPipeCommandIT.java ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAppendPipeTest.java integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5173.yml
⚡ Recommended focus areas for review Cluster Reuse Risk In the overridden `trimUnusedFields`, the converter is constructed using `root.rel.getCluster()` to reuse the incoming RelNode's cluster. While the comment explains the rationale, this could cause subtle issues if the cluster's planner state is mutated during trimming, potentially affecting the original RelNode tree. This should be carefully validated to ensure no side effects on shared cluster state. protected RelRoot trimUnusedFields(RelRoot root) { final SqlToRelConverter.Config config = SqlToRelConverter.config() .withTrimUnusedFields(shouldTrim(root.rel)) .withExpand(THREAD_EXPAND.get()) .withInSubQueryThreshold(requireNonNull(THREAD_INSUBQUERY_THRESHOLD.get())); // PPL analyzes into a pre-built RelNode before prepareStatement(rel). Reuse the incoming // RelNode's cluster here so prepare-time trimming does not create replacement nodes under a // different planner than the rest of the tree. final SqlToRelConverter converter = new OpenSearchSqlToRelConverter( this, getSqlValidator(), catalogReader, root.rel.getCluster(), convertletTable, config); final boolean ordered = !root.collation.getFieldCollations().isEmpty(); final boolean dml = SqlKind.DML.contains(root.kind); return root.withRel(converter.trimUnusedFields(dml \|\| ordered, root.rel)); } Copied Logic The `shouldTrim` method and the `trimUnusedFields` override appear to be copied/adapted from Calcite's internal `CalcitePrepareImpl`. The comment about "3 joins" threshold is a heuristic from Calcite internals. If Calcite upgrades change this logic, the copied version here could diverge silently. Consider referencing or delegating to the parent implementation where possible. private static boolean shouldTrim(RelNode rootRel) { // For now, don't trim if there are more than 3 joins. The projects // near the leaves created by trim migrate past joins and seem to // prevent join-reordering. return THREAD_TRIM.get() \|\| RelOptUtil.countJoins(rootRel) < 2; } Unused Parameter The private constructor `OpenSearchSqlToRelConverter` has a parameter named `ignored` (boolean) which is only used to differentiate the constructor signature. This is a code smell — consider using a different pattern (e.g., a factory method or a named inner builder) to avoid confusion and potential misuse. private OpenSearchSqlToRelConverter( ViewExpander viewExpander, @Nullable SqlValidator validator, CatalogReader catalogReader, RelOptCluster cluster, SqlRexConvertletTable convertletTable, Config effectiveConfig, boolean ignored) { super(viewExpander, validator, catalogReader, cluster, convertletTable, effectiveConfig); Hardcoded Expected Values The integration test `testDoubleAppendPipeWithFilter` expects specific row results including `rows(15224, "M")` appearing twice, but the comment says "1 (M filter from cumulative 3 rows)". This logic depends on the order of appendpipe evaluation and the cumulative nature of the pipe. If the behavior changes or data ordering is non-deterministic, the test may become flaky. public void testDoubleAppendPipeWithFilter() throws IOException { JSONObject actual = executeQuery( String.format( Locale.ROOT, "source=%s \| stats sum(age) as sum_age by gender" + " \| appendpipe [ where gender = 'F' ]" + " \| appendpipe [ where gender = 'M' ]", TEST_INDEX_ACCOUNT)); // 2 original + 1 (F filter from original) + 1 (M filter from cumulative 3 rows) verifyDataRows(actual, rows(14947, "F"), rows(15224, "M"), rows(14947, "F"), rows(15224, "M")); }

github-actions · 2026-04-08T02:14:28Z

PR Code Suggestions ✨

Latest suggestions up to 9124702

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Fix join-count threshold mismatch with comment The comment says "more than 3 joins" but the condition `< 2` means trimming is skipped when there are 2 or more joins, which contradicts the comment. The threshold should be `< 4` (or the comment should say "more than 1 join") to match the stated intent. core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java [396-401] private static boolean shouldTrim(RelNode rootRel) { // For now, don't trim if there are more than 3 joins. The projects // near the leaves created by trim migrate past joins and seem to // prevent join-reordering. - return THREAD_TRIM.get() \|\| RelOptUtil.countJoins(rootRel) < 2; + return THREAD_TRIM.get() \|\| RelOptUtil.countJoins(rootRel) < 4; } Suggestion importance[1-10]: 6 __ Why: The comment says "more than 3 joins" but `< 2` skips trimming for 2+ joins, which contradicts the stated intent. Changing to `< 4` would align the code with the comment, though the actual desired threshold may depend on the original Calcite source this was adapted from.	Low
General	Replace meaningless boolean marker with typed enum The `ignored` boolean parameter is a code smell — it exists solely to differentiate the private constructor from the public one, but it carries no semantic meaning and can confuse future maintainers. Consider using a private marker enum or a named factory method instead to make the intent explicit. core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java [424-431] +private enum ConstructorMarker { INSTANCE } + private OpenSearchSqlToRelConverter( ViewExpander viewExpander, @Nullable SqlValidator validator, CatalogReader catalogReader, RelOptCluster cluster, SqlRexConvertletTable convertletTable, Config effectiveConfig, - boolean ignored) { + ConstructorMarker ignored) { Suggestion importance[1-10]: 3 __ Why: The `ignored` boolean parameter is a minor code smell, but using a private enum marker is a style improvement with low functional impact. The suggestion is valid but offers only marginal maintainability benefit.	Low
General	Verify correctness of expected test data rows The comment says the second appendpipe filters `gender = 'M'` from the cumulative 3 rows (original 2 + 1 F-filtered), so the expected result should be 3 rows (original 2 + 1 F-filtered row) plus 2 M-filtered rows from the 3-row set, totalling 5 rows. The current assertion of 4 rows may be incorrect — verify the expected behavior and row count carefully against the actual query semantics. integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAppendPipeCommandIT.java [158] +// Verify expected row count and content match actual appendpipe semantics verifyDataRows(actual, rows(14947, "F"), rows(15224, "M"), rows(14947, "F"), rows(15224, "M")); Suggestion importance[1-10]: 2 __ Why: The suggestion asks to verify the row count but the `improved_code` is identical to the `existing_code` (just adds a comment), providing no actual fix. The concern about row count may be valid but the suggestion doesn't provide a concrete correction.	Low

Previous suggestions

Suggestions up to commit 0c3a695

Category	Suggestion	Impact
Possible issue	Fix join count threshold mismatch with comment The comment says "more than 3 joins" but the condition `< 2` means trimming is skipped when there are 2 or more joins, which is inconsistent with the stated threshold of 3. The condition should be `< 3` (or `<= 3`) to match the comment's intent of skipping trim only when there are more than 3 joins. core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java [395-400] private static boolean shouldTrim(RelNode rootRel) { // For now, don't trim if there are more than 3 joins. The projects // near the leaves created by trim migrate past joins and seem to // prevent join-reordering. - return THREAD_TRIM.get() \|\| RelOptUtil.countJoins(rootRel) < 2; + return THREAD_TRIM.get() \|\| RelOptUtil.countJoins(rootRel) < 3; } Suggestion importance[1-10]: 6 __ Why: The comment says "more than 3 joins" but the condition `< 2` would skip trimming for 2+ joins, not 3+. The condition should be `< 3` to match the stated intent. This is a real inconsistency between the comment and the code logic.	Low
General	Guard against null thread-local value `THREAD_EXPAND.get()` and `THREAD_INSUBQUERY_THRESHOLD.get()` may return `null` if the thread-local values have not been initialized, which would cause a `NullPointerException`. `THREAD_INSUBQUERY_THRESHOLD.get()` is already guarded with `requireNonNull`, but `THREAD_EXPAND.get()` is not. Consider adding a null check or default value for `THREAD_EXPAND.get()` as well. core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java [372-379] @Override protected RelRoot trimUnusedFields(RelRoot root) { final SqlToRelConverter.Config config = SqlToRelConverter.config() .withTrimUnusedFields(shouldTrim(root.rel)) - .withExpand(THREAD_EXPAND.get()) + .withExpand(requireNonNull(THREAD_EXPAND.get())) .withInSubQueryThreshold(requireNonNull(THREAD_INSUBQUERY_THRESHOLD.get())); Suggestion importance[1-10]: 4 __ Why: Adding `requireNonNull` to `THREAD_EXPAND.get()` is a defensive improvement, but since the parent class's `trimUnusedFields` also uses `THREAD_EXPAND` without null-checking, this is likely already handled by the initialization pattern of these thread-locals elsewhere in the codebase.	Low

Suggestions up to commit e07ed44

Category	Suggestion	Impact
General	Handle wrapped inputs in fixed row count check The `isFixedRowCount` method only checks direct `Values` or `Aggregate` inputs, but does not handle cases where the input is a `Project` wrapping a `Values` or `Aggregate`. This could cause `shouldAvoidSimplifyValues` to return `false` when it should return `true`, potentially leading to incorrect simplification behavior. Consider recursively checking through `Project` nodes to reach the underlying `Values` or `Aggregate`. core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchRelFieldTrimmer.java [170-179] private boolean isFixedRowCount(RelNode input) { if (input instanceof Values) { return true; } if (input instanceof Aggregate aggregate) { return aggregate.getGroupSet().isEmpty() && aggregate.getGroupType() == Aggregate.Group.SIMPLE; } + if (input instanceof Project project) { + return isFixedRowCount(project.getInput()); + } return false; } Suggestion importance[1-10]: 5 __ Why: The suggestion is logically sound — `isFixedRowCount` could miss cases where a `Project` wraps a `Values` or `Aggregate`. However, the method is specifically designed to guard against Calcite's `simplifyValues` optimization in the context of field trimming, and the current cases (`Values` and groupless `Aggregate`) cover the known problematic patterns. The improvement is reasonable but addresses an edge case not demonstrated to cause issues.	Low
Possible issue	Replace assert with explicit runtime validation The `assert` statements used to validate that `correlationIds.size() == 1` are only active when assertions are enabled (i.e., with the `-ea` JVM flag). In production environments where assertions are disabled, this check is silently skipped, and the code may behave incorrectly if there are multiple correlation IDs. Consider replacing the `assert` with an explicit runtime check (e.g., throwing an `IllegalStateException`) to ensure correctness in all environments. core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchRelFieldTrimmer.java [65-71] List<RexSubQuery> subQueries = RexUtil.SubQueryCollector.collect(project); Set<CorrelationId> correlationIds = RelOptUtil.getVariablesUsed(subQueries); ImmutableBitSet requiredColumns = ImmutableBitSet.of(); if (!correlationIds.isEmpty()) { - assert correlationIds.size() == 1; + if (correlationIds.size() != 1) { + throw new IllegalStateException("Expected at most one correlation ID, but found: " + correlationIds.size()); + } requiredColumns = RelOptUtil.correlationColumns(correlationIds.iterator().next(), project); } Suggestion importance[1-10]: 4 __ Why: The suggestion is valid — `assert` statements are disabled in production JVMs by default. However, this pattern (using `assert` for internal invariants) is common in Calcite's own codebase, and the scenario of multiple correlation IDs in a single project is extremely unlikely. The improvement is minor and defensive rather than fixing a real bug.	Low

Suggestions up to commit 48bbc4b

Category	Suggestion	Impact
Possible issue	Avoid pushing shared RelNode reference onto stack Calling `peek()` and then `push()` with the same node pushes a duplicate reference to the same `RelNode` onto the stack. This means the subquery will operate on a shared reference rather than an independent copy, which could cause issues when the subquery modifies or consumes the node. Consider using a proper copy or snapshot of the current plan before pushing it for the subquery to consume. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [305-307] RelNode mainNode = context.relBuilder.peek(); -context.relBuilder.push(mainNode); +context.relBuilder.push(mainNode.copy(mainNode.getTraitSet(), mainNode.getInputs())); node.getSubQuery().accept(this, context); Suggestion importance[1-10]: 4 __ Why: The suggestion raises a valid concern about pushing a shared `RelNode` reference. However, in Calcite's `RelBuilder`, `peek()` followed by `push()` is a common pattern to duplicate a reference on the stack for branching purposes, and `RelNode` trees are typically immutable — the subquery visitor builds new nodes on top. The fix may not be necessary and could introduce issues if `copy()` doesn't behave as expected in all cases.	Low
General	Clarify misleading test comment on row count The comment says "1 (M filter from cumulative 3 rows)" but the second `appendpipe` operates on the cumulative result of 3 rows (2 original + 1 F), so filtering by `gender = 'M'` from those 3 rows would yield 1 row (the original M row). However, the first `appendpipe` adds the F row again, so the cumulative set before the second `appendpipe` has 2 original + 1 F = 3 rows, of which 1 is M. The expected count of 4 may be correct, but the comment is misleading and should be clarified to avoid confusion. integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAppendPipeCommandIT.java [148-149] -// 2 original + 1 (F filter) + 1 (M filter from cumulative 3 rows) = 4 rows +// 2 original (F, M) + 1 (F filter appended) + 1 (M filter from 3 cumulative rows) = 4 rows verifyNumOfRows(actual, 4); Suggestion importance[1-10]: 2 __ Why: The suggestion only improves a comment's clarity without changing any logic. The `improved_code` is nearly identical to the `existing_code` with a minor rewording, offering minimal value.	Low

visitAppendPipe re-visits parent AST in new planner context, causing assertion failure. Use relBuilder stack directly instead of AST re-visitation to stay within the same planner context. Signed-off-by: Songkan Tang <songkant@amazon.com>

Signed-off-by: Songkan Tang <songkant@amazon.com>

Chained appendpipe queries can produce literal-only projections during prepare-time field trimming. Calcite may simplify those projections into Values, which can trigger planner mismatch assertions during execution. Preserve the Project shape for this narrow case in OpenSearchRelFieldTrimmer and add YAML REST regression coverage for double and triple appendpipe. Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-04-08T09:00:01Z

Persistent review updated to latest commit e07ed44

songkant-aws · 2026-04-08T10:07:19Z

+    openSearchRelBuilder.push(newInput);
+    openSearchRelBuilder.project(newProjects, newRowType.getFieldNames(), false, correlationIds);
+    return result(openSearchRelBuilder.build(), mapping, project);


In trimFields(Project, ...), the old openSearchRelBuilder.project(...) call could trigger Calcite's internal simplifyValues() when trimming produced a literal-only projection over a fixed-row-count input. That rewrite could replace the Project with Values, which is the shape that led to the planner mismatch in chained appendpipe.

qianheng-aws · 2026-04-09T09:01:53Z

As discussed offline, it's root cause is inconsistent planner/cluster are being used in trim phase and optimize phase.

In original calcite, one prepare statement will be used through the whole process. While in PPL, we create 2 different prepares for analyzing and executing(including trimming, optimizing and planning, only trimming uses the planner from this new created prepare while others reuse the planner in RelNode).

The ideal solution should be reusing the same planner for trimming as well.

Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-04-09T09:45:26Z

Persistent review updated to latest commit 0c3a695

Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-04-09T10:06:53Z

Persistent review updated to latest commit 9124702

songkant-aws · 2026-04-09T10:14:54Z

@qianheng-aws As suggested, prepare-time trimming now reuses the incoming RelNode's cluster so it stays aligned with the analyzed plan. The change also preserves pre-registered hint strategies required by PPL custom aggregate hints. If we find other prepare-time state that also needs to stay aligned with the incoming RelNode, we can address those cases incrementally.

qianheng-aws · 2026-04-10T02:56:47Z

LGTM

LantaoJin · 2026-04-10T03:24:26Z

@songkant-aws the multiple LogicalUnions could be optimized as discussed offline?

songkant-aws · 2026-04-10T06:51:59Z

@LantaoJin I checked Calcite already has some optimization rules for multiple LogicalUnions. Sometimes it can merge multiple unions into one.

songkant-aws marked this pull request as ready for review April 8, 2026 02:16

songkant-aws requested review from LantaoJin, RyanL1997, Swiddis, acarbonetto, ahkcs, anirudha, dai-chen, joshuali925, mengweieric, noCharger, penghuo, ps48, qianheng-aws, vamsimanohar, ykmr1224 and yuancu as code owners April 8, 2026 02:16

songkant-aws added the bugFix label Apr 8, 2026

songkant-aws added 5 commits April 8, 2026 14:08

Add regression tests for chained appendpipe (double and triple)

9b37167

Signed-off-by: Songkan Tang <songkant@amazon.com>

Minor change of assertion on rows

7cf6f5d

Signed-off-by: Songkan Tang <songkant@amazon.com>

Merge branch 'main' into bug-06b-appendpipe-planner-5173

e1af042

songkant-aws force-pushed the bug-06b-appendpipe-planner-5173 branch from 48bbc4b to e07ed44 Compare April 8, 2026 08:58

songkant-aws commented Apr 8, 2026

View reviewed changes

songkant-aws mentioned this pull request Apr 8, 2026

[BUG] Double appendpipe causes 'different planner' assertion due to AST re-visitation #5173

Closed

4 tasks

LantaoJin previously approved these changes Apr 9, 2026

View reviewed changes

Use consistent prepareStatement to resolve the issue

0c3a695

Signed-off-by: Songkan Tang <songkant@amazon.com>

songkant-aws dismissed LantaoJin’s stale review via 0c3a695 April 9, 2026 09:44

Preserve agg hint after reusing cluster during prepare and optimize

9124702

Signed-off-by: Songkan Tang <songkant@amazon.com>

qianheng-aws approved these changes Apr 10, 2026

View reviewed changes

LantaoJin approved these changes Apr 10, 2026

View reviewed changes

LantaoJin merged commit 582b4d2 into opensearch-project:main Apr 10, 2026
38 of 39 checks passed

songkant-aws deleted the bug-06b-appendpipe-planner-5173 branch April 10, 2026 09:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Fix multiple appendpipe error while revisiting parent AST#5322

[BugFix] Fix multiple appendpipe error while revisiting parent AST#5322
LantaoJin merged 7 commits into
opensearch-project:mainfrom
songkant-aws:bug-06b-appendpipe-planner-5173

songkant-aws commented Apr 8, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 8, 2026

Uh oh!

songkant-aws Apr 8, 2026

Uh oh!

qianheng-aws commented Apr 9, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 9, 2026

Uh oh!

github-actions Bot commented Apr 9, 2026

Uh oh!

songkant-aws commented Apr 9, 2026 •

edited

Loading

Uh oh!

qianheng-aws commented Apr 10, 2026

Uh oh!

Uh oh!

LantaoJin commented Apr 10, 2026

Uh oh!

songkant-aws commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

songkant-aws commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Check List

Uh oh!

github-actions Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 9124702)

Uh oh!

github-actions Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions Bot commented Apr 8, 2026

Uh oh!

songkant-aws Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

qianheng-aws commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 9, 2026

Uh oh!

github-actions Bot commented Apr 9, 2026

Uh oh!

songkant-aws commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qianheng-aws commented Apr 10, 2026

Uh oh!

Uh oh!

LantaoJin commented Apr 10, 2026

Uh oh!

songkant-aws commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

songkant-aws commented Apr 8, 2026 •

edited

Loading

github-actions Bot commented Apr 8, 2026 •

edited

Loading

(Review updated until commit `9124702`)

github-actions Bot commented Apr 8, 2026 •

edited

Loading

qianheng-aws commented Apr 9, 2026 •

edited

Loading

songkant-aws commented Apr 9, 2026 •

edited

Loading