Make addcoltotals/addtotals summary row backend-deterministic by ahkcs · Pull Request #5477 · opensearch-project/sql

ahkcs · 2026-05-27T23:26:00Z

What's broken

Two bugs in the lowering of addcoltotals (and addtotals col=true). Both manifest on the analytics-engine route and were hidden on the v2/Calcite path by single-node execution semantics.

The summary row lands in a random position. addcoltotals lowers to UNION ALL(data, single-row SUM). The user docs say the summary row is "the last row." SQL UNION ALL makes no ordering guarantee — DataFusion does not preserve input order, so the summary row comes out anywhere. Tests asserting "last row is the summary" fail non-deterministically.
HTTP 500 when labelfield names a new column. The lowering types that new column as VARCHAR(n) where n = max(len(label), len(labelfield)). substrait-java emits a VarChar(n) literal; upstream DataFusion's datafusion-substrait consumer does not implement length-bounded VarChar literals → Unsupported literal_type.

Both fixes are in CalciteRelNodeVisitor#buildAddRowTotalAggregate (the method shared by visitAddColTotals and visitAddTotals for col=true).

Bug 1 — summary row in random position

Step 1 — what `addcoltotals` does mechanically

testAddColTotalsTotalWithTotalField:

source=accounts | where age > 25 | fields age, balance | addcoltotals

Lowering produces:

LogicalUnion(all=[true])
  LogicalProject(age, balance)                       ← N data rows
    <upstream filter + scan>
  LogicalProject(age, balance)                       ← exactly 1 summary row
    LogicalAggregate(group=[{}], age=[SUM(age)], balance=[SUM(balance)])
      <upstream filter + scan>

Step 2 — on v2 (serial), UNION ALL preserves input order

The Calcite enumerable adapter executes UNION ALL by emitting the first input's tuples, then the second's:

rows[0]      data
rows[1]      data
…
rows[n-2]    data
rows[n-1]    summary   ← always last

verifyColTotals (the test helper) inspects rows[length-1] and asserts rows[length-1].age == sum(rows[0..length-2].age). Passes.

Step 3 — on DataFusion (parallel), UNION ALL doesn't preserve input order

SQL UNION ALL gives no ordering guarantee, and DataFusion's UnionExec interleaves its inputs' partitions as they're ready. One observed run:

rows[0]    data
rows[1]    data
…
rows[k]    summary   ← arbitrary position
…
rows[n-1]  data      ← just another account

Step 4 — exactly what the IT fails on

expected:<35.0> but was:<48323.0>

Working backwards: let T be the true age-total (the summary row's age). The misplaced summary row sits inside rows[0..length-2], so the client's cColTotals = sum(rows[0..length-2].age) = T + (T − 35) = 2T − 35. Solving 48323 = 2T − 35 gives T ≈ 24179. The numbers prove the bug is "summary row not last," not a wrong total. Across 3 runs we saw 2/6, 4/6, 4/6 — different tests failing each time.

Step 5 — a downstream PPL `sort` can't fix this

Adding | sort age after addcoltotals reorders by age, but the summary's age (24179) is a real numeric value indistinguishable from a data row, so the summary still doesn't land in any predictable position. There's no field that says "I'm the summary." The lowering itself has to encode it.

Where the offending `union(true)` is emitted

CalciteRelNodeVisitor#buildAddRowTotalAggregate:

RelNode totalsRow = context.relBuilder.build();
// 4. Union original data with totals row
context.relBuilder.push(originalData);
context.relBuilder.push(totalsRow);
context.relBuilder.union(true); // Use UNION ALL to preserve order

End-to-end path:

PPL "| addcoltotals"
   ↓  parser → AST AddColTotals
visitAddColTotals → buildAddRowTotalAggregate
   ↓  push(data) + push(totalsRow) + union(true)   ← no marker, no sort
LogicalUnion(all=[true])
   ↓  Calcite → Substrait
DataFusion's UnionExec emits tuples in arrival order across partitions

The comment // Use UNION ALL to preserve order was wishful — SQL UNION ALL has no ordering guarantee.

Fix

Tag each branch with a constant ordering key (0 on data rows, 1 on the summary row), UNION ALL, sort by the key, then projectExcept to drop it. The summary row is now > every data row on the sort key, so it sorts last on every backend:

RelNode totalsRow = context.relBuilder.build();
// 4. Union original data with totals row.
// UNION ALL does not guarantee row order, and some backends (e.g. DataFusion on the
// analytics-engine route) do not preserve input order, so the summary row can land
// anywhere. Tag each branch with a constant ordering key (0 = data rows, 1 = summary row),
// union, sort on it so the summary row is deterministically last, then drop the helper.
context.relBuilder.push(originalData);
context.relBuilder.projectPlus(
    context.relBuilder.alias(context.relBuilder.literal(0), ORDER_COLUMN_FOR_ADDCOLTOTALS));
RelNode dataWithOrder = context.relBuilder.build();
context.relBuilder.push(totalsRow);
context.relBuilder.projectPlus(
    context.relBuilder.alias(context.relBuilder.literal(1), ORDER_COLUMN_FOR_ADDCOLTOTALS));
RelNode totalsWithOrder = context.relBuilder.build();
context.relBuilder.push(dataWithOrder);
context.relBuilder.push(totalsWithOrder);
context.relBuilder.union(true); // UNION ALL
context.relBuilder.sort(context.relBuilder.field(ORDER_COLUMN_FOR_ADDCOLTOTALS));
context.relBuilder.projectExcept(context.relBuilder.field(ORDER_COLUMN_FOR_ADDCOLTOTALS));

Reuses the sort + projectExcept idiom that streamstats already uses in the same visitor. New PlanUtils.ORDER_COLUMN_FOR_ADDCOLTOTALS = "_addcoltotals_order_". No behavior change on v2 (which already happened to produce this order).

Bug 2 — `Unsupported literal_type: VarChar` on a new `labelfield`

Step 1 — what `labelfield='Grand Total'` triggers in the lowering

testAddColTotalsWithCustomLabel:

source=accounts | where age > 25 | head 2 | fields age, balance
  | addcoltotals label='Sum' labelfield='Grand Total'

labelfield='Grand Total' names a new column (doesn't exist in accounts). buildAddRowTotalAggregate creates that column with type VARCHAR(labelLength), where labelLength = max(len(label), len(labelfield)) — here max(3, 11) = 11 — and writes 'Sum':VARCHAR(11) in the summary row:

LogicalProject(age, balance, Grand_Total=[null:VARCHAR(11)])      ← data rows
LogicalProject(age, balance, Grand_Total=['Sum':VARCHAR(11)])     ← summary row

Step 2 — substrait-java renders bounded VARCHAR as Substrait `VarChar(N)`

substrait-java's type/literal converter maps:

Calcite VARCHAR with explicit length N → Substrait VarChar(N)
Calcite VARCHAR with no length → Substrait String (unbounded Utf8)

So 'Sum':VARCHAR(11) becomes a VarChar { value: "Sum", length: 11 } literal in the Substrait plan.

Step 3 — DataFusion's Substrait consumer doesn't implement `VarChar(N)` literals

Upstream datafusion-substrait's from_substrait_literal handles only String for string literals; VarChar and FixedChar fall through to not_impl_err!. The cluster returns:

HTTP 500
"details": "This feature is not implemented: Unsupported literal_type:
            Some(VarChar(VarChar { value: \"Sum\", length: 11 }))"

No rows are ever returned. Fails on every run (deterministic, unlike Bug 1).

Step 4 — why existing-field labels work and new-field labels don't

OpenSearchTypeFactory#createRelDataType maps every OpenSearch STRING field (text/keyword) to createSqlType(VARCHAR) with no explicit length:

// OpenSearchTypeFactory.java:170
case STRING:
  return TYPE_FACTORY.createSqlType(SqlTypeName.VARCHAR, nullable);   // no length

A label placed into an existing string column inherits that unbounded type → Substrait String → DataFusion accepts. testAddColTotalsWithLabelAndLabelField, which uses labelfield='firstname' (existing keyword field), works for exactly that reason. Only the new-labelfield path was creating bounded VARCHAR — inconsistent with how every other string column in the system is typed.

Where the bounded `VARCHAR(labelLength)` is emitted

CalciteRelNodeVisitor#buildAddRowTotalAggregate:

int labelLength =
    (labelField != null) && (labelField.length() > label.length())
        ? labelField.length()
        : label.length();

RelDataType labelVarcharType =
    context.relBuilder.getTypeFactory().createSqlType(SqlTypeName.VARCHAR, labelLength);

labelVarcharType then feeds two sites: makeNullLiteral(labelVarcharType) for the new label column on the data branch, and (via fieldDataType.getType()) makeLiteral(label, …) for the label value on the summary branch.

Fix

Drop the explicit length. Type the synthesized label column the same way OpenSearchTypeFactory types every other string column — unbounded VARCHAR:

// Type the synthesized label column as an unbounded VARCHAR, matching how OpenSearchTypeFactory
// maps STRING fields. A length-bounded VARCHAR(n) would render as a Substrait VarChar(n)
// literal, which the DataFusion backend (analytics-engine route) does not support.
RelDataType labelVarcharType =
    context.relBuilder.getTypeFactory().createSqlType(SqlTypeName.VARCHAR);

labelLength is removed — it served no purpose other than producing the bounded type. The literal now renders as Substrait String and DataFusion accepts it. No behavior change on v2.

Results

IT on analytics-engine route, 3 runs each	Before	After
`CalciteAddColTotalsCommandIT` (6 tests, all exercise the fix path)	flaky 2/6, 4/6, 4/6	stable 6/6, 6/6, 6/6
`CalciteAddTotalsCommandIT` — `testAddTotalsWithLabelAndLabelField` (the only `col=true` test; new `labelfield` so exercises both fixes)	—	stable pass, pass, pass (other 10 `col=false` tests are unaffected and also pass)
v2/Calcite IT + unit tests + explain tests	green	green

Testing

CalcitePPLAddColTotalsTest 8/8, CalcitePPLAddTotalsTest 12/12 (regenerated logical-plan + Spark-SQL expectations; result rows unchanged).
CalciteExplainIT#testaddColTotalsExplain / testaddTotalsExplain ✅ (regenerated explain_add_col_totals.yaml / explain_add_totals.yaml).
CalciteAddColTotalsCommandIT 6/6 stable across 3 runs on analytics route; 6/6 on v2 ✅
CalciteAddTotalsCommandIT 11/11 stable across 3 runs on analytics route; 11/11 on v2 ✅
:core:spotlessCheck + :ppl:spotlessCheck clean ✅

Check List

New functionality includes testing.
Behavior matches existing documentation (no doc change needed).
Commits are signed per the DCO using --signoff.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

github-actions · 2026-05-27T23:26:59Z

PR Reviewer Guide 🔍

(Review updated until commit `df0456f`)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ No major issues detected

github-actions · 2026-05-27T23:27:14Z

PR Code Suggestions ✨

Latest suggestions up to df0456f

Explore these optional code suggestions:

Category Suggestion Impact

General

Validate field reference after sort operation

The projectExcept operation may fail if ORDER_COLUMN_FOR_ADDCOLTOTALS doesn't exist
in the projection list after the sort. Verify that the field reference is valid
after the sort operation, or use an alternative approach like explicitly listing all
fields except the ordering column to ensure robustness.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3484-3496]

 context.relBuilder.push(originalData);
 context.relBuilder.projectPlus(
     context.relBuilder.alias(context.relBuilder.literal(0), ORDER_COLUMN_FOR_ADDCOLTOTALS));
 RelNode dataWithOrder = context.relBuilder.build();
 context.relBuilder.push(totalsRow);
 context.relBuilder.projectPlus(
     context.relBuilder.alias(context.relBuilder.literal(1), ORDER_COLUMN_FOR_ADDCOLTOTALS));
 RelNode totalsWithOrder = context.relBuilder.build();
 context.relBuilder.push(dataWithOrder);
 context.relBuilder.push(totalsWithOrder);
 context.relBuilder.union(true); // UNION ALL
 context.relBuilder.sort(context.relBuilder.field(ORDER_COLUMN_FOR_ADDCOLTOTALS));
-context.relBuilder.projectExcept(context.relBuilder.field(ORDER_COLUMN_FOR_ADDCOLTOTALS));
+// Get all fields except the ordering column
+List<RexNode> projectFields = new ArrayList<>();
+RelDataType rowType = context.relBuilder.peek().getRowType();
+for (int i = 0; i < rowType.getFieldCount() - 1; i++) {
+  projectFields.add(context.relBuilder.field(i));
+}
+context.relBuilder.project(projectFields);

Suggestion importance[1-10]: 5

__

Why: The suggestion raises a valid concern about projectExcept potentially failing if the field doesn't exist, but the improved code is more verbose and less maintainable than the existing approach. The projectExcept method is a standard Calcite operation that should handle the field reference correctly. The suggestion would be more valuable if there was evidence of actual failures.

Low

Previous suggestions

Suggestions up to commit df0456f

Category Suggestion Impact

General

Ensure ordering column removal is robust

The projectExcept operation may fail if ORDER_COLUMN_FOR_ADDCOLTOTALS doesn't exist
in the projection. Verify that the field reference is valid after the sort
operation, or use an index-based approach to remove the ordering column more
reliably.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3484-3496]

 context.relBuilder.push(originalData);
 context.relBuilder.projectPlus(
     context.relBuilder.alias(context.relBuilder.literal(0), ORDER_COLUMN_FOR_ADDCOLTOTALS));
 RelNode dataWithOrder = context.relBuilder.build();
 context.relBuilder.push(totalsRow);
 context.relBuilder.projectPlus(
     context.relBuilder.alias(context.relBuilder.literal(1), ORDER_COLUMN_FOR_ADDCOLTOTALS));
 RelNode totalsWithOrder = context.relBuilder.build();
 context.relBuilder.push(dataWithOrder);
 context.relBuilder.push(totalsWithOrder);
 context.relBuilder.union(true); // UNION ALL
 context.relBuilder.sort(context.relBuilder.field(ORDER_COLUMN_FOR_ADDCOLTOTALS));
-context.relBuilder.projectExcept(context.relBuilder.field(ORDER_COLUMN_FOR_ADDCOLTOTALS));
+// Project all fields except the last one (ORDER_COLUMN_FOR_ADDCOLTOTALS)
+int fieldCount = context.relBuilder.peek().getRowType().getFieldCount();
+List<RexNode> projectFields = new ArrayList<>();
+for (int i = 0; i < fieldCount - 1; i++) {
+  projectFields.add(context.relBuilder.field(i));
+}
+context.relBuilder.project(projectFields);

Suggestion importance[1-10]: 7

__

Why: The suggestion addresses a potential robustness issue with projectExcept. Using an index-based approach to remove the ordering column is more reliable than field name reference, especially after sort operations. However, this is a defensive improvement rather than fixing a critical bug, as projectExcept should work correctly if the field exists.

Medium

Suggestions up to commit 170d132

Category	Suggestion	Impact
General	Verify sort direction is deterministic The sort operation should specify a sort direction explicitly. While ASC is likely the default, explicitly specifying it ensures consistent behavior across different Calcite configurations and makes the intent clear that the summary row (tagged with 1) should appear last. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3502-3503] -context.relBuilder.sort(context.relBuilder.field(ORDER_COLUMN_FOR_ADDCOLTOTALS)); +context.relBuilder.sort( + context.relBuilder.field(ORDER_COLUMN_FOR_ADDCOLTOTALS)); context.relBuilder.projectExcept(context.relBuilder.field(ORDER_COLUMN_FOR_ADDCOLTOTALS)); Suggestion importance[1-10]: 5 __ Why: The suggestion correctly identifies that the sort direction should be explicit for clarity and consistency. However, the `improved_code` is identical to the `existing_code`, and the PR already includes a comment explaining the sorting behavior. The suggestion asks for verification rather than proposing a concrete fix, so it receives a moderate score.	Low

github-actions · 2026-05-28T16:49:22Z

Persistent review updated to latest commit e8678df

penghuo · 2026-05-29T18:06:30Z

+    // Type the synthesized label column as an unbounded VARCHAR, matching how OpenSearchTypeFactory
+    // maps STRING fields. A length-bounded VARCHAR(n) would render as a Substrait VarChar(n)
+    // literal, which the DataFusion backend (analytics-engine route) does not support.


delete comments. mention DF/Substrait is misleading.

Updated to remove/trim comments

addcoltotals (and addtotals with col=true) lowers to UNION ALL(data, single SUM row). SQL UNION ALL has no ordering guarantee: the v2 engine preserves input order incidentally, but the DataFusion backend on the analytics-engine route does not, so the summary row landed in a non-deterministic position. The command contract and user docs require the summary row to be last. Separately, the synthesized column for a new `labelfield` was typed as a bounded VARCHAR(n), which renders as a Substrait VarChar(n) literal that the DataFusion substrait consumer rejects ("Unsupported literal_type: VarChar"). Both fixes are in CalciteRelNodeVisitor.buildAddRowTotalAggregate: - Tag each union branch with a constant ordering key (0 = data rows, 1 = summary row), sort on it, then drop the helper column, so the summary row is deterministically last regardless of backend. Reuses the existing streamstats sort-then-projectExcept idiom; adds PlanUtils constant ORDER_COLUMN_FOR_ADDCOLTOTALS. - Type the synthesized label column as an unbounded VARCHAR, matching how OpenSearchTypeFactory maps STRING fields. Regenerated the affected CalcitePPLAddColTotalsTest / CalcitePPLAddTotalsTest plan and Spark-SQL expectations and the explain_add_col_totals / explain_add_totals YAML plans; result rows are unchanged. Signed-off-by: Kai Huang <ahkcs@amazon.com>

github-actions · 2026-05-29T18:17:50Z

Persistent review updated to latest commit df0456f

github-actions · 2026-05-29T18:17:54Z

Persistent review updated to latest commit df0456f

ahkcs requested review from LantaoJin, RyanL1997, Swiddis, acarbonetto, anirudha, dai-chen, joshuali925, mengweieric, noCharger, penghuo, ps48, qianheng-aws, songkant-aws, vamsimanohar, ykmr1224 and yuancu as code owners May 27, 2026 23:26

ahkcs added PPL Piped processing language enhancement New feature or request labels May 28, 2026

ahkcs force-pushed the fix/ppl-addcoltotals-analytics-route branch from 170d132 to e8678df Compare May 28, 2026 16:48

penghuo previously approved these changes May 29, 2026

View reviewed changes

ahkcs dismissed penghuo’s stale review via df0456f May 29, 2026 18:16

ahkcs force-pushed the fix/ppl-addcoltotals-analytics-route branch from e8678df to df0456f Compare May 29, 2026 18:16

ahkcs requested a review from penghuo May 29, 2026 18:18

mengweieric approved these changes May 29, 2026

View reviewed changes

RyanL1997 approved these changes May 29, 2026

View reviewed changes

ahkcs merged commit 585fa9e into opensearch-project:main May 29, 2026
77 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make addcoltotals/addtotals summary row backend-deterministic#5477

Make addcoltotals/addtotals summary row backend-deterministic#5477
ahkcs merged 1 commit into
opensearch-project:mainfrom
ahkcs:fix/ppl-addcoltotals-analytics-route

ahkcs commented May 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 28, 2026

Uh oh!

penghuo May 29, 2026

Uh oh!

ahkcs May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ahkcs commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's broken

Bug 1 — summary row in random position

Step 1 — what addcoltotals does mechanically

Step 2 — on v2 (serial), UNION ALL preserves input order

Step 3 — on DataFusion (parallel), UNION ALL doesn't preserve input order

Step 4 — exactly what the IT fails on

Step 5 — a downstream PPL sort can't fix this

Where the offending union(true) is emitted

Fix

Bug 2 — Unsupported literal_type: VarChar on a new labelfield

Step 1 — what labelfield='Grand Total' triggers in the lowering

Step 2 — substrait-java renders bounded VARCHAR as Substrait VarChar(N)

Step 3 — DataFusion's Substrait consumer doesn't implement VarChar(N) literals

Step 4 — why existing-field labels work and new-field labels don't

Where the bounded VARCHAR(labelLength) is emitted

Fix

Results

Testing

Check List

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit df0456f)

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions Bot commented May 28, 2026

Uh oh!

penghuo May 29, 2026

Choose a reason for hiding this comment

Uh oh!

ahkcs May 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ahkcs commented May 27, 2026 •

edited

Loading

Step 1 — what `addcoltotals` does mechanically

Step 5 — a downstream PPL `sort` can't fix this

Where the offending `union(true)` is emitted

Bug 2 — `Unsupported literal_type: VarChar` on a new `labelfield`

Step 1 — what `labelfield='Grand Total'` triggers in the lowering

Step 2 — substrait-java renders bounded VARCHAR as Substrait `VarChar(N)`

Step 3 — DataFusion's Substrait consumer doesn't implement `VarChar(N)` literals

Where the bounded `VARCHAR(labelLength)` is emitted

github-actions Bot commented May 27, 2026 •

edited

Loading

(Review updated until commit `df0456f`)

github-actions Bot commented May 27, 2026 •

edited

Loading