Make addcoltotals/addtotals summary row backend-deterministic#5477
Conversation
PR Reviewer Guide 🔍(Review updated until commit df0456f)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to df0456f Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit df0456f
Suggestions up to commit 170d132
|
170d132 to
e8678df
Compare
|
Persistent review updated to latest commit e8678df |
| // Type the synthesized label column as an unbounded VARCHAR, matching how OpenSearchTypeFactory | ||
| // maps STRING fields. A length-bounded VARCHAR(n) would render as a Substrait VarChar(n) | ||
| // literal, which the DataFusion backend (analytics-engine route) does not support. |
There was a problem hiding this comment.
delete comments. mention DF/Substrait is misleading.
There was a problem hiding this comment.
Updated to remove/trim comments
addcoltotals (and addtotals with col=true) lowers to UNION ALL(data,
single SUM row). SQL UNION ALL has no ordering guarantee: the v2 engine
preserves input order incidentally, but the DataFusion backend on the
analytics-engine route does not, so the summary row landed in a
non-deterministic position. The command contract and user docs require the
summary row to be last.
Separately, the synthesized column for a new `labelfield` was typed as a
bounded VARCHAR(n), which renders as a Substrait VarChar(n) literal that the
DataFusion substrait consumer rejects ("Unsupported literal_type: VarChar").
Both fixes are in CalciteRelNodeVisitor.buildAddRowTotalAggregate:
- Tag each union branch with a constant ordering key (0 = data rows,
1 = summary row), sort on it, then drop the helper column, so the summary
row is deterministically last regardless of backend. Reuses the existing
streamstats sort-then-projectExcept idiom; adds PlanUtils constant
ORDER_COLUMN_FOR_ADDCOLTOTALS.
- Type the synthesized label column as an unbounded VARCHAR, matching how
OpenSearchTypeFactory maps STRING fields.
Regenerated the affected CalcitePPLAddColTotalsTest / CalcitePPLAddTotalsTest
plan and Spark-SQL expectations and the explain_add_col_totals /
explain_add_totals YAML plans; result rows are unchanged.
Signed-off-by: Kai Huang <ahkcs@amazon.com>
e8678df to
df0456f
Compare
|
Persistent review updated to latest commit df0456f |
1 similar comment
|
Persistent review updated to latest commit df0456f |
What's broken
Two bugs in the lowering of
addcoltotals(andaddtotals col=true). Both manifest on the analytics-engine route and were hidden on the v2/Calcite path by single-node execution semantics.addcoltotalslowers toUNION ALL(data, single-row SUM). The user docs say the summary row is "the last row." SQLUNION ALLmakes no ordering guarantee — DataFusion does not preserve input order, so the summary row comes out anywhere. Tests asserting "last row is the summary" fail non-deterministically.labelfieldnames a new column. The lowering types that new column asVARCHAR(n)wheren = max(len(label), len(labelfield)). substrait-java emits aVarChar(n)literal; upstream DataFusion'sdatafusion-substraitconsumer does not implement length-boundedVarCharliterals →Unsupported literal_type.Both fixes are in
CalciteRelNodeVisitor#buildAddRowTotalAggregate(the method shared byvisitAddColTotalsandvisitAddTotalsforcol=true).Bug 1 — summary row in random position
Step 1 — what
addcoltotalsdoes mechanicallytestAddColTotalsTotalWithTotalField:Lowering produces:
Step 2 — on v2 (serial), UNION ALL preserves input order
The Calcite enumerable adapter executes UNION ALL by emitting the first input's tuples, then the second's:
verifyColTotals(the test helper) inspectsrows[length-1]and assertsrows[length-1].age == sum(rows[0..length-2].age). Passes.Step 3 — on DataFusion (parallel), UNION ALL doesn't preserve input order
SQL
UNION ALLgives no ordering guarantee, and DataFusion'sUnionExecinterleaves its inputs' partitions as they're ready. One observed run:Step 4 — exactly what the IT fails on
Working backwards: let
Tbe the true age-total (the summary row's age). The misplaced summary row sits insiderows[0..length-2], so the client'scColTotals = sum(rows[0..length-2].age) = T + (T − 35) = 2T − 35. Solving48323 = 2T − 35givesT ≈ 24179. The numbers prove the bug is "summary row not last," not a wrong total. Across 3 runs we saw 2/6, 4/6, 4/6 — different tests failing each time.Step 5 — a downstream PPL
sortcan't fix thisAdding
| sort ageafteraddcoltotalsreorders byage, but the summary'sage(24179) is a real numeric value indistinguishable from a data row, so the summary still doesn't land in any predictable position. There's no field that says "I'm the summary." The lowering itself has to encode it.Where the offending
union(true)is emittedCalciteRelNodeVisitor#buildAddRowTotalAggregate:End-to-end path:
The comment
// Use UNION ALL to preserve orderwas wishful — SQLUNION ALLhas no ordering guarantee.Fix
Tag each branch with a constant ordering key (
0on data rows,1on the summary row),UNION ALL, sort by the key, thenprojectExceptto drop it. The summary row is now>every data row on the sort key, so it sorts last on every backend:Reuses the
sort + projectExceptidiom thatstreamstatsalready uses in the same visitor. NewPlanUtils.ORDER_COLUMN_FOR_ADDCOLTOTALS = "_addcoltotals_order_". No behavior change on v2 (which already happened to produce this order).Bug 2 —
Unsupported literal_type: VarCharon a newlabelfieldStep 1 — what
labelfield='Grand Total'triggers in the loweringtestAddColTotalsWithCustomLabel:labelfield='Grand Total'names a new column (doesn't exist inaccounts).buildAddRowTotalAggregatecreates that column with typeVARCHAR(labelLength), wherelabelLength = max(len(label), len(labelfield))— heremax(3, 11) = 11— and writes'Sum':VARCHAR(11)in the summary row:Step 2 — substrait-java renders bounded VARCHAR as Substrait
VarChar(N)substrait-java's type/literal converter maps:
VARCHARwith explicit lengthN→ SubstraitVarChar(N)VARCHARwith no length → SubstraitString(unbounded Utf8)So
'Sum':VARCHAR(11)becomes aVarChar { value: "Sum", length: 11 }literal in the Substrait plan.Step 3 — DataFusion's Substrait consumer doesn't implement
VarChar(N)literalsUpstream
datafusion-substrait'sfrom_substrait_literalhandles onlyStringfor string literals;VarCharandFixedCharfall through tonot_impl_err!. The cluster returns:No rows are ever returned. Fails on every run (deterministic, unlike Bug 1).
Step 4 — why existing-field labels work and new-field labels don't
OpenSearchTypeFactory#createRelDataTypemaps every OpenSearch STRING field (text/keyword) tocreateSqlType(VARCHAR)with no explicit length:A label placed into an existing string column inherits that unbounded type → Substrait
String→ DataFusion accepts.testAddColTotalsWithLabelAndLabelField, which useslabelfield='firstname'(existing keyword field), works for exactly that reason. Only the new-labelfieldpath was creating bounded VARCHAR — inconsistent with how every other string column in the system is typed.Where the bounded
VARCHAR(labelLength)is emittedCalciteRelNodeVisitor#buildAddRowTotalAggregate:labelVarcharTypethen feeds two sites:makeNullLiteral(labelVarcharType)for the new label column on the data branch, and (viafieldDataType.getType())makeLiteral(label, …)for the label value on the summary branch.Fix
Drop the explicit length. Type the synthesized label column the same way
OpenSearchTypeFactorytypes every other string column — unboundedVARCHAR:labelLengthis removed — it served no purpose other than producing the bounded type. The literal now renders as SubstraitStringand DataFusion accepts it. No behavior change on v2.Results
CalciteAddColTotalsCommandIT(6 tests, all exercise the fix path)CalciteAddTotalsCommandIT—testAddTotalsWithLabelAndLabelField(the onlycol=truetest; newlabelfieldso exercises both fixes)col=falsetests are unaffected and also pass)Testing
CalcitePPLAddColTotalsTest8/8,CalcitePPLAddTotalsTest12/12 (regenerated logical-plan + Spark-SQL expectations; result rows unchanged).CalciteExplainIT#testaddColTotalsExplain/testaddTotalsExplain✅ (regeneratedexplain_add_col_totals.yaml/explain_add_totals.yaml).CalciteAddColTotalsCommandIT6/6 stable across 3 runs on analytics route; 6/6 on v2 ✅CalciteAddTotalsCommandIT11/11 stable across 3 runs on analytics route; 11/11 on v2 ✅:core:spotlessCheck+:ppl:spotlessCheckclean ✅Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.