Skip to content

[SPARK-57004][SQL] Simplify CheckOverflowInSum codegen under ANSI mode#56063

Draft
gengliangwang wants to merge 1 commit into
apache:masterfrom
gengliangwang:SPARK-57004-check-overflow-in-sum
Draft

[SPARK-57004][SQL] Simplify CheckOverflowInSum codegen under ANSI mode#56063
gengliangwang wants to merge 1 commit into
apache:masterfrom
gengliangwang:SPARK-57004-check-overflow-in-sum

Conversation

@gengliangwang
Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

Introduce DecimalExpressionUtils.java with a checkOverflowInSum(Decimal, int, int, boolean, QueryContext) static helper and call it from CheckOverflowInSum.doGenCode and the eval path.

Codegen body shrinks from a 10-line if/else block (init ev.value to null, branch on childGen.isNull, conditionally throw, conditionally call toPrecision + re-set ev.isNull) to 4 lines (single helper call + post-check). Eval is now a single delegating call.

Why are the changes needed?

Part of SPARK-56908 (umbrella). CheckOverflowInSum is emitted around every decimal Sum and is one of the longer remaining inline ANSI bodies. Collapsing the per-call-site body shrinks generated Java source and Janino compile time on aggregation-heavy plans.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

build/sbt "catalyst/testOnly *DecimalExpressionSuite *AggregateExpressionSuite"
build/sbt "sql/testOnly *DataFrameAggregateSuite -- -z sum"

20/20 pass (incl. the SPARK-39208 CheckOverflowInSum runtime-context test).

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor 1.x

### What changes were proposed in this pull request?

Introduce `DecimalExpressionUtils.java` with a `checkOverflowInSum(Decimal, int, int, boolean, QueryContext)` static helper and call it from `CheckOverflowInSum.doGenCode` and the eval path.

Codegen body shrinks from a 10-line if/else block (init `ev.value` to null, branch on `childGen.isNull`, conditionally throw, conditionally call `toPrecision` + re-set `ev.isNull`) to 4 lines (single helper call + post-check).

Eval is now a single delegating call.

### Why are the changes needed?

Part of SPARK-56908 (umbrella). `CheckOverflowInSum` is emitted around every decimal `Sum` and is one of the longer remaining inline ANSI bodies. Collapsing the per-call-site body shrinks generated Java source and Janino compile time on aggregation-heavy plans.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

```
build/sbt "catalyst/testOnly *DecimalExpressionSuite *AggregateExpressionSuite"
build/sbt "sql/testOnly *DataFrameAggregateSuite -- -z sum"
```

20/20 pass (incl. the SPARK-39208 `CheckOverflowInSum` runtime-context test).

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor 1.x
@gengliangwang gengliangwang marked this pull request as draft May 22, 2026 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant