Skip to content

Commit 0cc376a

Browse files
authored
[GLUTEN-11917][VL] Respect allowPrecisionLoss from expression context in Spark 4.1 (#12110)
* [GLUTEN-11917][VL] Respect allowPrecisionLoss from expression context in Spark 4.1 In Spark 4.1, SPARK-53968 introduced NumericEvalContext which captures allowPrecisionLoss at analysis time and embeds it in arithmetic expressions (Add, Subtract, Multiply, Divide) via an evalContext field. Previously, Gluten read SQLConf.get.decimalOperationsAllowPrecisionLoss at plan-conversion time, which can diverge from the expression's captured value — for example, when querying a persistent view whose expression was analyzed under a different session config. This commit adds a shim method decimalAllowPrecisionLoss(expr: BinaryArithmetic) to SparkShims. The Spark41Shims override reads evalContext.allowDecimalPrecisionLoss directly from the expression. All other Spark versions fall back to SQLConf.get, preserving existing behavior. DecimalArithmeticUtil.getResultType and VeloxSparkPlanExecApi.getDecimalArithmeticExprName are updated to use the shim, ensuring the correct result type and Velox function name are selected. Fixes: #11917 * [GLUTEN-11917][VL] Add test and clarifying comments for allowPrecisionLoss shim - Add regression test covering the view-analyzed-under-different-session-config scenario that motivated the fix: arithmetic is analyzed with allowPrecisionLoss=false, cached in a temp view, then queried with allowPrecisionLoss=true. Gluten must read the captured evalContext, not SQLConf.get. - Add comment on SparkShims default explaining why SQLConf.get is correct for pre-4.1 Spark versions (no evalContext field exists before SPARK-53968). - Add comment on Spark41Shims wildcard arm explaining that Remainder/Pmod lack evalContext and are gated out of Velox execution by GlutenNotSupportException in DecimalArithmeticUtil.getResultType, making the SQLConf fallback safe. - Add comment on SparkPlanExecApi default explaining why non-Velox backends (e.g. ClickHouse) correctly ignore allowPrecisionLoss — they do not use the _deny_precision_loss naming convention. * [GLUTEN-11917][VL] Update GlutenSimpleSQLViewSuite comment for Spark 4.1 The original comment noted 2 failures. One was the allowPrecisionLoss evalContext issue fixed in this PR (GLUTEN-11917). The suite still has 1 remaining failure unrelated to GLUTEN-11917, so it stays disabled.
1 parent 55eb0d3 commit 0cc376a

8 files changed

Lines changed: 56 additions & 9 deletions

File tree

backends-velox/src/main/scala/org/apache/gluten/backendsapi/velox/VeloxSparkPlanExecApi.scala

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -162,8 +162,8 @@ class VeloxSparkPlanExecApi extends SparkPlanExecApi with Logging {
162162
}
163163
}
164164

165-
override def getDecimalArithmeticExprName(exprName: String): String =
166-
if (!SQLConf.get.decimalOperationsAllowPrecisionLoss) { exprName + "_deny_precision_loss" }
165+
override def getDecimalArithmeticExprName(exprName: String, allowPrecisionLoss: Boolean): String =
166+
if (!allowPrecisionLoss) { exprName + "_deny_precision_loss" }
167167
else { exprName }
168168

169169
/** Transform map_entries to Substrait. */

backends-velox/src/test/scala/org/apache/gluten/functions/MathFunctionsValidateSuite.scala

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -413,4 +413,30 @@ abstract class MathFunctionsValidateSuite extends FunctionsValidateSuite {
413413
}
414414
}
415415
}
416+
417+
test("decimal arithmetic respects allowPrecisionLoss captured at view analysis time") {
418+
// Regression test for GLUTEN-11917: in Spark 4.1, arithmetic expressions embed
419+
// allowPrecisionLoss in their evalContext at analysis time. Gluten must read from
420+
// the expression rather than SQLConf.get, which can differ when querying a view
421+
// analyzed under a different session config.
422+
withTempView("t", "v") {
423+
sql("""
424+
|SELECT
425+
|CAST('1234567890123456789012345.12345678901' AS DECIMAL(38,11)) AS a,
426+
|CAST('1234567890123456789012345.02345678901' AS DECIMAL(38,11)) AS b""".stripMargin)
427+
.createOrReplaceTempView("t")
428+
429+
// Analyze arithmetic with allowPrecisionLoss=false and cache it in the view's plan.
430+
withSQLConf("spark.sql.decimalOperations.allowPrecisionLoss" -> "false") {
431+
sql("CREATE OR REPLACE TEMP VIEW v AS SELECT a - b, a + b, a * b, a / b FROM t")
432+
}
433+
434+
// Query under the opposite setting -- Gluten must use the captured context, not SQLConf.
435+
withSQLConf("spark.sql.decimalOperations.allowPrecisionLoss" -> "true") {
436+
runQueryAndCompare("SELECT * FROM v") {
437+
checkGlutenPlan[ProjectExecTransformer]
438+
}
439+
}
440+
}
441+
}
416442
}

gluten-substrait/src/main/scala/org/apache/gluten/backendsapi/SparkPlanExecApi.scala

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -264,7 +264,10 @@ trait SparkPlanExecApi {
264264
GenericExpressionTransformer(substraitExprName, Seq(left, right), original)
265265
}
266266

267-
def getDecimalArithmeticExprName(exprName: String): String = exprName
267+
// Default: ignore allowPrecisionLoss and return exprName unchanged. Non-Velox backends
268+
// (e.g. ClickHouse) do not use the _deny_precision_loss naming convention; they handle
269+
// decimal precision through their own mechanisms. VeloxSparkPlanExecApi overrides this.
270+
def getDecimalArithmeticExprName(exprName: String, allowPrecisionLoss: Boolean): String = exprName
268271

269272
/** Transform map_entries to Substrait. */
270273
def genMapEntriesTransformer(

gluten-substrait/src/main/scala/org/apache/gluten/expression/ExpressionConverter.scala

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -647,7 +647,8 @@ object ExpressionConverter extends SQLConfHelper with Logging {
647647
DecimalArithmeticUtil.isDecimalArithmetic(b) =>
648648
val arithmeticExprName =
649649
BackendsApiManager.getSparkPlanExecApiInstance.getDecimalArithmeticExprName(
650-
getAndCheckSubstraitName(b, expressionsMap))
650+
getAndCheckSubstraitName(b, expressionsMap),
651+
SparkShimLoader.getSparkShims.decimalAllowPrecisionLoss(b))
651652
val left =
652653
replaceWithExpressionTransformer0(b.left, attributeSeq, expressionsMap)
653654
val right =
@@ -664,7 +665,8 @@ object ExpressionConverter extends SQLConfHelper with Logging {
664665
)
665666
case b: BinaryArithmetic if DecimalArithmeticUtil.isDecimalArithmetic(b) =>
666667
val exprName = BackendsApiManager.getSparkPlanExecApiInstance.getDecimalArithmeticExprName(
667-
substraitExprName)
668+
substraitExprName,
669+
SparkShimLoader.getSparkShims.decimalAllowPrecisionLoss(b))
668670
if (!BackendsApiManager.getSettings.transformCheckOverflow) {
669671
GenericExpressionTransformer(
670672
exprName,

gluten-substrait/src/main/scala/org/apache/gluten/utils/DecimalArithmeticUtil.scala

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@ import org.apache.gluten.exception.GlutenNotSupportException
2020
import org.apache.gluten.sql.shims.SparkShimLoader
2121

2222
import org.apache.spark.sql.catalyst.expressions.{Add, BinaryArithmetic, Cast, Divide, Expression, Literal, Multiply, Pmod, PromotePrecision, Remainder, Subtract}
23-
import org.apache.spark.sql.internal.SQLConf
2423
import org.apache.spark.sql.types.{ByteType, Decimal, DecimalType, IntegerType, LongType, ShortType}
2524
import org.apache.spark.sql.utils.DecimalTypeUtil
2625

@@ -33,7 +32,7 @@ object DecimalArithmeticUtil {
3332
// Returns the result decimal type of a decimal arithmetic computing.
3433
def getResultType(expr: BinaryArithmetic, type1: DecimalType, type2: DecimalType): DecimalType = {
3534

36-
val allowPrecisionLoss = SQLConf.get.decimalOperationsAllowPrecisionLoss
35+
val allowPrecisionLoss = SparkShimLoader.getSparkShims.decimalAllowPrecisionLoss(expr)
3736
var resultScale = 0
3837
var resultPrecision = 0
3938
expr match {

gluten-ut/spark41/src/test/scala/org/apache/gluten/utils/velox/VeloxTestSettings.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -702,7 +702,7 @@ class VeloxTestSettings extends BackendTestSettings {
702702
enableSuite[GlutenSQLFunctionSuite]
703703
enableSuite[GlutenSQLJsonProtocolSuite]
704704
enableSuite[GlutenShufflePartitionsUtilSuite]
705-
// TODO: 4.x enableSuite[GlutenSimpleSQLViewSuite] // 2 failures
705+
// TODO: 4.x enableSuite[GlutenSimpleSQLViewSuite] // 1 failure remains after GLUTEN-11917 fix
706706
enableSuite[GlutenSparkPlanSuite]
707707
.exclude("SPARK-37779: ColumnarToRowExec should be canonicalizable after being (de)serialized")
708708
enableSuite[GlutenSparkPlannerSuite]

shims/common/src/main/scala/org/apache/gluten/sql/shims/SparkShims.scala

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ import org.apache.spark.broadcast.Broadcast
2424
import org.apache.spark.internal.io.FileCommitProtocol
2525
import org.apache.spark.sql.{AnalysisException, SparkSession}
2626
import org.apache.spark.sql.catalyst.InternalRow
27-
import org.apache.spark.sql.catalyst.expressions.{Attribute, Expression, InputFileBlockLength, InputFileBlockStart, InputFileName, RaiseError, UnBase64}
27+
import org.apache.spark.sql.catalyst.expressions.{Attribute, BinaryArithmetic, Expression, InputFileBlockLength, InputFileBlockStart, InputFileName, RaiseError, UnBase64}
2828
import org.apache.spark.sql.catalyst.plans.JoinType
2929
import org.apache.spark.sql.catalyst.plans.QueryPlan
3030
import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
@@ -250,6 +250,12 @@ trait SparkShims {
250250

251251
def widerDecimalType(d1: DecimalType, d2: DecimalType): DecimalType
252252

253+
// Spark 4.1+ (SPARK-53968) embeds allowDecimalPrecisionLoss in each arithmetic expression's
254+
// evalContext at analysis time. Spark41Shims overrides this to read from the expression.
255+
// All earlier versions have no evalContext field, so reading SQLConf.get here is correct.
256+
def decimalAllowPrecisionLoss(expr: BinaryArithmetic): Boolean =
257+
SQLConf.get.decimalOperationsAllowPrecisionLoss
258+
253259
def getRewriteCreateTableAsSelect(session: SparkSession): SparkStrategy = _ => Seq.empty
254260

255261
/** Shim method for get the "errorMessage" value for Spark 4.0 and above */

shims/spark41/src/main/scala/org/apache/gluten/sql/shims/spark41/Spark41Shims.scala

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -610,6 +610,17 @@ class Spark41Shims extends SparkShims {
610610
DecimalPrecisionTypeCoercion.widerDecimalType(d1, d2)
611611
}
612612

613+
override def decimalAllowPrecisionLoss(expr: BinaryArithmetic): Boolean = expr match {
614+
case a: Add => a.evalContext.allowDecimalPrecisionLoss
615+
case s: Subtract => s.evalContext.allowDecimalPrecisionLoss
616+
case m: Multiply => m.evalContext.allowDecimalPrecisionLoss
617+
case d: Divide => d.evalContext.allowDecimalPrecisionLoss
618+
// Remainder and Pmod do not carry evalContext in Spark 4.1. They also throw
619+
// GlutenNotSupportException in DecimalArithmeticUtil.getResultType, so they never
620+
// reach Velox execution; SQLConf.get is a safe fallback for the name-lookup path.
621+
case _ => SQLConf.get.decimalOperationsAllowPrecisionLoss
622+
}
623+
613624
override def getErrorMessage(raiseError: RaiseError): Option[Expression] = {
614625
raiseError.errorParms match {
615626
case CreateMap(children, _)

0 commit comments

Comments
 (0)