From 0119573408f91775df7a969df0df0d3577ccc4e8 Mon Sep 17 00:00:00 2001 From: Norio Date: Sat, 25 Apr 2026 00:00:00 +0000 Subject: [PATCH] [GLUTEN-11916][VL][TEST] Enable subquery/exists-subquery/exists-orderby-limit.sql with SPARK-57125 workaround GlutenSQLQueryTestSuite excludes ConvertToLocalRelation, ConstantFolding and NullPropagation by default to force queries through Gluten's offload paths. However, EXISTS+OFFSET queries in exists-orderby-limit.sql hit Spark's LimitPushDown rule which rewrites LocalLimit(le, Offset(oe, child)) into Offset(oe, LocalLimit(Add(le, oe), child)) and relies on ConstantFolding to subsequently fold `Add(Literal(N), Literal(M))` to `Literal(N + M)`. Without ConstantFolding the unfolded Add reaches physical planning where BasicOperators only matches LocalLimit(IntegerLiteral, _), producing AssertionError: No plan for LocalLimit (1 + 2) wrapped as [INTERNAL_ERROR] during the planning phase. This patch enables the test and re-enables ConstantFolding for just this SQL file via a per-file `--SET spark.sql.optimizer.excludedRules=...` directive that keeps only ConvertToLocalRelation excluded. The upstream Spark fix is tracked as SPARK-57125 (Apache Spark PR #56180), which makes LimitPushDown produce a literal sum directly so the rule no longer depends on ConstantFolding. Once that lands and Gluten picks up the Spark version, the `--SET` directive in this file can be removed. Note: the test framework's `--SET` parser splits values by comma, so multiple excluded rules cannot be specified in a single directive (recorded separately for a future Spark/Gluten follow-up). NullPropagation getting re-enabled is acceptable for this test. --- .../exists-subquery/exists-orderby-limit.sql | 12 ++++++++++++ .../utils/velox/VeloxSQLQueryTestSettings.scala | 2 +- 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/gluten-ut/spark41/src/test/resources/backends-velox/sql-tests/inputs/subquery/exists-subquery/exists-orderby-limit.sql b/gluten-ut/spark41/src/test/resources/backends-velox/sql-tests/inputs/subquery/exists-subquery/exists-orderby-limit.sql index 9ff3409b21a..1b2a884d158 100644 --- a/gluten-ut/spark41/src/test/resources/backends-velox/sql-tests/inputs/subquery/exists-subquery/exists-orderby-limit.sql +++ b/gluten-ut/spark41/src/test/resources/backends-velox/sql-tests/inputs/subquery/exists-subquery/exists-orderby-limit.sql @@ -5,6 +5,18 @@ --CONFIG_DIM1 spark.sql.codegen.wholeStage=false,spark.sql.codegen.factoryMode=CODEGEN_ONLY --CONFIG_DIM1 spark.sql.codegen.wholeStage=false,spark.sql.codegen.factoryMode=NO_CODEGEN +-- GlutenSQLQueryTestSuite excludes ConvertToLocalRelation, ConstantFolding and +-- NullPropagation by default to force queries through Gluten's offload paths. +-- However, EXISTS+OFFSET queries hit Spark's LimitPushDown rule which produces +-- `LocalLimit(Add(limit, offset), ...)` and assumes ConstantFolding will fold +-- the Add. Without ConstantFolding the planner fails with +-- `AssertionError: No plan for LocalLimit (1 + 2)` because BasicOperators only +-- matches LocalLimit(IntegerLiteral, _). Re-enable ConstantFolding (and +-- NullPropagation, which the test framework's --SET parser would split on +-- the comma) for this file by keeping only ConvertToLocalRelation excluded. +-- Tracked upstream as SPARK-57125. +--SET spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation + --ONLY_IF spark CREATE TEMPORARY VIEW EMP AS SELECT * FROM VALUES (100, "emp 1", date "2005-01-01", 100.00D, 10), diff --git a/gluten-ut/spark41/src/test/scala/org/apache/gluten/utils/velox/VeloxSQLQueryTestSettings.scala b/gluten-ut/spark41/src/test/scala/org/apache/gluten/utils/velox/VeloxSQLQueryTestSettings.scala index ea58cc7a5cd..0202ecd24c0 100644 --- a/gluten-ut/spark41/src/test/scala/org/apache/gluten/utils/velox/VeloxSQLQueryTestSettings.scala +++ b/gluten-ut/spark41/src/test/scala/org/apache/gluten/utils/velox/VeloxSQLQueryTestSettings.scala @@ -127,7 +127,7 @@ object VeloxSQLQueryTestSettings extends SQLQueryTestSettings { "subquery/exists-subquery/exists-cte.sql", "subquery/exists-subquery/exists-having.sql", "subquery/exists-subquery/exists-joins-and-set-ops.sql", - // TODO: fix on Spark-4.1 "subquery/exists-subquery/exists-orderby-limit.sql", + "subquery/exists-subquery/exists-orderby-limit.sql", "subquery/exists-subquery/exists-outside-filter.sql", "subquery/exists-subquery/exists-within-and-or.sql", "subquery/in-subquery/in-basic.sql",