Skip to content

Commit a885e27

Browse files
committed
[SPARK-57750][SQL] Assign a name to the error condition _LEGACY_ERROR_TEMP_3084 and set its cause
### What changes were proposed in this pull request? Replace the legacy error condition `_LEGACY_ERROR_TEMP_3084`, raised when a Hive UDF/UDAF/UDTF wrapper class fails to instantiate during function resolution, with the descriptive condition `CANNOT_INSTANTIATE_HIVE_FUNCTION`, and attach the original failure as the exception cause. - Add `CANNOT_INSTANTIATE_HIVE_FUNCTION` (SQLSTATE `38000`) to `error-conditions.json` and remove `_LEGACY_ERROR_TEMP_3084`. - Add `QueryCompilationErrors.cannotInstantiateHiveFunctionError(clazz, e)` that passes `cause = Some(e)` so the inner failure is preserved on the exception chain. - Update `HiveSessionStateBuilder.makeHiveFunctionExpression` to throw the new error and drop the manual `setStackTrace` (the attached cause now carries the inner stack trace). - Update `HiveUDFSuite` to assert via `checkError` on the new condition, and to read the inner failure via `getCause` where the wrapped message was previously asserted. ### Why are the changes needed? Part of the error-class migration (umbrella [SPARK-37935](https://issues.apache.org/jira/browse/SPARK-37935)). The legacy condition used a free-form `e` message parameter and did not attach the cause: the 2-arg `AnalysisException(errorClass, messageParameters)` constructor sets `cause = None`, so `getCause` returned `null` and callers/tests could not programmatically unwrap the inner failure (for example, asserting the inner condition via `checkError`). ### Does this PR introduce _any_ user-facing change? Yes. The error condition name and message change, and the original exception is now attached as the cause. This is a change within the unreleased `master` branch only. Before: ``` [_LEGACY_ERROR_TEMP_3084] No handler for UDF/UDAF/UDTF '<clazz>': <e> ``` After: ``` [CANNOT_INSTANTIATE_HIVE_FUNCTION] Cannot instantiate the Hive UDF/UDAF/UDTF wrapper class <clazz>. Check that the function arguments and their types are supported. SQLSTATE: 38000 ``` ### How was this patch tested? By running: - `build/sbt "core/testOnly org.apache.spark.SparkThrowableSuite"` - `build/sbt "hive/testOnly org.apache.spark.sql.hive.execution.HiveUDFSuite"` ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Cursor Closes #56867 from MaxGekk/error-cond_LEGACY_ERROR_TEMP_3084. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Max Gekk <max.gekk@gmail.com> (cherry picked from commit 4810493) Signed-off-by: Max Gekk <max.gekk@gmail.com>
1 parent a19544c commit a885e27

5 files changed

Lines changed: 53 additions & 34 deletions

File tree

common/utils/src/main/resources/error/error-conditions.json

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -498,6 +498,12 @@
498498
],
499499
"sqlState" : "22546"
500500
},
501+
"CANNOT_INSTANTIATE_HIVE_FUNCTION" : {
502+
"message" : [
503+
"Cannot instantiate the Hive UDF/UDAF/UDTF wrapper class <clazz>. Check that the function arguments and their types are supported."
504+
],
505+
"sqlState" : "38000"
506+
},
501507
"CANNOT_INVOKE_IN_TRANSFORMATIONS" : {
502508
"message" : [
503509
"Dataset transformations and actions can only be invoked by the driver, not inside of other Dataset transformations; for example, dataset1.map(x => dataset2.values.count() * x) is invalid because the values transformation and count action cannot be performed inside of the dataset1.map transformation. For more information, see SPARK-28702."
@@ -11325,11 +11331,6 @@
1132511331
"Unable to infer the schema. The schema specification is required to create the table <tableName>."
1132611332
]
1132711333
},
11328-
"_LEGACY_ERROR_TEMP_3084" : {
11329-
"message" : [
11330-
"No handler for UDF/UDAF/UDTF '<clazz>': <e>"
11331-
]
11332-
},
1133311334
"_LEGACY_ERROR_TEMP_3086" : {
1133411335
"message" : [
1133511336
"Cannot persist <tableName> into Hive metastore as table property keys may not start with 'spark.sql.': <invalidKeys>"

sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4596,6 +4596,13 @@ private[sql] object QueryCompilationErrors extends QueryErrorsBase with Compilat
45964596
messageParameters = Map("invalidClass" -> invalidClass))
45974597
}
45984598

4599+
def cannotInstantiateHiveFunctionError(clazz: String, e: Throwable): Throwable = {
4600+
new AnalysisException(
4601+
errorClass = "CANNOT_INSTANTIATE_HIVE_FUNCTION",
4602+
messageParameters = Map("clazz" -> clazz),
4603+
cause = Some(e))
4604+
}
4605+
45994606
def unsupportedParameterExpression(expr: Expression): Throwable = {
46004607
new AnalysisException(
46014608
errorClass = "UNSUPPORTED_EXPR_FOR_PARAMETER",

sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,6 @@ import scala.util.control.NonFatal
2424
import org.apache.hadoop.hive.ql.exec.{UDAF, UDF}
2525
import org.apache.hadoop.hive.ql.udf.generic.{AbstractGenericUDAFResolver, GenericUDF, GenericUDTF}
2626

27-
import org.apache.spark.sql.AnalysisException
2827
import org.apache.spark.sql.catalyst.analysis.{Analyzer, EvalSubqueriesForTimeTravel, InvokeProcedures, ReplaceCharWithVarchar, ResolveDataSource, ResolveEventTimeWatermark, ResolveExecuteImmediate, ResolveMetricView, ResolveSessionCatalog, ResolveTranspose}
2928
import org.apache.spark.sql.catalyst.analysis.resolver.ResolverExtension
3029
import org.apache.spark.sql.catalyst.catalog.{ExternalCatalogWithListener, InvalidUDFClassException}
@@ -246,13 +245,8 @@ object HiveUDFExpressionBuilder extends SparkUDFExpressionBuilder {
246245
case i: InvocationTargetException => i.getCause
247246
case o => o
248247
}
249-
val analysisException = new AnalysisException(
250-
errorClass = "_LEGACY_ERROR_TEMP_3084",
251-
messageParameters = Map(
252-
"clazz" -> clazz.getCanonicalName,
253-
"e" -> e.toString))
254-
analysisException.setStackTrace(e.getStackTrace)
255-
throw analysisException
248+
throw QueryCompilationErrors.cannotInstantiateHiveFunctionError(
249+
clazz.getCanonicalName, e)
256250
}
257251
udfExpr.getOrElse {
258252
throw QueryCompilationErrors.invalidUDFClassError(clazz.getCanonicalName)

sql/hive/src/test/scala/org/apache/spark/sql/hive/UDFSuite.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,7 @@ class UDFSuite
204204
sql(s"SELECT $functionName(value) from $testTableName")
205205
}
206206

207-
assert(e.getMessage.contains("Can not get an evaluator of the empty UDAF"))
207+
assert(e.getCause.getMessage.contains("Can not get an evaluator of the empty UDAF"))
208208
}
209209
}
210210

sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala

Lines changed: 37 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -290,8 +290,10 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
290290
sql(s"CREATE TEMPORARY FUNCTION testUDFRawList " +
291291
s"AS '${classOf[UDFRawList].getName}'")
292292
val err = intercept[AnalysisException](sql("SELECT testUDFRawList(s) FROM inputTable"))
293-
assert(err.getMessage.contains(
294-
"Raw list type in java is unsupported because Spark cannot infer the element type."))
293+
checkError(
294+
exception = err.getCause.asInstanceOf[AnalysisException],
295+
condition = "_LEGACY_ERROR_TEMP_3090",
296+
parameters = Map.empty)
295297

296298
sql("DROP TEMPORARY FUNCTION IF EXISTS testUDFRawList")
297299
hiveContext.reset()
@@ -304,8 +306,10 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
304306
sql(s"CREATE TEMPORARY FUNCTION testUDFRawMap " +
305307
s"AS '${classOf[UDFRawMap].getName}'")
306308
val err = intercept[AnalysisException](sql("SELECT testUDFRawMap(s) FROM inputTable"))
307-
assert(err.getMessage.contains(
308-
"Raw map type in java is unsupported because Spark cannot infer key and value types."))
309+
checkError(
310+
exception = err.getCause.asInstanceOf[AnalysisException],
311+
condition = "_LEGACY_ERROR_TEMP_3091",
312+
parameters = Map.empty)
309313

310314
sql("DROP TEMPORARY FUNCTION IF EXISTS testUDFRawMap")
311315
hiveContext.reset()
@@ -318,9 +322,10 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
318322
sql(s"CREATE TEMPORARY FUNCTION testUDFWildcardList " +
319323
s"AS '${classOf[UDFWildcardList].getName}'")
320324
val err = intercept[AnalysisException](sql("SELECT testUDFWildcardList(s) FROM inputTable"))
321-
assert(err.getMessage.contains(
322-
"Collection types with wildcards (e.g. List<?> or Map<?, ?>) are unsupported " +
323-
"because Spark cannot infer the data type for these type parameters."))
325+
checkError(
326+
exception = err.getCause.asInstanceOf[AnalysisException],
327+
condition = "_LEGACY_ERROR_TEMP_3092",
328+
parameters = Map.empty)
324329

325330
sql("DROP TEMPORARY FUNCTION IF EXISTS testUDFWildcardList")
326331
hiveContext.reset()
@@ -414,10 +419,16 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
414419
def testErrorMsgForFunc(funcName: String, className: String): Unit = {
415420
withUserDefinedFunction(funcName -> true) {
416421
sql(s"CREATE TEMPORARY FUNCTION $funcName AS '$className'")
417-
val message = intercept[AnalysisException] {
418-
sql(s"SELECT $funcName() FROM testUDF")
419-
}.getMessage
420-
assert(message.contains(s"No handler for UDF/UDAF/UDTF '$className'"))
422+
checkError(
423+
exception = intercept[AnalysisException] {
424+
sql(s"SELECT $funcName() FROM testUDF")
425+
},
426+
condition = "CANNOT_INSTANTIATE_HIVE_FUNCTION",
427+
parameters = Map("clazz" -> className),
428+
context = ExpectedContext(
429+
fragment = s"$funcName()",
430+
start = 7,
431+
stop = 6 + s"$funcName()".length))
421432
}
422433
}
423434

@@ -678,15 +689,21 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
678689
sql("SELECT testArraySum(array(1, 1.1, 1.2))"),
679690
Seq(Row(3.3)))
680691

681-
val msg = intercept[AnalysisException] {
682-
sql("SELECT testArraySum(1)")
683-
}.getMessage
684-
assert(msg.contains(s"No handler for UDF/UDAF/UDTF '${classOf[ArraySumUDF].getName}'"))
685-
686-
val msg2 = intercept[AnalysisException] {
687-
sql("SELECT testArraySum(1, 2)")
688-
}.getMessage
689-
assert(msg2.contains(s"No handler for UDF/UDAF/UDTF '${classOf[ArraySumUDF].getName}'"))
692+
checkError(
693+
exception = intercept[AnalysisException] {
694+
sql("SELECT testArraySum(1)")
695+
},
696+
condition = "CANNOT_INSTANTIATE_HIVE_FUNCTION",
697+
parameters = Map("clazz" -> classOf[ArraySumUDF].getCanonicalName),
698+
context = ExpectedContext(fragment = "testArraySum(1)", start = 7, stop = 21))
699+
700+
checkError(
701+
exception = intercept[AnalysisException] {
702+
sql("SELECT testArraySum(1, 2)")
703+
},
704+
condition = "CANNOT_INSTANTIATE_HIVE_FUNCTION",
705+
parameters = Map("clazz" -> classOf[ArraySumUDF].getCanonicalName),
706+
context = ExpectedContext(fragment = "testArraySum(1, 2)", start = 7, stop = 24))
690707
}
691708
}
692709

0 commit comments

Comments
 (0)