Skip to content

[SPARK-57750][SQL] Assign a name to the error condition _LEGACY_ERROR_TEMP_3084 and set its cause#56867

Open
MaxGekk wants to merge 2 commits into
apache:masterfrom
MaxGekk:error-cond_LEGACY_ERROR_TEMP_3084
Open

[SPARK-57750][SQL] Assign a name to the error condition _LEGACY_ERROR_TEMP_3084 and set its cause#56867
MaxGekk wants to merge 2 commits into
apache:masterfrom
MaxGekk:error-cond_LEGACY_ERROR_TEMP_3084

Conversation

@MaxGekk

@MaxGekk MaxGekk commented Jun 29, 2026

Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

Replace the legacy error condition _LEGACY_ERROR_TEMP_3084, raised when a Hive UDF/UDAF/UDTF wrapper class fails to instantiate during function resolution, with the descriptive condition CANNOT_INSTANTIATE_HIVE_FUNCTION, and attach the original failure as the exception cause.

  • Add CANNOT_INSTANTIATE_HIVE_FUNCTION (SQLSTATE 38000) to error-conditions.json and remove _LEGACY_ERROR_TEMP_3084.
  • Add QueryCompilationErrors.cannotInstantiateHiveFunctionError(clazz, e) that passes cause = Some(e) so the inner failure is preserved on the exception chain.
  • Update HiveSessionStateBuilder.makeHiveFunctionExpression to throw the new error and drop the manual setStackTrace (the attached cause now carries the inner stack trace).
  • Update HiveUDFSuite to assert via checkError on the new condition, and to read the inner failure via getCause where the wrapped message was previously asserted.

Why are the changes needed?

Part of the error-class migration (umbrella SPARK-37935). The legacy condition used a free-form e message parameter and did not attach the cause: the 2-arg AnalysisException(errorClass, messageParameters) constructor sets cause = None, so getCause returned null and callers/tests could not programmatically unwrap the inner failure (for example, asserting the inner condition via checkError).

Does this PR introduce any user-facing change?

Yes. The error condition name and message change, and the original exception is now attached as the cause. This is a change within the unreleased master branch only.

Before:

[_LEGACY_ERROR_TEMP_3084] No handler for UDF/UDAF/UDTF '<clazz>': <e>

After:

[CANNOT_INSTANTIATE_HIVE_FUNCTION] Cannot instantiate the Hive UDF/UDAF/UDTF wrapper class <clazz>. Check that the function arguments and their types are supported. SQLSTATE: 38000

How was this patch tested?

By running:

  • build/sbt "core/testOnly org.apache.spark.SparkThrowableSuite"
  • build/sbt "hive/testOnly org.apache.spark.sql.hive.execution.HiveUDFSuite"

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor

…_TEMP_3084 and set its cause

### What changes were proposed in this pull request?
Replace the legacy error condition `_LEGACY_ERROR_TEMP_3084`, raised when a Hive
UDF/UDAF/UDTF wrapper class fails to instantiate during function resolution, with
the descriptive condition `CANNOT_INSTANTIATE_HIVE_FUNCTION`, and attach the original
failure as the exception cause.

- Add `CANNOT_INSTANTIATE_HIVE_FUNCTION` (SQLSTATE 38000) to error-conditions.json and
  remove `_LEGACY_ERROR_TEMP_3084`.
- Add `QueryCompilationErrors.cannotInstantiateHiveFunctionError(clazz, e)` that passes
  `cause = Some(e)` so the inner failure is preserved on the exception chain.
- Update `HiveSessionStateBuilder.makeHiveFunctionExpression` to throw the new error and
  drop the manual `setStackTrace` (the cause now carries the inner stack trace).
- Update `HiveUDFSuite` to assert via `checkError` on the new condition, and to read the
  inner failure via `getCause` where the wrapped message was previously asserted.

### Why are the changes needed?
Part of the error-class migration (umbrella SPARK-37935). The legacy condition used a
free-form `e` message parameter and did not attach the cause (`getCause` returned null),
so callers and tests could not programmatically unwrap the inner failure.

### Does this PR introduce _any_ user-facing change?
Yes. The error condition name and message change, and the original exception is now
attached as the cause. Within the unreleased master branch only.

### How was this patch tested?
By running:
- `build/sbt "core/testOnly org.apache.spark.SparkThrowableSuite"`
- `build/sbt "hive/testOnly org.apache.spark.sql.hive.execution.HiveUDFSuite"`

### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Cursor
…ched cause

Follow-up: use checkError on the inner AnalysisException (the now-attached cause)
for the raw/wildcard collection cases in HiveUDFSuite, and read the inner Hive
SemanticException message via getCause in UDFSuite (SPARK-21318), which previously
relied on the inner failure being embedded in the wrapper exception's message.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants