Skip to content

Implement getErrorClass explicitly (CI validation on fork)#1

Open
dev-ankit wants to merge 2 commits into
masterfrom
fix/geterrorclass-binary-compat
Open

Implement getErrorClass explicitly (CI validation on fork)#1
dev-ankit wants to merge 2 commits into
masterfrom
fix/geterrorclass-binary-compat

Conversation

@dev-ankit

Copy link
Copy Markdown
Owner

Motivation

CI validation on the fork for the same fix proposed upstream in streamnative#198.

PulsarIllegalStateException / PulsarIllegalArgumentException implement SparkThrowable but override only getCondition, relying on the deprecated SparkThrowable.getErrorClass default. Spark distributions that leave getErrorClass abstract (e.g. Databricks Runtime 18 / Spark 4.1) then raise AbstractMethodError when one of these exceptions escapes a task, which hangs the job. This implements getErrorClass explicitly on both classes.

Modifications

  • Explicit getErrorClass override on both connector SparkThrowable subclasses.
  • PulsarExceptionsSuite regression test.

Verifying this change

  • Make sure that the change passes the CI checks.
  • This change added tests and can be verified as follows: PulsarExceptionsSuite; full ./mvnw clean verify passes locally (123 tests + 2 ignored).

Documentation

  • doc-required
  • no-need-doc
  • doc

dev-ankit and others added 2 commits May 29, 2026 19:04
…ions

PulsarIllegalStateException and PulsarIllegalArgumentException implement
SparkThrowable but override only getCondition, relying on the deprecated
SparkThrowable.getErrorClass default (which delegates to getCondition).

Some Spark distributions — notably Databricks Runtime 18 (Spark 4.1) — leave
getErrorClass abstract rather than providing that default. When one of these
exceptions escapes a task, Spark's TaskResultGetter virtual-calls
getErrorClass(), finds no implementation, and raises:

  java.lang.AbstractMethodError:
    'java.lang.String org.apache.spark.SparkThrowable.getErrorClass()'
    at org.apache.spark.sql.pulsar.PulsarIllegalStateException.getErrorClass

This kills the result-getter thread, so the original failure (e.g. an
incompatible-schema produce) is never reported and the Spark job hangs
indefinitely instead of failing.

Implement getErrorClass explicitly (delegating to the error class, same as
getCondition) so the body lives on the class itself. This is harmless on OSS
Spark, where it simply overrides the deprecated default, and restores correct
error propagation on distributions that drop it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PulsarExceptions.pulsarSinkInvalidSchema references the error class
"PULSAR_SINK_INVALID_SCHEMA", which is not present in
error/pulsar-error-classes.json (only "PULSAR_SINK_INVALID_SCHEMA_TYPE" is),
so constructing it raises INTERNAL_ERROR. Use pulsarProviderInvalidSaveMode,
whose error class is defined, to exercise PulsarIllegalArgumentException.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant