Skip to content

Fix chunk download failures mis-bucketed as CONNECTION_ERROR in telemetry#1469

Merged
samikshya-db merged 2 commits into
databricks:mainfrom
samikshya-db:fix-chunk-download-telemetry-bucket
May 27, 2026
Merged

Fix chunk download failures mis-bucketed as CONNECTION_ERROR in telemetry#1469
samikshya-db merged 2 commits into
databricks:mainfrom
samikshya-db:fix-chunk-download-telemetry-bucket

Conversation

@samikshya-db
Copy link
Copy Markdown
Collaborator

Summary

The 5-arg DatabricksSQLException(reason, cause, statementId, chunkIndex, sqlState) constructor — used exclusively by ChunkDownloadTask — hardcoded CONNECTION_ERROR.name() in its exportFailureLog call, ignoring the sqlState argument passed in by the caller.

ChunkDownloadTask.java:98 passes CHUNK_DOWNLOAD_ERROR.name() as sqlState, so chunk download failures were:

  • stored on the SQLException with sqlState CHUNK_DOWNLOAD_ERROR (visible to JDBC consumers), but
  • recorded in telemetry under the CONNECTION_ERROR bucket.

This divergence was flagged by a pre-existing // TODO : Check chunk retry telemetry logic comment.

The fix passes the caller's sqlState through to exportFailureLog, so the telemetry errorName matches the SQLException's sqlState and the caller's intent.

Test plan

  • mvn test -pl . -Dtest=ChunkDownloadTaskTest
  • mvn test -pl . -Dtest=DatabricksSQLExceptionTest
  • Confirm telemetry dashboards bucket chunk download failures under CHUNK_DOWNLOAD_ERROR going forward (was CONNECTION_ERROR).

NO_CHANGELOG=true

This pull request and its description were written by Isaac.

… telemetry

The 5-arg DatabricksSQLException constructor used by ChunkDownloadTask
passes CHUNK_DOWNLOAD_ERROR.name() as sqlState, but the constructor was
hardcoding CONNECTION_ERROR.name() in the telemetry export. As a result,
chunk download failures were recorded under the CONNECTION_ERROR bucket
instead of CHUNK_DOWNLOAD_ERROR, masking the true failure category and
making telemetry-driven retry/alerting analysis unreliable.

Pass the caller's sqlState through to exportFailureLog so the telemetry
errorName matches the SQLException's sqlState and the caller's intent.

Co-authored-by: Isaac
Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>
@samikshya-db samikshya-db merged commit 3cae948 into databricks:main May 27, 2026
15 of 17 checks passed
@samikshya-db samikshya-db deleted the fix-chunk-download-telemetry-bucket branch May 27, 2026 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants