Skip to content

[SPARK-57555][FOLLOWUP][SQL] Add escape-hatch conf to keep legacy JDBC TIME-to-timestamp mapping#56884

Open
cloud-fan wants to merge 2 commits into
apache:masterfrom
cloud-fan:dw-jdbc-time-type-legacy-conf
Open

[SPARK-57555][FOLLOWUP][SQL] Add escape-hatch conf to keep legacy JDBC TIME-to-timestamp mapping#56884
cloud-fan wants to merge 2 commits into
apache:masterfrom
cloud-fan:dw-jdbc-time-type-legacy-conf

Conversation

@cloud-fan

@cloud-fan cloud-fan commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

SPARK-57555 made the built-in JDBC data source read SQL TIME columns as TimeType (instead of the legacy TimestampType) when spark.sql.timeType.enabled is true. That is a schema-inference change for existing JDBC reads: a column that previously came back as TimestampType now comes back as TimeType, with no DDL change by the user.

This PR adds an internal escape-hatch conf, spark.sql.legacy.jdbc.timeMapping.enabled (default false), following the existing spark.sql.legacy.<db>.*Mapping.enabled family. When set to true, JDBC TIME columns keep the legacy TimestampType mapping even when spark.sql.timeType.enabled is true. It has no effect when spark.sql.timeType.enabled is false.

Why are the changes needed?

spark.sql.timeType.enabled is a single global switch that governs TIME support across all of Spark. Enabling it flips JDBC TIME reads from TimestampType to TimeType in one step, which can silently break workloads that depend on the old mapping (downstream casts, comparisons, stored schemas, serialization). The escape hatch lets such workloads opt back into the legacy JDBC behavior without having to keep the entire TIME type disabled.

Does this PR introduce any user-facing change?

No behavioral change by default. The new conf defaults to false, preserving SPARK-57555 behavior. When explicitly set to true (and spark.sql.timeType.enabled is true), JDBC TIME columns read as TimestampType as they did before SPARK-57555.

How was this patch tested?

Added a unit test in JDBCSuite asserting that, with spark.sql.timeType.enabled=true, the column reads as TimestampType when the escape hatch is on and as TimeType when it is off.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.8)

…C TIME-to-timestamp mapping

### What changes were proposed in this pull request?
SPARK-57555 made the built-in JDBC data source read SQL `TIME` columns as `TimeType` (instead of the legacy `TimestampType`) when `spark.sql.timeType.enabled` is true. That is a schema-inference change for existing JDBC reads: a column that previously came back as `TimestampType` now comes back as `TimeType`, with no DDL change by the user.

This PR adds an internal escape-hatch conf, `spark.sql.legacy.jdbc.timeMapping.enabled` (default `false`), following the existing `spark.sql.legacy.<db>.*Mapping.enabled` family. When set to true, JDBC `TIME` columns keep the legacy `TimestampType` mapping even when `spark.sql.timeType.enabled` is true. It has no effect when `spark.sql.timeType.enabled` is false.

### Why are the changes needed?
`spark.sql.timeType.enabled` is a single global switch that governs `TIME` support across all of Spark. Enabling it flips JDBC `TIME` reads from `TimestampType` to `TimeType` in one step, which can silently break workloads that depend on the old mapping (downstream casts, comparisons, stored schemas, serialization). The escape hatch lets such workloads opt back into the legacy JDBC behavior without having to keep the entire `TIME` type disabled.

### Does this PR introduce any user-facing change?
No behavioral change by default. The new conf defaults to `false`, preserving SPARK-57555 behavior. When explicitly set to `true` (and `spark.sql.timeType.enabled` is `true`), JDBC `TIME` columns read as `TimestampType` as they did before SPARK-57555.

### How was this patch tested?
Added a unit test in `JDBCSuite` asserting that, with `spark.sql.timeType.enabled=true`, the column reads as `TimestampType` when the escape hatch is on and as `TimeType` when it is off.

### Was this patch authored or co-authored using generative AI tooling?
No
@cloud-fan

Copy link
Copy Markdown
Contributor Author

cc @MaxGekk

@MaxGekk

MaxGekk commented Jun 30, 2026

Copy link
Copy Markdown
Member

I reviewed the change and didn't find correctness issues.

The config semantics look right to me: JDBC TIME maps to TimeType only when spark.sql.timeType.enabled=true and the new legacy escape hatch is false; when the escape hatch is true it falls back to the existing timestamp mapping path, preserving preferTimestampNTZ behavior through getTimestampType(isTimestampNTZ).

I also checked that dialect-specific mappings such as PostgreSQL timetz and H2 TIME_WITH_TIMEZONE are not affected by this default mapper change, which seems intentional. The .version("4.3.0") also looks consistent with the expected branch policy for this change.

Only minor note: the new test does not explicitly cover spark.sql.timeType.enabled=false together with the escape hatch enabled, but the existing legacy TIME test already covers the disabled-TIME path and the code path is straightforward. So this is not a blocker from my side.

…mapping conf and cover disabled-TIME case

Add `.withBindingPolicy(ConfigBindingPolicy.SESSION)` to the new
`spark.sql.legacy.jdbc.timeMapping.enabled` conf so it passes
`SparkConfigBindingPolicySuite`, and add a test case asserting the escape
hatch has no effect when `spark.sql.timeType.enabled` is false.

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants