Resolve type mismatch between PPL's VARCHAR-backed datetime UDTs and
standard Calcite DATE/TIME/TIMESTAMP columns in the unified query API's
PPL path.
DatetimeUdtExtension contributes a CoercionRule that wraps every standard
datetime expression with CAST(x AS VARCHAR), aligning unified PPL with
PPL V3's string-based datetime semantics. Hooks into the new
postAnalysisRules API on LanguageSpec, applied at the top of
UnifiedQueryPlanner.plan(). PPL path only; no impact on SQL.
Signed-off-by: Chen Dai <daichen@amazon.com>
Description
This PR introduces
DatetimeUdtExtension, a PPL-spec extension that adds aCoercionRule. The rule wraps Calcite standard datetimeRexNodes inCAST(... AS <UDT>)so unified PPL follows PPL V3’s string-based datetime contract. Please fine more background for the problem and root cause in #5250 (comment).Examples
In the examples below,
hire_dateis a field with a Calcite standardDATEtype, not a PPL datetime UDT value.source=t | eval y = YEAR(hire_date)YEAR($0:DATE)fails because the UDF expects a stringYEAR(CAST($0 AS EXPR_DATE))passes an ISO string to the UDFsource=t | where hire_date > DATE('2020-06-01')int > Stringsource=t | fields hire_dateLogicalProject(hire_date=[$0:DATE])returnsjava.sql.Date, which violates the PPL string contractLogicalProject(hire_date=[CAST($0 AS EXPR_DATE)])returns an ISO stringCurrent Limitation: Subsecond Precision
The current implementation relies on Calcite’s base
RexBuilderstringification, which formatsTIMEandTIMESTAMPvalues with precision0. As a result, fractional seconds are dropped.EXPR_DATEyyyy-MM-ddyyyy-MM-ddEXPR_TIMEHH:mm:ss[.n…]HH:mm:ssEXPR_TIMESTAMPyyyy-MM-dd HH:mm:ss[.n…]yyyy-MM-dd HH:mm:ssThis gap is narrow in practice:
EXPR_DATEis fully aligned with PPL V3.EXPR_TIMEandEXPR_TIMESTAMPonly differ when fractional seconds are present.epoch_millis || strict_date_optional_time), PPL V3 is already effectively capped at millisecond precision on its numeric-epoch path, so this PR loses at most 0–3 fractional digits relative to V3.Optional Follow-Ups
CAST(... AS <UDT>)path with a precision-preserving conversion, either via a suitable Calcite built-in or by extendingExprValueUtils.fromObjectValueto accept numeric datetime inputs.CASTnodes interfere with pushdown or transpilation, switch to the implementor-wrapping approach prototyped in #5355 to keep the logical plan free of explicit casts.Related Issues
Resolves #5250
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.