[Enhancement](udf) Support volatility property for scalar UDF#62698
[Enhancement](udf) Support volatility property for scalar UDF#62698linrrzqqq wants to merge 3 commits into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
/review |
|
OpenCode automated review failed and did not complete. Error: Review step was skipped (possibly timeout or cancelled) Please inspect the workflow logs and rerun the review after the underlying issue is resolved. |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
f2edf43 to
3678922
Compare
|
run buildall |
|
/review |
There was a problem hiding this comment.
I found blocking correctness issues.
-
Blanket
isDeterministic() == falsefor all Java/Python UDF/UDAF/UDTF classes is too broad. The modified regression coverage uses deterministic helpers (IntTest,MySumInt,FloatTest,float_test.py), so this change now rejects pure external UDFs from MV/MTMV unless users opt intoenable_nondeterministic_function=trueand stops existing MV rewrite coverage from applying. -
The optimizer fix is incomplete. Several rewrite paths that can duplicate or relocate expressions still key off
containsUniqueFunction()rather than!isDeterministic(), so external UDFs can still be evaluated multiple times or moved incorrectly. A simplefilter(project(udf(...) as a))case is still vulnerable. -
The bundled BE fix correctly handles unaligned
Decimal256Arrow reads, but the adjacentTYPE_DECIMALV2Arrow branch still performs the same unsafereinterpret_castload and keeps the misalignment UB for DECIMALV2.
Critical checkpoints:
- Goal / correctness: The PR fixes the specific CTE nondeterminism example, but not the end-to-end optimizer correctness problem and it broadens semantics for all external UDFs.
- Scope / minimality: Not minimal; it changes default behavior for every Java/Python UDF/UDAF/UDTF and rewrites existing deterministic-UDF MV tests around that change.
- Concurrency / lifecycle: No new concurrency or lifecycle issue found in the touched code.
- Config / compatibility: No new config or compatibility issue found.
- Parallel paths: Not all relevant rewrite paths were updated.
- Tests: Added CTE/MTMV coverage is useful, but the modified float/java tests now encode the broader regression for deterministic UDFs and there is no regression case for the remaining filter/project duplication path.
- Observability / transaction / FE-BE variable passing: Not applicable in this diff.
- Performance: Blanket nondeterminism disables valid optimizations for pure UDFs.
No additional user-provided focus was supplied.
3678922 to
f66187b
Compare
|
run buildall |
|
/review |
FE Regression Coverage ReportIncrement line coverage |
There was a problem hiding this comment.
Additional blockers beyond the existing inline threads:
- Pre-upgrade UDF metadata is not backward-compatible with the new persisted
deterministicfield. Old Java/Python UDFs/UDAFs/UDTFs replay as nondeterministic after upgrade, which silently changesSHOW CREATE FUNCTIONoutput and can strand existing MTMVs inSCHEMA_CHANGEwhenensureMTMVQueryUsable()re-analyzes the stored query. - The new per-UDF
deterministicproperty is still not honored in cache paths. Deterministic external UDFs remain blanket-ineligible for SQL cache and fragment query cache because those paths still special-caseUdf/AliasFunctioninstead of consultingisDeterministic().
Critical checkpoints:
- Goal: Partially met. The PR plumbs a user-facing determinism flag through create/show/Nereids translation and adds regression coverage for CTE/MTMV scenarios, but it does not preserve compatibility for existing persisted UDFs and it does not propagate the flag to cache consumers.
- Scope/focus: The code changes are localized, but the end-to-end contract of the new property is incomplete.
- Concurrency: No new lock-order or thread-safety issue found in the touched code; the blocking problems are replay/analysis-path correctness issues.
- Lifecycle/static init: No special lifecycle or static initialization issue found.
- Config/properties: A new user-visible UDF property (
deterministic) is added and parsed, but not all relevant consumers honor it. - Compatibility/persistence: Blocking issue.
FunctionJSON persistence changed without a backward-compatible upgrade path for existing function metadata. - Parallel code paths: Blocking issue. Cache/query-normalization paths still bypass the new flag.
- Special conditional checks: No additional issue beyond the above.
- Test coverage: Added DDL/CTE/MV coverage is useful, but there is no coverage for upgrade/replay compatibility or SQL/query cache behavior.
- Test result files: The new
.outfiles look consistent with the added regression cases. - Observability: No new observability gap found for this change.
- Transaction/persistence/data writes: Metadata persistence is affected; the replay compatibility issue above is the main blocker.
- FE-BE variable passing: Not applicable here.
- Performance: No material performance regression found in the touched code.
- Other issues: No additional distinct blocker beyond the two comments here and the already-open review threads.
User focus: none provided.
FE Regression Coverage ReportIncrement line coverage |
f66187b to
9d73507
Compare
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
|
/review |
There was a problem hiding this comment.
I am requesting changes based on the already-open inline review threads; I did not add duplicate inline comments for the same issues.
Critical checkpoint conclusions:
- Goal/test coverage: The PR aims to add an explicit deterministic property for Java/Python UDF, UDAF, and UDTF handling. The property is plumbed through catalog/Nereids classes and new tests exist, but existing review threads identify unresolved end-to-end gaps in cache/planner behavior and optimizer rewrites.
- Scope/focus: The code changes are mostly localized, but the behavior change is broad because UDF determinism affects MV/MTMV eligibility, SQL cache, fragment query cache, CTE inlining, and optimizer rewrites.
- Concurrency/lifecycle: I did not find new lock ordering, shared mutable state, or static initialization risks in this diff.
- Configuration: No new config item is added; this is a persisted function property.
- Compatibility/persistence: Existing review context already flags that old persisted functions lack the new field and may be replayed with a changed default, which is a blocking compatibility concern.
- Parallel paths: Java/Python scalar, aggregate, and table-function wrappers are touched, but existing review context already flags remaining parallel paths that still do not respect the new determinism bit.
- Conditional checks/error handling: Boolean property parsing follows existing property parsing style; I did not find an additional distinct error-handling issue.
- Tests/results: Regression and FE unit coverage were added, with generated
.outfiles present. Coverage still does not close the existing end-to-end gaps noted in the inline threads. - Observability/transaction/data-write/FE-BE protocol: No new observability need, data write path, or FE-BE protocol field was identified beyond persisted function metadata.
- Performance: I did not find an additional distinct performance regression in the changed code.
User focus: No additional user-provided review focus was specified. I reviewed the full PR with extra attention to determinism propagation and did not find a new non-duplicate issue beyond the existing review threads.
|
I think a boolean
So for Python UDF we probably need to identify which of these cases we want to support. The current deterministic true/false split can distinguish immutable from non-immutable, but it cannot distinguish stable functions from volatile functions, while optimizer rules need different behavior for those two cases. |
9d73507 to
2bbd937
Compare
|
/review |
There was a problem hiding this comment.
Review completed. I am not adding duplicate inline comments because the remaining blocking concern is already covered by the existing thread on CreateFunctionCommand: the new UDF volatility flag is still not effective end-to-end for SQL cache / fragment query cache eligibility. In the current head, ExpressionAnalyzer still marks any Udf/AliasUdfBuilder as cannotProcessExpression, and PhysicalPlanTranslator.hasUndeterministicExpression() still returns true for any Udf.class/AliasFunction.class before relying on isDeterministic(), so a scalar UDF declared immutable/volatility=immutable is still treated as uncacheable in those paths.
Critical checkpoint conclusions:
- Goal/test: The PR aims to let users classify scalar Java/Python UDF volatility and protect optimizer rewrites for volatile calls. The current tests cover construction/equality and some rewrite behavior, but they do not prove the cache-related advertised deterministic behavior end-to-end.
- Scope/focus: The implementation is mostly focused on Nereids UDF volatility plumbing, though several end-to-end paths still need alignment.
- Concurrency/lifecycle: No new concurrent mutable state or non-obvious lifecycle issue found in the reviewed FE changes.
- Configuration/compatibility: No new config item. Persisted Function compatibility now has a null/default guard for old metadata.
- Parallel paths: One important parallel path remains incomplete: SQL cache and fragment query cache still use UDF-class checks that bypass the new volatility semantics.
- Special conditions: The VOLATILE identity logic is localized and guarded; no additional distinct issue found there.
- Test coverage: Missing cache eligibility tests for immutable UDFs and negative tests for volatile UDFs through the cache decision paths.
- Observability/transactions/data writes: Not applicable for this FE optimizer/catalog metadata change beyond normal function persistence.
- FE/BE variable passing: Runtime execution does not appear to require BE volatility propagation; this is FE planning metadata.
User focus: No additional user-provided review focus was specified.
|
run buildall |
TPC-H: Total hot run time: 29761 ms |
There was a problem hiding this comment.
I found one blocking issue: the PR commits local regression-test configuration changes that alter the default FE ports, failure limit, and Python UDF runtime. This will break standard regression-test execution for other developers and CI environments that use the repository defaults.
Critical checkpoint conclusions:
- Goal and tests: the UDF volatility goal is partially covered by FE unit tests and a Python UDF regression test, but the committed default regression configuration prevents normal verification from being reliable.
- Scope/focus: the volatility code is focused, but
regression-test/conf/regression-conf.groovycontains unrelated environment-specific changes and should not be part of the PR. - Concurrency/lifecycle: no new FE locking or lifecycle hazard found in the reviewed volatility paths.
- Configuration/persistence/compatibility: the volatility field is persisted and existing compatibility concerns are already covered in prior threads; the new issue here is the committed test-runner configuration change.
- Parallel paths: Java/Python scalar UDF paths were reviewed; existing cache and rewrite-path concerns are already raised in prior review threads and not duplicated here.
- Tests/results: regression coverage was added, but the default regression config change must be reverted so tests run against standard ports and do not stop after one failure.
- Observability/performance/transactions: no additional issue found in these areas for this PR.
User focus: no additional user-provided review focus was specified.
TPC-DS: Total hot run time: 170243 ms |
FE Regression Coverage ReportIncrement line coverage |
39d521e to
9b804ab
Compare
|
/review |
There was a problem hiding this comment.
I found one additional blocking issue beyond the existing review threads. The PR goal is to add user-controlled UDF volatility/determinism and prevent unsafe rewrites for volatile UDF calls; the implementation adds catalog persistence, Nereids UDF expression identity, optimizer guards, and regression/unit coverage.
Critical checkpoint conclusions:
- Goal and tests: partially covered by new FE unit tests and a Python UDF regression test, but
SHOW CREATE FUNCTIONfor Python UDFs is not replayable after the new serialization branch. Existing threads also still cover broader end-to-end determinism/cache/rewrite concerns. - Scope/focus: the main code changes are focused on FE/Nereids UDF volatility; no BE/Cloud data path changes in the actual GitHub PR diff.
- Concurrency/lifecycle: no new concurrent state or lock lifecycle is introduced. Volatile identity is per-expression and statement-scoped.
- Configuration: no new runtime config items.
- Compatibility/persistence: the new persisted
volfield now has anIMMUTABLEfallback for missing old values, but replayability ofSHOW CREATEfor Python UDFs is incomplete. - Parallel paths: Java/Python scalar paths are handled; aggregate/table function volatility is explicitly rejected. Existing comments already cover remaining cache/rewrite paths that still treat UDFs specially.
- Tests: new coverage exercises volatility behavior, but it does not validate
SHOW CREATE FUNCTIONoutput can be executed for Python UDFs. - Observability/transactions/data writes: not applicable beyond metadata persistence; no additional logging/metrics needed for this FE-only feature.
- User focus: no additional user-provided review focus was supplied.
Please fix the inline issue and address the already-open review threads before merge.
9b804ab to
2987c8e
Compare
|
/review |
There was a problem hiding this comment.
I found additional issues beyond the existing review threads. The PR goal is clear (persist and honor scalar UDF volatility), and the new scalar create/show/replay coverage addresses part of that goal, but the implementation still has correctness gaps in volatile identity equality and in paths that also handle UDTFs or slot-dependent volatile UDFs.
Critical checkpoint conclusions: goal is only partially met; scope is mostly focused but touches shared rewrite utilities where old assumptions no longer hold; no new FE locking/concurrency issue found; no special lifecycle/static initialization issue found; persistence compatibility is improved by defaulting missing volatility to IMMUTABLE, but UDTF replay/properties are inconsistent; parallel rewrite paths were broadly updated from unique to volatile checks, but the alias-hoisting rule still assumes no input slots; tests cover scalar Python volatility and replay, but miss UDTF show/replay and join/slot-dependent volatile alias cases; no additional observability requirement identified; no transaction/data-write path impact identified; no FE-BE protocol change identified. User focus: no additional user-provided focus points were present.
|
/review |
There was a problem hiding this comment.
I found current-head issues that need to be fixed before merge.
Checkpoint conclusions:
- Goal/test proof: The volatility plumbing is partially covered, but one newly added unit test is inconsistent with the current implementation and will fail.
- Scope/focus: The main changes are focused on UDF volatility, but there is stale code left from the previous aliasing attempt.
- Concurrency/lifecycle/config: No new concurrency, lifecycle, or config issues found in the reviewed volatility path.
- Compatibility/persistence: Existing review threads already cover serialization/replay compatibility concerns; I did not add duplicate comments.
- Parallel code paths: Existing review threads already cover UDTF SHOW/SHOW CREATE handling; no additional distinct instance found.
- Test coverage/results: The new UdfVolatilityTest contains a failing assertion against current AddProjectForUniqueFunction behavior.
- Observability/transactions/data writes: Not applicable for this PR area.
- User focus: No additional user-provided focus was specified.
f0a416b to
38714f5
Compare
|
/review |
There was a problem hiding this comment.
Automated review completed. I am requesting changes because blocking issues from the existing inline review context remain on the current head, and I did not add duplicate inline comments. In particular, the new UdfVolatilityTest still cannot pass: testAddProjectForRepeatedVolatileUdf asserts aliases.size() is 0 and then immediately dereferences aliases.get(0), which is the same test-failure area already raised in the existing thread.
Critical checkpoint conclusions:
- Goal: the PR aims to add scalar UDF volatility semantics and protect optimizer rewrites. The implementation is not yet proven correct because existing blocking optimizer/test issues remain.
- Scope: the change is mostly focused on FE UDF volatility and Nereids rewrite guards, but some previously raised paths still need cleanup.
- Concurrency: no new direct concurrent data structure or locking path was identified in the reviewed FE changes.
- Lifecycle/persistence: function metadata persistence is involved; existing review context already covers compatibility/SHOW CREATE replay concerns, and I avoided duplicating those comments.
- Configuration: no new dynamic configuration item is added.
- Compatibility: user-visible CREATE/SHOW FUNCTION syntax and persisted function metadata are changed; compatibility/replay must be resolved before merge.
- Parallel code paths: the PR updates many UniqueFunction guard paths to volatile-expression guards, but existing review context already identifies remaining optimizer-path risks.
- Tests: regression and unit tests were added, but the current unit test source still contains a failing assertion/dereference sequence. I attempted the targeted FE UT, but the runner lacks thirdparty/installed/bin/protoc, so the test command could not get past generated-source setup.
- Observability: no additional observability requirement was identified for this FE optimizer/metadata change.
- Transactions/data writes: no direct transaction write-path change was identified.
- FE/BE variable passing: no new FE-BE execution variable was introduced.
- Performance: no additional distinct performance issue was found beyond the optimizer rewrite correctness concerns already raised.
User focus: no additional user-provided review focus was specified.
38714f5 to
4771ff8
Compare
|
/review |
There was a problem hiding this comment.
I found one remaining correctness issue in the current PR. The volatility guards now avoid emitting an unreplayable VOLATILITY property for UDTFs, but SHOW CREATE still serializes UDTFs as scalar CREATE FUNCTION statements, so existing Java/Python table functions are still not faithfully replayable.
Critical checkpoint conclusions:
- Goal/test: the PR adds user-declared volatility for scalar Java/Python UDFs and broader Nereids rewrite guards; scalar volatility has coverage, but UDTF SHOW CREATE replay remains incomplete.
- Scope: the implementation is mostly focused, but the UDTF serialization fix is partial.
- Concurrency/lifecycle: no new concurrent state or special lifecycle risk identified in the reviewed paths.
- Configuration/compatibility: no new config; persisted Function volatility defaults to IMMUTABLE for old metadata, but new scalar defaults are VOLATILE through create analysis.
- Parallel paths: scalar SHOW CREATE, SHOW FUNCTIONS, and tests were updated; UDTF statement-kind serialization was missed.
- Tests: added tests cover scalar volatility and absence of VOLATILITY for UDTFs, but not UDTF SHOW CREATE replay.
- Observability/performance/transactions: no additional concerns found.
User focus: no additional user-provided review focus was supplied.
4771ff8 to
acf1ca7
Compare
|
/review |
There was a problem hiding this comment.
I found one additional replay-fidelity issue in the latest patch. Several earlier blocking concerns are already tracked in existing inline threads and are not duplicated here, including UDF/UDTF SHOW CREATE fidelity, volatility rewrite safety, and regression configuration/test failures.
Critical checkpoint conclusions: goal is to add UDF volatility metadata and protect optimizer rewrites; current code is closer but not fully complete because SHOW CREATE replay can still lose persisted lifecycle settings. Scope is mostly focused, but replay serialization needs to cover all persisted create-time properties it claims to preserve. Concurrency/locking changes are not involved. Lifecycle/persistence is involved for Function metadata; volatility is persisted, but replay SQL is still not equivalent for Python UDF/UDAF expiration settings. No new config items are added. Compatibility is partly addressed by defaulting missing volatility to IMMUTABLE, but existing threads still cover remaining behavior concerns. Parallel optimizer paths were updated broadly from unique-function checks to volatile-expression checks; no new distinct issue beyond already-threaded concerns found. Test coverage adds unit and regression coverage, but it misses replay of non-default expiration_time. Observability is not materially affected. Data-write/transaction paths are not involved. User focus: no additional user-provided review focus was specified.
|
run buildall |
TPC-H: Total hot run time: 29197 ms |
TPC-DS: Total hot run time: 170621 ms |
FE Regression Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Previously, UDFs could be treated as deterministic in optimizer-related paths, which is unsafe for UDFs whose results are not stable across evaluations. That may cause invalid rewrite/planning decisions and lead to incorrect query semantics in some cases.
Introduce
immutable,stable, andvolatilesemantics through"volatility" = "immutable|stable|volatile", persist the property in function metadata, and use it to drive deterministic and volatile-expression behavior in Nereids.Immutable UDFs are treated as deterministic, stable UDFs avoid volatile identity handling while remaining non-deterministic, and volatile UDFs receive per-call volatile identities to protect unsafe rewrites.
before:
now
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)