feat: eliminate legacy REPLACE path, unify to operator-based execution#24044
Conversation
matrixorigin#23946) Remove the legacy REPLACE INTO fallback that used string-concatenated DELETE+INSERT SQL and unify all REPLACE execution through the modern operator-based path (DEDUP JOIN + MULTI_UPDATE). Key changes: - Extend modern path to support fake PK tables (unique key only) - Extend modern path to support FK tables via IgnoreForeignKey context - Self-referencing FK checks enforced via DetectSqls post-execution - Delete legacy build_replace.go, scope.replace(), Query_REPLACE, ReplaceCtx - Clean up compile, frontend, deepcopy, and proto artifacts - Add BVT tests for NULL UK, fake PK, ODKU, PK+UK conflict scenarios - Add 15 unit tests covering all REPLACE plan paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…fety Add pre-check mechanism to detect when REPLACE would delete a parent row that is still referenced by other rows via self-referencing FK RESTRICT. - build_util.go: add genPreCheckSqlsForReplaceFKSelfRefer() to generate pre-check SQL from AST VALUES - build.go: call new function, prefix pre-check SQLs with REPLACE_PARENT_CHK: - compile.go: run pre-check before execution (ErrFKRowIsReferenced), filter them from post-check - fk_self_refer5.result: restore 4 missing parent-row FK error lines
There was a problem hiding this comment.
Pull request overview
This PR removes the legacy REPLACE INTO execution path and routes all REPLACE/ODKU-rewritten scenarios through the modern operator-based pipeline (DEDUP JOIN + MULTI_UPDATE), adding additional handling for fake-PK tables, vector-index tables, and FK/self-referencing FK cases.
Changes:
- Eliminates legacy REPLACE planning/execution/proto fields and unifies REPLACE to
Query_INSERT+ operator-based execution. - Extends modern REPLACE to support fake-PK (no real PK) tables, FK tables (via context bypass), and vector-index tables (by filtering irregular indexes).
- Adds FK self-refer checks and new distributed/BVT tests for NULL-on-UK, fake-PK+UK, ODKU, and FK RESTRICT scenarios.
Reviewed changes
Copilot reviewed 18 out of 19 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| test/distributed/cases/replace_statement/replace_statement.result | Updates expected output formatting for REPLACE statement cases. |
| test/distributed/cases/foreign_key/fk_self_refer5.result | Updates expected FK error outputs (drops “internal error:” prefix). |
| test/distributed/cases/dml/replace/replace.test | Adds new distributed tests for NULL unique keys, fake-PK+UK, ODKU rewrite, and PK+UK conflict. |
| test/distributed/cases/dml/replace/replace.result | Adds expected results for the new distributed REPLACE test cases. |
| proto/plan.proto | Removes legacy ReplaceCtx/REPLACE enums/fields and reserves their tags. |
| pkg/sql/plan/mock.go | Adds mock schemas for fake-PK and self-referencing FK tables; treats fake PK as hidden/autoincr in mock defs. |
| pkg/sql/plan/deepcopy.go | Removes DeepCopyReplaceCtx and Node.ReplaceCtx copying. |
| pkg/sql/plan/build.go | Removes legacy fallback and attaches DetectSqls/pre-check SQLs for REPLACE FK self-refer handling. |
| pkg/sql/plan/build_util.go | Adds REPLACE parent→child FK RESTRICT pre-check SQL generation helper. |
| pkg/sql/plan/build_test.go | Adds unit tests validating REPLACE planning structure and new schema coverage. |
| pkg/sql/plan/build_replace.go | Deletes the legacy REPLACE plan builder implementation. |
| pkg/sql/plan/build_insert.go | Routes ODKU→REPLACE rewrite through bindAndOptimizeReplaceQuery (modern path). |
| pkg/sql/plan/bind_replace.go | Enables FK tables in modern REPLACE, supports fake-PK/no-UK fallback behavior, and skips irregular indexes instead of rejecting. |
| pkg/sql/compile/scope.go | Removes legacy Scope.replace() execution (DELETE + INSERT string manipulation). |
| pkg/sql/compile/scope_test.go | Removes tests for legacy SQL-string manipulation helper. |
| pkg/sql/compile/compile2.go | Ensures query-plan analyze runs for former REPLACE statements (now INSERT stmt type). |
| pkg/sql/compile/compile.go | Adds REPLACE FK pre-check execution and filters pre-check SQLs out of post-check FK detection. |
| pkg/frontend/mysql_cmd_executor.go | Removes ReplaceCtx handling in modify-check logic. |
| pkg/frontend/mysql_cmd_executor_test.go | Removes the ReplaceCtx test case from checkModify tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…check
- compile.go: only translate ErrFKNoReferencedRow2 from runDetectSql
into ErrFKRowIsReferenced; pass through real execution errors
(syntax, permissions, network, txn conflicts) instead of masking
them as FK violations
- build_util.go: fix genPreCheckSqlsForReplaceFKSelfRefer
- skip non-RESTRICT/NO_ACTION OnDelete actions so CASCADE / SET NULL
self-ref FK REPLACE is not incorrectly blocked
- lowercase user-supplied explicit column identifiers to match
ColDef.Name normalization (case-insensitive lookup)
- skip all hidden columns in the implicit column-list mapping
(composite PK helper, fake PK, cluster-by composite, Row_ID)
instead of only Row_ID
- skip pre-check generation when any row supplies ParamExpr at the
referred position so prepared statements do not emit '?' into the
background SQL (TODO: move generation to compile time once EXECUTE
args are bound)
- build_test.go: replace no-op TestReplaceDetectSqls with real
assertions on REPLACE_PARENT_CHK content; assert that CASCADE
self-ref FK REPLACE generates no parent-child pre-check; add
TestReplaceDetectSqlsExplicitColumnsCaseInsensitive
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 19 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- proto/plan.proto: turn the ReplaceCtx removal comment into an actual
`message ReplaceCtx { reserved 1 to 9; }` declaration so the legacy
field tags stay reserved across future re-introductions (wire-compat).
- pkg/sql/plan/explain: drop dead Go references to plan.Node_REPLACE /
Label_Replace from marshal_query.go, marshal_model.go, explain_query.go
so a future regen of pb/plan that drops the reserved enum value will
not break the build. Update TestGetLabelOrTitle to skip the reserved
REPLACE NodeType.
- pkg/sql/plan/build_util.go: tighten genPreCheckSqlsForReplaceFKSelfRefer
so the embedded PK values are only generated for static literals
(NumVal/StrVal, including NULL). Non-literal expressions like
rand()/uuid()/now()/subqueries/ParamExpr are skipped instead of being
re-evaluated at pre-check time and producing values that may not
match what REPLACE actually wrote.
- pkg/sql/plan/build_test.go: add UT coverage for the non-literal skip
path and the multi-row pre-check IN list — bumps coverage of
genPreCheckSqlsForReplaceFKSelfRefer from 79.2% to 86.5%.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
XuPeng-SH
left a comment
There was a problem hiding this comment.
Review Summary
Well-designed refactoring that eliminates the fragile legacy REPLACE path (string-based DELETE + INSERT) and unifies to the modern operator-based pipeline (DEDUP JOIN + MULTI_UPDATE). Comprehensive test coverage with 13 new tests + BVT tests. Proto changes are done correctly with reserved fields.
✅ Strengths
- Proto backwards compatibility:
reserved 1 to 9in ReplaceCtx,reserved 32in NodeType,reserved 3in StmtType,reserved 31in Node — prevents silent wire-format corruption during rolling upgrades - FK handling is thorough: Pre-checks (REPLACE_PARENT_CHK) run before REPLACE for RESTRICT/NO_ACTION, post-checks validate no orphaned children. CASCADE/SET_NULL/SET_DEFAULT correctly skip pre-checks
- IgnoreForeignKey context scoping: Context is properly restored after ResolveTables —
builder.compCtx.SetContext(origCtx)prevents the bypass from leaking - Non-literal expression safety:
isSimpleLiteralExpr()correctly skips pre-check generation forrand(),?, subqueries etc. that would produce different values when re-evaluated in background SQL - Fake PK handling: No PK + no UK → INSERT fallback avoids cross-join. No PK + UK → uses first unique key for LEFT JOIN. Both are correct
- Error message improvement:
fk_self_refer5.resultshows cleaner error messages —"Cannot delete or update..."instead of"internal error: Cannot delete or update..."
🟡 Minor Issues
-
"REPLACE_PARENT_CHK:"prefix is a magic string — used inbuild.go(generation) andcompile.go(consumption). Consider extracting to a constant to prevent typos:const replaceParentCheckPrefix = "REPLACE_PARENT_CHK:"
-
Pre-check SQL only handles single-column FKs (
len(referCols) != 1 || len(fkCols) != 1 → continue). Composite FK self-references silently skip the parent safety check. This is documented as a limitation but worth a TODO comment. -
marshal_query_test.goskip is fragile — The test loops overNode_NodeType_valueand explicitly skipsNode_REPLACEby matching the constant. If proto is regenerated andNode_REPLACEis removed from the generated code, this skip becomes dead code. Consider regeneratingpipeline.pb.goas part of this PR to keep proto and generated code in sync. -
Variable redeclaration in
compile.go:query := c.pn.GetQuery() // line ~503 (pre-check) // ... later ... query = c.pn.GetQuery() // line ~596 (post-check, changed from := to =)
The pre-check block introduces
queryin the outerrunOncescope, which shadows the existingquery :=at line ~596. The diff shows this was handled (:=changed to=), which is correct. But the double-fetch ofGetQuery()is redundant — the value cannot change between the two call sites.
⚠️ Things to Verify (not blocking)
-
REPLACE on tables with vector indexes: The PR removes the
hasIrregularIndex → ErrUnsupportedDMLguard. The comment says "async cron maintained" indexes are safe to skip. Verify this with BVT tests on tables with vector indexes. -
Concurrent REPLACE on self-ref FK tables: The pre-check SQL runs before the REPLACE executes. Between the pre-check and execution, another transaction could insert a child row referencing the PK being replaced, causing the check to pass but the REPLACE to create an orphan. This is a TOCTOU gap inherent to the approach — the only real fix would be row-level locks, which is a larger change.
-
ODKU rewrite path:
build_insert.gonow callsbindAndOptimizeReplaceQueryinstead ofbuildReplace. Verify that the ODKUVALUES()function semantics are preserved in the new path (the BVT test covers a basic case but not complex expressions).
Overall: Approve — clean, well-tested refactoring with proper proto compatibility handling. The minor issues are non-blocking.
Merge Queue Status
This pull request spent 57 minutes 40 seconds in the queue, including 56 minutes 36 seconds running CI. Required conditions to merge
|
What type of PR is this?
Which issue(s) this PR fixes:
issue #23946
What this PR does / why we need it:
Eliminates the legacy REPLACE INTO execution path and unifies all REPLACE INTO scenarios to the modern operator-based path (DEDUP JOIN + MULTI_UPDATE).
Changes:
IgnoreForeignKeycontext bypass inResolveTables, with self-referencing FK constraint checks enforced viaDetectSqlsbuild_replace.go,scope.replace(),DeepCopyReplaceCtx, and all related branchesReplaceCtx,Node.REPLACE=32,Query.REPLACE=3,Node.replace_ctx=31as reservedNode_REPLACEreferences from explain codeLegacy path problems fixed:
col = NULLsemantic error inbuildConjunction()(SQL standard:NULL = NULLis UNKNOWN)scope.replace()