Skip to content

fix: false duplicate entry error on concurrent INSERT with NULL uniqu…#24046

Closed
jiangxinmeng1 wants to merge 8 commits intomatrixorigin:mainfrom
jiangxinmeng1:fix/concurrent-insert-null-unique-key
Closed

fix: false duplicate entry error on concurrent INSERT with NULL uniqu…#24046
jiangxinmeng1 wants to merge 8 commits intomatrixorigin:mainfrom
jiangxinmeng1:fix/concurrent-insert-null-unique-key

Conversation

@jiangxinmeng1
Copy link
Copy Markdown
Contributor

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #24030

What this PR does / why we need it:

Concurrent multi-row INSERT into tables with UNIQUE KEY + AUTO_INCREMENT falsely reports 'Duplicate entry '' for key' when ~50% of unique key values are NULL. ~30-40% of concurrent INSERTs fail.

Root causes and fixes:

  1. fuzzyCheck treats NULL as empty string (fuzzyCheck.go)

    • firstlyCheck() and genCollsionKeys() did not skip NULL rows
    • Multiple NULLs in unique key column were detected as duplicate empty strings
    • Fix: skip NULL rows in both functions
  2. PKPersistedBetween.InplaceSort corrupts bitmap alignment (txn_table.go)

    • InplaceSort reorders Varlena data entries but does NOT reorder the null bitmap
    • After sorting, NULL entries (zero Varlena) move to array start while bitmap still marks original positions, causing bitmap/data misalignment
    • Fix: Clone vector before InplaceSort in PKPersistedBetween
  3. NULL Varlena slots contain garbage from pSpool buffer reuse (txn_table.go)

    • GetUnionAllFunction skips writing to NULL Varlena slots during vector copy
    • Reused cache buffers may contain stale data from different column types
    • EncodePrimaryKeyVector iterates all rows without checking nulls, reading garbage Varlena with svlen>23 but nil area causes slice bounds panic
    • Fix: cloneNonNullKeys() filters out NULL rows before passing to primaryKeysMayBeChanged

Additional defensive fixes:

  • bitmap.go: add RecalculateCount() to fix count/data inconsistency
  • nulls.go: Contains() removes EmptyByFlag shortcut; Range() snapshots via Clone()
  • vector.go: TryExpand in Copy/UnionBatch/AppendFixedList ensures bitmap length
  • value_scan.go: prepared flag prevents duplicate evalRowsetData; reset in Reset()
  • scope.go: TP queries use sequential MergeRun execution
  • stats.go: disable right join optimization for DEDUP joins
  • txn.go: checkPKDup skips NULL for varchar/array types
  • preinsert.go: type handling adjustment in constructColBuf

Fixes #24030

…e keys (matrixorigin#24030)

Concurrent multi-row INSERT into tables with UNIQUE KEY + AUTO_INCREMENT
falsely reports 'Duplicate entry '' for key' when ~50% of unique key values
are NULL. ~30-40% of concurrent INSERTs fail.

Root causes and fixes:

1. fuzzyCheck treats NULL as empty string (fuzzyCheck.go)
   - firstlyCheck() and genCollsionKeys() did not skip NULL rows
   - Multiple NULLs in unique key column were detected as duplicate empty strings
   - Fix: skip NULL rows in both functions

2. PKPersistedBetween.InplaceSort corrupts bitmap alignment (txn_table.go)
   - InplaceSort reorders Varlena data entries but does NOT reorder the null bitmap
   - After sorting, NULL entries (zero Varlena) move to array start while bitmap
     still marks original positions, causing bitmap/data misalignment
   - Fix: Clone vector before InplaceSort in PKPersistedBetween

3. NULL Varlena slots contain garbage from pSpool buffer reuse (txn_table.go)
   - GetUnionAllFunction skips writing to NULL Varlena slots during vector copy
   - Reused cache buffers may contain stale data from different column types
   - EncodePrimaryKeyVector iterates all rows without checking nulls, reading
     garbage Varlena with svlen>23 but nil area causes slice bounds panic
   - Fix: cloneNonNullKeys() filters out NULL rows before passing to
     primaryKeysMayBeChanged

Additional defensive fixes:
- bitmap.go: add RecalculateCount() to fix count/data inconsistency
- nulls.go: Contains() removes EmptyByFlag shortcut; Range() snapshots via Clone()
- vector.go: TryExpand in Copy/UnionBatch/AppendFixedList ensures bitmap length
- value_scan.go: prepared flag prevents duplicate evalRowsetData; reset in Reset()
- scope.go: TP queries use sequential MergeRun execution
- stats.go: disable right join optimization for DEDUP joins
- txn.go: checkPKDup skips NULL for varchar/array types
- preinsert.go: type handling adjustment in constructColBuf

Fixes matrixorigin#24030

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Fix false duplicate entry error on concurrent INSERT with NULL unique keys

🐞 Bug fix

Grey Divider

Walkthroughs

Description
• Fix false duplicate entry errors on concurrent INSERT with NULL unique keys
• Prevent NULL bitmap corruption from concurrent vector modifications
• Skip NULL values in duplicate detection and primary key checks
• Ensure bitmap length consistency across vector operations
• Disable right join optimization for DEDUP joins to preserve NULL handling
Diagram
flowchart LR
  A["Concurrent INSERT<br/>with NULL unique keys"] -->|fuzzyCheck| B["Skip NULL rows<br/>in duplicate detection"]
  A -->|Vector operations| C["Ensure bitmap<br/>length consistency"]
  A -->|PKPersistedBetween| D["Clone vector<br/>before sorting"]
  A -->|Value scan| E["Prevent duplicate<br/>evalRowsetData calls"]
  A -->|DEDUP joins| F["Disable right join<br/>optimization"]
  B --> G["Fix: No false<br/>duplicates"]
  C --> G
  D --> G
  E --> G
  F --> G
Loading

Grey Divider

File Changes

1. pkg/common/bitmap/bitmap.go 🐞 Bug fix +9/-0

Add RecalculateCount method to fix bitmap corruption

pkg/common/bitmap/bitmap.go


2. pkg/container/nulls/nulls.go 🐞 Bug fix +7/-2

Fix NULL bitmap handling in concurrent access scenarios

pkg/container/nulls/nulls.go


3. pkg/container/vector/vector.go 🐞 Bug fix +30/-22

Ensure bitmap length consistency in vector operations

pkg/container/vector/vector.go


View more (10)
4. pkg/sql/colexec/lockop/lock_op.go Formatting +2/-1

Minor formatting and whitespace adjustments

pkg/sql/colexec/lockop/lock_op.go


5. pkg/sql/colexec/preinsert/preinsert.go 🐞 Bug fix +6/-3

Fix vector duplication and type handling in buffer construction

pkg/sql/colexec/preinsert/preinsert.go


6. pkg/sql/colexec/projection/projection.go 📝 Documentation +0/-3

Remove outdated comment about vector modification

pkg/sql/colexec/projection/projection.go


7. pkg/sql/colexec/value_scan/types.go 🐞 Bug fix +5/-3

Add prepared flag to prevent duplicate evalRowsetData

pkg/sql/colexec/value_scan/types.go


8. pkg/sql/colexec/value_scan/value_scan.go 🐞 Bug fix +15/-0

Skip duplicate preparation and recalculate bitmap count

pkg/sql/colexec/value_scan/value_scan.go


9. pkg/sql/compile/fuzzyCheck.go 🐞 Bug fix +18/-4

Skip NULL values in duplicate key detection

pkg/sql/compile/fuzzyCheck.go


10. pkg/sql/compile/scope.go 🐞 Bug fix +23/-6

Run non-connector PreScopes sequentially for TP queries

pkg/sql/compile/scope.go


11. pkg/sql/plan/stats.go 🐞 Bug fix +5/-7

Disable right join optimization for DEDUP joins

pkg/sql/plan/stats.go


12. pkg/vm/engine/disttae/txn.go 🐞 Bug fix +9/-0

Skip NULL values in primary key duplicate checks

pkg/vm/engine/disttae/txn.go


13. pkg/vm/engine/disttae/txn_table.go 🐞 Bug fix +51/-9

Clone vector before sorting and filter NULL keys

pkg/vm/engine/disttae/txn_table.go


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented Apr 3, 2026

Code Review by Qodo

🐞 Bugs (4) 📘 Rule violations (0) 📎 Requirement gaps (0) 🎨 UX Issues (0)

Grey Divider


Action required

1. Bitmap Contains race panic 🐞 Bug ☼ Reliability
Description
Nulls.Contains now always calls bitmap.Bitmap.Contains, and some union paths rely on IsEmpty()
instead of EmptyByFlag(), increasing read access to bitmaps during concurrent expansion. Since
Bitmap.TryExpandWithSize sets len before resizing data, a concurrent Contains() can index
past data and panic.
Code

pkg/container/nulls/nulls.go[138]

+	return nsp != nil && nsp.np.Contains(row)
Evidence
Nulls.Contains unconditionally delegates to Bitmap.Contains (no longer short-circuiting on
EmptyByFlag). Bitmap.TryExpandWithSize updates n.len before growing n.data, and
Bitmap.Contains indexes n.data[idx] without verifying idx < len(n.data), so concurrent reads
during expansion can panic with out-of-range access.

pkg/container/nulls/nulls.go[136-139]
pkg/container/vector/vector.go[1340-1364]
pkg/common/bitmap/bitmap.go[231-238]
pkg/common/bitmap/bitmap.go[348-363]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`Bitmap.TryExpandWithSize` can temporarily expose `len` that is larger than `data` length. With the PR’s increased reliance on `Contains()` in concurrent contexts, this can crash with an index-out-of-range panic.
### Issue Context
This PR is explicitly addressing concurrent bitmap corruption; hardening the core bitmap read path avoids turning “wrong null detection” into process crashes.
### Fix Focus Areas
- Add a defensive bound check in `Bitmap.Contains` (and any other direct `n.data[idx]` reads) to return false if `idx >= uint64(len(n.data))`.
- Optionally reorder `TryExpandWithSize` to grow `data` before publishing `len`.
- pkg/common/bitmap/bitmap.go[231-238]
- pkg/common/bitmap/bitmap.go[348-363]
- pkg/container/nulls/nulls.go[136-139]
- pkg/container/vector/vector.go[1340-1364]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Range skips corrupted nulls 🐞 Bug ≡ Correctness
Description
nulls.Range still returns early based on EmptyByFlag() (which is count==0), so if count is
corrupted to 0 while data has set bits, Range will drop NULLs entirely. This can make
Vector.Window/CloneWindowTo silently lose NULL markers and produce incorrect results.
Code

pkg/container/nulls/nulls.go[R226-238]

}
b.np.InitWithSize(int64(end + 1 - bias))
+
+	// Take a snapshot of the source bitmap to prevent reading inconsistent
+	// state when the source is being concurrently modified by a parallel
+	// Prepare call on the same operator chain.
+	snap := nsp.Clone()
for ; start < end; start++ {
-		if nsp.np.Contains(start) {
+		if snap.np.Contains(start) {
b.np.Add(start - bias)
}
}
Evidence
bitmap.EmptyByFlag() returns true when count==0 even if underlying bits are set, and
nulls.Range exits before taking its snapshot when EmptyByFlag() is true. Vector.Window depends
on nulls.Range to copy the null bitmap into the windowed vector, so early-return yields an
incorrect window null set.

pkg/container/nulls/nulls.go[223-239]
pkg/common/bitmap/bitmap.go[193-197]
pkg/container/vector/vector.go[3754-3784]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`nulls.Range` still gates on `EmptyByFlag()` which depends on `count`. In the exact corruption scenario this PR is targeting (stale `count`), `Range` can incorrectly return early and drop NULLs in windowed/sliced vectors.
### Issue Context
The PR adds a snapshot clone inside `Range`, but the early-return happens before the snapshot and can mask real nulls.
### Fix Focus Areas
- Avoid using `EmptyByFlag()` as the early-exit condition.
- Consider: clone first, then call `snap.np.RecalculateCount()` and check `snap.np.IsEmpty()` before iterating.
- pkg/container/nulls/nulls.go[223-239]
- pkg/common/bitmap/bitmap.go[171-178]
- pkg/common/bitmap/bitmap.go[193-197]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. Prepared flag set early 🐞 Bug ≡ Correctness
Description
ValueScan.makeValueScanBatch sets runningCtx.prepared=true before evalRowsetData runs, so if
evaluation returns an error the flag remains true and later Prepare() calls will incorrectly skip
evaluation. This can lead to returning partially-filled or stale vectors on retries.
Code

pkg/sql/colexec/value_scan/value_scan.go[R70-79]

+	// Skip evalRowsetData if already done (prevents concurrent bitmap corruption
+	// when the same scope is started multiple times by nested MergeRun)
+	if valueScan.runningCtx.prepared {
+		return nil
+	}
+	valueScan.runningCtx.prepared = true
+
for i := 0; i < valueScan.ColCount; i++ {
exprList = valueScan.ExprExecLists[i]
if len(exprList) == 0 {
Evidence
The code sets prepared=true before any work, and returns errors directly from evalRowsetData
without clearing the flag; the only reset is in ValueScan.Reset, which may not run before a
retry/second prepare attempt in some failure paths.

pkg/sql/colexec/value_scan/value_scan.go[52-87]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`prepared` becomes sticky even when `makeValueScanBatch` fails, causing subsequent Prepare calls to skip evaluation.
### Issue Context
The intent is to avoid duplicate evaluation when the same operator chain is started multiple times, but correctness on error/retry paths needs to be preserved.
### Fix Focus Areas
- Move `prepared=true` to after the eval loop completes successfully.
- Or ensure `prepared=false` is set on all error return paths.
- pkg/sql/colexec/value_scan/value_scan.go[70-86]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. PreScopes overwritten in MergeRun🐞 Bug ☼ Reliability
Description
For TP queries, Scope.MergeRun overwrites s.PreScopes with only connector scopes and never
restores it. If MergeRun is invoked again on the same Scope instance (retry/reuse),
non-connector pre-scopes will no longer run, and connector order is reversed due to backward
iteration plus append.
Code

pkg/sql/compile/scope.go[R284-301]

+	if c.IsTpQuery() {
+		var connectorScopes []*Scope
for i := len(s.PreScopes) - 1; i >= 0; i-- {
-			err := s.PreScopes[i].MergeRun(c)
-			if err != nil {
-				return err
+			ps := s.PreScopes[i]
+			if ps.RootOp != nil && ps.RootOp.OpType() == vm.Connector {
+				connectorScopes = append(connectorScopes, ps)
+			} else {
+				if err := ps.MergeRun(c); err != nil {
+					return err
+				}
}
}
-		return s.ParallelRun(c)
+		if len(connectorScopes) == 0 {
+			return s.ParallelRun(c)
+		}
+		// Replace PreScopes with only connector scopes for concurrent execution
+		s.PreScopes = connectorScopes
}
Evidence
The TP path assigns s.PreScopes = connectorScopes as a permanent mutation of the scope object. The
connectorScopes slice is built by iterating from the end and appending, which reverses the relative
order of connector scopes compared to the original PreScopes slice.

pkg/sql/compile/scope.go[268-301]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`MergeRun` mutates `s.PreScopes` in-place for TP queries, making the scope stateful and potentially incorrect if the same scope is run again.
### Issue Context
The goal is to run non-connector PreScopes sequentially, then run connector PreScopes concurrently. This can be done without overwriting the original slice.
### Fix Focus Areas
- Save `orig := s.PreScopes` and `defer func(){ s.PreScopes = orig }()`.
- Build `connectorScopes` in the same order as `orig` (iterate forward or reverse+prepend).
- Prefer using a local variable for “scopes to launch concurrently” instead of mutating `s.PreScopes`.
- pkg/sql/compile/scope.go[284-301]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Range clone hot-path overhead 🐞 Bug ➹ Performance
Description
nulls.Range now clones the full bitmap on every call, and it is used by hot-path APIs like
Vector.Window and Vector.CloneWindowTo. This adds an O(words) allocation/copy to every
window/slice operation and can significantly regress throughput and memory churn.
Code

pkg/container/nulls/nulls.go[R229-238]

+
+	// Take a snapshot of the source bitmap to prevent reading inconsistent
+	// state when the source is being concurrently modified by a parallel
+	// Prepare call on the same operator chain.
+	snap := nsp.Clone()
for ; start < end; start++ {
-		if nsp.np.Contains(start) {
+		if snap.np.Contains(start) {
b.np.Add(start - bias)
}
}
Evidence
nulls.Range always executes snap := nsp.Clone() regardless of whether the caller is in a
concurrent context. Vector.Window and Vector.CloneWindowTo call nulls.Range for every
slice/window, so the clone occurs frequently under normal execution.

pkg/container/nulls/nulls.go[223-239]
pkg/container/vector/vector.go[3754-3784]
pkg/container/vector/vector.go[3819-3846]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`nulls.Range` unconditionally clones the bitmap, adding allocation/copy overhead to every vector window/slice.
### Issue Context
The snapshot is meant to address a specific concurrency hazard; applying it globally penalizes common, non-concurrent paths.
### Fix Focus Areas
- Introduce a second API (e.g. `RangeSnapshot` or `RangeSafe`) for the snapshotting behavior.
- Keep `Range` as the fast, non-cloning implementation, and switch only the known-concurrent call sites to the safe variant.
- pkg/container/nulls/nulls.go[223-239]
- pkg/container/vector/vector.go[3754-3784]
- pkg/container/vector/vector.go[3819-3846]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@matrix-meow matrix-meow added the size/M Denotes a PR that changes [100,499] lines label Apr 3, 2026
@mergify mergify Bot added the kind/bug Something isn't working label Apr 3, 2026
// Contains returns true if the integer is contained in the Nulls
func (nsp *Nulls) Contains(row uint64) bool {
return nsp != nil && !nsp.np.EmptyByFlag() && nsp.np.Contains(row)
return nsp != nil && nsp.np.Contains(row)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Bitmap contains race panic 🐞 Bug ☼ Reliability

Nulls.Contains now always calls bitmap.Bitmap.Contains, and some union paths rely on IsEmpty()
instead of EmptyByFlag(), increasing read access to bitmaps during concurrent expansion. Since
Bitmap.TryExpandWithSize sets len before resizing data, a concurrent Contains() can index
past data and panic.
Agent Prompt
### Issue description
`Bitmap.TryExpandWithSize` can temporarily expose `len` that is larger than `data` length. With the PR’s increased reliance on `Contains()` in concurrent contexts, this can crash with an index-out-of-range panic.

### Issue Context
This PR is explicitly addressing concurrent bitmap corruption; hardening the core bitmap read path avoids turning “wrong null detection” into process crashes.

### Fix Focus Areas
- Add a defensive bound check in `Bitmap.Contains` (and any other direct `n.data[idx]` reads) to return false if `idx >= uint64(len(n.data))`.
- Optionally reorder `TryExpandWithSize` to grow `data` before publishing `len`.

- pkg/common/bitmap/bitmap.go[231-238]
- pkg/common/bitmap/bitmap.go[348-363]
- pkg/container/nulls/nulls.go[136-139]
- pkg/container/vector/vector.go[1340-1364]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines 226 to 238
}

b.np.InitWithSize(int64(end + 1 - bias))

// Take a snapshot of the source bitmap to prevent reading inconsistent
// state when the source is being concurrently modified by a parallel
// Prepare call on the same operator chain.
snap := nsp.Clone()
for ; start < end; start++ {
if nsp.np.Contains(start) {
if snap.np.Contains(start) {
b.np.Add(start - bias)
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Range skips corrupted nulls 🐞 Bug ≡ Correctness

nulls.Range still returns early based on EmptyByFlag() (which is count==0), so if count is
corrupted to 0 while data has set bits, Range will drop NULLs entirely. This can make
Vector.Window/CloneWindowTo silently lose NULL markers and produce incorrect results.
Agent Prompt
### Issue description
`nulls.Range` still gates on `EmptyByFlag()` which depends on `count`. In the exact corruption scenario this PR is targeting (stale `count`), `Range` can incorrectly return early and drop NULLs in windowed/sliced vectors.

### Issue Context
The PR adds a snapshot clone inside `Range`, but the early-return happens before the snapshot and can mask real nulls.

### Fix Focus Areas
- Avoid using `EmptyByFlag()` as the early-exit condition.
- Consider: clone first, then call `snap.np.RecalculateCount()` and check `snap.np.IsEmpty()` before iterating.

- pkg/container/nulls/nulls.go[223-239]
- pkg/common/bitmap/bitmap.go[171-178]
- pkg/common/bitmap/bitmap.go[193-197]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

scope.go explicitly sets PreScopes elements to nil to prevent
double-free during MergeRun errors. However, many functions that
recursively traverse PreScopes did not check for nil, causing
panics during bootstrap (e.g. genReceiverMap in debugTools.go).

Add nil guards at function entry for all affected functions:
- debugTools.go: genReceiverMap, showSingleScope
- analyze_module.go: ConvertScopeToPhyScope, UpdatePreparePhyScope,
  explainSingleScope
- compile.go: UpdateScopeTxnOffset
- remoterun.go: generatePipeline, fillInstructionsForPipeline
- remoterunClient.go: checkPipelineStandaloneExecutableAtRemote
  (both BFS loops, plus Proc nil check)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jiangxinmeng1 and others added 2 commits April 3, 2026 17:04
In MergeRun's TP path, non-connector scopes are run sequentially.
After execution, perform the full cleanup sequence matching the
normal Compile.FreeOperator → Compile.clear order:

1. FreeOperator(c) - calls op.Free() to release expression executors
   (e.g. FixedVectorExpressionExecutor)
2. release() - calls op.Release() and reuse.Free[Scope] to return
   operator and scope objects to the reuse pool
3. Set PreScopes[i] = nil - prevents double-free when later cleanup
   walks the tree (nil checks exist throughout)

The previous approach only called release() without FreeOperator(),
causing 'missing free' panics for expression executors. Removing
release() entirely caused 'missing free' panics for Scope objects.

Also use concurrentPreScopes counter instead of len(s.PreScopes) to
avoid deadlock when receiving from the preScopeResultReceiveChan.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@XuPeng-SH XuPeng-SH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

The NULL-skipping fixes (fuzzyCheck, checkPKDup, cloneNonNullKeys, PKPersistedBetween clone-before-sort) are correct and address real bugs. However, several other changes are treating symptoms of a deeper concurrency issue rather than fixing the root cause, and introduce performance regressions.

See inline comments for details on each concern.

@@ -135,7 +135,7 @@ func TryExpand(nsp *Nulls, size int) {

// Contains returns true if the integer is contained in the Nulls
func (nsp *Nulls) Contains(row uint64) bool {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Removing EmptyByFlag() short-circuit masks the root cause

EmptyByFlag() checks n.count == 0 — a fast O(1) short-circuit on one of the hottest paths in the engine. Removing it means you've discovered that count and data are inconsistent (count=0 but bits are set). The correct fix is to find what's corrupting the count/data invariant, not to stop trusting count.

Also note the contradiction with other changes in this PR: the EmptyByFlag()→IsEmpty()replacements in vector.go still rely oncount being accurate (IsEmpty()returnsn.count == 0`). If count is untrustworthy here, it's untrustworthy there too.

This will cause a measurable performance regression on every Contains() call across the entire engine.

// Take a snapshot of the source bitmap to prevent reading inconsistent
// state when the source is being concurrently modified by a parallel
// Prepare call on the same operator chain.
snap := nsp.Clone()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Clone() on every Range() call — severe hot-path performance regression

Range() is called frequently during null propagation. Cloning the entire bitmap on every call is very expensive. If the source bitmap is being concurrently modified by a parallel Prepare call, the correct fix is to add synchronization at the caller level (or ensure the operator chain is not shared), not to defensively clone on every Range() invocation.

This also only provides a snapshot — if the underlying concurrency bug remains, the snapshot may still be taken from an inconsistent state (torn read).

func (n *Bitmap) RecalculateCount() {
n.count = 0
for _, w := range n.data {
n.count += int64(bits.OnesCount64(w))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 RecalculateCount() treats the symptom, not the cause

This function exists to fix count/data inconsistency after "concurrent bitmap corruption" (per the comment). But if concurrent access can corrupt the count, it can also corrupt the data bits themselves. Recalculating count from (possibly corrupted) data only makes the count consistent with whatever garbage is in data.

The real fix should prevent the concurrent corruption in the first place. Consider this a temporary band-aid and track the root cause separately.

@@ -349,7 +358,13 @@ func (n *Bitmap) TryExpandWithSize(size int) {
return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ TryExpandWithSize zeroing — correct fix

This is a legitimate bug fix. When len(n.data) < newCap <= cap(n.data), the n.data = n.data[:newCap] re-slice exposes backing-array slots that may contain stale data from a previous use. Zeroing them is correct.

Note: the newCap > cap(n.data) branch (line 348) allocates via make() which zero-initializes in Go, so it doesn't need this treatment.

Comment thread pkg/sql/compile/scope.go
// release (op.Release + reuse.Free[Scope]). This mirrors the normal cleanup
// order in Compile.FreeOperator → Compile.clear. The entry is then nil'd so
// that later walks of PreScopes skip it (nil checks exist throughout).
if c.IsTpQuery() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Removing !c.hasMergeOp condition — overly broad, fragile pattern

The old condition c.IsTpQuery() && !c.hasMergeOp limited sequential execution to simple TP queries. Removing !c.hasMergeOp forces ALL TP queries (including INSERT with UNIQUE KEY, which sets hasMergeOp=true) into sequential pre-scope execution. This is a performance regression for TP queries with multiple independent PreScopes.

The FreeOperator → release → nil pattern is also fragile — it requires nil guards everywhere PreScopes are traversed (20+ new nil checks across compile.go, compile2.go, debugTools.go, remoterun.go, remoterunClient.go, types.go, analyze_module.go). Missing a single nil check will cause a panic.

A more targeted fix would be to identify which specific scope is being shared unsafely and fix the sharing, rather than serializing everything.

// correctly even when the last appended rows are non-null.
if vec.nsp.Count() > 0 {
nulls.TryExpand(&vec.nsp, vec.length)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 TryExpand on every non-null appendOneFixed — hot path cost

if vec.nsp.Count() > 0 {
    nulls.TryExpand(&vec.nsp, vec.length)
}

This runs on every non-null appendOneFixed call when the vector has any previous nulls. For a batch of 8192 rows where only row 0 is null, this calls TryExpand 8191 times. The expand is a no-op when already big enough, but the function call overhead on this hot path is unnecessary.

Consider doing this once after the batch is fully built, or tracking whether expansion is needed with a flag.

for _, k := range pkey {
for i, k := range pkey {
// NULL != NULL in SQL standard, skip NULL values from duplicate check
if toCheck.GetNulls().Contains(uint64(i)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Correct — NULL rows must be skipped in duplicate detection

SQL standard: NULL != NULL. Multiple NULL values in a UNIQUE KEY column are not duplicates. This fix is correct and addresses one of the core root causes.

sortedKeys, err := keys.Dup(tbl.proc.Load().Mp())
if err != nil {
return false, err
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Correct — Clone before InplaceSort

InplaceSort() reorders the data slice via sort.Slice() but does NOT reorder the null bitmap (verified in source). Cloning before sort prevents bitmap/data misalignment. This is a legitimate bug fix.

Note: a more fundamental fix would be to make InplaceSort() itself handle nulls (either sort the bitmap alongside data, or filter nulls first), since this footgun will bite other callers too.

}
return cloned, nil
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Correct — cloneNonNullKeys prevents garbage Varlena panic

Filtering out NULL rows before primaryKeysMayBeChanged prevents EncodePrimaryKeyVector from reading garbage Varlena data in NULL slots. This is a correct defensive fix.

Minor: when !keysVector.HasNull(), this does a full Dup(). If memory is a concern, the no-null path could return the original vector with a flag indicating the caller should NOT free it (to avoid the unconditional defer clonedKeys.Free(mp)).

start, count int) (bool, string) {
for _, v := range vals[start : start+count] {
nsp := pk.GetNulls()
for i := start; i < start+count; i++ {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Correct — skip NULL in checkPKDupGeneric

NULL values should not participate in primary key duplicate detection. This fix is correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Something isn't working size/M Denotes a PR that changes [100,499] lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: flink-cdc: 10g data sync, mo oom.

3 participants