Skip to content

feat: route find-or-create merge_insert through the v2 write path#6473

Open
wombatu-kun wants to merge 1 commit intolance-format:mainfrom
wombatu-kun:feat/find-or-create-v2-merge-insert-6441
Open

feat: route find-or-create merge_insert through the v2 write path#6473
wombatu-kun wants to merge 1 commit intolance-format:mainfrom
wombatu-kun:feat/find-or-create-v2-merge-insert-6441

Conversation

@wombatu-kun
Copy link
Copy Markdown
Contributor

Summary

  • Add WhenMatched::DoNothing to the outer matches! gate in can_use_create_plan so find-or-create merge_insert jobs (the default MergeInsertBuilder config: DoNothing + InsertAll) take the v2 FullSchemaMergeInsertExec path instead of falling back to the legacy v1 Merger. The downstream pipeline (merge_insert_action, FullSchemaMergeInsertExec::process_row_action, MergeInsertPlanner::is_delete_only) was already prepared to handle DoNothing; the outer gate was the only gap.
  • explain_plan() now works for find-or-create configurations (previously returned Error::NotSupported), and bloom-filter conflict detection (inserted_rows_filter) is now populated for find-or-create on unenforced primary keys — two concurrent jobs inserting the same fresh key now fail fast with TooMuchWriteContention instead of silently racing.
  • Refresh the (already-stale) doc comment on can_use_create_plan to list all supported WhenMatched modes.

Closes #6441. This is the DoNothing-shaped counterpart of #6442 (commit 46639be).

Test plan

  • cargo fmt --all
  • cargo clippy -p lance --tests --benches -- -D warnings
  • cargo test -p lance --lib dataset::write::merge_insert — all 127 tests pass, including the three new tests below and unchanged coverage (test_basic_merge, test_fast_path_*, test_explain_plan*, test_concurrent_*, test_when_matched_delete_*).
  • cargo test -p lance --lib dataset::write — all 190 tests pass across merge_insert, update, and write paths.
  • test_fast_path_find_or_create (new) — plan-shape assertion proving a default find-or-create job produces a v2 plan with a Right join. Would fail on main.
  • test_explain_plan_find_or_create (new) — asserts explain_plan() now succeeds for the default find-or-create config, covering the issue's explain_plan acceptance criterion.
  • test_concurrent_find_or_create_same_new_key (new) — bloom-filter conflict-detection regression mirroring test_concurrent_insert_same_new_key with WhenMatched::DoNothing, covering the issue's Bloom-filter acceptance criterion.

Acceptance criteria (from the issue)

  • WhenMatched::DoNothing operations use the v2 path (verified via explain_plan in test_explain_plan_find_or_create and via plan assertion in test_fast_path_find_or_create).
  • Existing DoNothing tests pass unchanged (test_basic_merge find-or-create case still green on both Legacy and V2_0).
  • New test calls explain_plan with DoNothing config to confirm v2 is used (test_explain_plan_find_or_create).
  • Conflict detection (bloom filter) works for DoNothing operations (test_concurrent_find_or_create_same_new_key).

🤖 Generated with Claude Code

@github-actions github-actions Bot added the enhancement New feature or request label Apr 10, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 10, 2026

Codecov Report

❌ Patch coverage is 99.13043% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/dataset/write/merge_insert.rs 99.13% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@wjones127 wjones127 self-assigned this Apr 10, 2026
@wombatu-kun wombatu-kun force-pushed the feat/find-or-create-v2-merge-insert-6441 branch from f98ecea to 7a823c9 Compare April 14, 2026 02:19
Find-or-create merge_insert jobs (`WhenMatched::DoNothing` +
`WhenNotMatched::InsertAll`, the default `MergeInsertBuilder` config)
used to fall back to the legacy v1 `Merger` path even though every
other piece of the v2 pipeline was already prepared to handle them:

- `merge_insert_action` already emits `Action::Nothing` for `DoNothing`
  matches.
- `FullSchemaMergeInsertExec::process_row_action` already handles
  `Action::Nothing` by dropping the row without counting it as a
  delete or update.
- `MergeInsertPlanner::is_delete_only` already routes `DoNothing` to
  `FullSchemaMergeInsertExec` (only `WhenMatched::Delete` takes the
  delete-only exec).
- The `no_upsert` eligibility check inside `can_use_create_plan` itself
  already accepted `DoNothing` for the partial-schema (key-only) branch.

The only gap was the outer `matches!(self.params.when_matched, ...)`
gate in `can_use_create_plan`, which listed
`UpdateAll | UpdateIf | Fail | Delete` but was missing `DoNothing`, so
the fast path was never attempted. Add `DoNothing` to that match.

Consequences:

- `explain_plan()` now works for find-or-create configurations
  (previously returned `Error::NotSupported`).
- Bloom-filter conflict detection (`inserted_rows_filter`) is now
  populated for find-or-create operations on unenforced primary keys,
  because non-matching rows take the existing `Action::Insert` branch
  that populates the filter. v1 always returned `None` here, so two
  concurrent find-or-create jobs inserting the same fresh key could
  previously silently race; they now fail fast with
  `TooMuchWriteContention`.
- The doc comment on `can_use_create_plan` is refreshed to list all
  supported `WhenMatched` modes (it had already drifted out of sync
  with the `Delete` addition).

Tests added in `rust/lance/src/dataset/write/merge_insert.rs`:

- `test_fast_path_find_or_create` — plan-shape assertion mirroring
  `test_fast_path_update_only`; proves a default find-or-create job
  now produces a v2 plan with a Right join and would fail on main.
- `test_explain_plan_find_or_create` — asserts `explain_plan()` now
  succeeds for the default find-or-create config.
- `test_concurrent_find_or_create_same_new_key` — bloom-filter
  conflict-detection regression mirroring
  `test_concurrent_insert_same_new_key` with
  `WhenMatched::DoNothing`.

The existing `test_basic_merge` find-or-create coverage continues to
pass unchanged on both `LanceFileVersion::Legacy` and
`LanceFileVersion::V2_0`, satisfying the issue's "Existing DoNothing
tests pass unchanged" acceptance criterion.

Closes lance-format#6441

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@wombatu-kun wombatu-kun force-pushed the feat/find-or-create-v2-merge-insert-6441 branch from af49d1d to cf1f95b Compare May 4, 2026 08:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support WhenMatched::DoNothing on v2 merge_insert path

2 participants