fix(remove): atomic CAS branch deletion unifies safe-delete semantics#2903
Conversation
The `check_lock` in `step_prune` was carried over from #2808's Windows `.git/config` race fix. After #2870 restructured the flow so hook approval is exact, the phase ordering already serializes integration-check readers against the rename-failure fallback's `.git/config` rewrite: the background par_iter is the only reader, and the channel-drain loop (`for (idx, result) in rx`) only exits once that thread has finished. Removals run strictly after. That made the lock belt-and-suspenders — one mechanism per guarantee, per CLAUDE.md. Drop the `RwLock<()>`, the `Arc` wrapping it, the read-guard in the par_iter worker, the write-guard at the top of `try_remove`, and the `check_lock` field on `RemovalContext`. Comments on the par_iter spawn and the `SynchronousForNonCurrent` variant now name the real invariants (phase ordering, summary-accurate iteration) rather than the dropped lock. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…e branch Both branch-deletion paths in `execute_instant_removal_or_fallback` swallowed errors at `log::debug!` — invisible in default verbosity. When the worktree is already gone but the branch survives (a hook advanced it, a `git branch -D` exec failed, refs scan errored), the user got no signal that the second half of the operation didn't happen. Promote both to a `warning_message` with a `wt remove --force-delete <branch>` recovery hint. To avoid duplicating the planner's existing "Branch unmerged" hint, the warning suppresses the `Ok(NotDeleted)` case when the planner already predicted retention (unmerged, or `--no-delete-branch`); race-driven `NotDeleted` outcomes still warn. `Err` always warns, since the prior `log::debug!` was invisible. `BackgroundRemoval` carries a new `planner_expected_retention` bit so the helper can make this distinction without re-deriving planner state. This adds noise to a previously silent path — the tradeoff is that silent half-completion of a destructive operation is worse than an extra warning line on the rare race. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
worktrunk-bot
left a comment
There was a problem hiding this comment.
The CAS approach unifies the four SafeDelete paths cleanly: one library helper (delete_branch_if_safe) for fast/synchronous/branch-only, plus a CAS-tail builder for the Detached fallback. The update-ref -d <ref> <oid> primitive picks up IntegrationReason semantics for free across all paths — the description of the prior && git branch -d <branch> shell tail silently losing squash-merged/patch-id matches is accurate. The race-rejection discrimination (try update-ref -d, then rev-parse --verify --quiet to distinguish "ref still present" from "real error") is the right shape per the structured-output rule.
The only gap is the one you raised in test plan item 3, and it shows up the most on handle_named_removed_worktree_foreground. In that path output.branch_result carries Ok(RetainedRaced) straight through RemovalDisplayInfo::from_branch_result → handle_branch_deletion_result, which sets show_unmerged_hint = true. RemovalDisplayInfo::print_message then prints "Removed feature worktree" with no mention that the branch was kept (since branch_deleted() is false), and print_hints falls into the self.show_unmerged_hint || … arm and emits "Branch unmerged; to delete, run wt remove -D". The branch wasn't unmerged from the planner's view — it was integrated and then moved — so that hint mis-attributes the cause.
The other three paths handle this correctly: the rename-into-trash fast path and SynchronousForNonCurrent go through warn_if_branch_retained (handlers.rs:311), and handle_branch_only_output has its own dedicated branch (handlers.rs:1123). The detached fallback can't surface it (CAS happens in the detached child), so foreground is the one missing case.
Cleanest fix is probably either: (1) split show_unmerged_hint in BranchDeletionDisplay from a show_race_warning flag for RetainedRaced, then have print_hints emit the dedicated "moved during deletion" message; or (2) keep the field but add a RetainedRaced early branch in print_hints that prints the warning before falling through.
Nit: assert_ne!(original_sha, advanced_sha) works because TestRepo::git_output trims (src/testing/mod.rs:1120), but it's worth knowing — if that trim ever goes away, the assertion becomes vacuous since the trailing \n would always make the strings differ.
The `Err` arm of `warn_if_branch_retained` covered failure modes the user cannot act on differently than re-running the command (`git branch -D` exec errors, refs DB I/O failures). Promote it from `warning_message` to `log::warn!` to match the writing-user-outputs guideline for non-user-fixable issues, and let the `Ok(NotDeleted)` race path stay user-visible since the recovery is `wt remove --force-delete <branch>`. Net effect on the prior commit: removes the duplicate "to retry, run ..." recovery hint for I/O errors (the prior wording suggested the same recovery either way, conflating two different failure shapes) and cuts an integration-test-unreachable code path that was the source of the codecov/patch miss on #2900. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
c127ed5 to
96f7f14
Compare
Integration tests cover only one of the four arms — the NotDeleted+planner-expected-deletion warning surfaced by test_merge_pre_remove_new_commit_keeps_branch. Add a binary-crate unit test that calls the helper with each outcome (Ok(NotDeleted) with both planner predictions, Ok(ForceDeleted), Ok(Integrated), Err) so the suppression logic and the log::warn Err arm are exercised in coverage. Output goes to stderr and isn't captured — the assertion is the call doesn't panic on any arm. User-visible message shapes stay pinned by the integration tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three follow-ups to #2870 collapse into one rewrite of `delete_branch_if_safe`: it now performs an atomic compare-and-swap delete via `git update-ref -d refs/heads/<branch> <expected-sha>`, where the expected SHA comes from the snapshot the integration check already consulted. If the ref moves between the integration check and the delete (a hook, a concurrent push), git rejects the CAS — the branch is retained (fail-closed) and we surface the new `BranchDeletionOutcome::RetainedRaced` outcome rather than dropping the unmerged commits silently. Force-delete still uses `git branch -D` (the user's explicit override). This unifies the divergent safe-delete semantics in `execute_instant_removal_or_fallback`: - Fast path and `SynchronousForNonCurrent` fallback already routed through `delete_branch_if_safe`, so they pick up CAS for free. - Detached fallback used to bake `&& git branch -d <branch>` into the shell, which only honored git's reachability-from-HEAD check (so squash-merged / patch-id / ancestor branches the planner accepted as integrated were rejected at delete time). Now `build_cas_branch_delete_tail` runs worktrunk's full integration check in the foreground and, if integrated, appends an atomic `git update-ref -d <ref> <expected-sha>` to the detached shell — same semantics as the other paths, plus CAS safety against tip movement between the foreground check and the detached delete. - `handle_branch_only_output` also switches its safe-delete path from `git branch -D` (an unsafe force-delete after a stale integration check) to `delete_branch_if_safe`, so the CAS protects branch-only deletions the same way. The display layer absorbs the new variant: `flag_note` treats `RetainedRaced` like `NotDeleted` for the success badge, `handle_branch_deletion_result` shows the unmerged hint, and the branch-only output path emits a dedicated "moved during deletion" warning with a `wt remove --force-delete <branch>` recovery hint. CAS protects the TOCTOU window inside `delete_branch_if_safe` (between its own snapshot capture and `update-ref -d`). Hook-driven races are still caught by the existing fresh integration check that runs after pre-remove hooks; CAS narrows the residual unprotected window from \"between check and delete\" to \"never\". Tests: `git update-ref -d <ref> <sha>` is structured — exit 0 on deletion, non-zero on stale-SHA rejection (\"cannot lock ref ... is at X but expected Y\"). The CAS helper re-checks ref presence via `rev-parse --verify --quiet` rather than parsing the error message, keeping with the structured-output policy. Two new unit tests in `git::remove::tests`: - `cas_rejects_delete_when_branch_advances` — captures a snapshot, moves the branch externally, then calls `delete_branch_if_safe` and asserts `RetainedRaced` plus surviving tip at the post-race SHA. Pins the CAS rejection mechanism. - `cas_deletes_when_branch_unchanged` — sanity check the common integrated case still deletes. `test_prune_fallback_config_race_canary`'s wrapper-git script now matches `update-ref -d refs/heads/<branch>` alongside the `branch -d|-D <branch>` shapes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The doc rewrite that came with the CAS commit only mentioned the NotDeleted-race case. Spell out RetainedRaced, NotDeleted (with the planner-predicted-retention suppression), the silent success arms, and the log::warn! Err arm so the file reads consistently after the rebase onto the PR-2900 simplification. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
96f7f14 to
992790f
Compare
worktrunk-bot
left a comment
There was a problem hiding this comment.
The CAS rewrite addresses the previous review's structural concerns. One small follow-up on the new test (inline).
The arm this PR introduces — the only one that warns regardless of planner_expected_retention — was the one missing from the per-arm test. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…8e-cas # Conflicts: # src/commands/step/prune.rs
worktrunk-bot
left a comment
There was a problem hiding this comment.
codecov/patch is failing on the CAS changes. Triaging the report:
False positives (test exists, llvm-cov mis-attributes the middle lines of eprintln!(warning_message(cformat!(...)))): src/output/handlers.rs L324 and L335 in warn_if_branch_retained. The eprintln! openers at L322 and L333 are hit (L322 hits=2, L333 hits=1 in cobertura.xml), so the new warn_if_branch_retained_arms test added in 80e4b5e is doing its job — the gap is reporting noise inside the macro expansion.
Genuine gaps (hits=0 in cobertura):
src/output/handlers.rsL1127/L1129 — theRetainedRacedarm ofhandle_branch_only_output. Distinct fromwarn_if_branch_retained; the new unit test doesn't reach this function. Easiest fix: a sibling unit test that callshandle_branch_only_outputwith aBranchDeletionOutcome::RetainedRacedand asserts it doesn't panic, mirroring the pattern at handlers.rs:2016.src/output/handlers.rsL393 — theNone => remove_commandbranch ofbuild_remove_command_with_tail. TheSome(tail)arm is exercised (L391 hits=1); the no-tail arm is trivial to cover with a unit test on the helper.src/git/remove.rsL437-438 — thesnapshot_shareturnsNonefallback. By the inline comment this is intentionally a corner-case preservation of pre-CAS behavior, reachable only via a synthetic snapshot/branch mismatch. Either construct aRefSnapshotthat hits it, or — if the path is genuinely unreachable — remove it and let the caller fall through to a single deletion code path per the CLAUDE.md coverage guidance ("remove unused code… where falling through to a general handler suffices").src/git/remove.rsL495 — the real-error propagation incas_delete_branch_outcomewhen bothupdate-ref -dand the post-failurerev-parse --verifyfail. Hardest to exercise without a fault-injection hook.
I'm dismissing my prior approval per CLAUDE.md's "never merge a PR with failing codecov/patch without explicit user approval" — the two helper-function gaps look mechanical to close.
codecov/patch failing — coverage gaps detailed in follow-up comment
Close codecov/patch gaps on the new CAS branch-deletion paths: - cas_delete_branch_outcome: the true-error arm (update-ref fails and the ref is already gone) propagates the error rather than reporting RetainedRaced. - build_remove_command_with_tail: the no-tail branch returns the bare worktree-removal command.
delete_branch_if_safe falls back to a plain `branch -D` when the branch is integrated — resolved via the live rev-parse fallback in compute_integration_lazy — but absent from the captured snapshot, so snapshot_sha has no SHA to compare-and-swap against. The arm is reachable when the branch is created after the snapshot; exercise it directly rather than leaving it as the lone uncovered patch path.
# Conflicts: # src/output/handlers.rs
…t paths The "branch kept because its tip moved during the delete" message had drifted into three shapes: two near-identical inline strings (the worktree-removal warning and the branch-only warning, differing only in lead-in) plus a gap — the foreground worktree path routed `RetainedRaced` through `show_unmerged_hint` and printed the generic "Branch unmerged; run -D" hint, whose bare `-D` would force-delete the just-arrived racing commits, the opposite of the cautious "inspect commits, then force-delete if safe" the other paths gave. Collapse all three onto one `retained_raced_branch_message` helper (a sibling to `retained_unmerged_branch_messages`, same can't-drift rationale), and give `print_hints` an explicit `RetainedRaced` arm so the foreground path surfaces the correct message. `RetainedRaced` is no longer folded into `show_unmerged_hint`, since every caller now handles it explicitly. While here: - Converge the force-delete recovery command to the canonical `wt remove -D <branch>` via `suggest_command`, replacing the hardcoded `--force-delete` spelling these messages had drifted to. - Inline the single-caller `snapshot_sha` wrapper in `delete_branch_if_safe`; it now reads `snapshot.local_branch(branch)` directly — the same accessor the detached-fallback CAS tail already used. Behavior is unchanged except the now-correct foreground raced message and the `-D` spelling; no snapshot pinned the raced/not-integrated strings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
worktrunk-bot
left a comment
There was a problem hiding this comment.
Both concerns from the prior reviews are now resolved by the three new commits:
- Foreground
RetainedRacedmis-attribution (971e8b4) — the worktree path no longer foldsRetainedRacedintoshow_unmerged_hint;print_hintsgets a dedicated early-return arm, and all three emit paths (warn_if_branch_retained,handle_branch_only_output,print_hints) now route through oneretained_raced_branch_messagehelper. I traced the readers:show_unmerged_hintis nowNotDeleted-only, and every remainingNotDeleted | RetainedRacedgrouping (flag_note) is correct since both yield an empty note. The raced framing ("inspect commits, then runwt remove -Dif safe") is the right fix over the generic "Branch unmerged; run-D" hint, which mis-described an integrated-but-moved branch. - codecov/patch — now passing on
dca30c2. The new tests (c58a41,813f63) target the exact gaps flagged: the CAS true-error arm, the no-tailbuild_remove_command_with_tailbranch, and the snapshot-skewed non-CAS fallback.
This change touches the branch-deletion / force-delete surface (src/git/remove.rs, src/output/handlers.rs, the wt remove -D recovery command), so per the repo's data-loss-surface review policy I'm not issuing an approval — that merge call is yours. No new findings on the incremental diff.
Three follow-ups to #2870 plus a CAS-based unification of safe-delete semantics. (#2900 was the small/incremental path for the first two items; it has been closed as superseded — its commits are all contained here.)
Item 1 — drop the defensive
check_lock(refactor)The
check_lockRwLock<()>instep_prunewas carried over from #2808's Windows.git/configrace fix. After #2870 restructured the flow, the phase ordering already serializes integration-check readers against removals: the backgroundpar_iteris the only reader, thefor ... in rxloop exits only once that thread has finished, and removals run strictly after. Belt-and-suspenders per CLAUDE.md — removed.Item 2 — warn when background branch deletion silently retains (fix)
Both branch-deletion paths in
execute_instant_removal_or_fallbackswallowed errors atlog::debug!. Promoted the surpriseOk(NotDeleted)(planner predicted deletion, branch survived — a hook moved the tip) to awarning_messagewith awt remove -D <branch>recovery hint. Suppressed when the planner already predicted retention so the existingprint_hintsunmerged-branch message doesn't duplicate.Errfailures now log at warn level (developer-facing) rather than user-actionable warnings — the failure modes (git update-refexec error, refs DB I/O) aren't actionable beyond re-running.Item 3 / CAS rewrite — atomic safe-delete (the big one)
delete_branch_if_safenow usesgit update-ref -d refs/heads/<branch> <expected-sha>instead ofgit branch -D. The expected SHA comes from the snapshot the integration check already consulted, so the check→delete sequence becomes one atomic step against a known SHA. If anything moved the ref in between, git rejects the CAS — the newBranchDeletionOutcome::RetainedRacedoutcome is returned and surfaced to the user. Force-delete keepsgit branch -D(explicit override).This unifies the divergent safe-delete semantics in
execute_instant_removal_or_fallback:delete_branch_if_safe, so they pick up CAS for free.&& git branch -d <branch>is replaced by a foreground integration check + atomic&& git update-ref -d <ref> <expected-sha>tail (build_cas_branch_delete_tail). Squash-merged / patch-id / ancestor branches the planner accepts are now accepted at delete time too, matching the fast path.handle_branch_only_output) switches from baregit branch -D(effectively a force-delete after a stale integration check) todelete_branch_if_safe. CAS protects it too.Follow-up consolidation (commit
971e8b4)The "branch kept because its tip moved during the delete" message had drifted into three shapes — two near-identical inline strings plus a gap: the foreground worktree path routed
RetainedRacedthroughshow_unmerged_hintand printed the generic "Branch unmerged; run-D" hint, whose bare-Dwould force-delete the just-arrived racing commits. All three emit paths (warn_if_branch_retained,handle_branch_only_output,print_hints) now route through oneretained_raced_branch_messagehelper, with a dedicatedRetainedRacedearly-return arm inprint_hints;show_unmerged_hintis nowNotDeleted-only. Also converged the recovery command to the canonicalwt remove -D <branch>(viasuggest_command) and inlined the single-callersnapshot_shawrapper.Race coverage
CAS closes the TOCTOU window inside
delete_branch_if_safe(between its owncapture_refsandupdate-ref -d). Hook-driven races continue to be caught by the existing fresh integration check that runs after pre-remove hooks; CAS narrows the residual window from "between check and delete" to "never". The race rejection is unit-simulated (the ref is moved externally), not yet driven end-to-end through a realwt remove+ hook.Tests
Unit tests in
git::remove::tests:cas_rejects_delete_when_branch_advances(pins the CAS rejection mechanism),cas_deletes_when_branch_unchanged,cas_propagates_error_when_ref_vanished, anddeletes_via_fallback_when_branch_absent_from_snapshot. Inoutput::handlers::tests: everywarn_if_branch_retainedarm,build_remove_command_with_tail_appends_only_when_present, andretained_raced_branch_message_lead_in_varies.test_prune_fallback_config_race_canary's wrapper-git script matches the newupdate-ref -d refs/heads/<branch>shape.cargo run -- hook pre-merge --yespasses locally: 4263 nextest tests + doctests + pre-commit + clippy + docs, no snapshot drift.Known gaps (deferred)
pre-removehook advancing the branch); the mechanism is unit-covered.