Skip to content

fix: lifecycle-safety follow-ups (#2870)#2900

Closed
max-sixty wants to merge 4 commits into
mainfrom
agent-a3bad31af2bb71c8e
Closed

fix: lifecycle-safety follow-ups (#2870)#2900
max-sixty wants to merge 4 commits into
mainfrom
agent-a3bad31af2bb71c8e

Conversation

@max-sixty

Copy link
Copy Markdown
Owner

Two follow-ups to #2870 surfaced by a retrospective deep-dive review. The
third item discussed in that review (the two divergent safe-delete semantics
in execute_instant_removal_or_fallback) is being explored as a separate
CAS-based rewrite via git update-ref -d <ref> <sha> and will come in a
follow-up PR — it touches delete_branch_if_safe core logic and warrants
its own review surface.

1. refactor(prune): drop defensive check_lock now that phases are sequenced

The check_lock RwLock<()> in step_prune was carried over from #2808's
Windows .git/config race fix. After #2870 restructured the flow so hook
approval is exact, the phase ordering already serializes integration-check
readers against the rename-failure fallback: the background par_iter is
the only reader, the channel-drain for ... in rx loop exits only once
that thread (and all its readers) has finished, and removals run strictly
after.

The remaining comment admitted the guard was defensive ("preserves the
serialization if this flow is refactored back to overlap checks and
removals") — one mechanism per guarantee per CLAUDE.md, so the lock is
removed. Comments on the par_iter spawn and the SynchronousForNonCurrent
variant now name the real invariants.

Verdict: Removed. Phase ordering is the actual mechanism; the lock was
belt-and-suspenders.

2. fix(remove): warn when background branch deletion silently retains the branch

Both branch-deletion paths in execute_instant_removal_or_fallback
(the fast-path post-prune delete and the SynchronousForNonCurrent
fallback delete) swallowed errors at log::debug! — invisible at default
verbosity. When the worktree was removed but the branch survived (a hook
advanced it, git branch -D errored, refs scan failed), the user got no
signal that the second half of the operation didn't happen.

Promoted to a warning_message with a wt remove --force-delete <branch>
recovery hint. To avoid duplicating the planner's existing "Branch
unmerged" hint, the warning suppresses the Ok(NotDeleted) case when
the planner already predicted retention; race-driven NotDeleted outcomes
still warn. Err always warns. BackgroundRemoval carries a new
planner_expected_retention bit so the helper can make that distinction
without re-deriving planner state.

Tradeoff: a previously silent path gains noise on the rare race. Silent
half-completion of a destructive operation is worse than an occasional
extra warning line.

Tests

cargo run -- hook pre-merge --yes passes: 3834 nextest tests + doctests

  • pre-commit lints + clippy + docs.

This was written by Claude Code on behalf of Maximilian Roos

max-sixty and others added 2 commits May 25, 2026 14:28
The `check_lock` in `step_prune` was carried over from #2808's
Windows `.git/config` race fix. After #2870 restructured the flow so
hook approval is exact, the phase ordering already serializes
integration-check readers against the rename-failure fallback's
`.git/config` rewrite: the background par_iter is the only reader, and
the channel-drain loop (`for (idx, result) in rx`) only exits once that
thread has finished. Removals run strictly after.

That made the lock belt-and-suspenders — one mechanism per guarantee,
per CLAUDE.md. Drop the `RwLock<()>`, the `Arc` wrapping it, the
read-guard in the par_iter worker, the write-guard at the top of
`try_remove`, and the `check_lock` field on `RemovalContext`. Comments
on the par_iter spawn and the `SynchronousForNonCurrent` variant now
name the real invariants (phase ordering, summary-accurate iteration)
rather than the dropped lock.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…e branch

Both branch-deletion paths in `execute_instant_removal_or_fallback`
swallowed errors at `log::debug!` — invisible in default verbosity.
When the worktree is already gone but the branch survives (a hook
advanced it, a `git branch -D` exec failed, refs scan errored), the
user got no signal that the second half of the operation didn't happen.

Promote both to a `warning_message` with a `wt remove --force-delete <branch>`
recovery hint. To avoid duplicating the planner's existing "Branch
unmerged" hint, the warning suppresses the `Ok(NotDeleted)` case when
the planner already predicted retention (unmerged, or
`--no-delete-branch`); race-driven `NotDeleted` outcomes still warn.
`Err` always warns, since the prior `log::debug!` was invisible.

`BackgroundRemoval` carries a new `planner_expected_retention` bit so
the helper can make this distinction without re-deriving planner state.

This adds noise to a previously silent path — the tradeoff is that
silent half-completion of a destructive operation is worse than an
extra warning line on the rare race.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

@worktrunk-bot worktrunk-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two doc comments still cite the old "serialize .git/config rewrite with integration-check readers" rationale that the rest of this PR replaced — the SynchronousForNonCurrent variant doc and the try_remove comment were updated, but two siblings in src/output/handlers.rs were missed:

  • execute_instant_removal_or_fallback's doc, ending ...runs that fallback synchronously for non-current worktrees when the caller needs to keep the fallback's '.git/config' rewrite serialized with other git readers. — the prune caller no longer needs that serialization; the variant doc you updated now grounds the choice in iteration ordering / final summary.
  • handle_remove_output's doc, ending prune passes [...SynchronousForNonCurrent] to keep the fallback's '.git/config' rewrite serialized with its parallel integration-check readers. — same stale claim.

Happy to push a follow-up commit aligning both with the new variant doc if you'd like.

codecov/patch is failing with 6 uncovered lines, all in the new warn_if_branch_retained — the Ok(NotDeleted) race-warning arm (lines 303–308) and the Err arm (lines 312–317). Per CLAUDE.md codecov/patch is mandatory despite being non-required, so flagging for your call: the uncovered branches are exactly the rare paths the PR description calls out as acknowledged tradeoffs, and they're not trivially testable without an injected hook / git failure. Up to you whether to add tests or merge with the gap.

Other observations, all non-blocking:

  • The lock removal in step_prune looks correct — the for (idx, result) in rx loop only exits once tx is dropped (when the spawned thread returns), which happens after every par_iter worker has finished. Removals run strictly after, so checks and the fallback's .git/config rewrite can't overlap.
  • planner_expected_retention = ctx.deletion_mode.should_keep() || safety.integration_reason.is_none() evaluates to true for ForceDelete + integration_reason.is_none(), even though the planner predicted force-deletion in that case. It's a latent inaccuracy that never fires in practice because delete_branch_if_safe(force_delete=true) never returns Ok(NotDeleted) — but the relationship is implicit. Tightening to ... || (!is_force() && integration_reason.is_none()) would make the formula correct independent of delete_branch_if_safe's shape; arguably not worth the extra clause.

max-sixty and others added 2 commits May 25, 2026 15:12
The `Err` arm of `warn_if_branch_retained` covered failure modes the user
cannot act on differently than re-running the command (`git branch -D` exec
errors, refs DB I/O failures). Promote it from `warning_message` to
`log::warn!` to match the writing-user-outputs guideline for non-user-fixable
issues, and let the `Ok(NotDeleted)` race path stay user-visible since the
recovery is `wt remove --force-delete <branch>`.

Net effect on the prior commit: removes the duplicate "to retry, run ..."
recovery hint for I/O errors (the prior wording suggested the same recovery
either way, conflating two different failure shapes) and cuts an
integration-test-unreachable code path that was the source of the
codecov/patch miss on #2900.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Integration tests cover only one of the four arms — the
NotDeleted+planner-expected-deletion warning surfaced by
test_merge_pre_remove_new_commit_keeps_branch. Add a binary-crate unit
test that calls the helper with each outcome (Ok(NotDeleted) with both
planner predictions, Ok(ForceDeleted), Ok(Integrated), Err) so the
suppression logic and the log::warn Err arm are exercised in coverage.

Output goes to stderr and isn't captured — the assertion is the call
doesn't panic on any arm. User-visible message shapes stay pinned by
the integration tests.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@worktrunk-bot

Copy link
Copy Markdown
Collaborator

code-coverage failed on an unrelated test (working_tree_diff_stats_with_untracked_counts_untracked_and_preserves_real_index) — same flake addressed in #2874. The 3 required test (linux|macos|windows) legs all pass. Leaving the approval in place; rerun once #2874 lands, or merge regardless since the failure isn't from this PR's changes.

(Worth noting: this means codecov/patch wasn't computed for this commit. The Err-arm removal + new unit test should resolve the gap from the prior commit's review, but that can't be verified until code-coverage succeeds.)

@max-sixty

Copy link
Copy Markdown
Owner Author

Superseded by #2903, which already contains every commit from this PR plus the full atomic-CAS branch-deletion rework. Closing in favor of that; no work is lost.

This was written by Claude Code on behalf of max-sixty

@max-sixty max-sixty closed this Jun 27, 2026
@max-sixty max-sixty deleted the agent-a3bad31af2bb71c8e branch June 27, 2026 23:36
max-sixty added a commit that referenced this pull request Jun 28, 2026
…#2903)

Three follow-ups to #2870 plus a CAS-based unification of safe-delete
semantics. (#2900 was the small/incremental path for the first two
items; it has been closed as superseded — its commits are all contained
here.)

## Item 1 — drop the defensive `check_lock` (refactor)

The `check_lock` `RwLock<()>` in `step_prune` was carried over from
#2808's Windows `.git/config` race fix. After #2870 restructured the
flow, the phase ordering already serializes integration-check readers
against removals: the background `par_iter` is the only reader, the `for
... in rx` loop exits only once that thread has finished, and removals
run strictly after. Belt-and-suspenders per CLAUDE.md — removed.

## Item 2 — warn when background branch deletion silently retains (fix)

Both branch-deletion paths in `execute_instant_removal_or_fallback`
swallowed errors at `log::debug!`. Promoted the surprise
`Ok(NotDeleted)` (planner predicted deletion, branch survived — a hook
moved the tip) to a `warning_message` with a `wt remove -D <branch>`
recovery hint. Suppressed when the planner already predicted retention
so the existing `print_hints` unmerged-branch message doesn't duplicate.
`Err` failures now log at warn level (developer-facing) rather than
user-actionable warnings — the failure modes (`git update-ref` exec
error, refs DB I/O) aren't actionable beyond re-running.

## Item 3 / CAS rewrite — atomic safe-delete (the big one)

`delete_branch_if_safe` now uses `git update-ref -d refs/heads/<branch>
<expected-sha>` instead of `git branch -D`. The expected SHA comes from
the snapshot the integration check already consulted, so the
check→delete sequence becomes one atomic step against a known SHA. If
anything moved the ref in between, git rejects the CAS — the new
`BranchDeletionOutcome::RetainedRaced` outcome is returned and surfaced
to the user. Force-delete keeps `git branch -D` (explicit override).

This unifies the divergent safe-delete semantics in
`execute_instant_removal_or_fallback`:

- **Fast path** + **SynchronousForNonCurrent fallback** routed through
`delete_branch_if_safe`, so they pick up CAS for free.
- **Detached fallback** (rename failed AND current worktree): the shell
`&& git branch -d <branch>` is replaced by a foreground integration
check + atomic `&& git update-ref -d <ref> <expected-sha>` tail
(`build_cas_branch_delete_tail`). Squash-merged / patch-id / ancestor
branches the planner accepts are now accepted at delete time too,
matching the fast path.
- **Branch-only deletion** (`handle_branch_only_output`) switches from
bare `git branch -D` (effectively a force-delete after a stale
integration check) to `delete_branch_if_safe`. CAS protects it too.

## Follow-up consolidation (commit `971e8b4`)

The "branch kept because its tip moved during the delete" message had
drifted into three shapes — two near-identical inline strings plus a
gap: the foreground worktree path routed `RetainedRaced` through
`show_unmerged_hint` and printed the generic "Branch unmerged; run `-D`"
hint, whose bare `-D` would force-delete the just-arrived racing
commits. All three emit paths (`warn_if_branch_retained`,
`handle_branch_only_output`, `print_hints`) now route through one
`retained_raced_branch_message` helper, with a dedicated `RetainedRaced`
early-return arm in `print_hints`; `show_unmerged_hint` is now
`NotDeleted`-only. Also converged the recovery command to the canonical
`wt remove -D <branch>` (via `suggest_command`) and inlined the
single-caller `snapshot_sha` wrapper.

### Race coverage

CAS closes the TOCTOU window inside `delete_branch_if_safe` (between its
own `capture_refs` and `update-ref -d`). Hook-driven races continue to
be caught by the existing fresh integration check that runs after
pre-remove hooks; CAS narrows the residual window from "between check
and delete" to "never". The race rejection is unit-simulated (the ref is
moved externally), not yet driven end-to-end through a real `wt remove`
+ hook.

### Tests

Unit tests in `git::remove::tests`:
`cas_rejects_delete_when_branch_advances` (pins the CAS rejection
mechanism), `cas_deletes_when_branch_unchanged`,
`cas_propagates_error_when_ref_vanished`, and
`deletes_via_fallback_when_branch_absent_from_snapshot`. In
`output::handlers::tests`: every `warn_if_branch_retained` arm,
`build_remove_command_with_tail_appends_only_when_present`, and
`retained_raced_branch_message_lead_in_varies`.
`test_prune_fallback_config_race_canary`'s wrapper-git script matches
the new `update-ref -d refs/heads/<branch>` shape.

`cargo run -- hook pre-merge --yes` passes locally: 4263 nextest tests +
doctests + pre-commit + clippy + docs, no snapshot drift.

### Known gaps (deferred)

- No end-to-end integration test of a real mid-delete race (a
`pre-remove` hook advancing the branch); the mechanism is unit-covered.
- The FAQ [What can Worktrunk
delete?](docs/content/faq.md#what-can-worktrunk-delete) doesn't yet
mention the fail-closed-on-race behavior.

> _This was written by Claude Code on behalf of Maximilian Roos_

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: worktrunk-bot <254187624+worktrunk-bot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants