Skip to content

fix(worker): clean up orphaned zoekt .tmp shard files after every index attempt#1214

Open
brendan-kellam wants to merge 3 commits into
mainfrom
brendan-kellam/fix-orphaned-temp-shards
Open

fix(worker): clean up orphaned zoekt .tmp shard files after every index attempt#1214
brendan-kellam wants to merge 3 commits into
mainfrom
brendan-kellam/fix-orphaned-temp-shards

Conversation

@brendan-kellam
Copy link
Copy Markdown
Contributor

@brendan-kellam brendan-kellam commented May 20, 2026

Summary

  • Moved cleanupTempShards from the catch block into a finally block in repoIndexManager.ts so leftover .tmp shard files are cleaned up on every indexing attempt, not just on failure.
  • Previously, if a previous run was killed mid-rename (zoekt writes shards to .tmp first and atomically renames them), the orphaned .tmp files would persist indefinitely — zoekt itself only tracks temp files from the current build, so a subsequent successful run would not remove them.

Investigation

Test plan

  • Verify that successful indexing runs no longer leave behind .tmp shard files when orphans exist from a prior interrupted run.
  • Verify that failed indexing runs still trigger cleanup and propagate the original error (the cleanupTempShards call is best-effort and does not throw, so it cannot mask the error from catch).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Temporary shard files are now consistently cleaned up after every indexing attempt, eliminating leftover files from interrupted runs.
  • Documentation

    • Changelog updated with an "Unreleased" entry noting the cleanup fix.
  • Tests

    • Test suite updated to cover and validate the new cleanup behavior during indexing.

Review Change Stack

…dex attempt

Move the cleanupTempShards call from the catch block into a finally block
so orphans left behind by a previously interrupted run are removed on the
next successful indexing pass, not only after a failure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

This comment has been minimized.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

Walkthrough

The PR refactors indexing error handling to guarantee cleanup of temporary Zoekt shard files on every indexing attempt, not only on failure. The cleanupTempShards(repo) call was moved from the catch block into a finally block, tests were updated to mock the function, and a changelog entry documents the fix.

Changes

Cleanup guarantee in repo indexing

Layer / File(s) Summary
Try-catch-finally refactor, test mock, and changelog update
packages/backend/src/repoIndexManager.ts, packages/backend/src/repoIndexManager.test.ts, CHANGELOG.md
Moved cleanupTempShards(repo) from the catch block to a finally block in indexRepository, simplified the catch logging and rethrow behavior; added cleanupTempShards to the Zoekt mock used in tests; added an Unreleased "Fixed" changelog entry noting orphaned .tmp shard cleanup.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

  • sourcebot-dev/sourcebot#805: Both PRs modify packages/backend/src/repoIndexManager.ts's indexRepository flow to use cleanupTempShards(repo) for Zoekt .tmp shard cleanup (with this PR ensuring cleanup runs in a finally after every indexing attempt).
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and clearly describes the main change: moving temp shard cleanup to execute on every index attempt rather than only on failure.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch brendan-kellam/fix-orphaned-temp-shards

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@brendan-kellam brendan-kellam self-assigned this May 20, 2026
@brendan-kellam brendan-kellam requested a review from jsourcebot May 20, 2026 17:52
@brendan-kellam brendan-kellam changed the title fix(backend): clean up orphaned zoekt .tmp shard files after every index attempt fix(worker): clean up orphaned zoekt .tmp shard files after every index attempt May 20, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/backend/src/repoIndexManager.ts`:
- Around line 521-527: Tests are failing because the new call to
cleanupTempShards(repo) in repoIndexManager.ts expects an export from the
./zoekt.js mock that doesn't exist; update the vi.mock blocks that mock
"./zoekt.js" to provide a cleanupTempShards export (e.g., a no-op async function
or a jest/vi.fn spy) or convert those mocks to partial mocks using
importOriginal and then override only the symbols you need while preserving
cleanupTempShards (use importOriginal() and return { ...original,
cleanupTempShards: vi.fn(async () => {}) }). Locate vi.mock calls referencing
"./zoekt.js" in the test files and add the cleanupTempShards export consistently
so the finally block can call it during tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 990d8d9a-374f-4dbf-902f-68f681dbacad

📥 Commits

Reviewing files that changed from the base of the PR and between 79b3767 and 571224a.

📒 Files selected for processing (2)
  • CHANGELOG.md
  • packages/backend/src/repoIndexManager.ts

Comment thread packages/backend/src/repoIndexManager.ts
The success path of indexRepository now calls cleanupTempShards in a finally
block, so the mock needs to export it or the previously-passing tests fail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/backend/src/repoIndexManager.test.ts (1)

57-60: ⚡ Quick win

Add explicit assertions that cleanupTempShards runs on both success and failure paths.

Great mock update, but this PR’s core contract is cleanup-on-every-attempt. Please add assertions in representative success and failure job-processing tests to verify cleanupTempShards(repo) is called in both paths.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/backend/src/repoIndexManager.test.ts` around lines 57 - 60, Add
explicit assertions that the mocked cleanupTempShards is invoked regardless of
indexGitRepository success or failure: in the representative success test (where
indexGitRepository resolves) and in the failure test (where indexGitRepository
rejects), assert vi.mocked(cleanupTempShards).toHaveBeenCalledWith(repo) (or
equivalent) after the job processing call; ensure you set indexGitRepository to
resolve/reject in those tests, reset/clear mocks between tests, and
import/reference the same cleanupTempShards mock used in the vi.mock block so
both test paths verify cleanupTempShards(repo) is called.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/backend/src/repoIndexManager.test.ts`:
- Around line 57-60: Add explicit assertions that the mocked cleanupTempShards
is invoked regardless of indexGitRepository success or failure: in the
representative success test (where indexGitRepository resolves) and in the
failure test (where indexGitRepository rejects), assert
vi.mocked(cleanupTempShards).toHaveBeenCalledWith(repo) (or equivalent) after
the job processing call; ensure you set indexGitRepository to resolve/reject in
those tests, reset/clear mocks between tests, and import/reference the same
cleanupTempShards mock used in the vi.mock block so both test paths verify
cleanupTempShards(repo) is called.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 578d3cd9-4a77-401a-be78-b9259a5fcc92

📥 Commits

Reviewing files that changed from the base of the PR and between 571224a and 1622d18.

📒 Files selected for processing (1)
  • packages/backend/src/repoIndexManager.test.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants