Skip to content

fix(code-review): retry broken sandboxes#3110

Open
alex-alecu wants to merge 10 commits intomainfrom
heartbreaking-ragamuffin
Open

fix(code-review): retry broken sandboxes#3110
alex-alecu wants to merge 10 commits intomainfrom
heartbreaking-ragamuffin

Conversation

@alex-alecu
Copy link
Copy Markdown
Contributor

Why

Sometimes a code review can fail because the machine it is using breaks. When that happens, the review should get one clean second try instead of staying failed.

What changed

Code reviews now remember which sandbox and attempt they are using. When cloud-agent-next confirms a sandbox failed and destroys it, the web app claims affected reviews only once and starts them again in a fresh attempt. Old updates from the first attempt are ignored, so they cannot overwrite the retry. The same pull request check is reused and shows that the review is trying again.

How to test

  1. Run pnpm drizzle migrate.
  2. Run pnpm test -- "apps/web/src/app/api/internal/code-review-status/[reviewId]/route.test.ts" apps/web/src/app/api/internal/code-review-sandbox-destroyed/route.test.ts apps/web/src/lib/code-reviews/sandbox-retry.test.ts apps/web/src/lib/code-reviews/db/code-reviews.test.ts apps/web/src/lib/code-reviews/dispatch/dispatch-pending-reviews.test.ts apps/web/src/routers/code-reviews-router.test.ts.
  3. Run pnpm --filter kilo-code-review-worker test.
  4. Run pnpm --filter cloud-agent-next test -- sandbox-recovery.test.ts session-prepare.test.ts.
  5. Run pnpm --filter @kilocode/worker-utils test.

Comment thread packages/db/src/migrations/0117_new_jasper_sitwell.sql
@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented May 7, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Files Reviewed (3 files)
  • apps/web/src/app/api/internal/code-review-status/[reviewId]/route.test.ts
  • apps/web/src/app/api/internal/code-review-status/[reviewId]/route.ts
  • packages/db/src/migrations/0117_new_jasper_sitwell.sql
Resolved Findings
  • apps/web/src/app/api/internal/code-review-status/[reviewId]/route.ts - sandbox retry now only skips terminal cleanup when a retry was actually claimed
  • packages/db/src/migrations/0117_new_jasper_sitwell.sql - sandbox ID index now uses CREATE INDEX CONCURRENTLY

Reviewed by gpt-5.5-2026-04-23 · 810,280 tokens

@alex-alecu
Copy link
Copy Markdown
Contributor Author

Manual test passed.

Tested:

  • Started an isolated local Postgres database, applied migrations, and verified the new code review sandbox retry columns and sandbox_id index.
  • Exercised the code review sandbox destruction retry flow through the web internal endpoint into the code-review Worker using a stubbed cloud-agent-next response.
  • Exercised stale callback handling for old attempt callbacks, superseded session callbacks, terminal-state callbacks, and duplicate sandbox destruction notifications.

Verified:

  • A matching running review was claimed once from sandbox destruction, advanced from attempt 1 to attempt 2, cleared the old session/sandbox fields, and recorded sandbox_500_destroyed retry metadata.
  • The retried attempt dispatched to an attempt-specific Worker Durable Object and stored fresh attempt 2 session, CLI session, and sandbox IDs before completing successfully.
  • Old attempt 1 callbacks and mismatched-session attempt 2 callbacks were ignored without overwriting the retried attempt, and repeated sandbox destruction notifications did not create a third attempt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant