Skip to content

feat(api): soft-delete build layers in DB on user delete#3121

Merged
ValentaTomas merged 32 commits into
mainfrom
pr/db-soft-delete-builds
Jun 30, 2026
Merged

feat(api): soft-delete build layers in DB on user delete#3121
ValentaTomas merged 32 commits into
mainfrom
pr/db-soft-delete-builds

Conversation

@ValentaTomas

@ValentaTomas ValentaTomas commented Jun 28, 2026

Copy link
Copy Markdown
Member

This switches user delete to soft-delete the build layers in DB:

  • New deleted build status, mapped to a dedicated deleted status_group (migration updates compute_status_group).
  • New MarkExclusiveTemplateBuildsDeleted query that marks builds assigned only to this template as deleted, reusing the exclusivity predicate so builds shared with other templates are untouched.
  • Template delete and snapshot/paused delete now run the mark + delete in one transaction. The env_builds rows (and their team/env attribution) are preserved so a future GC can reclaim storage. Mirrors how failed keeps attribution.

Add a 'deleted' build status (mapped to a 'deleted' status_group) and, on
template/snapshot delete, mark the template's exclusive builds as deleted in a
transaction before the cascade. The env_builds rows and their team/env
attribution are preserved so a future GC can reclaim the storage; builds shared
with other templates are left untouched. No storage deletion here.
@cla-bot cla-bot Bot added the cla-signed label Jun 28, 2026
@cursor

cursor Bot commented Jun 28, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Broad changes to template delete, listing, and build registration with new concurrency rules between soft-delete and in-flight builds; a missed active_envs join could still expose deleted templates.

Overview
Template and snapshot delete now soft-delete the underlying env (deleted_at) instead of removing the row, so build assignments and build rows stay in the database for future GC while the template disappears from normal API and dashboard flows.

Delete runs through a shared softDeleteTemplate transaction: stamp deleted_at, release aliases (names become reusable), and clear active_template_builds so concurrency limits drop deleted templates. Template delete and sandbox snapshot cleanup both use this path.

Reads and mutations that should only see live templates now go through the active_envs view (sqlc embeds ActiveEnv instead of Env). CreateOrUpdateTemplate skips updates when deleted_at is set, so rebuilds return 404 instead of resurrecting a deleted ID. CreateTemplateBuildAssignment only inserts when the env is active (with locking) and callers treat 0 rows as template gone.

Listing, tag assignment, build pagination, and snapshot queries were updated to join active_envs so soft-deleted envs are excluded; alias-vs-ID conflict checks still scan envs so IDs stay reserved while the row exists.

Reviewed by Cursor Bugbot for commit c4aa658. Bugbot is set up for automated code reviews on this repo. Configure here.

@codecov

codecov Bot commented Jun 28, 2026

Copy link
Copy Markdown

❌ 5 Tests Failed:

Tests completed Failed Passed Skipped
3098 5 3093 8
View the top 3 failed test(s) by shortest run time
github.com/e2b-dev/infra/tests/integration/internal/tests/api/templates::TestTemplateBuildRUN
Stack Traces | 0s run time
=== RUN   TestTemplateBuildRUN
=== PAUSE TestTemplateBuildRUN
=== CONT  TestTemplateBuildRUN
--- FAIL: TestTemplateBuildRUN (0.00s)
github.com/e2b-dev/infra/tests/integration/internal/tests/api/templates::TestDeleteTemplate
Stack Traces | 158s run time
=== RUN   TestDeleteTemplate
=== PAUSE TestDeleteTemplate
=== CONT  TestDeleteTemplate
    build_template_test.go:133: test-to-delete: [info] Building template n2in8c883ls0r209g9x1/a3d692df-da7c-4b86-af18-cf5456548e52
    build_template_test.go:133: test-to-delete: [info] [base] FROM ubuntu:22.04 [f9f564014e009a9561a82bf8c84f9314242971e833fb019936654ecba452f184]
    build_template_test.go:133: test-to-delete: [info] Base Docker image size: 30 MB
    build_template_test.go:133: test-to-delete: [info] Creating file system and pulling Docker image
    build_template_test.go:133: test-to-delete: [info] Uncompressing layer sha256:40d16f30db405106ef8074779bdf41f012465c2a785bbeaa2eab9f2081099b47 30 MB
    build_template_test.go:133: test-to-delete: [info] Uncompressing layer sha256:9da83fbd88767908f9025f4df7f37ed037884005b7a50e709c45fc78f392ff63 13 MB
    build_template_test.go:133: test-to-delete: [info] Uncompressing layer sha256:8c4b1b28875140ed3abacaf16ad0d696f6bef912f52d2148f261a23e3349465b 168 B
    build_template_test.go:133: test-to-delete: [info] Layers extracted
    build_template_test.go:133: test-to-delete: [info] Root filesystem structure: bin, boot, dev, etc, home, lib, lib32, lib64, libx32, media, mnt, opt, proc, root, run, sbin, srv, sys, tmp, usr, var
    build_template_test.go:133: test-to-delete: [info] Provisioning sandbox template
    build_template_test.go:133: test-to-delete: [info] Provisioning was successful, cleaning up
    build_template_test.go:133: test-to-delete: [info] Sandbox template provisioned
    build_template_test.go:133: test-to-delete: [info] [base] DEFAULT USER user [49e586c2171254c6bc4a09e84eedac32dbcf113a158c24248129af2f49cbed74]
    build_template_test.go:133: test-to-delete: [info] [builder 1/1] RUN echo 'Hello, World!' [c72b4f813c2a16b0fc1a1c5da7b1365a304cbac516b22dc304a71f70aae48ac0]
    build_template_test.go:133: test-to-delete: [info] [builder 1/1] [stdout]: Hello, World!
    build_template_test.go:133: test-to-delete: [info] [finalize] Finalizing template build [92c524e30533398ebb41ce04c2596130f0cdecc9aa328e28fdb16a1b11f61d62]
    build_template_test.go:133: test-to-delete: [error] Build failed: build was cancelled
    delete_template_test.go:18: Build failed: {<nil> build was cancelled <nil>}
--- FAIL: TestDeleteTemplate (158.02s)
github.com/e2b-dev/infra/tests/integration/internal/tests/api/templates::TestDeleteTemplateFromAnotherTeamAPIKey
Stack Traces | 158s run time
=== RUN   TestDeleteTemplateFromAnotherTeamAPIKey
=== PAUSE TestDeleteTemplateFromAnotherTeamAPIKey
=== CONT  TestDeleteTemplateFromAnotherTeamAPIKey
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] Building template tqm541wrze673y0mwa6t/89172868-7146-42c7-a704-6732e1be3995
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] [base] FROM ubuntu:22.04 [f9f564014e009a9561a82bf8c84f9314242971e833fb019936654ecba452f184]
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] Base Docker image size: 30 MB
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] Creating file system and pulling Docker image
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] Uncompressing layer sha256:40d16f30db405106ef8074779bdf41f012465c2a785bbeaa2eab9f2081099b47 30 MB
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] Uncompressing layer sha256:9da83fbd88767908f9025f4df7f37ed037884005b7a50e709c45fc78f392ff63 13 MB
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] Uncompressing layer sha256:8c4b1b28875140ed3abacaf16ad0d696f6bef912f52d2148f261a23e3349465b 168 B
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] Layers extracted
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] Root filesystem structure: bin, boot, dev, etc, home, lib, lib32, lib64, libx32, media, mnt, opt, proc, root, run, sbin, srv, sys, tmp, usr, var
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] Provisioning sandbox template
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] Provisioning was successful, cleaning up
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] Sandbox template provisioned
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] [base] DEFAULT USER user [49e586c2171254c6bc4a09e84eedac32dbcf113a158c24248129af2f49cbed74]
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] [builder 1/1] RUN echo 'Hello, World!' [c72b4f813c2a16b0fc1a1c5da7b1365a304cbac516b22dc304a71f70aae48ac0]
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] [builder 1/1] [stdout]: Hello, World!
    build_template_test.go:133: test-to-delete-another-team-api-key: [info] [finalize] Finalizing template build [92c524e30533398ebb41ce04c2596130f0cdecc9aa328e28fdb16a1b11f61d62]
    build_template_test.go:133: test-to-delete-another-team-api-key: [error] Build failed: build was cancelled
    delete_template_test.go:51: Build failed: {<nil> build was cancelled <nil>}
--- FAIL: TestDeleteTemplateFromAnotherTeamAPIKey (158.02s)
github.com/e2b-dev/infra/tests/integration/internal/tests/api/templates::TestTemplateBuildRUN/Single_RUN_command
Stack Traces | 169s run time
=== RUN   TestTemplateBuildRUN/Single_RUN_command
=== PAUSE TestTemplateBuildRUN/Single_RUN_command
=== CONT  TestTemplateBuildRUN/Single_RUN_command
    build_template_test.go:133: test-ubuntu-run: [info] Building template eagkwhq889qgj6cxmbtp/00821732-4246-4193-b763-4af71cc84216
    build_template_test.go:133: test-ubuntu-run: [info] [base] FROM ubuntu:22.04 [f9f564014e009a9561a82bf8c84f9314242971e833fb019936654ecba452f184]
    build_template_test.go:133: test-ubuntu-run: [info] Base Docker image size: 30 MB
    build_template_test.go:133: test-ubuntu-run: [info] Creating file system and pulling Docker image
    build_template_test.go:133: test-ubuntu-run: [info] Uncompressing layer sha256:40d16f30db405106ef8074779bdf41f012465c2a785bbeaa2eab9f2081099b47 30 MB
    build_template_test.go:133: test-ubuntu-run: [info] Uncompressing layer sha256:9da83fbd88767908f9025f4df7f37ed037884005b7a50e709c45fc78f392ff63 13 MB
    build_template_test.go:133: test-ubuntu-run: [info] Uncompressing layer sha256:8c4b1b28875140ed3abacaf16ad0d696f6bef912f52d2148f261a23e3349465b 168 B
    build_template_test.go:133: test-ubuntu-run: [info] Layers extracted
    build_template_test.go:133: test-ubuntu-run: [info] Root filesystem structure: bin, boot, dev, etc, home, lib, lib32, lib64, libx32, media, mnt, opt, proc, root, run, sbin, srv, sys, tmp, usr, var
    build_template_test.go:133: test-ubuntu-run: [info] Provisioning sandbox template
    build_template_test.go:133: test-ubuntu-run: [info] Provisioning was successful, cleaning up
    build_template_test.go:133: test-ubuntu-run: [info] Sandbox template provisioned
    build_template_test.go:133: test-ubuntu-run: [info] [base] DEFAULT USER user [49e586c2171254c6bc4a09e84eedac32dbcf113a158c24248129af2f49cbed74]
    build_template_test.go:133: test-ubuntu-run: [info] [builder 1/1] RUN echo 'Hello, World!' [c72b4f813c2a16b0fc1a1c5da7b1365a304cbac516b22dc304a71f70aae48ac0]
    build_template_test.go:133: test-ubuntu-run: [info] [builder 1/1] [stdout]: Hello, World!
    build_template_test.go:133: test-ubuntu-run: [info] [finalize] Finalizing template build [92c524e30533398ebb41ce04c2596130f0cdecc9aa328e28fdb16a1b11f61d62]
    build_template_test.go:133: test-ubuntu-run: [error] Build failed: build was cancelled
    build_template_test.go:166: Build failed: {<nil> build was cancelled <nil>}
--- FAIL: TestTemplateBuildRUN/Single_RUN_command (169.06s)
github.com/e2b-dev/infra/tests/integration/internal/tests/envd::TestCommandKillNextApp
Stack Traces | 265s run time
=== RUN   TestCommandKillNextApp
=== PAUSE TestCommandKillNextApp
=== CONT  TestCommandKillNextApp
Executing command /bin/bash in sandbox i5cv5b96zgk2rrtjvn1v4
    process_test.go:30: Build failed: {<nil> build was cancelled <nil>}
--- FAIL: TestCommandKillNextApp (264.56s)

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Overwriting finished_at and reason for already completed or failed builds during soft-deletion destroys historical build metrics and diagnostic failure reasons. Using COALESCE in the SQL update query ensures that the original completion timestamp and failure details are preserved for completed builds, while still setting them for active or pending builds.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread packages/db/queries/builds/mark_exclusive_template_builds_deleted.sql Outdated

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: In-flight sync clears deleted status
    • Added status_group <> 'deleted' guard to all three build update queries to prevent concurrent template-manager operations from overwriting soft-deleted builds.

Create PR

Or push these changes by commenting:

@cursor push c7c7b1b319
Preview (c7c7b1b319)
diff --git a/packages/db/queries/builds/finish_template_build.sql b/packages/db/queries/builds/finish_template_build.sql
--- a/packages/db/queries/builds/finish_template_build.sql
+++ b/packages/db/queries/builds/finish_template_build.sql
@@ -16,4 +16,5 @@
     kernel_version = COALESCE(NULLIF(@kernel_version::text, ''), kernel_version),
     firecracker_version = COALESCE(NULLIF(@firecracker_version::text, ''), firecracker_version)
 WHERE
-    id = @build_id;
+    id = @build_id
+    AND status_group <> 'deleted';

diff --git a/packages/db/queries/finish_template_build.sql.go b/packages/db/queries/finish_template_build.sql.go
--- a/packages/db/queries/finish_template_build.sql.go
+++ b/packages/db/queries/finish_template_build.sql.go
@@ -26,6 +26,7 @@
     firecracker_version = COALESCE(NULLIF($5::text, ''), firecracker_version)
 WHERE
     id = $6
+    AND status_group <> 'deleted'
 `
 
 type FinishTemplateBuildParams struct {

diff --git a/packages/db/queries/templates/update_template_build_status.sql b/packages/db/queries/templates/update_template_build_status.sql
--- a/packages/db/queries/templates/update_template_build_status.sql
+++ b/packages/db/queries/templates/update_template_build_status.sql
@@ -4,7 +4,8 @@
     finished_at = @finished_at,
     reason = sqlc.narg(reason),
     version = @version
-WHERE id = @build_id;
+WHERE id = @build_id
+    AND status_group <> 'deleted';
 
 -- name: FailTemplateBuildAndDeactivate :exec
 WITH deactivated AS (
@@ -15,4 +16,5 @@
     finished_at = @finished_at,
     reason = sqlc.narg(reason),
     version = @version
-WHERE id = @build_id;
+WHERE id = @build_id
+    AND status_group <> 'deleted';

diff --git a/packages/db/queries/update_template_build_status.sql.go b/packages/db/queries/update_template_build_status.sql.go
--- a/packages/db/queries/update_template_build_status.sql.go
+++ b/packages/db/queries/update_template_build_status.sql.go
@@ -23,6 +23,7 @@
     reason = $3,
     version = $4
 WHERE id = $5
+    AND status_group <> 'deleted'
 `
 
 type FailTemplateBuildAndDeactivateParams struct {
@@ -51,6 +52,7 @@
     reason = $3,
     version = $4
 WHERE id = $5
+    AND status_group <> 'deleted'
 `
 
 type UpdateEnvBuildStatusParams struct {

You can send follow-ups to the cloud agent here.

Comment thread packages/db/queries/builds/mark_exclusive_template_builds_deleted.sql Outdated
@ValentaTomas

Copy link
Copy Markdown
Member Author

Used for demonstration. Referenced (related) from Linear EN-1185.

@linear-code

linear-code Bot commented Jun 29, 2026

Copy link
Copy Markdown

EN-1185

Add status_group <> 'deleted' guard to the build status update queries so a
concurrent template-manager sync cannot resurrect a soft-deleted build, and
preserve original finished_at/reason when marking builds deleted.
@ValentaTomas ValentaTomas marked this pull request as ready for review June 29, 2026 06:26
@ValentaTomas ValentaTomas marked this pull request as draft June 29, 2026 06:32
Replace the build-level soft-delete with an env-level one: deleting a
template, paused sandbox, or snapshot marks envs.status='deleted' instead
of hard-deleting the env, preserving env_builds, env_build_assignments,
and snapshots rows (sandbox_id/base_env_id) so build lineage stays
traceable for a later storage GC. Aliases are released so names can be
reused. Template/snapshot/build reads in the API and dashboard filter out
deleted envs.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: In-flight builds survive template delete
    • Added explicit DELETE CTE for active_template_builds in DeleteTemplate since ON DELETE CASCADE no longer fires with soft-delete.

Create PR

Or push these changes by commenting:

@cursor push f0db1d64ea
Preview (f0db1d64ea)
diff --git a/packages/db/queries/delete_template.sql.go b/packages/db/queries/delete_template.sql.go
--- a/packages/db/queries/delete_template.sql.go
+++ b/packages/db/queries/delete_template.sql.go
@@ -22,6 +22,9 @@
 ), released AS (
   DELETE FROM public.env_aliases ea
   WHERE ea.env_id = $1
+), active_builds_cleanup AS (
+  DELETE FROM public.active_template_builds atb
+  WHERE atb.template_id = $1
 ), updated AS (
   UPDATE public.envs e
   SET status = 'deleted', updated_at = NOW()

diff --git a/packages/db/queries/templates/delete_template.sql b/packages/db/queries/templates/delete_template.sql
--- a/packages/db/queries/templates/delete_template.sql
+++ b/packages/db/queries/templates/delete_template.sql
@@ -13,6 +13,9 @@
 ), released AS (
   DELETE FROM public.env_aliases ea
   WHERE ea.env_id = @template_id
+), active_builds_cleanup AS (
+  DELETE FROM public.active_template_builds atb
+  WHERE atb.template_id = @template_id
 ), updated AS (
   UPDATE public.envs e
   SET status = 'deleted', updated_at = NOW()

You can send follow-ups to the cloud agent here.

Comment thread packages/db/queries/templates/delete_template.sql Outdated

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional findings (outside current diff — PR may have been updated during review):

  • 🔴 packages/db/pkg/types/types.go:153-155 — After this PR, POST /builds/statuses reports soft-deleted builds as status="building" indefinitely. The new BuildStatusGroupDeleted is not handled by MapBuildStatusFromDBStatusGroup in packages/dashboard-api/internal/utils/builds.go, so deleted rows fall to the default api.Building branch. Add explicit case dbtypes.BuildStatusGroupDeleted (and BuildStatusDeleted in MapBuildStatusFromDBStatus for symmetry) mapping to a terminal status, and surface the deletion reason in MapBuildStatusMessageFromDBStatusGroup.

    Extended reasoning...

    Bug

    This PR introduces a new BuildStatusGroup value (deleted) in packages/db/pkg/types/types.go:167 and writes it to env_builds.status_group whenever a template is deleted (via MarkExclusiveTemplateBuildsDeleted). However, the dashboard-api status mapper in packages/dashboard-api/internal/utils/builds.go:68-77 was not updated:

    func MapBuildStatusFromDBStatusGroup(status dbtypes.BuildStatusGroup) api.BuildStatus {
        switch status {
        case dbtypes.BuildStatusGroupFailed:
            return api.Failed
        case dbtypes.BuildStatusGroupReady:
            return api.Success
        default:
            return api.Building   // <-- 'deleted' falls here
        }
    }

    How the deleted row reaches this mapper

    Of the three dashboard endpoints that map build statuses, two filter the soft-deleted row out via JOIN LATERAL env_build_assignments (which DeleteTemplate cascade-removes). But the POST /builds/statuses path does not:

    packages/db/queries/builds/get_builds_statuses.sql is:

    SELECT b.id, b.status_group, b.reason, b.finished_at
    FROM public.env_builds b
    WHERE b.team_id = sqlc.arg(team_id)::uuid
      AND b.id = ANY(...);

    No join, no status_group filter. The env_builds.team_id column (migration 20260218120000) is populated by a trigger when the assignment was created and is never cleared on cascade delete. So the soft-deleted build row stays visible to GetBuildsStatusesByTeam keyed by (team_id, build_id).

    packages/dashboard-api/internal/handlers/builds_statuses.go:53 then maps record.StatusGroup (= 'deleted') through MapBuildStatusFromDBStatusGroup, hits the default branch, and returns api.Building. MapBuildStatusMessageFromDBStatusGroup (builds.go:91-101) gates on BuildStatusGroupFailed only, so the diagnostic reason ("Template deleted by user" / "Snapshot deleted by user") is also suppressed.

    Why this is a regression introduced by this PR

    Before this PR, no code wrote status='deleted', so the default branch was never reached for a terminal build. After this PR, the same builds that previously stayed at success/failed (and reported as api.Success/api.Failed) now flip to deleted and report as api.Building — and they will never transition back, because this PR also adds AND status_group <> 'deleted' guards on FinishTemplateBuild/UpdateEnvBuildStatus/FailTemplateBuildAndDeactivate to prevent any future status writes from clearing the soft-delete.

    Step-by-step proof

    1. Client creates a template. Build B completes; env_builds(B).status='success', status_group='ready', team_id=T (set by trigger when assignment was inserted).
    2. User calls DELETE /templates/{id}. deleteTemplateAndSoftDeleteBuilds runs:
      • MarkExclusiveTemplateBuildsDeleted sets env_builds(B).status='deleted' → trigger compute_status_group (migration 20260627120000) sets status_group='deleted'.
      • DeleteTemplate cascade-deletes env_build_assignments and envs rows for this template. env_builds(B) is preserved with team_id=T intact.
    3. Client calls POST /builds/statuses { build_ids: [B] }. GetBuildsStatusesByTeam runs the SELECT above with team_id=T and finds row B with status_group='deleted'.
    4. Handler at builds_statuses.go:53 calls MapBuildStatusFromDBStatusGroup(BuildStatusGroupDeleted) → default → returns api.Building. StatusMessage is nil (the "Template deleted by user" reason is filtered out by the != Failed check at builds.go:92).
    5. Client receives { id: B, status: "building", status_message: null } — a build that previously reported success now reports as in-flight, with no diagnostic, and the new status_group <> 'deleted' guards ensure no template-manager update will ever change that.

    Fix

    In packages/dashboard-api/internal/utils/builds.go, add an explicit case dbtypes.BuildStatusGroupDeleted (and BuildStatusDeleted in MapBuildStatusFromDBStatus for symmetry, even though that path is currently masked by the JOIN LATERAL in GetBuildInfoByTeamAndBuildID — keeping them in lockstep prevents future regressions). Map it to a new api.Deleted enum value (preferred — most informative) or to api.Failed (no API schema change). Extend MapBuildStatusMessageFromDBStatusGroup to surface the deletion reason for the deleted group too.

  • 🔴 packages/db/queries/builds/mark_exclusive_template_builds_deleted.sql:6-10 — The new COALESCE(NULLIF(eb.reason, '{}'::jsonb), @reason) only treats a literal {} as missing, but types.BuildReason{} serializes to {"message":""} (the Message field has no omitempty), and that's what successful template/snapshot builds write via UpdateEnvBuildStatus (snapshot_template.go:115, pause_instance.go:75). So for the common case — soft-deleting a successful build — NULLIF returns the existing {"message":""}, COALESCE keeps it, and the deletion reason ("Template deleted by user" / "Snapshot deleted by user") is silently dropped, defeating the stated intent of this fix. Consider widening the empty check, e.g. reason = CASE WHEN COALESCE(eb.reason->>'message','') = '' THEN @reason ELSE eb.reason END.

    Extended reasoning...

    What's broken. The mark query added in f76f755 uses reason = COALESCE(NULLIF(eb.reason, '{}'::jsonb), @reason) to preserve any existing diagnostic on env_builds.reason and only fall back to the deletion reason when reason is empty. But the only jsonb value NULLIF will treat as empty here is the literal {} — and that is only the column DEFAULT set in migration 20250905134524_fix_env_builds_non_null_fields.sql. Any row whose status was ever touched by the Go layer goes through types.BuildReason, whose Message field is tagged json:"message" with no omitempty (packages/db/pkg/types/types.go:52-58). So json.Marshal(types.BuildReason{}) produces {"message":""}, not {}.

    Where the empty-message reason is written on success. UpdateEnvBuildStatus is called with Reason: types.BuildReason{} from at least two success paths:

    • packages/api/internal/orchestrator/snapshot_template.go:115 — on snapshot upload success (status=uploaded).
    • packages/api/internal/orchestrator/pause_instance.go:75 — on pause success (status=success).

    Both rows end up in the DB with reason = {"message":""}. This is the common state for any successful template or snapshot build, not an edge case.

    Why the guard fails. Postgres jsonb equality is structural — '{"message":""}'::jsonb and '{}'::jsonb are not equal, so NULLIF('{"message":""}', '{}') returns {"message":""} (not NULL). COALESCE then returns that non-null value and the supplied @reason parameter ("Template deleted by user" or "Snapshot deleted by user") is discarded.

    Step-by-step proof for a user deleting a successful template.

    1. Template build runs, finishes successfully → UpdateEnvBuildStatus(Status: Uploaded, Reason: types.BuildReason{}) fires from snapshot_template.go:115.
    2. pgx encodes types.BuildReason{} via Value()json.Marshal{"message":""}. Row now has reason = '{"message":""}'::jsonb.
    3. User deletes the template. deleteTemplateAndSoftDeleteBuilds calls MarkExclusiveTemplateBuildsDeleted with Reason: dbtypes.BuildReason{Message: "Template deleted by user"}.
    4. The SQL evaluates NULLIF('{"message":""}'::jsonb, '{}'::jsonb) → returns {"message":""} (operands are not equal).
    5. COALESCE('{"message":""}', '{"message":"Template deleted by user"}') → returns {"message":""}.
    6. Row is committed with status='deleted' but reason='{"message":""}'. The deletion reason is silently dropped.

    Why the test suite didn't catch it. packages/db/pkg/tests/builds/mark_exclusive_template_builds_deleted_test.go uses testutils.CreateTestBuild which inserts a row without setting reason — so the row keeps the column default {}::jsonb, the only value where NULLIF actually flips to NULL. The tests therefore only exercise the path where the fallback fires.

    Impact. The status field still flips to 'deleted', so soft-delete itself works and storage GC can still find orphaned rows by status. But the reason field that this PR added to give operators / GC a human-readable signal (user-deleted vs failed vs successful-then-stranded) is wrong for the most common case — every successful template/snapshot that a user deletes will look like {"message":""} in the audit trail, indistinguishable from a row that was never assigned a reason. This directly contradicts the maintainer comment on the resolved review thread ("completed/failed builds retain their timestamp and diagnostics"): for the success case the preserved "diagnostic" is just an empty string, and the actual deletion reason is lost.

    Fix. Treat any reason whose message is empty as missing:

    reason = CASE WHEN COALESCE(eb.reason->>'message', '') = '' THEN @reason ELSE eb.reason END,

    Alternatively, add omitempty to BuildReason.Message so BuildReason{} serializes to {} — but that has broader implications and the SQL-side fix is more targeted to the soft-delete intent.

…elete

Soft-delete no longer removes the env row, so active_template_builds (was
ON DELETE CASCADE) lingered, inflating team concurrency counts and keeping
the template-manager sync busy. DeleteTemplate now clears those rows, and
GetInProgressTemplateBuilds ignores deleted envs.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Upsert leaves env status deleted
    • Added status='active' to the ON CONFLICT UPDATE clause in CreateOrUpdateTemplate to reset deleted envs when they are rebuilt.
  • ✅ Fixed: Alias blocked by deleted env id
    • Added status <> 'deleted' filter to CheckAliasConflictsWithTemplateID to exclude soft-deleted envs from blocking alias reuse.

Create PR

Or push these changes by commenting:

@cursor push 88413ac5aa
Preview (88413ac5aa)
diff --git a/packages/db/queries/template_aliases/check_alias_exists.sql b/packages/db/queries/template_aliases/check_alias_exists.sql
--- a/packages/db/queries/template_aliases/check_alias_exists.sql
+++ b/packages/db/queries/template_aliases/check_alias_exists.sql
@@ -3,6 +3,7 @@
     SELECT 1
     FROM "public"."envs"
     WHERE id = @alias
+      AND status <> 'deleted'
 );
 
 -- name: CheckAliasExistsInNamespace :one

diff --git a/packages/db/queries/templates/create_template.sql b/packages/db/queries/templates/create_template.sql
--- a/packages/db/queries/templates/create_template.sql
+++ b/packages/db/queries/templates/create_template.sql
@@ -3,7 +3,8 @@
 VALUES (@template_id, @team_id, @created_by, NOW(), FALSE, @cluster_id, 'template')
 ON CONFLICT (id) DO UPDATE
 SET updated_at  = NOW(),
-    build_count = envs.build_count + 1;
+    build_count = envs.build_count + 1,
+    status      = 'active';
 
 -- name: InvalidateUnstartedTemplateBuilds :exec
 WITH invalidated AS (

You can send follow-ups to the cloud agent here.

Comment thread packages/db/queries/templates/delete_template.sql Outdated
Comment thread packages/db/queries/templates/delete_template.sql Outdated
…onflict

Soft-delete keeps the env row, so rebuilding into the same template id must
reset status to 'active' (CreateOrUpdateTemplate ON CONFLICT), and the
alias/template-id conflict check must ignore deleted envs so a released
name stays reusable.
Replace the envs.status text column with a boolean envs.deleted, since the
state is strictly binary. Reads filter deleted = false; delete sets it true.
@ValentaTomas ValentaTomas marked this pull request as ready for review June 29, 2026 06:57

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Kill leaves orphaned snapshot rows
    • Added explicit DeleteSnapshot call in deleteSnapshot() to remove snapshot row after soft-delete, preventing orphaned rows that would cause UpsertSnapshot to skip creating a new env.

Create PR

Or push these changes by commenting:

@cursor push 4e4425c3b7
Preview (4e4425c3b7)
diff --git a/packages/api/internal/db/snapshots.go b/packages/api/internal/db/snapshots.go
--- a/packages/api/internal/db/snapshots.go
+++ b/packages/api/internal/db/snapshots.go
@@ -47,3 +47,10 @@
 
 	return snapshot, nil
 }
+
+func DeleteSnapshot(ctx context.Context, db *sqlcdb.Client, teamID uuid.UUID, sandboxID string) error {
+	return db.DeleteSnapshot(ctx, queries.DeleteSnapshotParams{
+		SandboxID: sandboxID,
+		TeamID:    teamID,
+	})
+}

diff --git a/packages/api/internal/handlers/sandbox_kill.go b/packages/api/internal/handlers/sandbox_kill.go
--- a/packages/api/internal/handlers/sandbox_kill.go
+++ b/packages/api/internal/handlers/sandbox_kill.go
@@ -29,6 +29,12 @@
 		return fmt.Errorf("error deleting template from db: %w", dbErr)
 	}
 
+	// Delete the snapshot row to prevent orphaned rows after soft-delete.
+	// Soft-delete on envs no longer triggers ON DELETE CASCADE, so explicit cleanup is needed.
+	if err := db.DeleteSnapshot(ctx, a.sqlcDB, teamID, sandboxID); err != nil {
+		return fmt.Errorf("error deleting snapshot row: %w", err)
+	}
+
 	a.templateCache.InvalidateAllTags(context.WithoutCancel(ctx), snapshot.TemplateID)
 	a.templateCache.InvalidateAliasesByTemplateID(context.WithoutCancel(ctx), snapshot.TemplateID, aliasKeys)
 	a.snapshotCache.Invalidate(context.WithoutCancel(ctx), sandboxID)

diff --git a/packages/db/queries/delete_snapshot.sql.go b/packages/db/queries/delete_snapshot.sql.go
new file mode 100644
--- /dev/null
+++ b/packages/db/queries/delete_snapshot.sql.go
@@ -1,0 +1,28 @@
+// Code generated by sqlc. DO NOT EDIT.
+// versions:
+//   sqlc v1.29.0
+// source: delete_snapshot.sql
+
+package queries
+
+import (
+	"context"
+
+	"github.com/google/uuid"
+)
+
+const deleteSnapshot = `-- name: DeleteSnapshot :exec
+DELETE FROM "public"."snapshots"
+WHERE sandbox_id = $1
+AND team_id = $2
+`
+
+type DeleteSnapshotParams struct {
+	SandboxID string
+	TeamID    uuid.UUID
+}
+
+func (q *Queries) DeleteSnapshot(ctx context.Context, arg DeleteSnapshotParams) error {
+	_, err := q.db.Exec(ctx, deleteSnapshot, arg.SandboxID, arg.TeamID)
+	return err
+}

diff --git a/packages/db/queries/snapshots/delete_snapshot.sql b/packages/db/queries/snapshots/delete_snapshot.sql
new file mode 100644
--- /dev/null
+++ b/packages/db/queries/snapshots/delete_snapshot.sql
@@ -1,0 +1,4 @@
+-- name: DeleteSnapshot :exec
+DELETE FROM "public"."snapshots"
+WHERE sandbox_id = @sandbox_id
+AND team_id = @team_id;

You can send follow-ups to the cloud agent here.

Comment thread packages/api/internal/handlers/sandbox_kill.go

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e37b771b39

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/db/queries/templates/delete_template.sql Outdated
…t env

ExistsWaitingTemplateBuild now requires e.deleted = false so the docker
reverse proxy can't authorize layer uploads for a deleted template's
pending build. UpsertSnapshot reactivates the env when re-pausing reuses a
previously soft-deleted snapshot env.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0aab1a503a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/db/queries/templates/create_template.sql Outdated
Comment thread packages/api/internal/handlers/sandbox_kill.go
Comment thread packages/api/internal/handlers/template_delete.go Outdated
A retry of DELETE /sandboxes/{id} on an already soft-deleted snapshot
returned 204 instead of 404 because GetSnapshotBuilds still matched the
soft-deleted env. Filter e.deleted = false to restore not-found behavior,
and correct the softDeleteTemplate doc comment.
Soft-delete logic lives in the DeleteTemplate query; the handlers just call
it directly like before, keeping the diff minimal.
Move the remaining sqlc.embed(e) reads onto the active_envs view too, so the
soft-delete filter lives only in the view and the write paths. The embedded
field is now ActiveEnv (consumers updated). GetTemplateByID stays on envs
unfiltered so rebuild can reactivate a soft-deleted template.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Reactivates soft-deleted snapshot envs
    • Added WHERE envs.source = 'template' clause to CreateOrUpdateTemplate's conflict update to prevent reactivation of snapshot_template envs while preserving rebuild capability for regular templates.

Create PR

Or push these changes by commenting:

@cursor push 6c4eea8aa5
Preview (6c4eea8aa5)
diff --git a/packages/db/queries/create_template.sql.go b/packages/db/queries/create_template.sql.go
--- a/packages/db/queries/create_template.sql.go
+++ b/packages/db/queries/create_template.sql.go
@@ -19,6 +19,7 @@
 SET updated_at  = NOW(),
     deleted_at  = NULL,
     build_count = envs.build_count + 1
+WHERE envs.source = 'template'
 `
 
 type CreateOrUpdateTemplateParams struct {

diff --git a/packages/db/queries/templates/create_template.sql b/packages/db/queries/templates/create_template.sql
--- a/packages/db/queries/templates/create_template.sql
+++ b/packages/db/queries/templates/create_template.sql
@@ -4,7 +4,8 @@
 ON CONFLICT (id) DO UPDATE
 SET updated_at  = NOW(),
     deleted_at  = NULL,
-    build_count = envs.build_count + 1;
+    build_count = envs.build_count + 1
+WHERE envs.source = 'template';
 
 -- name: InvalidateUnstartedTemplateBuilds :exec
 WITH invalidated AS (

You can send follow-ups to the cloud agent here.

Comment thread packages/db/queries/templates/create_template.sql Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c6b9858523

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/db/queries/templates/delete_template.sql Outdated
Comment thread packages/db/queries/template_aliases/check_alias_exists.sql Outdated
…served

CreateOrUpdateTemplate only clears deleted_at for source='template', so a
template build registration can't resurrect a soft-deleted snapshot env.
CheckAliasConflictsWithTemplateID checks envs (not active_envs) so a
soft-deleted (reactivatable) env id still blocks an aliasing collision.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Snapshot template rebuild stays soft-deleted
    • Changed the WHERE clause in CreateOrUpdateTemplate to include both 'template' and 'snapshot_template' sources so that soft-deleted snapshot templates can be properly reactivated on rebuild.

Create PR

Or push these changes by commenting:

@cursor push 44b4192d95
Preview (44b4192d95)
diff --git a/packages/db/queries/create_template.sql.go b/packages/db/queries/create_template.sql.go
--- a/packages/db/queries/create_template.sql.go
+++ b/packages/db/queries/create_template.sql.go
@@ -19,7 +19,7 @@
 SET updated_at  = NOW(),
     deleted_at  = NULL,
     build_count = envs.build_count + 1
-WHERE envs.source = 'template'
+WHERE envs.source IN ('template', 'snapshot_template')
 `
 
 type CreateOrUpdateTemplateParams struct {

diff --git a/packages/db/queries/templates/create_template.sql b/packages/db/queries/templates/create_template.sql
--- a/packages/db/queries/templates/create_template.sql
+++ b/packages/db/queries/templates/create_template.sql
@@ -6,7 +6,7 @@
     deleted_at  = NULL,
     build_count = envs.build_count + 1
 -- Only reactivate template envs; never resurrect a soft-deleted snapshot env.
-WHERE envs.source = 'template';
+WHERE envs.source IN ('template', 'snapshot_template');
 
 -- name: InvalidateUnstartedTemplateBuilds :exec
 WITH invalidated AS (

You can send follow-ups to the cloud agent here.

Comment thread packages/db/queries/templates/create_template.sql Outdated
…eted

Disambiguates from the active-envs GetTemplateById (case-only difference was
a footgun now that one includes soft-deleted envs for rebuild and the other
doesn't).
Drop the rebuild reactivation: CreateOrUpdateTemplate no longer clears
deleted_at and skips the update for a soft-deleted env (returns no row), so
RegisterBuild fails with 404 instead of resurrecting it. Removed the
unfiltered GetTemplateByIDIncludingDeleted lookup; all resolves go through the
active_envs view. Alias/name reuse is unaffected.
With soft-delete now permanent (no reactivation), a deleted env id is dead and
unresolvable, so the alias/template-id conflict check goes through the view
like every other read.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Alias check ignores deleted envs
    • Changed CheckAliasConflictsWithTemplateID to query envs table instead of active_envs view so soft-deleted template IDs remain reserved and cannot be used as aliases.

Create PR

Or push these changes by commenting:

@cursor push 0afc5075f4
Preview (0afc5075f4)
diff --git a/packages/db/queries/check_alias_exists.sql.go b/packages/db/queries/check_alias_exists.sql.go
--- a/packages/db/queries/check_alias_exists.sql.go
+++ b/packages/db/queries/check_alias_exists.sql.go
@@ -12,7 +12,7 @@
 const checkAliasConflictsWithTemplateID = `-- name: CheckAliasConflictsWithTemplateID :one
 SELECT EXISTS(
     SELECT 1
-    FROM "public"."active_envs"
+    FROM "public"."envs"
     WHERE id = $1
 )
 `

diff --git a/packages/db/queries/template_aliases/check_alias_exists.sql b/packages/db/queries/template_aliases/check_alias_exists.sql
--- a/packages/db/queries/template_aliases/check_alias_exists.sql
+++ b/packages/db/queries/template_aliases/check_alias_exists.sql
@@ -1,7 +1,7 @@
 -- name: CheckAliasConflictsWithTemplateID :one
 SELECT EXISTS(
     SELECT 1
-    FROM "public"."active_envs"
+    FROM "public"."envs"
     WHERE id = @alias
 );

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit e9440cc. Configure here.

Comment thread packages/db/queries/template_aliases/check_alias_exists.sql Outdated
@ValentaTomas ValentaTomas requested review from dobrac and wj-e2b June 29, 2026 23:36

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ad23311802

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/db/queries/templates/delete_template.sql Outdated
Soft-delete keeps the envs row, so the FK no longer rejects assigning a build
to a concurrently-deleted env. Guard CreateTemplateBuildAssignment on
active_envs (now :execrows); snapshot-template and tag-assignment paths treat
0 rows as template-not-found.
… return

The query is now :execrows; the test setup helper must read the rows value.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 11cfc5472f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/db/queries/templates/create_template_build_assignment.sql Outdated
Comment thread packages/db/pkg/dashboard/sql_queries/templates/list_team_templates.sql Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 57e699184c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/db/queries/templates/delete_template.sql Outdated
ValentaTomas and others added 2 commits June 30, 2026 12:39
…uild

DeleteTemplate was a single CTE: when its env-lock UPDATE waited on an
in-flight build registration, the active_template_builds DELETE used the stale
snapshot and missed the just-committed active row, leaving a deleted template
counted toward concurrency. Split into SoftDeleteTemplate + ReleaseTemplateAliases
+ DeleteActiveTemplateBuilds run in a transaction, so the cleanup statements get
fresh snapshots after the env lock is taken.
@ValentaTomas ValentaTomas enabled auto-merge (squash) June 30, 2026 19:53
@ValentaTomas ValentaTomas merged commit ee88776 into main Jun 30, 2026
53 checks passed
@ValentaTomas ValentaTomas deleted the pr/db-soft-delete-builds branch June 30, 2026 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants