Skip to content

Delete sitreps non-transactionally with improved pagination#10210

Merged
smklein merged 6 commits intomainfrom
sitrep-delete-better
Apr 14, 2026
Merged

Delete sitreps non-transactionally with improved pagination#10210
smklein merged 6 commits intomainfrom
sitrep-delete-better

Conversation

@smklein
Copy link
Copy Markdown
Collaborator

@smklein smklein commented Apr 2, 2026

Follow-up to #10143

Adds pagination during sitrep garbage collection.

While I was there, I realized that we actually don't need transactions on the delete pathway anymore.

  • The delete operation (within the garbage collection pass) deletes sitreps by deleting out-of-history fm_sitrep rows, which immediately orphans all other sub-tables within the sitrep.
  • All the "concurrent reads" of sitreps go through fm_sitrep_read_on_conn, which reads metadata from fm_sitrep last. If this sitrep can be read: it has not been deleted yet. If this sitrep cannot be read: it has been deleted, and prior reads can be discarded.
  • All the "concurrent inserts" of sitreps are contingent on the parent sitrep not being stale. So: because sitrep insert writes to the fm_sitrep table first, the child rows are "not orphaned" (won't be GC-ed). This protection lasts for the duration of fm_sitrep_insert, OR until the parent sitrep is marked stale - at which point insert should fail anyway.

All this is to say: the "read" and "insert" pathways function fine if a fm_sitrep row is deleted non-atomically before subsequent child rows. Therefore: no transaction is necessary here.

@smklein smklein requested review from hawkw and mergeconflict April 2, 2026 20:49
@hawkw
Copy link
Copy Markdown
Member

hawkw commented Apr 2, 2026

All this is to say: the "read" and "insert" pathways function fine if a fm_sitrep row is deleted non-atomically before subsequent child rows. Therefore: no transaction is necessary here.

zoom zoom!

Comment thread nexus/db-queries/src/db/datastore/fm.rs
Comment thread nexus/db-queries/src/db/datastore/fm.rs
Comment thread nexus/db-queries/src/db/datastore/fm.rs
Comment thread nexus/db-queries/src/db/datastore/fm.rs Outdated
Comment thread nexus/db-queries/src/db/datastore/fm.rs
Comment thread nexus/db-queries/src/db/datastore/fm.rs
Comment thread nexus/db-queries/src/db/datastore/fm.rs
Comment thread nexus/db-queries/src/db/datastore/fm.rs Outdated
Comment on lines +3116 to +3120
// Join all tasks — a panic in any task (from assert_sitreps_eq)
// means we detected a torn read.
for handle in handles {
handle.await.expect("task panicked");
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe this may not actually be necessary if the tests are being compiled with panic = "abort"? but probably good to do anyway

Comment thread nexus/db-queries/src/db/datastore/fm.rs Outdated
Comment thread nexus/db-queries/src/db/datastore/fm.rs
/// *complete* sitrep (no torn reads). Errors (e.g. `NotFound`) are
/// expected and fine — partial data is not.
///
/// Writers race with each other, causing `ParentNotCurrent` failures.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we expect that one of the racing writers will always win and the rest will fail with ParentNotCurrent or whatever, is that right? Is there any possibility that all writers will fail, such that the test wouldn't make progress?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if exactly one racing writer doesn't always win, then we have much worse problems :)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ yeah, we are spinning waiting for a "minimum number of successful inserts", but there really is no reason why "all of them" would fail - one should reliably be getting through.

@smklein smklein enabled auto-merge (squash) April 14, 2026 22:59
@smklein smklein merged commit 3b4c2bc into main Apr 14, 2026
16 checks passed
@smklein smklein deleted the sitrep-delete-better branch April 14, 2026 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants