Skip to content

fix: detect abnormal shutdown and trigger repair on next start#351

Merged
hsanjuan merged 2 commits into
masterfrom
feat/dirty-after-bad-shutdown
May 14, 2026
Merged

fix: detect abnormal shutdown and trigger repair on next start#351
hsanjuan merged 2 commits into
masterfrom
feat/dirty-after-bad-shutdown

Conversation

@guillaumemichel
Copy link
Copy Markdown
Contributor

Fix suggested in #279 (comment)

Problem

A CRDT replica can be left in a silently inconsistent state after an abnormal termination (SIGKILL, OOM, panic without Close(), power loss).

Fix

Persist a single key, /<namespace>/bs, whose presence at startup means "the previous run did not close cleanly":

  • Startup: if the key is present, MarkDirty the store (so repair runs) and log a warning. Then write the key unconditionally so a crash in this run is caught on the next start.
  • Clean Close(): after wg.Wait() (so any worker still marking dirty during shutdown has finished), delete the key as the last step.

No changes to the walk/merge hot path, MarkDirty/IsDirty signatures, or repairDAG scope.

Copy link
Copy Markdown
Contributor

@gammazero gammazero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question. Otherwise ok to merge.

Comment thread crdt.go
@hsanjuan hsanjuan merged commit 7468ff8 into master May 14, 2026
6 checks passed
@hsanjuan hsanjuan deleted the feat/dirty-after-bad-shutdown branch May 14, 2026 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants