[new-protocol] garbage collect consensus data for views less than decided view#4415
[new-protocol] garbage collect consensus data for views less than decided view#4415imabdulbasit wants to merge 9 commits into
Conversation
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. |
|
Claude finished @imabdulbasit's task in 2m 20s —— View job Code Review: PR #4415
SummaryClean simplification that removes the configurable ObservationsCorrectness: The change looks correct. Previously, GC was triggered only when Behavioral change — checkpoint votes on every decide: Before this PR, Minor cleanup opportunity: Posted an inline comment — Testing: The test files are updated correctly (all Overall this is a clean, well-scoped removal. LGTM pending the minor nit. |
2c61646 to
d4c17ed
Compare
bfish713
left a comment
There was a problem hiding this comment.
We should completely remove the garbage collection certificate if we are doing this. I think this will trigger voting for a checkpoint every decide, which is an extra cert per view we don't actually need.
# Conflicts: # crates/espresso/node/src/context.rs # crates/hotshot/new-protocol/src/coordinator.rs
# Conflicts: # crates/hotshot/new-protocol/src/coordinator.rs
|
Claude finished @imabdulbasit's task in 3m 43s —— View job Code Review: PR #4415
SummaryThis PR addresses @bfish713's feedback from the first round: the checkpoint certificate mechanism ( What Changed
Review FindingsCorrectness — GC on every decide is sound: The Behavioral change — more aggressive memory cleanup: GC now fires every view instead of every ~100 blocks. This is good for memory usage but means state is retained for a shorter window. If any component outside the coordinator's Posted 2 inline comments:
Previous review threads: My prior inline comment about removing the Overall this is a well-executed follow-up that addresses the reviewer's request cleanly. The removal is thorough with no orphaned references. |
| if let Some(newest) = leaves.first() { | ||
| let gc_view = newest.view_number(); | ||
| let gc_epoch = newest.justify_qc().epoch().unwrap_or_default(); | ||
| self.gc(gc_view, gc_epoch); |
There was a problem hiding this comment.
Nit: unwrap_or_default() silently substitutes EpochNumber(0) when the epoch is missing. Since this controls the GC sweep across vote collectors and the epoch manager (self.epoch_manager.gc(epoch)), an unexpected EpochNumber(0) could either be a no-op or prematurely purge epoch-0 data.
Consider whether this branch can actually fire in practice (e.g. a leaf whose justify_qc has no epoch). If it can't, a warn! + early-return would make the impossibility explicit rather than silently using a sentinel value.
| // TODO: change this once cliquenet has a Slot::MAX for external messages that is not garbage collected | ||
| MessageKind::External(_) => ViewNumber::new(u64::MAX), |
There was a problem hiding this comment.
This sentinel works for the split_off-based GC (which retains keys >= the GC view), but u64::MAX is brittle — any arithmetic on this view number (view + 1, comparisons that assume a valid range) elsewhere in the stack would silently overflow or behave unexpectedly. As the TODO says, a proper Slot::MAX or an Option<ViewNumber> return would be safer. Worth tracking this.
Also: other MessageKind::Data variants return ViewNumber::new(1) for NotFound/Denied — those will be GC'd immediately once the decided view passes 1. Is that intentional, or should they get a similar "never GC" treatment?
No description provided.