[new-protocol] seed parent proposal from anchor leaf so leader can propose after restart#4440
Conversation
|
Claude finished @imabdulbasit's task in 7m 6s —— View job Code Review: PR #4440
SummaryThis PR generalizes Findings1. View mismatch between
If there's an invariant in the new protocol that guarantees these are always equal, a 2. The 3. No test for the non-genesis restart path The test changes are mechanical renames ( What looks good
|
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. |
| let cert1 = initializer.high_qc.clone(); | ||
| let parent_proposal = message::Proposal { | ||
| block_header: anchor_leaf.block_header().clone(), | ||
| view_number: anchor_view, | ||
| epoch: anchor_epoch, | ||
| justify_qc: anchor_leaf.justify_qc(), | ||
| next_epoch_justify_qc: None, | ||
| upgrade_certificate: None, | ||
| view_change_evidence: None, | ||
| next_drb_result: None, | ||
| upgrade_certificate: anchor_leaf.upgrade_certificate(), | ||
| view_change_evidence: anchor_leaf | ||
| .view_change_evidence | ||
| .clone() | ||
| .and_then(|e| match e { | ||
| hotshot_types::data::ViewChangeEvidence2::Timeout(tc) => Some(tc), | ||
| hotshot_types::data::ViewChangeEvidence2::ViewSync(_) => None, | ||
| }), | ||
| next_drb_result: anchor_leaf.next_drb_result, | ||
| state_cert: None, | ||
| }; | ||
|
|
||
| let mut state_manager = StateManager::new( | ||
| Arc::new(initializer.instance_state.clone()), | ||
| upgrade_lock.clone(), | ||
| ); | ||
| state_manager.seed_state( | ||
| initializer.anchor_leaf.view_number(), | ||
| initializer.anchor_state.clone(), | ||
| initializer.anchor_leaf.clone(), | ||
| ); | ||
| // The synthetic genesis proposal has a non-null justify_qc (the genesis | ||
| // cert1) so the leaf derived from it has a different commitment than | ||
| // the anchor leaf produced by `Leaf2::genesis`. `request_header` for | ||
| // view 1 looks up the parent state by the *proposal's* leaf | ||
| // commitment, so seed the same state under that commitment too. | ||
| state_manager.seed_state( | ||
| ViewNumber::genesis(), | ||
| anchor_view, | ||
| initializer.anchor_state.clone(), | ||
| Leaf2::from(genesis_proposal.clone()), | ||
| anchor_leaf.clone(), | ||
| ); | ||
| consensus.seed_genesis(genesis_cert1, genesis_proposal); | ||
| consensus.seed_parent(cert1, parent_proposal); |
There was a problem hiding this comment.
Question (correctness): seed_parent stores both the cert and proposal at cert1.view_number(), but the proposal's own view_number field is anchor_view. If cert1.view_number() != anchor_view — which can happen when high_qc is ahead of the anchor leaf (the HotShotInitializer comment says "It should be for a view greater than inner's view number for the non genesis case") — then the commitment check in maybe_propose (consensus.rs:1201) will fail:
} else if proposal_commitment(proposal) != parent_cert.data.leaf_commit {
warn!("stored proposal at parent_view does not match parent cert's leaf_commit; refusing to propose with mismatched parent");
return;proposal_commitment derives a leaf from the proposal (which carries view_number = anchor_view), while cert1.data.leaf_commit is the commitment of whatever leaf cert1 actually certified (at cert1.view_number()). If these views differ, the commitments will differ and the leader silently refuses to propose.
Is there an invariant guaranteeing cert1.view_number() == anchor_view in the new protocol context? If so, a debug_assert or comment documenting this would help future readers. If not, this may need to be handled.
| view_number: anchor_view, | ||
| epoch: anchor_epoch, | ||
| justify_qc: anchor_leaf.justify_qc(), | ||
| next_epoch_justify_qc: None, |
There was a problem hiding this comment.
Nit (correctness at epoch boundaries): next_epoch_justify_qc is hardcoded to None here, and the Proposal -> QuorumProposalWrapper conversion also hardcodes it to None. Since next_epoch_justify_qc contributes to the Leaf2 commitment (when with_epoch = true, see data.rs:1755), this means proposal_commitment(parent_proposal) won't match anchor_leaf.commit() if the anchor_leaf has a non-None next_epoch_justify_qc.
This likely doesn't matter if the anchor_leaf is never at an epoch boundary when this code path runs, but worth keeping in mind.
| ); | ||
| consensus.seed_genesis(genesis_cert1, genesis_proposal); | ||
| consensus.seed_parent(cert1, parent_proposal); | ||
| consensus.set_view(anchor_view, anchor_epoch); |
There was a problem hiding this comment.
Good addition — previously current_view was stuck at genesis after seed_genesis, which would prevent the node from proposing at the correct view after a non-genesis restart.
| /// Seed a parent certificate and proposal so the leader of the *next* view | ||
| /// can propose without any external bootstrap injection. | ||
| /// Sets the locked certificate and current epoch. After calling this, a | ||
| /// subsequent `apply` that triggers `maybe_propose` will find the | ||
| /// parent cert and proposal it needs. | ||
| pub fn seed_genesis(&mut self, genesis_cert1: Certificate1<T>, genesis_proposal: Proposal<T>) { | ||
| self.current_epoch = Some(genesis_proposal.epoch); | ||
| self.certs | ||
| .insert(ViewNumber::genesis(), genesis_cert1.clone()); | ||
| self.locked_cert = Some(genesis_cert1); | ||
| self.proposals | ||
| .insert(ViewNumber::genesis(), genesis_proposal); | ||
| pub fn seed_parent(&mut self, cert1: Certificate1<T>, proposal: Proposal<T>) { | ||
| let view = cert1.view_number(); | ||
| self.current_epoch = Some(proposal.epoch); | ||
| self.certs.insert(view, cert1.clone()); | ||
| self.locked_cert = Some(cert1); | ||
| self.proposals.insert(view, proposal); | ||
| } |
There was a problem hiding this comment.
The generalization from seed_genesis to seed_parent looks clean. One thing the doc could mention: this method assumes cert1.view_number() matches proposal.view_number (i.e., the cert certifies the leaf derived from this proposal). If they can legitimately differ, the method will store the proposal under a view key that doesn't match its own view_number field, which will trip the commitment check in maybe_propose.
|
Successfully created backport PR for |
…l from anchor leaf so leader can propose after restart (#4442) [new-protocol] seed parent proposal from anchor leaf so leader can propose after restart (#4440) * seed parent proposal from anchor leaf so leader can propose after restart * use proposal view number (cherry picked from commit 403f2e6) Co-authored-by: Abdul Basit <45506001+imabdulbasit@users.noreply.github.com>
No description provided.