feat: add qbft module#18
Conversation
- Cannot set up nightly without breaking stuff
- Bigger image (80 mb vs 14 mb)
- Missing public `Run` method
- Wrap in Option - Make classify return None instead of empty vec
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 14 changed files in this pull request and generated 20 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
At the moment, the tests for the QBFT implementation are flaky, thus I've added cargo nextest to have a bit more control over timeouts and retries. The CI workflows have been updated to also include the installation of this tool ( |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 14 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
possible nit: I noticed you don't recover from panics, is this because rust doesn't have a |
varex83
left a comment
There was a problem hiding this comment.
Just leaving some comments that are possible areas of future discussion, please treat them as non-blocking and create an issue for it. We 100% need to revisit it in the future. Overall code matches the original implementation, the only thing is language difference. Also, it would be nice to make it async
| // TODO: Remove these checks | ||
| #![allow(missing_docs)] | ||
| #![allow(clippy::type_complexity)] | ||
| #![allow(clippy::collapsible_if)] | ||
| #![allow(clippy::cast_sign_loss)] | ||
| #![allow(clippy::cast_precision_loss)] | ||
| #![allow(clippy::cast_possible_wrap)] | ||
| #![allow(clippy::cast_possible_truncation)] | ||
| #![allow(clippy::arithmetic_side_effects)] |
There was a problem hiding this comment.
I would fix remove those comments in any case
|
|
||
| /// Defines the event based rules that are triggered when messages are received. | ||
| #[derive(PartialEq, Eq, Hash, Clone, Copy)] | ||
| pub struct UponRule(i64); |
There was a problem hiding this comment.
leaving for the future: refactor it to be enums
| let round: Cell<i64> = Cell::new(1); | ||
| let input_value: RefCell<V> = RefCell::new(Default::default()); | ||
| let mut input_value_source: C = Default::default(); | ||
| let ppj_cache: RefCell<Option<Vec<Msg<I, V, C>>>> = RefCell::new(None); // Cached pre-prepare justification for the current round (`None` value is unset). | ||
| let prepared_round: Cell<i64> = Cell::new(0); | ||
| let prepared_value: RefCell<V> = RefCell::new(Default::default()); | ||
| let mut compare_failure_round: i64 = 0; | ||
| let prepared_justification: RefCell<Option<Vec<Msg<I, V, C>>>> = RefCell::new(None); | ||
| let mut q_commit: Option<Vec<Msg<I, V, C>>> = None; | ||
| let buffer: RefCell<HashMap<i64, Vec<Msg<I, V, C>>>> = RefCell::new(HashMap::new()); | ||
| let dedup_rules: RefCell<HashMap<DedupKey, bool>> = RefCell::new(HashMap::new()); | ||
| let mut timer_chan: mpmc::Receiver<time::Instant>; | ||
| let mut stop_timer: Box<dyn Fn()>; |
There was a problem hiding this comment.
Can we put it to some State struct?
| // to be used when the input value becomes available. | ||
| let broadcast_own_pre_prepare = |justification: Vec<Msg<I, V, C>>| { | ||
| if ppj_cache.borrow().is_some() { | ||
| panic!("bug: justification cache must be none") |
There was a problem hiding this comment.
I would prefer to return errors, not panics, since we cannot recover from them normally
| t: &Transport<I, V, C>, | ||
| instance: &I, | ||
| process: i64, | ||
| mut input_value_ch: mpmc::Receiver<V>, |
There was a problem hiding this comment.
Using of channels here is a point of the future discussion, since we can get rid of them
| V: PartialEq, | ||
| C: Clone + Send + Sync, | ||
| { | ||
| let (compare_err_tx, compare_err_rx) = mpmc::bounded::<Result<()>>(1); |
There was a problem hiding this comment.
Can we use oneshot instead?
| /// Abstracts the transport layer between processes in the consensus system. | ||
| pub struct Transport<I, V, C> | ||
| where | ||
| V: PartialEq, | ||
| { | ||
| /// Broadcast sends a message with the provided fields to all other | ||
| /// processes in the system (including this process). | ||
| /// | ||
| /// Note that an error exits the algorithm. | ||
| pub broadcast: Box< | ||
| dyn Fn( | ||
| /* ct */ &CancellationToken, | ||
| /* type_ */ MessageType, | ||
| /* instance */ &I, | ||
| /* source */ i64, | ||
| /* round */ i64, | ||
| /* value */ &V, | ||
| /* pr */ i64, | ||
| /* pv */ &V, | ||
| /* justification */ Option<&Vec<Msg<I, V, C>>>, | ||
| ) -> Result<()> | ||
| + Send | ||
| + Sync, | ||
| >, | ||
|
|
||
| /// Receive returns a stream of messages received | ||
| /// from other processes in the system (including this process). | ||
| pub receive: mpmc::Receiver<Msg<I, V, C>>, | ||
| } | ||
|
|
||
| /// Defines the consensus system parameters that are external to the qbft | ||
| /// algorithm. This remains constant across multiple instances of consensus | ||
| /// (calls to `run`). | ||
| pub struct Definition<I, V, C> | ||
| where | ||
| V: PartialEq, | ||
| { | ||
| /// A deterministic leader election function. | ||
| pub is_leader: | ||
| Box<dyn Fn(/* instance */ &I, /* round */ i64, /* process */ i64) -> bool + Send + Sync>, | ||
|
|
||
| /// Returns a new timer channel and stop function for the round | ||
| pub new_timer: Box< | ||
| dyn Fn(/* round */ i64) -> (mpmc::Receiver<time::Instant>, Box<dyn Fn() + Send + Sync>) | ||
| + Send | ||
| + Sync, | ||
| >, | ||
|
|
||
| /// Called when leader proposes value and we compare it with our local | ||
| /// value. It's an opt-in feature that should instantly return `None` on | ||
| /// `return_err` channel if it is not turned on. | ||
| pub compare: Box< | ||
| dyn Fn( | ||
| /* ct */ &CancellationToken, | ||
| /* qcommit */ &Msg<I, V, C>, | ||
| /* input_value_source_ch */ &mpmc::Receiver<C>, | ||
| /* input_value_source */ &C, | ||
| /* return_err */ &mpmc::Sender<Result<()>>, | ||
| /* return_value */ &mpmc::Sender<C>, | ||
| ) + Send | ||
| + Sync, | ||
| >, | ||
|
|
||
| /// Called when consensus has been reached on a value. | ||
| pub decide: Box< | ||
| dyn Fn( | ||
| /* ct */ &CancellationToken, | ||
| /* instance */ &I, | ||
| /* value */ &V, | ||
| /* qcommit */ &Vec<Msg<I, V, C>>, | ||
| ) + Send | ||
| + Sync, | ||
| >, | ||
|
|
||
| /// Allows debug logging of triggered upon rules on message receipt. | ||
| /// It includes the rule that triggered it and all received round messages. | ||
| pub log_upon_rule: Box< | ||
| dyn Fn( | ||
| /* instance */ &I, | ||
| /* process */ i64, | ||
| /* round */ i64, | ||
| /* msg */ &Msg<I, V, C>, | ||
| /* upon_rule */ UponRule, | ||
| ) + Send | ||
| + Sync, | ||
| >, | ||
| /// Allows debug logging of round changes. | ||
| pub log_round_change: Box< | ||
| dyn Fn( | ||
| /* instance */ &I, | ||
| /* process */ i64, | ||
| /* round */ i64, | ||
| /* new_round */ i64, | ||
| /* upon_rule */ UponRule, | ||
| /* msgs */ &Vec<Msg<I, V, C>>, | ||
| ) + Send | ||
| + Sync, | ||
| >, | ||
|
|
||
| /// Allows debug logging of unjust messages. | ||
| pub log_unjust: | ||
| Box<dyn Fn(/* instance */ &I, /* process */ i64, /* msg */ Msg<I, V, C>) + Send + Sync>, | ||
|
|
||
| /// Total number of nodes/processes participating in consensus. | ||
| pub nodes: i64, | ||
|
|
||
| /// Limits the amount of message buffered for each peer. | ||
| pub fifo_limit: i64, | ||
| } | ||
|
|
||
| impl<I, V, C> Definition<I, V, C> | ||
| where | ||
| V: PartialEq, | ||
| { | ||
| /// Quorum count for the system. | ||
| /// See IBFT 2.0 paper for correct formula: <https://arxiv.org/pdf/1909.10194.pdf> | ||
| pub fn quorum(&self) -> i64 { | ||
| (self.nodes as u64 * 2).div_ceil(3) as i64 | ||
| } | ||
|
|
||
| /// Maximum number of faulty/byzantine nodes supported in the system. | ||
| /// See IBFT 2.0 paper for correct formula: <https://arxiv.org/pdf/1909.10194.pdf> | ||
| pub fn faulty(&self) -> i64 { | ||
| (self.nodes - 1) / 3 | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
I think we can put some functions in the trait and not store them here
|
@DiogoSantoss Rust does provide a mechanism to recover from panics in the form of catch_unwind, but we intentionally do not recover from panics in this iteration: panics are developer errors (ex. In a future iteration we'll refactor the code to replace all panics with compile-time guarantees (ex. use an enum instead of constants), and in the case that's not possible we will address each case individually. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Closes #13
Implements the module qbft.
The original implementation makes heavy usage of channels for communicating results and errors, while inputs are abstracted using generic interfaces. We match the dynamic dispatch using
type Msg = Arc<dyn SomeMsg<I, V, C>>whereSomeMsg<I, V, C>is a trait, and matches the original implementation interface. For channels, we use the crossbeam library in a sync fashion.