Conversation
|
Claude finished @iamquang95's task in 6m 16s —— View job Code Review
SummaryThis PR implements the DKG step-synchronization protocol ( Findings[High] Step is updated for peers that fail validation In Evidence: if let Err(error) = protocol::validate_request_with_public_key(...) {
// sets server error, response.error
} else {
server.mark_connected(peer_id).await;
}
// ← this runs even after validation failure
if server.update_step(peer_id, message.step).await? {
send_inbound_event(...);
}
// loop continues - next read will also come from the invalid peerA peer that sends an invalid hash signature (wrong cluster) can still advance the step tracker and the loop keeps reading from that peer. The Go reference returns from the handler after an invalid message. The function should [Medium] Protocol ID has trailing slash — verify against Go
pub const PROTOCOL_NAME: libp2p::StreamProtocol = StreamProtocol::new("/charon/dkg/sync/1.0.0/");The trailing [Medium]
[Low] Redundant explicit arms before wildcard in
ConnectionEvent::AddressChange(_)
| ConnectionEvent::LocalProtocolsChange(_)
| ConnectionEvent::RemoteProtocolsChange(_) => {}
ConnectionEvent::ListenUpgradeError(_) => {}
_ => {} // ← makes the lines above unreachableThe named arms are unreachable because [Low]
[Low]
Parity Matrix
TestsTests were not run (no
Missing coverage:
Open Questions
|
Same as charon implementation
Matched with golang:
Matched with charon, also it's not a bug. |
|
@claude now you will have the charon source code, review it again |
|
Claude finished @iamquang95's task in 3m 23s —— View job Code Review (with Charon source)
SummaryWith access to Corrections to Previous Review[Retracted — High] Step updated after validation failure Confirmed correct. Go's if err := s.validReq(pubkey, msg); err != nil {
s.setErr(...)
resp.Error = err.Error()
} else if !s.isConnected(pID) {
s.setConnected(pID)
}
if err := s.updateStep(pID, int(msg.GetStep())); err != nil { // ← unconditional
return err
}Rust [Retracted — Medium] Protocol ID trailing slash Confirmed correct. Go uses [Retracted — Medium] Confirmed correct. Go's Remaining Valid Findings[Low] Redundant match arms before wildcard in
ConnectionEvent::AddressChange(_)
| ConnectionEvent::LocalProtocolsChange(_)
| ConnectionEvent::RemoteProtocolsChange(_) => {}
ConnectionEvent::ListenUpgradeError(_) => {}
_ => {} // makes the four named arms above dead codeThe named arms are unreachable. Remove them. Fix this → [Low]
New Finding[Low] Relay error detection may miss Go's func isRelayError(err error) bool {
return errors.Is(err, network.ErrReset) ||
errors.Is(err, network.ErrResourceScopeClosed) // ← second case
}Rust ( relay_reset: error.kind() == std::io::ErrorKind::ConnectionReset,
Recommend mapping Informational NoteRust Go's Parity Matrix (updated)
|
Don't have the equivalent error in rust. |
varex83agent
left a comment
There was a problem hiding this comment.
Review: feat(dkg): implement sync
This PR ports the Charon DKG sync protocol to Rust. The architecture is solid — the Arc<Inner> / RwLock<ServerState> / Notify design is correct, the wire format matches Go's i64 LE framing, and the step validation rules are faithful. Test coverage is good with both unit tests mirroring Go's test shapes and an integration round-trip test.
However there are four bugs that need to be fixed before merge:
stop_rx.changed()spurious cancel —activate()sendsfalseonstop_tx, which fireschanged()in the running outbound stream and kills it prematurely (handler.rs:299,319+client.rs:178)- Inbound future replaced without cleanup — dropping the old
BoxFutureskipsserver.clear_connected, leaving the peer permanently in the connected set (handler.rs:257) - No backoff on Reconnectable exit — transitioning to
Idledirectly can spin-loop under relay churn; theon_dial_upgrade_errorpath correctly callsschedule_retry()but the stream-exit path does not (handler.rs:226) NegotiationFailedretries forever — when a peer doesn't support the protocol, retrying with backoff is pointless; the handler should callclient.finish(Err(Unsupported))(handler.rs:125)
Additionally, await_all_shutdown is missing the error-state check that await_all_connected and await_all_at_step both have, and the MAX_MESSAGE_SIZE constant (128 MiB) diverges from Go's 32 MiB cap.
Inline comments are filed for each issue individually.
| ) { | ||
| match event { | ||
| Either::Left(event) => self.pending_events.push_back(ToSwarm::GenerateEvent(event)), | ||
| Either::Right(unreachable) => match unreachable {}, |
There was a problem hiding this comment.
| Either::Right(unreachable) => match unreachable {}, | |
| Either::Right(_) => {}, |
There was a problem hiding this comment.
Prefer to keep the code with unreachable. It's cleaner to read and expect this will never happen
|
/claude-review |
|
Claude finished @iamquang95's task in 20m 15s —— View job Code Review — PR #311 feat(dkg): implement sync
SummaryAll five bugs identified in prior review rounds are confirmed fixed:
Functional parity with the Go reference is excellent. The findings below are all minor or nit — no blockers. Findings[Minor] When This matches Go behaviour ( [Minor] If a shutdown message arrives with a step that fails [Minor] No per-message read timeout — slow-loris risk —
[Minor]
fn send_inbound_event(tx: &mpsc::UnboundedSender<OutEvent>, event: OutEvent) {
let result = tx.send(event);
debug_assert!(result.is_ok(), "inbound event receiver dropped while handler alive");
}[Minor]
[Minor] If connection closure follows a [Minor] TOCTOU race in The [Minor] Unbounded Events are produced by [nit] [nit] [nit] [nit] Type annotation uses Verdict: COMMENT — 0 bugs, 7 minor, 4 nits. No blockers; all prior bugs fixed. |
| pub async fn read_fixed_size_delimited<S: AsyncRead + Unpin>( | ||
| stream: &mut S, | ||
| ) -> io::Result<Vec<u8>> { | ||
| read_fixed_size_delimited_with_max(stream, MAX_MESSAGE_SIZE).await |
There was a problem hiding this comment.
Can we align the MAX_MESSAGE_SIZE to charon's 32Mb?
https://github.com/ObolNetwork/charon/blob/97959c4e8ec0da2246dcf5166830159504245baa/dkg/sync/server.go#L35
There was a problem hiding this comment.
I know that the earlier version did not have any checks at all, but since we have added this already in newer releases (post-audit), then it makes sense to align the values from the beginning. It will probably be hard to find such the differences in the future.
Should address #133