Skip to content

Commit 750587a

Browse files
AIQnetLabclaude
andcommitted
consensus: same-round 2f+1 leader rotation; remove cross-round pacemaker; scale + harden
- Leader-rotation round advances ONLY on a same-round 2f+1 TimeoutCertificate. The cross-round pacemaker made HIGHEST_CERTIFIED_ROUND path-dependent, so nodes computed different leaders for the same height -> dual production -> fork. Physically remove the agg-TC subsystem (statics, structs, wire variant, serving loop, handler) and all stale references. - Deterministic checkpoint content (mb_hashes from canonical body, beacon fail-stop) so finality cannot stall on node-local metadata divergence. - Finality-subordinate fork-choice + bounded reorg for minority-fork recovery. - Eligible-producer snapshot membership O(R*E) -> O(R) via HashSet (runs every macroblock; matters at 100k registered super nodes). - Activation TX also signed with the node consensus Dilithium3 key (post-quantum identity binding, not only the ephemeral Ed25519 P2P-gate signature). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 291cab1 commit 750587a

5 files changed

Lines changed: 261 additions & 1286 deletions

File tree

development/qnet-integration/src/activation_validation.rs

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2021,6 +2021,28 @@ impl BlockchainActivationRegistry {
20212021
let sig = signing_key.sign(canonical_msg.as_bytes());
20222022
transaction.signature = Some(hex::encode(sig.to_bytes()));
20232023
transaction.public_key = Some(hex::encode(verifying_key.as_bytes()));
2024+
2025+
// Post-quantum: ALSO sign with this node's consensus Dilithium3 key (registered
2026+
// on-chain at NodeRegistration). The ephemeral Ed25519 above only satisfies the
2027+
// P2P "must carry a signature" gate — it proves no identity and is quantum-breakable.
2028+
// This Dilithium sig binds the activation to the node's PQ identity. Same crypto
2029+
// pair as the heartbeat (proven on-chain); verified on admission by
2030+
// verify_dilithium_tx_signature_async (signer_id = dilithium_public_key) over the
2031+
// SAME canonical message (build_canonical_verify_message's NodeActivation arm).
2032+
let local_node_id = crate::unified_p2p::GLOBAL_NODE_ID.read().clone();
2033+
if !local_node_id.is_empty() {
2034+
if let Some(crypto) = crate::node::try_get_quantum_crypto() {
2035+
match crypto.create_consensus_signature(&local_node_id, &canonical_msg).await {
2036+
Ok(dil) => {
2037+
transaction.dilithium_signature = Some(dil.signature);
2038+
transaction.dilithium_public_key = Some(local_node_id);
2039+
}
2040+
Err(e) => {
2041+
println!("[WARN][ACTIVATION] dilithium_sign_failed err={}", e);
2042+
}
2043+
}
2044+
}
2045+
}
20242046
}
20252047

20262048
// Calculate hash using canonical serialization (SHA3-256 NIST compliant)

development/qnet-integration/src/block_pipeline.rs

Lines changed: 26 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -113,10 +113,10 @@ static FORK_RECOVERY_TRIGGER_TIMES: once_cell::sync::Lazy<
113113
> = once_cell::sync::Lazy::new(dashmap::DashMap::new);
114114
const FORK_RECOVERY_COOLDOWN_SECS: u64 = 60;
115115

116-
// v34: cooldown for the failover-cert pull-on-reject. mb_idx → wall-clock secs of last request.
116+
// Cooldown for the failover-cert pull-on-reject. mb_idx → wall-clock secs of last request.
117117
// Bounds how often a node stuck on an uncertified failover block asks peers for that window's
118-
// timeout certificates (the request/serve already exists for sync and returns the cross-round
119-
// AggregatedTimeoutCertificate). 2s is fast enough to recover within a window, slow enough that
118+
// timeout certificates (the request/serve already exists for sync and returns the same-round
119+
// 2f+1 TimeoutCertificate). 2s is fast enough to recover within a window, slow enough that
120120
// the repeated per-block reject loop can't flood peers.
121121
static FAILOVER_CERT_PULL_TIMES: once_cell::sync::Lazy<
122122
dashmap::DashMap<u64, u64>
@@ -1767,7 +1767,12 @@ impl BlockPipeline {
17671767
if cooldown_ok {
17681768
let prev = FORK_RECOVERY_HEIGHT
17691769
.load(std::sync::atomic::Ordering::SeqCst);
1770-
let target = finalized_h.saturating_add(1);
1770+
// Roll back to the last good height = disputed-2 (the forked block is
1771+
// local[disputed-1]), clamped to ≥ finalized. finalized_h+1 was wrong when
1772+
// the fork IS at finalized+1 (our own tip): the handler's `rollback_to <
1773+
// local_h` guard then never fires → forked tip kept → permanent
1774+
// hash_chain_break (the N004 single-source self-fork wedge).
1775+
let target = disputed_h.saturating_sub(2).max(finalized_h);
17711776
if target > prev {
17721777
FORK_RECOVERY_HEIGHT.store(
17731778
target,
@@ -1947,7 +1952,7 @@ impl BlockPipeline {
19471952

19481953
// Producer authority check (same-round mismatch ≡ HARD reject).
19491954
// A. timeout_divergence (block round != cached round): views of
1950-
// HIGHEST_CERTIFIED/ADOPTED_ROUND diverged in transit. Soft —
1955+
// HIGHEST_CERTIFIED_ROUND diverged in transit. Soft —
19511956
// log only; hash-chain + sig + 2f+1 commit resolve it. Expected
19521957
// producer is NOT re-derived on ingest (needs remote VRF preimage).
19531958
// B. same_round_mismatch (cached round == block round, wrong signer):
@@ -1971,28 +1976,24 @@ impl BlockPipeline {
19711976
// producer at certification) arrives, or via sync (which skips this gate,
19721977
// trusting macroblock finality). Round 0 (happy path) needs no cert. O(1).
19731978
if mb.timeout_round > 0 {
1974-
// v34: authorise the failover round with the SAME predicate the producer used to
1979+
// Authorise the failover round with the SAME predicate the producer used to
19751980
// pick it — `highest_certified_round_for(mb_idx) >= round + baseline`, keyed by
1976-
// mb_idx + ABSOLUTE round. HIGHEST_CERTIFIED_ROUND advances on BOTH a same-round
1977-
// TimeoutProof AND a cross-round AggregatedTimeoutCertificate, so a round reached
1978-
// via the storm pacemaker (no single round at 2f+1) is accepted. The prior gate
1979-
// keyed get_timeout_certificate by microblock HEIGHT + RELATIVE round — a key
1980-
// never populated — so it rejected every failover block: the producer advanced
1981-
// cross-round while receivers demanded a same-round proof the storm structurally
1982-
// prevents → split-brain, multi-hour stall. Still 2f+1-gated (a forged round
1983-
// isn't certified ⇒ rejected); round 0 (happy path) needs no certificate.
1981+
// mb_idx + ABSOLUTE round. HIGHEST_CERTIFIED_ROUND advances ONLY on a same-round
1982+
// 2f+1 TimeoutCertificate, so the producer can be at round R only if the network
1983+
// certified R — both sides read the same map and can never disagree. A forged
1984+
// round isn't certified ⇒ rejected; round 0 (happy path) needs no certificate.
19841985
let round_certified =
19851986
crate::unified_p2p::failover_round_authorized(mb.height / 90, mb.timeout_round);
19861987
if !round_certified {
1987-
// v34 PULL-ON-REJECT: the round IS legitimate (a producer reached it via
1988-
// 2f+1), but the proving certificate never arrived — the agg-TC broadcast is
1989-
// one-shot (deduped) and vote gossip only re-fans on NEW votes, which stop once
1990-
// the storm settles, so a node that missed the brief window would stay stuck
1991-
// forever (the multi-hour split-brain). Actively request this window's timeout
1992-
// certificates from peers (rate-limited per mb_idx); the existing serve returns
1993-
// the cross-round AggregatedTimeoutCertificate, which advances our
1994-
// HIGHEST_CERTIFIED_ROUND so this still-replayable block is accepted next pass.
1995-
// Reuses the sync catch-up request/serve — no new wire type.
1988+
// PULL-ON-REJECT: the round IS legitimate (a producer reached it via a
1989+
// same-round 2f+1), but the proving TimeoutCertificate never arrived — its
1990+
// broadcast is one-shot and vote gossip only re-fans on NEW votes, which stop
1991+
// once the storm settles, so a node that missed the brief window would stay
1992+
// stuck forever. Actively request this window's timeout certificates from
1993+
// peers (rate-limited per mb_idx); the existing serve returns the same-round
1994+
// 2f+1 TimeoutCertificate, which advances our HIGHEST_CERTIFIED_ROUND so this
1995+
// still-replayable block is accepted next pass. Reuses the sync catch-up
1996+
// request/serve — no new wire type.
19961997
let mb_idx = mb.height / 90;
19971998
let now_secs = std::time::SystemTime::now()
19981999
.duration_since(std::time::UNIX_EPOCH)
@@ -2926,8 +2927,8 @@ impl BlockPipeline {
29262927
);
29272928
}
29282929
// Request certificates for the macroblock window
2929-
// covering this block — peers serve both same-round
2930-
// and aggregated certificates in one response.
2930+
// covering this block — peers serve the same-round
2931+
// 2f+1 TimeoutCertificates for it.
29312932
p2p.request_timeout_proofs(mb_idx, mb_idx);
29322933
}
29332934
}

0 commit comments

Comments
 (0)