Skip to content

Commit c5d0523

Browse files
AIQnetLabclaude
andcommitted
fix: v16.0.2 idempotent state apply (sync re-delivery resilience)
Re-delivery of an already-applied block during catch-up sync fails the strict nonce check (`expected=N got=K<N`), causing partial block apply, state divergence, and a permanent hash_chain_break at the next block. Symptom on the testnet: [REJECT][TX] invalid_nonce expected=101 got=1..100 [REJECT][CREATE-ACCOUNT] account_already_exists [WARN][PIPELINE] hash_chain_break h=351 from=genesis_node_001 [CRIT][PIPELINE] verify_stuck stall_ms=1060000+ verified=271 applied=271 Fix: silent Ok() when tx.nonce <= sender.nonce (already-applied) across all 9 nonce-checked tx_types in apply_to_state, plus matching idempotency for CreateAccount (account_exists) and NodeActivation (already_node). Same policy mirrored into state_db.rs::execute_transaction. Replay protection preserved — TXs with stale nonce have no incremental effect because sender's balance and state already reflect the original deduction. Strict +1 check still applies for new (>current) nonces. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 66c07fb commit c5d0523

2 files changed

Lines changed: 106 additions & 15 deletions

File tree

core/qnet-state/src/state_db.rs

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,10 +86,19 @@ impl StateDB {
8686
}
8787
});
8888

89+
// IDEMPOTENT APPLY — silent skip for already-applied transactions.
90+
// Mirrors the policy in `transaction.rs::apply_to_state` (Transfer arm)
91+
// so that re-delivery of the same TX during sync / replay does not
92+
// fail the operation. Replay protection is preserved: a TX with stale
93+
// nonce has no incremental effect (sender's balance already reflects
94+
// the original deduction).
95+
if tx.nonce <= sender.nonce {
96+
return Ok(tx_hash);
97+
}
8998
// Check nonce for transaction ordering
9099
if tx.nonce != sender.nonce + 1 {
91100
return Err(StateError::InvalidTransaction(format!(
92-
"Invalid nonce: expected {}, got {}",
101+
"Invalid nonce: expected {}, got {}",
93102
sender.nonce + 1, tx.nonce
94103
)));
95104
}

core/qnet-state/src/transaction.rs

Lines changed: 96 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1541,7 +1541,50 @@ impl Transaction {
15411541
// Get sender account
15421542
let sender = accounts.get_mut(from)
15431543
.ok_or_else(|| StateError::AccountNotFound(from.clone()))?;
1544-
1544+
1545+
// ═══════════════════════════════════════════════════════════════
1546+
// IDEMPOTENT APPLY — silent skip for already-applied transactions
1547+
// ═══════════════════════════════════════════════════════════════
1548+
// When a node receives the same block more than once (typical during
1549+
// batch sync, gossip duplication, or post-restart replay), every TX
1550+
// inside that block is presented to apply_to_state again. Without
1551+
// idempotency the strict `nonce == sender.nonce + 1` check fails
1552+
// for every previously-applied TX, which:
1553+
// * pollutes logs with [REJECT][TX] invalid_nonce noise
1554+
// * causes block-level apply failure if any TX inside fails
1555+
// * cascades to state divergence between nodes that received the
1556+
// block once vs nodes that re-applied it multiple times
1557+
//
1558+
// ROOT CAUSE OF NETWORK HALT (observed at h=350):
1559+
// Genesis block re-delivered during catch-up sync. Sender "genesis"
1560+
// had nonce=100 from initial apply. Re-apply attempted nonces 1..100
1561+
// sequentially, all rejected. Block partially applied → state_root
1562+
// diverged from peers → next block (h=351) failed hash_chain_break →
1563+
// pipeline jammed forever.
1564+
//
1565+
// SAFETY: silent skip is NOT a security relaxation:
1566+
// * If `tx.nonce <= sender.nonce`, the operation has already taken
1567+
// effect on this account. Re-applying would either no-op
1568+
// (idempotent) or fail (current behaviour) — both leave state
1569+
// identical. Silent skip preserves the same final state without
1570+
// polluting the failure path.
1571+
// * Replay-attack semantics are preserved: an attacker re-broadcasting
1572+
// a signed TX with old nonce cannot double-spend, because the
1573+
// sender's balance already reflects the original deduction. The
1574+
// skipped TX has no incremental effect.
1575+
// * Strict +1 check still applies for FUTURE nonces; only stale
1576+
// (≤ current) nonces are silently skipped.
1577+
//
1578+
// SCALABILITY: O(1) per TX — single comparison. Identical cost at
1579+
// 5 or 5000 validators. Scales to thousands of super-nodes without
1580+
// any cross-node coordination.
1581+
// ═══════════════════════════════════════════════════════════════
1582+
if self.nonce <= sender.nonce {
1583+
// Already applied — silent no-op. Preserves idempotency under
1584+
// replay/re-sync without state divergence.
1585+
return Ok(());
1586+
}
1587+
15451588
// CRITICAL SECURITY: Check nonce to prevent replay attacks and double spending
15461589
// Transaction nonce must be exactly sender.nonce + 1
15471590
if self.nonce != sender.nonce + 1 {
@@ -1576,8 +1619,15 @@ impl Transaction {
15761619
.ok_or_else(|| StateError::InvalidTransaction("[REJECT][TRANSFER] receiver_balance_overflow".into()))?;
15771620
}
15781621
TransactionType::CreateAccount { address, initial_balance } => {
1622+
// IDEMPOTENT APPLY — re-creation is a no-op, not an error.
1623+
// When genesis-style blocks are re-delivered during sync, every
1624+
// CreateAccount in that block is re-presented. Returning Err here
1625+
// would fail the whole block apply and corrupt subsequent state;
1626+
// returning Ok preserves idempotency without changing semantics
1627+
// (the account already exists with its initial balance, mint cannot
1628+
// be repeated because the contains_key short-circuit prevents it).
15791629
if accounts.contains_key(address) {
1580-
return Err(StateError::InvalidTransaction("[REJECT][CREATE-ACCOUNT] account_already_exists".to_string()));
1630+
return Ok(());
15811631
}
15821632

15831633
// C1 SECURITY: Only system/genesis accounts can mint initial balance
@@ -1621,21 +1671,24 @@ impl Transaction {
16211671
let sender = accounts.get_mut(&self.from)
16221672
.expect("account just inserted");
16231673

1624-
// v14.8.4: SINGLE-USE ACTIVATION GUARD.
1674+
// v14.8.4: SINGLE-USE ACTIVATION GUARD (with idempotent re-apply).
16251675
// Each wallet may hold exactly one active node at a time. An
1626-
// already-activated wallet trying to submit another NodeActivation
1627-
// is a replay (same wallet, different burn_tx/activation_code) or a
1628-
// misconfigured relaunch. Reject at state apply so it cannot be
1629-
// re-applied from the mempool or from gossip.
1676+
// already-activated wallet re-presented during batch sync (same
1677+
// block re-delivered) must be a no-op, not an error — re-apply
1678+
// would otherwise fail the whole block and corrupt subsequent
1679+
// state. The mempool layer prevents fresh NodeActivation TXs
1680+
// from already-activated wallets; this code path only fires on
1681+
// sync replay where idempotency is the correct semantic.
16301682
if sender.is_node {
1631-
return Err(StateError::InvalidTransaction(format!(
1632-
"[REJECT][TX] wallet_already_activated from={} existing_type={:?}",
1633-
self.from, sender.node_type
1634-
)));
1683+
return Ok(());
16351684
}
16361685

16371686
// CRITICAL SECURITY: Check nonce to prevent replay attacks.
16381687
// First-time wallet has sender.nonce == 0 → valid TX nonce is 1.
1688+
// Idempotent skip for already-applied: tx.nonce ≤ sender.nonce.
1689+
if self.nonce <= sender.nonce {
1690+
return Ok(());
1691+
}
16391692
if self.nonce != sender.nonce + 1 {
16401693
return Err(StateError::InvalidTransaction(format!(
16411694
"[REJECT][TX] invalid_nonce expected={} got={}",
@@ -1680,7 +1733,12 @@ impl Transaction {
16801733
// which is part of the Merkle tree -> replicated to ALL nodes via blocks
16811734
let sender = accounts.get_mut(&self.from)
16821735
.ok_or_else(|| StateError::AccountNotFound(self.from.clone()))?;
1683-
1736+
1737+
// IDEMPOTENT APPLY — see Transfer arm for full rationale. Re-presented
1738+
// ContractDeploy with stale nonce is a no-op (already deployed).
1739+
if self.nonce <= sender.nonce {
1740+
return Ok(());
1741+
}
16841742
// CRITICAL SECURITY: Check nonce to prevent replay attacks
16851743
if self.nonce != sender.nonce + 1 {
16861744
return Err(StateError::InvalidTransaction(format!(
@@ -1790,7 +1848,11 @@ impl Transaction {
17901848
// transfer, approve, transferFrom all modify contract_storage in blockchain state
17911849
let sender = accounts.get_mut(&self.from)
17921850
.ok_or_else(|| StateError::AccountNotFound(self.from.clone()))?;
1793-
1851+
1852+
// IDEMPOTENT APPLY — see Transfer arm for full rationale.
1853+
if self.nonce <= sender.nonce {
1854+
return Ok(());
1855+
}
17941856
// CRITICAL SECURITY: Check nonce to prevent replay attacks
17951857
if self.nonce != sender.nonce + 1 {
17961858
return Err(StateError::InvalidTransaction(format!(
@@ -2051,7 +2113,11 @@ impl Transaction {
20512113
// v3.18: Gas fee goes directly to block producer (Pool 2 removed)
20522114
let sender = accounts.get_mut(from)
20532115
.ok_or_else(|| StateError::AccountNotFound(from.clone()))?;
2054-
2116+
2117+
// IDEMPOTENT APPLY — see Transfer arm for full rationale.
2118+
if self.nonce <= sender.nonce {
2119+
return Ok(());
2120+
}
20552121
// CRITICAL SECURITY: Check nonce to prevent replay attacks
20562122
if self.nonce != sender.nonce + 1 {
20572123
return Err(StateError::InvalidTransaction(format!(
@@ -2194,6 +2260,10 @@ impl Transaction {
21942260
let sender = accounts.get_mut(&self.from)
21952261
.ok_or_else(|| StateError::AccountNotFound(self.from.clone()))?;
21962262

2263+
// IDEMPOTENT APPLY — see Transfer arm for full rationale.
2264+
if self.nonce <= sender.nonce {
2265+
return Ok(());
2266+
}
21972267
// CRITICAL SECURITY: Check nonce to prevent replay attacks
21982268
if self.nonce != sender.nonce + 1 {
21992269
return Err(StateError::InvalidTransaction(format!(
@@ -2231,6 +2301,10 @@ impl Transaction {
22312301
let sender = accounts.get_mut(&self.from)
22322302
.ok_or_else(|| StateError::AccountNotFound(self.from.clone()))?;
22332303

2304+
// IDEMPOTENT APPLY — see Transfer arm for full rationale.
2305+
if self.nonce <= sender.nonce {
2306+
return Ok(());
2307+
}
22342308
// CRITICAL SECURITY: Check nonce to prevent replay attacks
22352309
if self.nonce != sender.nonce + 1 {
22362310
return Err(StateError::InvalidTransaction(format!(
@@ -2280,6 +2354,10 @@ impl Transaction {
22802354
let sender = accounts.get_mut(&self.from)
22812355
.ok_or_else(|| StateError::AccountNotFound(self.from.clone()))?;
22822356

2357+
// IDEMPOTENT APPLY — see Transfer arm for full rationale.
2358+
if self.nonce <= sender.nonce {
2359+
return Ok(());
2360+
}
22832361
// CRITICAL SECURITY: Check nonce to prevent replay attacks
22842362
if self.nonce != sender.nonce + 1 {
22852363
return Err(StateError::InvalidTransaction(format!(
@@ -2493,6 +2571,10 @@ impl Transaction {
24932571
let account = accounts.get_mut(&self.from)
24942572
.expect("account just inserted");
24952573

2574+
// IDEMPOTENT APPLY for PQ-upgrade — see Transfer arm for rationale.
2575+
if self.nonce <= account.nonce {
2576+
return Ok(());
2577+
}
24962578
// Nonce monotonicity (same as any user TX).
24972579
if self.nonce != account.nonce + 1 {
24982580
return Err(StateError::InvalidTransaction(format!(

0 commit comments

Comments
 (0)