Skip to content

Commit 346445f

Browse files
AIQnetLabclaude
andcommitted
fix: v17 identity-binding hardening + 22 security fixes
Closes the deadlock-class root cause and audits the rest of the chain for adjacent vulnerabilities. 21 confirmed issues fixed, 1 dead-code documented. 18 regression tests added. CRITICAL fixes - quantum_crypto::verify_dilithium_signature: remove STEP-4 fallback that re-verified math against the embedded PK without consulting the consensus PK registry. Single canonical path through verify_consensus_signature now. Comment-fence forbids reintroduction. - transaction.rs Transfer/Swap: payload `from` MUST equal tx.from. Three layer defence (state validate, integration validate, apply_to_state). Closes wallet-impersonation drain (any peer could debit any wallet while signature verified against attacker's PK). IDENTITY-SPOOF class (deadlock cause) - is_legacy_genesis_node: accept both 3-digit production and 1-digit legacy forms. Pre-fix exact-match silently disabled every IP-gate. - check_genesis_ip_gate helper applied to 12 message types (VrfLeaderClaim, VrfKeyAnnounce, ActiveNodeAnnouncement, ProducerHeartbeat, ProducerReady, ReadyAck, BlockRejection, TimeoutVote, ConsensusCommit, ConsensusReveal, BlockAttestation, EmptySlotAttestation). - Tier-3 hard-reject for unbound genesis identities; super-node TOFV retained for chain-anchored bootstrap. - Strict-guard install order fix (was always no-op due to startup race). - Auto-write genesis_anchors.json from collected VrfKeyAnnounces - eliminates manual ceremony. - Pre-populate VRF + consensus PK registries from anchor map at startup. - Aggressive VrfKeyAnnounce schedule (1s/2s burst → 60s maintenance): bootstrap window reduced 15-20s → 3-5s. - ConsensusReveal "legacy mode" empty-sig accept removed. - BlockAttestation / EmptySlot "bootstrap grace h<100" without PK removed. - block_pipeline: empty mb.signature for h>0 hard-reject. - verify_block_vrf_proof: no "legacy block" pass-through. - genesis_code_bypass: require sender IP from anchored set. - snapshot binding: 3 unverified_accept paths converted to hard reject; fall through to byzantine-safe block-by-block sync. DoS / pollution - decompress_zstd_bounded / decode_zstd_bounded helpers with output cap on inbound P2P decompression (MacroBlockBroadcast, compact_bin signature). 64 MiB / 256 KiB caps. - LightNodeRegistryResponse requires sender to be Genesis or active Super; closes registry pollution path. REGRESSION TESTS (18 added) - qnet-state: 6 tests covering Transfer/Swap validate + apply mismatch rejection with victim-balance-untouched assertions. - qnet-consensus: 6 tests covering decode_zstd_bounded (below cap / exact cap / above cap / 1000x bomb / malformed / empty). - qnet-integration: 6 tests covering is_legacy_genesis_node both forms, genesis_ip_for_node_id normalisation, decompress_zstd_bounded. Scalability: every gate is O(1) or O(genesis-size=5); independent of super-node count. Identity registry capacity 50K, parking_lot::RwLock wait-free reads. Build: cargo check --workspace pass; cargo build --release --workspace pass (18m 55s, LTO=fat); 18/18 v17 regression tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent eee6001 commit 346445f

10 files changed

Lines changed: 1659 additions & 330 deletions

File tree

core/qnet-consensus/src/consensus_crypto.rs

Lines changed: 206 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -453,11 +453,23 @@ async fn verify_compact_binary_signature(
453453
}
454454
};
455455

456-
// Decompress zstd
457-
let decompressed = match zstd::decode_all(binary_data.as_slice()) {
456+
// Decompress zstd with a HARD output ceiling.
457+
//
458+
// `zstd::decode_all` allocates whatever the stream demands; an adversarial
459+
// input ~1000× its on-the-wire size could OOM every receiver. Honest
460+
// compact_bin signatures are ~2.6 KB; the largest plausible variant
461+
// (`hybrid_bin` with embedded certificate) is ~5 KB. A 256 KB ceiling
462+
// is ~50× the largest legitimate payload — generous head-room for future
463+
// protocol additions while making decompression-bomb DoS impossible
464+
// in this code path.
465+
const MAX_COMPACT_BIN_DECOMPRESSED: usize = 256 * 1024;
466+
let decompressed = match decode_zstd_bounded(binary_data.as_slice(), MAX_COMPACT_BIN_DECOMPRESSED) {
458467
Ok(data) => data,
459468
Err(e) => {
460-
println!("[ERR][CONSENSUS_CRYPTO] compact_bin_decompress_failed err={}", e);
469+
println!(
470+
"[ERR][CONSENSUS_CRYPTO] compact_bin_decompress_failed input_bytes={} err={}",
471+
binary_data.len(), e
472+
);
461473
return false;
462474
}
463475
};
@@ -1192,39 +1204,99 @@ async fn verify_with_real_dilithium(
11921204
let signed_message_bytes = &signature_bytes[4..4 + signed_len];
11931205
let public_key_bytes = &signature_bytes[pk_start..pk_start + pk_len];
11941206

1195-
// v14.8: Two-tier PK binding.
1196-
// Tier 1 (HARD): if node_id is registered, extracted PK must match. A
1197-
// mismatch is a hostile self-attested PK — reject hard. This is the
1198-
// only check that can prevent a compromised peer from pretending to
1199-
// be a different node_id (whose real PK is in the registry).
1200-
// Tier 2 (SOFT): if node_id is not yet registered, accept but do NOT
1201-
// auto-register from here. Auto-registering from the verify path
1202-
// would re-open the race window that register_consensus_pk_from_chain
1203-
// exists to close (binding must come from finalised chain state,
1204-
// not from the first inbound signature we happen to see). The
1205-
// integration layer installs bindings either via genesis anchors or
1206-
// during NodeRegistration/NodeReactivation TX application.
1207+
// ─────────────────────────────────────────────────────────────────────
1208+
// Identity → public-key binding policy (three tiers)
1209+
// ─────────────────────────────────────────────────────────────────────
1210+
//
1211+
// Tier 1 (HARD MATCH): registry has a binding for `node_id` and the
1212+
// extracted PK matches it. The signature is identity-bound.
1213+
//
1214+
// Tier 2 (HARD REJECT — non-match): registry has a binding for `node_id`
1215+
// and the extracted PK does NOT match. This is a hostile identity
1216+
// claim — a peer holding their own valid Dilithium3 keypair attempting
1217+
// to spoof an already-bound identity. Reject. There is NO legitimate
1218+
// reason to accept a different PK for an identity once the registry
1219+
// has locked one in (registry entries are immutable for the process
1220+
// lifetime; see register_consensus_pk_from_chain immutability check).
1221+
//
1222+
// Tier 3 (POLICY-DEPENDENT — no binding):
1223+
// * If `node_id` matches a Genesis pattern (`"genesis_node_*"`):
1224+
// HARD REJECT. Genesis identities MUST be in the registry before any
1225+
// inbound signature is accepted. They are populated either by
1226+
// (1) self-registration at boot (initialize_wallet_identity calls
1227+
// register_consensus_pk_from_chain with the local keypair
1228+
// BEFORE P2P comes up); or
1229+
// (2) the genesis anchor file shipped by the operator
1230+
// (install_genesis_anchors_at_startup, then anchored PKs are
1231+
// embedded into the genesis NodeRegistration TX which feeds
1232+
// cache_node_registrations_from_transactions_with_dashmap →
1233+
// register_consensus_pk_from_chain).
1234+
// Accepting a first-seen Genesis PK here would lock the identity to
1235+
// whatever PK the network sees first, opening the squat-on-bootstrap
1236+
// window that the anchor system exists to close.
1237+
// * Otherwise (Super-node, Light-node, generic identity):
1238+
// Accept (TOFV) and continue to math verification. Super-node
1239+
// identities reach steady-state binding via signed
1240+
// `NodeRegistration` TX (proof-of-ownership in the TX payload),
1241+
// which is applied to chain state and mirrored into this registry
1242+
// before any cross-restart binding is needed. The TOFV path lets
1243+
// a freshly-joined Super-node's first announcement be accepted in
1244+
// the small window between its TX broadcast and chain finality.
12071245
//
1208-
// NOTE: the Dilithium3 signature itself is cryptographically verified
1209-
// further down under `dilithium3::open` regardless of which tier fired —
1210-
// this block only governs the identity → key binding policy, not the
1211-
// mathematical validity of the signature.
1246+
// NOTE on math: regardless of tier, the Dilithium3 signature is
1247+
// cryptographically verified under `dilithium3::open` further down. This
1248+
// tier block only governs the identity → key binding decision, not the
1249+
// mathematical validity of the signature itself.
1250+
//
1251+
// SCALABILITY: registry uses parking_lot::RwLock + HashMap with capacity
1252+
// 50K — supports tens of thousands of Super-nodes. Read path is
1253+
// wait-free; the write path runs exactly once per identity registration
1254+
// (one-shot per node lifetime). The genesis prefix check is a fixed-cost
1255+
// string comparison — O(1) regardless of network size.
12121256
{
12131257
let registry = CONSENSUS_PK_REGISTRY.read();
12141258
match registry.get(node_id) {
12151259
Some(registered_pk) if registered_pk == public_key_bytes => {
1216-
// ok, bound and matches
1260+
// Tier 1: bound and matches — proceed to math verification.
12171261
}
12181262
Some(registered_pk) => {
1263+
// Tier 2: bound, mismatch — hostile identity claim. Hard reject.
12191264
eprintln!("[ERR][CONSENSUS] pk_mismatch node={} registered={}.. extracted={}..",
12201265
node_id,
12211266
hex::encode(&registered_pk[..8]),
12221267
hex::encode(&public_key_bytes[..8]));
12231268
return false;
12241269
}
12251270
None => {
1226-
// First-seen — allowed, but logged so integration can audit
1227-
// which identities slipped past chain-state binding.
1271+
// Tier 3: policy depends on identity class.
1272+
if node_id.starts_with("genesis_node_") {
1273+
// Genesis identity with no registry binding. The boot
1274+
// sequence of every honest node guarantees a binding is
1275+
// installed BEFORE P2P traffic is processed, so an
1276+
// unbound genesis claim arriving here is either:
1277+
// (a) a race against a not-yet-completed self-register
1278+
// (transient, will resolve on retry/regossip), or
1279+
// (b) a squat attempt from a non-genesis peer.
1280+
// Both cases are handled identically by hard-rejecting:
1281+
// case (a) self-heals because the legitimate sender's
1282+
// gossip continues; case (b) is the attack we exist to
1283+
// block.
1284+
let extracted_prefix = if public_key_bytes.len() >= 8 {
1285+
hex::encode(&public_key_bytes[..8])
1286+
} else {
1287+
String::new()
1288+
};
1289+
eprintln!(
1290+
"[CRIT][CONSENSUS] genesis_pk_first_seen_rejected node={} extracted={}.. \
1291+
action=hard_reject hint=anchor_or_self_register_must_run_before_p2p",
1292+
node_id, extracted_prefix
1293+
);
1294+
return false;
1295+
}
1296+
// Non-genesis identity (Super-node, Light-node, etc.). TOFV
1297+
// is acceptable; chain-state will lock the canonical binding
1298+
// shortly via NodeRegistration TX application, after which
1299+
// any future mismatch is caught by Tier 2 above.
12281300
if public_key_bytes.len() >= 8 {
12291301
println!("[WARN][CONSENSUS] pk_first_seen node={} extracted={}..",
12301302
node_id, hex::encode(&public_key_bytes[..8]));
@@ -1297,3 +1369,115 @@ fn ct_eq(a: &[u8], b: &[u8]) -> bool {
12971369
// Use black_box to prevent compiler from optimising the loop away
12981370
std::hint::black_box(diff) == 0
12991371
}
1372+
1373+
/// Decompress zstd bytes with a hard output ceiling.
1374+
///
1375+
/// Used by every signature-format verifier on the inbound P2P path so a
1376+
/// hostile peer cannot weaponise zstd's typical-thousand-fold expansion
1377+
/// ratio into an OOM. The streaming `Read::take` adapter caps the total
1378+
/// bytes read from the decoder; a payload that decodes to more than
1379+
/// `max_output_bytes` short-circuits with `Err(InvalidData)` before the
1380+
/// inner buffer is allowed to grow further.
1381+
///
1382+
/// Scalability: O(N) in `output_size`. The pre-sized `Vec` capacity is
1383+
/// 1 MiB or `max_output_bytes` (whichever is smaller), so small-but-
1384+
/// frequent verifications do not pay a full max-size allocation each call.
1385+
pub(crate) fn decode_zstd_bounded(input: &[u8], max_output_bytes: usize) -> std::io::Result<Vec<u8>> {
1386+
use std::io::Read;
1387+
let mut decoder = zstd::Decoder::new(input)?;
1388+
let initial_cap = max_output_bytes.min(1 * 1024 * 1024);
1389+
let mut output: Vec<u8> = Vec::with_capacity(initial_cap);
1390+
let cap_plus_one = max_output_bytes.saturating_add(1) as u64;
1391+
let mut bounded = decoder.by_ref().take(cap_plus_one);
1392+
let _ = bounded.read_to_end(&mut output)?;
1393+
if output.len() > max_output_bytes {
1394+
return Err(std::io::Error::new(
1395+
std::io::ErrorKind::InvalidData,
1396+
format!(
1397+
"decompressed_size_exceeds_cap output_bytes={} cap_bytes={}",
1398+
output.len(), max_output_bytes
1399+
),
1400+
));
1401+
}
1402+
Ok(output)
1403+
}
1404+
1405+
// ════════════════════════════════════════════════════════════════════════════
1406+
// REGRESSION TESTS — Fix #20 (bounded zstd) + Tier-3 binding policy
1407+
// ════════════════════════════════════════════════════════════════════════════
1408+
#[cfg(test)]
1409+
mod tests_v17_security {
1410+
use super::*;
1411+
1412+
fn zstd_compress_for_test(input: &[u8]) -> Vec<u8> {
1413+
zstd::encode_all(input, 1).expect("zstd encode for test must succeed")
1414+
}
1415+
1416+
/// Fix #20: decoded bytes equal input on a payload below the cap.
1417+
#[test]
1418+
fn decode_zstd_bounded_accepts_payload_below_cap() {
1419+
let original = b"compact_bin signature test payload".to_vec();
1420+
let compressed = zstd_compress_for_test(&original);
1421+
let decoded = decode_zstd_bounded(&compressed, 1024).expect("below cap must decode");
1422+
assert_eq!(decoded, original);
1423+
}
1424+
1425+
/// Fix #20: an exact-cap payload is accepted; the implementation's
1426+
/// `cap_plus_one` reader plus `<= cap` post-check allow equality.
1427+
#[test]
1428+
fn decode_zstd_bounded_accepts_payload_at_exact_cap() {
1429+
let original = vec![0x55u8; 5 * 1024];
1430+
let compressed = zstd_compress_for_test(&original);
1431+
let decoded = decode_zstd_bounded(&compressed, original.len())
1432+
.expect("exact-size must decode");
1433+
assert_eq!(decoded.len(), original.len());
1434+
}
1435+
1436+
/// Fix #20: decoded bytes one over the cap MUST yield InvalidData.
1437+
/// Regression here re-opens the bomb class on the consensus layer.
1438+
#[test]
1439+
fn decode_zstd_bounded_rejects_payload_above_cap() {
1440+
let original = vec![0xAAu8; 2048];
1441+
let compressed = zstd_compress_for_test(&original);
1442+
let result = decode_zstd_bounded(&compressed, original.len() - 1);
1443+
assert!(result.is_err(), "must reject above-cap output");
1444+
let err = result.err().unwrap();
1445+
assert_eq!(err.kind(), std::io::ErrorKind::InvalidData);
1446+
assert!(err.to_string().contains("decompressed_size_exceeds_cap"));
1447+
}
1448+
1449+
/// Fix #20: classic decompression bomb — small input, huge output.
1450+
/// The cap is on OUTPUT bytes, not input bytes; a small input that
1451+
/// expands far past the cap MUST be rejected even though the input
1452+
/// alone is well within any reasonable network packet size.
1453+
#[test]
1454+
fn decode_zstd_bounded_rejects_high_ratio_bomb() {
1455+
// 512 KB of zeros compresses to a few KB — but exceeds an 8 KB
1456+
// output cap by ~64×. Real-world bombs hit 1000× ratios.
1457+
let original = vec![0u8; 512 * 1024];
1458+
let compressed = zstd_compress_for_test(&original);
1459+
assert!(compressed.len() < 8 * 1024,
1460+
"fixture sanity: compressed payload must be small relative to original");
1461+
let result = decode_zstd_bounded(&compressed, 8 * 1024);
1462+
assert!(result.is_err(), "decompression bomb must be rejected on output cap");
1463+
}
1464+
1465+
/// Fix #20: malformed zstd input MUST return Err (and not panic) so a
1466+
/// hostile peer cannot crash the verifier with a bogus stream.
1467+
#[test]
1468+
fn decode_zstd_bounded_rejects_malformed_input() {
1469+
let garbage: Vec<u8> = (0..256).map(|i| (i * 31 + 17) as u8).collect();
1470+
let result = decode_zstd_bounded(&garbage, 4096);
1471+
assert!(result.is_err(), "malformed zstd must error gracefully");
1472+
}
1473+
1474+
/// Fix #20: empty payload decodes to empty output without error.
1475+
/// Edge case ensures the bounded reader does not regress to a
1476+
/// "minimum 1 byte" requirement.
1477+
#[test]
1478+
fn decode_zstd_bounded_empty_payload_round_trip() {
1479+
let compressed = zstd_compress_for_test(&[]);
1480+
let decoded = decode_zstd_bounded(&compressed, 4096).expect("empty must decode");
1481+
assert!(decoded.is_empty());
1482+
}
1483+
}

0 commit comments

Comments
 (0)