feat: capability-aware peer selection with exponential-backoff cooldown#108
feat: capability-aware peer selection with exponential-backoff cooldown#108xdustinface wants to merge 10 commits intov0.42-devfrom
Conversation
* chore: add Manki AI code review configuration * fix: pass `claude_code_oauth_token` to Manki action
* feat: add session outcome fields to `PeerReputation` Adds `last_success`, `last_tried`, and `consecutive_failures` to `PeerReputation`, plus a `record_connection_failure` method on `PeerReputationManager`. All new fields use `#[serde(default)]` so existing `reputations.json` files load without migration. `record_connection_attempt` now sets `last_tried` and `record_successful_connection` sets `last_success` and resets `consecutive_failures`. * test: cover `PeerReputation` session outcome transitions Adds unit tests for default values, `last_tried` on attempt, `last_success` plus `consecutive_failures` reset on success, failure streak increment preserving `last_success`, and legacy `reputations.json` decoding with missing fields. * feat: bump `AddrV2.time` on successful handshake Adds `AddrV2Handler::mark_seen` to refresh the stored timestamp for a directly observed peer, preserving existing services for known entries and inserting a fresh entry otherwise. `connect_to_peer` now calls `mark_seen` after a successful handshake so the `peers.dat` time reflects first-hand observation instead of gossip. * feat: track peer connection outcomes in network manager Calls `record_connection_failure` on both the TCP connect failure and the handshake failure paths in `PeerNetworkManager::connect_to_peer`, so the `consecutive_failures` streak reflects every unsuccessful attempt. * refactor: drop backward-compat shims from `PeerReputation` Removes `#[serde(default)]` on the new session outcome fields and the legacy-JSON load test. Backward compatibility of `reputations.json` across versions is no longer a requirement, so the shims and test are dead weight. * refactor: thread peer-advertised services into `mark_seen` * docs: clarify `last_connection` vs `last_tried` on `PeerReputation` * refactor: extract `make_addr_message` helper in `addrv2` * refactor: defensively set `last_tried` in `record_connection_failure` * refactor: add atomic `record_failure_with_penalty` and use at failure sites * chore: apply `cargo fmt` * refactor: clamp `consecutive_failures` on deserialization * refactor: extract `apply_score_change` and always update `last_tried` on failure * test: cover `record_failure_with_penalty` directly * refactor: clamp \`consecutive_failures\` at runtime and consolidate failure-field updates Extract private `record_failure_fields` that applies `last_tried = now` and `consecutive_failures.saturating_add(1).min(MAX_CONSECUTIVE_FAILURES)`. Both `record_connection_failure` and `record_failure_with_penalty` now delegate to it, eliminating duplicated mutations and capping the in-memory streak at the same 1000 limit enforced by the deserializer. * test: cover \`consecutive_failures\` clamp, \`last_success\` preservation, \`mark_seen\` eviction - Assert `last_success` is unchanged after `record_failure_with_penalty` - Deserialize a `PeerReputation` with `consecutive_failures: 99999` and assert it clamps to `MAX_CONSECUTIVE_FAILURES` - Fill `AddrV2Handler` to capacity and assert `mark_seen` stays bounded and includes the new entry * refactor: remove unused `record_connection_failure` * test: stabilize eviction test and cover runtime `consecutive_failures` saturation * refactor: document and assert non-negative contract on `record_failure_with_penalty` * test: cover `last_tried` preservation on success and update on failure * fix: clamp negative `score_change` in `record_failure_with_penalty` * test: cover happy-path attempt to success lifecycle * refactor: tighten `clamp_future_system_time` bounds and enforce load invariants Add a 30-day lower bound to `clamp_future_system_time` so stale or corrupted timestamps (including epoch 0) are discarded on load, in addition to future ones. Add `PeerReputation::normalize_after_load` and call it from the storage load path. It resets `consecutive_failures` to 0 whenever `last_tried` is `None`, preventing the inconsistent state where a non-zero failure streak has no temporal anchor. * test: cover `clamp_future_system_time` edge cases Add three tests: future timestamp rejected, epoch-zero rejected (exercising the new lower bound), and recent-past timestamp preserved. * docs: clarify zero-`score_change` contract on `record_failure_with_penalty` A value of 0 is a deliberate no-op for the reputation score but still records the failure counter and timestamp, which is useful for failures that should be tracked without contributing toward a ban. * fix: use `checked_sub` in `clamp_future_system_time` to avoid panic on broken clocks * test: cover `normalize_after_load` via storage round-trip * fix: use `checked_add` in `clamp_future_system_time` to avoid panic on far-future clocks * test: also cover stale-timestamp path in `normalize_after_load` round-trip * refactor: drop 30-day stale-timestamp floor from `clamp_future_system_time` Remove `TIMESTAMP_MAX_AGE` and the `floor` computation that rejected timestamps older than 30 days. The future-timestamp guard (10-second tolerance) is the only meaningful constraint. Update the `normalize_after_load` doc comment to drop the "stale" reference. * test: remove obsolete stale-timestamp tests and consolidate `clamp_future_system_time` coverage Delete `test_normalize_after_load_via_storage_round_trip_stale` (tested the removed stale-floor path). Merge the three `test_clamp_future_system_time_*` tests into a single `test_clamp_future_system_time` covering future rejection and recent-past acceptance. * fix: \`mark_seen\` now overwrites \`services\` on existing entries Since round 1 the caller passes the actual handshake-negotiated services, so preserving the gossip-sourced value was inverted. The handshake-observed value is authoritative and is now written on both new and existing entries. Rename \`test_mark_seen_bumps_time_and_preserves_services\` to \`test_mark_seen_bumps_time_and_updates_services\` and update its assertion to expect the handshake services, not the original gossip services. * test: cover positive \`normalize_after_load\` branch Add \`test_normalize_after_load_preserves_failures_when_last_tried_valid\` to assert that a valid (non-future) \`last_tried\` and a non-zero \`consecutive_failures\` are both preserved through the load round-trip, complementing the existing reset-path test. * refactor: move `record_failure_fields` to `impl PeerReputation` * refactor: move `apply_score_change` to `impl PeerReputation`
Introduces `RequiredServices` in `dash-spv/src/network/required_services.rs`. The type wraps a `ServiceFlags` set derived from `ClientConfig`: `NETWORK` always, `COMPACT_FILTERS` when filter sync is enabled, `BLOOM` when mempool tracking uses the BIP37 strategy. `NODE_HEADERS_COMPRESSED` is intentionally excluded since it is a preferred-peer bonus rather than a hard requirement. `PeerNetworkManager` now builds `RequiredServices` once from `ClientConfig` at construction and threads it into every `select_best_peers` call so peers whose advertised `AddrV2.services` don't cover the required set are hard-filtered out. If no capable peers survive, selection returns empty rather than picking an incapable peer, preserving the maintenance loop's DNS-fallback path. The `ReputationAware` trait is scoped to `pub(crate)` since the private `RequiredServices` parameter is internal to the crate.
Adds `cooldown()` and `in_cooldown()` helpers on `PeerReputation` that derive a backoff window from `consecutive_failures` (30s, 1m, 5m, 30m, 2h) anchored at `last_tried`. The per-step durations live in a module-scope `COOLDOWN_STEPS` constant. `select_best_peers` now also skips any peer still inside its cooldown window, closing the loop between persisted session outcomes and future reconnection attempts. Peers with a discarded `last_tried` (e.g. after `clamp_future_system_time` + `normalize_after_load`) are treated as not-in-cooldown so they become eligible again immediately.
Replaces `select_best_peers`'s plain reputation-score sort with a composite score combining: the existing reputation score, the persisted success ratio (`successful_connections` / `connection_attempts`), a gossip-staleness penalty derived from `AddrV2.time`, and a preferred-services bonus for peers advertising `NODE_HEADERS_COMPRESSED`. Lower composite score is better, matching the reputation convention. Tunables live in a `scoring_weights` module so each factor can be adjusted without hunting for magic numbers. The staleness penalty is capped so a very old gossip entry cannot dominate the score. Tests exercise every factor in isolation (reputation ordering, success ratio, staleness, compressed-headers bonus), tie handling, the `count=0` edge case, the banned-peer exclusion, and the capability-aware required-services filter.
|
Manki — Review complete Planner Review — 17 findings Judge — 15 kept · 0 dropped (109s) Review metadataConfig:
Judge decisions:
Timing:
|
There was a problem hiding this comment.
Three reviewers independently converged on the same lock-width bug: reputations.write() is held through decay calls, scoring, and the final sort — blocking concurrent record_failure_with_penalty and update_reputation calls for the full duration of every selection tick. That's the real issue here; the rest is largely test-coverage gaps against acceptance criteria that explicitly call for comprehensive coverage.
📊 15 findings (8 suggestion, 7 nit) · 587 lines · 371s
Review stats
{
"model": "claude-sonnet-4-6",
"reviewTimeMs": 370788,
"diffLines": 587,
"diffAdditions": 569,
"diffDeletions": 18,
"filesReviewed": 5,
"agents": [
"Security & Safety",
"Architecture & Design",
"Correctness & Logic",
"Testing & Coverage",
"Performance & Efficiency"
],
"findingsRaw": 17,
"findingsKept": 15,
"findingsDropped": 2,
"severity": {
"required": 0,
"suggestion": 8,
"nit": 7
},
"verdict": "REQUEST_CHANGES",
"prNumber": 108,
"commitSha": "259b120cf0c0dfa816b5de2c7ca07292b04b06e9",
"agentMetrics": [
{
"name": "Security & Safety",
"findingsRaw": 3,
"findingsKept": 3,
"responseLength": 2913
},
{
"name": "Architecture & Design",
"findingsRaw": 4,
"findingsKept": 4,
"responseLength": 4173
},
{
"name": "Correctness & Logic",
"findingsRaw": 3,
"findingsKept": 3,
"responseLength": 2633
},
{
"name": "Testing & Coverage",
"findingsRaw": 4,
"findingsKept": 4,
"responseLength": 4848
},
{
"name": "Performance & Efficiency",
"findingsRaw": 3,
"findingsKept": 3,
"responseLength": 2627
}
],
"judgeMetrics": {
"confidenceDistribution": {
"high": 9,
"medium": 6,
"low": 0
},
"severityChanges": 15,
"mergedDuplicates": 2
},
"fileMetrics": {
"fileTypes": {
".rs": 5
},
"findingsPerFile": {
"dash-spv/src/network/reputation.rs": 9,
"dash-spv/src/network/required_services.rs": 1,
"dash-spv/src/network/reputation_tests.rs": 5
}
},
"reviewerModel": "claude-sonnet-4-6",
"judgeModel": "claude-sonnet-4-6"
}| let rep = rep_with_streak(3, None); | ||
| assert!(!rep.in_cooldown(SystemTime::now())); | ||
| } | ||
|
|
There was a problem hiding this comment.
💡 Suggestion [medium confidence]: Missing test: select_best_peers with empty available_peers list
There is no test for select_best_peers(required, vec![], n) where the input peer slice is empty. The acceptance criteria call for comprehensive edge-case coverage, and this is a distinct code path (the scoring loop never executes, scored is empty). While the code is correct today, a regression in the early-exit or capacity-allocation logic could silently break this case.
Suggested fix
| #[tokio::test] | |
| async fn test_select_empty_input_returns_empty() { | |
| let manager = PeerReputationManager::new(); | |
| let selected = manager.select_best_peers(network_required(), vec![], 5).await; | |
| assert!(selected.is_empty()); | |
| } |
AI context
{
"file": "dash-spv/src/network/reputation_tests.rs",
"line": 725,
"severity": "suggestion",
"confidence": "medium",
"flaggedBy": [
"Testing & Coverage"
],
"title": "Missing test: select_best_peers with empty available_peers list",
"fix": "#[tokio::test]\nasync fn test_select_empty_input_returns_empty() {\n let manager = PeerReputationManager::new();\n let selected = manager.select_best_peers(network_required(), vec![], 5).await;\n "
}| .select_best_peers(network_required(), vec![stale.clone(), fresh.clone()], 2) | ||
| .await; | ||
| assert_eq!(selected, vec![fresh_addr, stale_addr]); | ||
| } |
There was a problem hiding this comment.
💡 Suggestion [medium confidence]: composite_score: future addr.time guard (saturating_sub) is untested
composite_score uses now_epoch_secs.saturating_sub(addr.time as u64) to prevent underflow when a peer's gossip timestamp is in the future. Every ranking test supplies a past or equal time, so the guard path is never exercised. A future-stamped peer should receive zero staleness penalty and therefore rank ahead of stale peers, but this is not verified. If the saturating_sub were accidentally removed or the cast changed to signed, negative staleness would silently mis-rank peers.
Suggested fix
| } | |
| #[tokio::test] | |
| async fn test_select_future_addr_time_gets_no_staleness_penalty() { | |
| let manager = PeerReputationManager::new(); | |
| let now = now_epoch_secs(); | |
| // addr.time slightly ahead of now — should yield 0 staleness penalty | |
| let future = addr_msg("10.10.10.1", 9999, ServiceFlags::NETWORK, now.saturating_add(3600)); | |
| let stale = addr_msg("10.10.10.2", 9999, ServiceFlags::NETWORK, now.saturating_sub(7200)); | |
| let selected = manager | |
| .select_best_peers(network_required(), vec![stale.clone(), future.clone()], 2) | |
| .await; | |
| // future-stamped should rank first (no staleness penalty) | |
| assert_eq!(selected[0], future.socket_addr().unwrap()); | |
| } |
AI context
{
"file": "dash-spv/src/network/reputation_tests.rs",
"line": 575,
"severity": "suggestion",
"confidence": "medium",
"flaggedBy": [
"Testing & Coverage"
],
"title": "composite_score: future addr.time guard (saturating_sub) is untested",
"fix": "#[tokio::test]\nasync fn test_select_future_addr_time_gets_no_staleness_penalty() {\n let manager = PeerReputationManager::new();\n let now = now_epoch_secs();\n // addr.time slightly ahead of no"
}…best_peers\` Explicit peers from \`ClientConfig.peers\` are user-configured addresses whose real service flags are learned only after handshake. Integration tests seed the peer store with \`NETWORK\`-only flags but run with \`enable_filters = true\`, which requires \`COMPACT_FILTERS\`. The hard-filter was rejecting all seeded peers so the client never dialed the test dashd node. Fix: snapshot explicit peer addresses in \`PeerNetworkManager\` as a \`HashSet<SocketAddr>\` and pass them to \`select_best_peers\`. Explicit peers bypass the services filter but are still subject to ban and cooldown checks. Also bundle the remaining runtime correctness fixes in the same set of functions: - Scope the \`reputations\` write lock to the gather phase only; release before sort to eliminate write-lock contention across concurrent reputation updates. - Use \`.get().cloned().unwrap_or_default()\` instead of \`.entry().or_default()\` in the selection path to prevent map growth to the full gossip address book. - Clamp \`success_ratio\` to \`[0.0, 1.0]\` in \`composite_score\` so diverging counters cannot produce an unbounded score bonus. - Clamp \`addr.time\` to at most \`now_epoch_secs\` before computing staleness so a peer-controlled future timestamp cannot zero the staleness penalty.
…\`addr.time\`, combined filters) Add \`no_explicit()\` helper and thread the new \`explicit_peers\` arg through all existing \`select_best_peers\` call sites in the test module. Three new tests: - \`test_select_empty_available_peers_returns_empty\`: verifies no panic and no map insertion when the candidate list is empty. - \`test_select_future_addr_time_does_not_panic_and_ranks_as_fresh\`: confirms that a future \`addr.time\` is clamped to now (staleness = 0) and ranks ahead of a stale peer. - \`test_select_combined_services_and_cooldown_filters\`: provides three peers where (a) lacks a required service and (b) is in cooldown, and asserts only (c) survives.
…ntegration test \`create_non_exclusive_test_config\` was seeding the peer store with only \`ServiceFlags::NETWORK\`. With \`enable_filters = true\` (the default), the \`RequiredServices\` hard-filter in \`select_best_peers\` requires \`NETWORK | COMPACT_FILTERS\`, so the seeded peer was filtered out and the client never dialed the dashd test node. Seed with \`NETWORK | COMPACT_FILTERS\` to match what the regtest dashd node actually advertises. Real service flags are re-learned via \`mark_seen\` after the first successful handshake; the seed value only needs to pass selection.
|
Manki — Review complete Planner (21s) Review — 11 findings Judge — 6 kept · 0 dropped (93s) Review metadataConfig:
Judge decisions:
Timing:
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## v0.42-dev #108 +/- ##
=============================================
+ Coverage 68.04% 68.08% +0.04%
=============================================
Files 319 320 +1
Lines 68117 68258 +141
=============================================
+ Hits 46348 46476 +128
- Misses 21769 21782 +13
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
The addr.time clamping, lock-narrowing, success_ratio clamping, and no-default-insertion issues from the last review are all cleanly addressed. The explicit_peers bypass — the PR's new feature — has zero test coverage: every new test passes no_explicit(), leaving the services-filter bypass and its safety invariants (ban, cooldown) entirely unverified despite the acceptance criteria requiring comprehensive coverage.
📊 6 findings (1 required, 4 suggestion, 1 nit) · 759 lines · 424s
Review stats
{
"model": "claude-sonnet-4-6",
"reviewTimeMs": 424490,
"diffLines": 759,
"diffAdditions": 730,
"diffDeletions": 29,
"filesReviewed": 6,
"agents": [
"Security & Safety",
"Correctness & Logic",
"Architecture & Design",
"Testing & Coverage"
],
"findingsRaw": 11,
"findingsKept": 6,
"findingsDropped": 5,
"severity": {
"required": 1,
"suggestion": 4,
"nit": 1
},
"verdict": "REQUEST_CHANGES",
"prNumber": 108,
"commitSha": "9edbc1269be185ece05a7210750b3ab8100b5fda",
"agentMetrics": [
{
"name": "Security & Safety",
"findingsRaw": 2,
"findingsKept": 2,
"responseLength": 2403
},
{
"name": "Correctness & Logic",
"findingsRaw": 3,
"findingsKept": 2,
"responseLength": 3477
},
{
"name": "Architecture & Design",
"findingsRaw": 3,
"findingsKept": 3,
"responseLength": 3174
},
{
"name": "Testing & Coverage",
"findingsRaw": 3,
"findingsKept": 1,
"responseLength": 4274
}
],
"judgeMetrics": {
"confidenceDistribution": {
"high": 4,
"medium": 2,
"low": 0
},
"severityChanges": 6,
"mergedDuplicates": 5
},
"fileMetrics": {
"fileTypes": {
".rs": 6
},
"findingsPerFile": {
"dash-spv/src/network/reputation.rs": 3,
"dash-spv/src/network/reputation_tests.rs": 2,
"dash-spv/tests/dashd_sync/setup.rs": 1
}
},
"reviewerModel": "claude-sonnet-4-6",
"judgeModel": "claude-sonnet-4-6"
}| // zero out the staleness penalty. | ||
| let addr_time = (addr.time as u64).min(now_epoch_secs); | ||
| let staleness_secs = now_epoch_secs.saturating_sub(addr_time) as f64; | ||
| let staleness_penalty = (staleness_secs / scoring_weights::STALENESS_SECS_PER_POINT) |
There was a problem hiding this comment.
💡 Suggestion [medium confidence]: Peer-advertised NODE_HEADERS_COMPRESSED grants unverifiable 15-point ranking advantage
The composite_score function subtracts PREFERRED_SERVICES_BONUS (15.0) when a peer's advertised services include NODE_HEADERS_COMPRESSED. Unlike addr.time (which this PR correctly mitigates via clamping to now_epoch_secs), service flags are fully peer-controlled and never verified before selection. An adversarial peer can falsely advertise this flag to consistently rank 15 points ahead of honest peers with identical reputation scores. After failing to serve compressed headers it accrues penalties, but a sybil peer that disconnects-and-reconnects before ban threshold can sustain the advantage indefinitely. The PR's own addr.time fix demonstrates awareness of this threat class. Consider gating the bonus on post-connection evidence, such as requiring successful_connections > 0, so newly-seen peers cannot claim it.
Suggested fix
| let staleness_penalty = (staleness_secs / scoring_weights::STALENESS_SECS_PER_POINT) | |
| if addr.services.has(ServiceFlags::NODE_HEADERS_COMPRESSED) | |
| && reputation.successful_connections > 0 | |
| { | |
| score -= scoring_weights::PREFERRED_SERVICES_BONUS; | |
| } |
AI context
{
"file": "dash-spv/src/network/reputation.rs",
"line": 679,
"severity": "suggestion",
"confidence": "medium",
"flaggedBy": [
"Security & Safety"
],
"title": "Peer-advertised NODE_HEADERS_COMPRESSED grants unverifiable 15-point ranking advantage",
"fix": "if addr.services.has(ServiceFlags::NODE_HEADERS_COMPRESSED)\n && reputation.successful_connections > 0\n{\n score -= scoring_weights::PREFERRED_SERVICES_BONUS;\n}"
}| let mut rep = reputations.get(&socket_addr).cloned().unwrap_or_default(); | ||
| rep.apply_decay(); | ||
|
|
||
| // Persist decay back only for peers that already have an entry. |
There was a problem hiding this comment.
💡 Suggestion [high confidence]: Misleading double apply_decay call for stored peers in select_best_peers
For peers already present in the reputation map, apply_decay() is called twice in quick succession: once on the cloned value (rep) used for scoring, and once directly on the stored entry. The comment 'Persist decay back' implies writing the clone's computed state back to storage, but stored.apply_decay() recomputes decay independently from the same base rather than copying results from rep. Since DECAY_INTERVAL is one hour and the two calls are nanoseconds apart, the second call is always a no-op and the comment is actively misleading. Simply remove the second call; the stored entry's last_update is untouched until a genuine hour boundary arrives.
Suggested fix
| // Persist decay back only for peers that already have an entry. | |
| // Only one apply_decay call needed; no separate 'persist back' step required. | |
| let mut rep = reputations.get(&socket_addr).cloned().unwrap_or_default(); | |
| rep.apply_decay(); | |
| // Remove: if let Some(stored) = reputations.get_mut(&socket_addr) { stored.apply_decay(); } |
AI context
{
"file": "dash-spv/src/network/reputation.rs",
"line": 755,
"severity": "suggestion",
"confidence": "high",
"flaggedBy": [
"Security & Safety",
"Correctness & Logic",
"Architecture & Design"
],
"title": "Misleading double apply_decay call for stored peers in select_best_peers",
"fix": "// Only one apply_decay call needed; no separate 'persist back' step required.\nlet mut rep = reputations.get(&socket_addr).cloned().unwrap_or_default();\nrep.apply_decay();\n// Remove: if let Some(store"
}| @@ -408,6 +438,317 @@ mod tests { | |||
| ); | |||
There was a problem hiding this comment.
🚫 Required [high confidence]: No test for explicit_peers services-filter bypass
The PR introduces an explicit_peers bypass that allows user-configured peers to skip the required-services hard-filter, yet every new test passes no_explicit(). The bypass code path (explicit_peers.contains(&socket_addr)) has zero test coverage. The PR's own acceptance criteria states 'Comprehensive test coverage is a requirement' and explicitly calls out the required-services filter as a must-cover case; this gap directly violates that requirement.
Suggested fix
| ); | |
| #[tokio::test] | |
| async fn test_select_explicit_peer_bypasses_services_filter() { | |
| let manager = PeerReputationManager::new(); | |
| let now = now_epoch_secs(); | |
| // Peer advertises only NETWORK, missing COMPACT_FILTERS. | |
| let addr: SocketAddr = "40.0.0.1:9999".parse().unwrap(); | |
| let msg = addr_msg("40.0.0.1", 9999, ServiceFlags::NETWORK, now); | |
| let required = RequiredServices::from_flags( | |
| ServiceFlags::NETWORK | ServiceFlags::COMPACT_FILTERS, | |
| ); | |
| // Without explicit entry it must be filtered out. | |
| let without = manager | |
| .select_best_peers(required, vec![msg.clone()], 1, &no_explicit()) | |
| .await; | |
| assert!(without.is_empty(), "should be filtered without explicit entry"); | |
| // With explicit entry it must survive the services filter. | |
| let mut explicit = HashSet::new(); | |
| explicit.insert(addr); | |
| let with = manager | |
| .select_best_peers(required, vec![msg], 1, &explicit) | |
| .await; | |
| assert_eq!(with, vec![addr], "explicit peer must bypass services filter"); | |
| } |
AI context
{
"file": "dash-spv/src/network/reputation_tests.rs",
"line": 438,
"severity": "required",
"confidence": "high",
"flaggedBy": [
"Testing & Coverage"
],
"title": "No test for explicit_peers services-filter bypass",
"fix": "#[tokio::test]\nasync fn test_select_explicit_peer_bypasses_services_filter() {\n let manager = PeerReputationManager::new();\n let now = now_epoch_secs();\n // Peer advertises only NETWORK, miss"
}| assert!(!selected.contains(&missing_service_addr)); | ||
| assert!(!selected.contains(&in_cooldown_addr)); | ||
| } | ||
| } |
There was a problem hiding this comment.
💡 Suggestion [high confidence]: No test verifying banned explicit peers are still excluded from selection
The ReputationAware trait documentation says explicit_peers bypass only the required-services filter, remaining subject to ban and cooldown filters. The implementation is correct, but none of the 24 new tests cover the scenario where a peer is both in explicit_peers and banned. A regression that accidentally lifts the ban check for explicit peers (e.g., mis-ordering the filter guards) would pass all current tests undetected.
Suggested fix
| } | |
| #[tokio::test] | |
| async fn test_select_explicit_banned_peer_still_excluded() { | |
| let manager = PeerReputationManager::new(); | |
| let now = now_epoch_secs(); | |
| let addr: SocketAddr = "40.0.0.1:9999".parse().unwrap(); | |
| manager.update_reputation(addr, misbehavior_scores::INVALID_MESSAGE * 10, "bad").await; | |
| assert!(manager.is_banned(&addr).await); | |
| let msg = addr_msg("40.0.0.1", 9999, ServiceFlags::NETWORK, now); | |
| let explicit = HashSet::from([addr]); | |
| let selected = manager | |
| .select_best_peers(network_required(), vec![msg], 1, &explicit) | |
| .await; | |
| assert!(selected.is_empty(), "banned explicit peer must still be excluded"); | |
| } |
AI context
{
"file": "dash-spv/src/network/reputation_tests.rs",
"line": 862,
"severity": "suggestion",
"confidence": "high",
"flaggedBy": [
"Correctness & Logic"
],
"title": "No test verifying banned explicit peers are still excluded from selection",
"fix": "#[tokio::test]\nasync fn test_select_explicit_banned_peer_still_excluded() {\n let manager = PeerReputationManager::new();\n let now = now_epoch_secs();\n let addr: SocketAddr = \"40.0.0.1:9999\".p"
}| .expect("Failed to open peer storage"); | ||
| let msg = AddrV2Message::new(peer_addr, ServiceFlags::NETWORK); | ||
| let msg = AddrV2Message::new(peer_addr, ServiceFlags::NETWORK | ServiceFlags::COMPACT_FILTERS); | ||
| peer_store.save_peers(&[msg]).await.expect("Failed to seed peer store"); |
There was a problem hiding this comment.
💡 Suggestion [medium confidence]: Integration test seeds peer with hardcoded service flags instead of deriving from RequiredServices
create_non_exclusive_test_config hardcodes ServiceFlags::NETWORK | ServiceFlags::COMPACT_FILTERS for the seeded peer. If ClientConfig::regtest() ever adds another required capability (e.g., BLOOM for a bloom-filter mempool strategy), the seed will no longer satisfy the RequiredServices hard-filter, silently breaking non-exclusive-mode integration tests. The root cause is that RequiredServices is pub(crate) and therefore inaccessible from integration test crates. Exposing RequiredServices publicly (or adding a ClientConfig::required_service_flags() -> ServiceFlags helper) would let test setup code auto-derive the correct mask and stay in sync with config changes.
Suggested fix
| peer_store.save_peers(&[msg]).await.expect("Failed to seed peer store"); | |
| // In ClientConfig (or RequiredServices): | |
| pub fn required_service_flags(&self) -> ServiceFlags { | |
| RequiredServices::from_config(self).0 | |
| } | |
| // In setup.rs: | |
| let msg = AddrV2Message::new(peer_addr, config.required_service_flags()); |
AI context
{
"file": "dash-spv/tests/dashd_sync/setup.rs",
"line": 408,
"severity": "suggestion",
"confidence": "medium",
"flaggedBy": [
"Architecture & Design"
],
"title": "Integration test seeds peer with hardcoded service flags instead of deriving from `RequiredServices`",
"fix": "// In ClientConfig (or RequiredServices):\npub fn required_service_flags(&self) -> ServiceFlags {\n RequiredServices::from_config(self).0\n}\n\n// In setup.rs:\nlet msg = AddrV2Message::new(peer_addr, co"
}|
This PR has merge conflicts with the base branch. Please rebase or merge the base branch into your branch to resolve them. |
66e5614 to
05b14a8
Compare
Summary
RequiredServicesderived fromClientConfigat client startup (alwaysNETWORK, plusCOMPACT_FILTERSwhen filter sync is enabled, plusBLOOMwhen bloom-filter sync is enabled). Threaded intoPeerNetworkManager.select_best_peerswith a tiered, capability-aware selector: hard-filters stored peers by required services, by exponential-backoff cooldown (30s → 1m → 5m → 30m → 2h keyed onconsecutive_failuresandlast_tried), and by ban status. Remaining candidates are ranked by a single composite score combining reputation, success ratio,AddrV2.timestaleness, and aNODE_HEADERS_COMPRESSEDpreferred-services bonus. When nothing survives the filters, the selection returns empty so DNS discovery can take over on the next tick.Foundational work for #105 (capability-driven churn in the maintenance loop) which will also consume
RequiredServices.Closes #104
Part of #102