Skip to content

feat: capability-aware peer selection with exponential-backoff cooldown#108

Open
xdustinface wants to merge 10 commits intov0.42-devfrom
feat/capability-aware-peer-selection
Open

feat: capability-aware peer selection with exponential-backoff cooldown#108
xdustinface wants to merge 10 commits intov0.42-devfrom
feat/capability-aware-peer-selection

Conversation

@xdustinface
Copy link
Copy Markdown
Owner

Summary

  • Adds RequiredServices derived from ClientConfig at client startup (always NETWORK, plus COMPACT_FILTERS when filter sync is enabled, plus BLOOM when bloom-filter sync is enabled). Threaded into PeerNetworkManager.
  • Replaces misbehavior-score-only select_best_peers with a tiered, capability-aware selector: hard-filters stored peers by required services, by exponential-backoff cooldown (30s → 1m → 5m → 30m → 2h keyed on consecutive_failures and last_tried), and by ban status. Remaining candidates are ranked by a single composite score combining reputation, success ratio, AddrV2.time staleness, and a NODE_HEADERS_COMPRESSED preferred-services bonus. When nothing survives the filters, the selection returns empty so DNS discovery can take over on the next tick.
  • 24 new unit tests covering each filter, each ranking factor, ties, empty-after-filter, zero-count, and the cooldown steps.

Foundational work for #105 (capability-driven churn in the maintenance loop) which will also consume RequiredServices.

Closes #104

Part of #102

xdustinface and others added 6 commits April 17, 2026 20:24
* chore: add Manki AI code review configuration

* fix: pass `claude_code_oauth_token` to Manki action
* feat: add session outcome fields to `PeerReputation`

Adds `last_success`, `last_tried`, and `consecutive_failures` to `PeerReputation`, plus a `record_connection_failure` method on `PeerReputationManager`. All new fields use `#[serde(default)]` so existing `reputations.json` files load without migration.

`record_connection_attempt` now sets `last_tried` and `record_successful_connection` sets `last_success` and resets `consecutive_failures`.

* test: cover `PeerReputation` session outcome transitions

Adds unit tests for default values, `last_tried` on attempt, `last_success` plus `consecutive_failures` reset on success, failure streak increment preserving `last_success`, and legacy `reputations.json` decoding with missing fields.

* feat: bump `AddrV2.time` on successful handshake

Adds `AddrV2Handler::mark_seen` to refresh the stored timestamp for a directly observed peer, preserving existing services for known entries and inserting a fresh entry otherwise. `connect_to_peer` now calls `mark_seen` after a successful handshake so the `peers.dat` time reflects first-hand observation instead of gossip.

* feat: track peer connection outcomes in network manager

Calls `record_connection_failure` on both the TCP connect failure and the handshake failure paths in `PeerNetworkManager::connect_to_peer`, so the `consecutive_failures` streak reflects every unsuccessful attempt.

* refactor: drop backward-compat shims from `PeerReputation`

Removes `#[serde(default)]` on the new session outcome fields and the legacy-JSON load test. Backward compatibility of `reputations.json` across versions is no longer a requirement, so the shims and test are dead weight.

* refactor: thread peer-advertised services into `mark_seen`

* docs: clarify `last_connection` vs `last_tried` on `PeerReputation`

* refactor: extract `make_addr_message` helper in `addrv2`

* refactor: defensively set `last_tried` in `record_connection_failure`

* refactor: add atomic `record_failure_with_penalty` and use at failure sites

* chore: apply `cargo fmt`

* refactor: clamp `consecutive_failures` on deserialization

* refactor: extract `apply_score_change` and always update `last_tried` on failure

* test: cover `record_failure_with_penalty` directly

* refactor: clamp \`consecutive_failures\` at runtime and consolidate failure-field updates

Extract private `record_failure_fields` that applies `last_tried = now` and
`consecutive_failures.saturating_add(1).min(MAX_CONSECUTIVE_FAILURES)`. Both
`record_connection_failure` and `record_failure_with_penalty` now delegate to it,
eliminating duplicated mutations and capping the in-memory streak at the same 1000
limit enforced by the deserializer.

* test: cover \`consecutive_failures\` clamp, \`last_success\` preservation, \`mark_seen\` eviction

- Assert `last_success` is unchanged after `record_failure_with_penalty`
- Deserialize a `PeerReputation` with `consecutive_failures: 99999` and assert it clamps to `MAX_CONSECUTIVE_FAILURES`
- Fill `AddrV2Handler` to capacity and assert `mark_seen` stays bounded and includes the new entry

* refactor: remove unused `record_connection_failure`

* test: stabilize eviction test and cover runtime `consecutive_failures` saturation

* refactor: document and assert non-negative contract on `record_failure_with_penalty`

* test: cover `last_tried` preservation on success and update on failure

* fix: clamp negative `score_change` in `record_failure_with_penalty`

* test: cover happy-path attempt to success lifecycle

* refactor: tighten `clamp_future_system_time` bounds and enforce load invariants

Add a 30-day lower bound to `clamp_future_system_time` so stale or corrupted timestamps
(including epoch 0) are discarded on load, in addition to future ones.

Add `PeerReputation::normalize_after_load` and call it from the storage load path.
It resets `consecutive_failures` to 0 whenever `last_tried` is `None`, preventing
the inconsistent state where a non-zero failure streak has no temporal anchor.

* test: cover `clamp_future_system_time` edge cases

Add three tests: future timestamp rejected, epoch-zero rejected (exercising the new
lower bound), and recent-past timestamp preserved.

* docs: clarify zero-`score_change` contract on `record_failure_with_penalty`

A value of 0 is a deliberate no-op for the reputation score but still records
the failure counter and timestamp, which is useful for failures that should be
tracked without contributing toward a ban.

* fix: use `checked_sub` in `clamp_future_system_time` to avoid panic on broken clocks

* test: cover `normalize_after_load` via storage round-trip

* fix: use `checked_add` in `clamp_future_system_time` to avoid panic on far-future clocks

* test: also cover stale-timestamp path in `normalize_after_load` round-trip

* refactor: drop 30-day stale-timestamp floor from `clamp_future_system_time`

Remove `TIMESTAMP_MAX_AGE` and the `floor` computation that rejected timestamps older than 30 days. The future-timestamp guard (10-second tolerance) is the only meaningful constraint. Update the `normalize_after_load` doc comment to drop the "stale" reference.

* test: remove obsolete stale-timestamp tests and consolidate `clamp_future_system_time` coverage

Delete `test_normalize_after_load_via_storage_round_trip_stale` (tested the removed stale-floor path). Merge the three `test_clamp_future_system_time_*` tests into a single `test_clamp_future_system_time` covering future rejection and recent-past acceptance.

* fix: \`mark_seen\` now overwrites \`services\` on existing entries

Since round 1 the caller passes the actual handshake-negotiated services, so
preserving the gossip-sourced value was inverted. The handshake-observed value
is authoritative and is now written on both new and existing entries.

Rename \`test_mark_seen_bumps_time_and_preserves_services\` to
\`test_mark_seen_bumps_time_and_updates_services\` and update its assertion to
expect the handshake services, not the original gossip services.

* test: cover positive \`normalize_after_load\` branch

Add \`test_normalize_after_load_preserves_failures_when_last_tried_valid\`
to assert that a valid (non-future) \`last_tried\` and a non-zero
\`consecutive_failures\` are both preserved through the load round-trip,
complementing the existing reset-path test.

* refactor: move `record_failure_fields` to `impl PeerReputation`

* refactor: move `apply_score_change` to `impl PeerReputation`
Introduces `RequiredServices` in `dash-spv/src/network/required_services.rs`. The type wraps a `ServiceFlags` set derived from `ClientConfig`: `NETWORK` always, `COMPACT_FILTERS` when filter sync is enabled, `BLOOM` when mempool tracking uses the BIP37 strategy. `NODE_HEADERS_COMPRESSED` is intentionally excluded since it is a preferred-peer bonus rather than a hard requirement.

`PeerNetworkManager` now builds `RequiredServices` once from `ClientConfig` at construction and threads it into every `select_best_peers` call so peers whose advertised `AddrV2.services` don't cover the required set are hard-filtered out. If no capable peers survive, selection returns empty rather than picking an incapable peer, preserving the maintenance loop's DNS-fallback path.

The `ReputationAware` trait is scoped to `pub(crate)` since the private `RequiredServices` parameter is internal to the crate.
Adds `cooldown()` and `in_cooldown()` helpers on `PeerReputation` that derive a backoff window from `consecutive_failures` (30s, 1m, 5m, 30m, 2h) anchored at `last_tried`. The per-step durations live in a module-scope `COOLDOWN_STEPS` constant.

`select_best_peers` now also skips any peer still inside its cooldown window, closing the loop between persisted session outcomes and future reconnection attempts. Peers with a discarded `last_tried` (e.g. after `clamp_future_system_time` + `normalize_after_load`) are treated as not-in-cooldown so they become eligible again immediately.
Replaces `select_best_peers`'s plain reputation-score sort with a composite score combining: the existing reputation score, the persisted success ratio (`successful_connections` / `connection_attempts`), a gossip-staleness penalty derived from `AddrV2.time`, and a preferred-services bonus for peers advertising `NODE_HEADERS_COMPRESSED`. Lower composite score is better, matching the reputation convention.

Tunables live in a `scoring_weights` module so each factor can be adjusted without hunting for magic numbers. The staleness penalty is capped so a very old gossip entry cannot dominate the score.

Tests exercise every factor in isolation (reputation ordering, success ratio, staleness, compressed-headers bonus), tie handling, the `count=0` edge case, the banned-peer exclusion, and the capability-aware required-services filter.
@manki-review
Copy link
Copy Markdown

manki-review Bot commented Apr 17, 2026

Manki — Review complete

Planner
    587 lines · 0 agents

Review — 17 findings

Judge — 15 kept · 0 dropped (109s)
    kept: 8 suggestion · 7 nit

Review metadata

Config:

  • Models: reviewer=claude-sonnet-4-6, judge=claude-sonnet-4-6
  • Review level: medium (auto, 587 lines)
  • Team: Security & Safety, Architecture & Design, Correctness & Logic, Testing & Coverage, Performance & Efficiency
  • Memory: disabled
  • Nit handling: issues

Judge decisions:

  • ✓ Kept: "Peer-controlled addr.time allows staleness-penalty bypass in composite scoring" (suggestion, medium confidence) — "Impact: Low-Medium (affects only ranking among survivors; staleness contribution is capped at STALENESS_CAP, bounding the advantage an adversarial peer can gain), Likelihood: Possible (requires deliberate gossip manipulation). The fix is a one-liner and removes the asymmetry cleanly. Single reviewer, but the finding is technically accurate."
  • ✓ Kept: "Write lock held across scoring sort — unnecessary latency under contention" (suggestion, high confidence) — "Impact: Medium (serialises all concurrent reputation writes for the full selection window; in a busy SPV client with inbound peer messages updating reputations on one task and the maintenance tick running selection on another, this is a real contention point), Likelihood: Probable (any concurrent reputation update hits the blocked lock). Three reviewers independently flagged this (Findings 2, 4, and 15 — merged). The sort fix is immediate; moving decay to a background sweep requires more design but fully resolves the write-lock-for-read-query smell."
  • ✓ Kept: "now_epoch_secs falls back to 0 silently on pre-epoch system clock" (nit, high confidence) — "Impact: Low (every peer gets max staleness penalty — degraded ranking but not incorrect filtering), Likelihood: Unlikely (no modern system has a pre-1970 clock). A tracing::warn is good defensive practice but not urgent. Single reviewer, reviewer correctly assessed as nit."
  • ✓ Kept: "ReputationAware narrowed from pub to pub(crate) — breaking public API" (suggestion, medium confidence) — "Impact: Medium (silently breaks any downstream crate using this trait at the type level; the library is 0.x so breaking changes are permitted, but this is unannounced), Likelihood: Possible (any real downstream user imports and dyn-dispatches over this trait). A changelog or commit note is the minimal actionable fix. Single reviewer; keeping at suggestion because the breakage is real even on an unstable API."
  • ✓ Kept: "select_best_peers inserts default reputation entries for every candidate peer" (suggestion, medium confidence) — "Impact: Medium (the in-memory map grows to the full gossip address book size as a side effect of a read-like selection query; in a long-running SPV client with a large address book this is a measurable memory regression), Likelihood: Probable (selection runs on every maintenance tick and processes every surviving peer). Single reviewer, but the analysis is accurate — using .get() for the scoring path avoids the side effect entirely."
  • ✓ Kept: "from_flags and flags constructors are #[cfg(test)]-only — consider pub(crate) for non-test use" (nit, high confidence) — "Impact: Low (future contributors may add a non-test from_flags inadvertently; no current production bug), Likelihood: Unlikely. A doc comment is the right fix. Single reviewer, correctly assessed as nit."
  • ✓ Kept: "success_ratio not clamped to [0.0, 1.0] in composite_score" (suggestion, high confidence) — "Impact: High (successful_connections > connection_attempts yields a ratio >> 1.0, producing a score swing of thousands of points that completely dominates ranking), Likelihood: Possible (requires record_successful_connection called without a prior record_connection_attempt, plausible via storage deserialization of corrupted or legacy data). The fix is a single .min(1.0) call. Single reviewer, but the arithmetic consequence is severe enough to keep at suggestion."
  • ✓ Kept: "cooldown() doc claims it returns None when last_tried is absent, but it doesn't check last_tried" (nit, high confidence) — "Impact: Low (misleading doc comment on an internal function; no runtime behaviour change), Likelihood: Certain (the doc is simply wrong). Single reviewer, correctly assessed as nit."
  • ✓ Kept: "now_epoch_secs() test helper truncates to u32, diverging from production u64 path" (nit, high confidence) — "Impact: Low (no numeric difference today; silent asymmetry that could confuse future test authors adjusting staleness windows), Likelihood: Unlikely. Single reviewer, correctly assessed as nit."
  • ✓ Kept: "Missing test: select_best_peers with empty available_peers list" (suggestion, medium confidence) — "Impact: Medium (the acceptance criteria explicitly require edge-case coverage and an empty input is a distinct code path from empty-after-filter), Likelihood: Certain (this path is never exercised). The acceptance criteria state 'Comprehensive test coverage is a requirement' — an unexercised distinct path against a required coverage target is a suggestion-level gap."
  • ✓ Kept: "composite_score: future addr.time guard (saturating_sub) is untested" (suggestion, medium confidence) — "Impact: Medium (if the guard were accidentally removed, future-stamped peers would receive incorrect staleness penalties silently), Likelihood: Certain (no test exercises the future-timestamp branch). The acceptance criteria require comprehensive coverage of scoring factors; this is a named scoring invariant with zero test coverage. Single reviewer, but the acceptance-criteria gap elevates it."
  • ✓ Kept: "No combined services-filter + cooldown-filter test" (suggestion, medium confidence) — "Impact: Medium (filter ordering could be swapped in a future refactor with no test catching the regression), Likelihood: Possible. The acceptance criteria call for comprehensive filter coverage; testing filters only in isolation leaves the interaction untested. Single reviewer, but the acceptance-criteria requirement is explicit."
  • ✓ Kept: "Test mutates private reputations field directly" (nit, high confidence) — "Impact: Low (test maintenance concern — a field rename or storage refactor silently breaks the test), Likelihood: Possible. A small #[cfg(test)] helper documents intent and survives refactors. Single reviewer, correctly assessed as nit."
  • ✓ Kept: "scored Vec pre-allocated to full input size before filtering" (nit, high confidence) — "Impact: Low (minor allocation waste when filters reject most candidates; no correctness impact), Likelihood: Probable. In typical steady-state most peers are filtered, so the allocation is wasted. Single reviewer, correctly assessed as nit."
  • ✓ Kept: "Use f64::total_cmp instead of partial_cmp with NaN fallback" (nit, high confidence) — "Impact: Low (NaN cannot arise from the current arithmetic, so this is idiomatic improvement only), Likelihood: Unlikely. total_cmp is the right API here and the change is one-line. Single reviewer, correctly assessed as nit."

Timing:

  • Parse: 1.6s
  • Review agents: 259.9s
  • Judge: 109.2s
  • Total: 370.8s

manki-review[bot]
manki-review Bot previously requested changes Apr 17, 2026
Copy link
Copy Markdown

@manki-review manki-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three reviewers independently converged on the same lock-width bug: reputations.write() is held through decay calls, scoring, and the final sort — blocking concurrent record_failure_with_penalty and update_reputation calls for the full duration of every selection tick. That's the real issue here; the rest is largely test-coverage gaps against acceptance criteria that explicitly call for comprehensive coverage.

📊 15 findings (8 suggestion, 7 nit) · 587 lines · 371s

Review stats
{
  "model": "claude-sonnet-4-6",
  "reviewTimeMs": 370788,
  "diffLines": 587,
  "diffAdditions": 569,
  "diffDeletions": 18,
  "filesReviewed": 5,
  "agents": [
    "Security & Safety",
    "Architecture & Design",
    "Correctness & Logic",
    "Testing & Coverage",
    "Performance & Efficiency"
  ],
  "findingsRaw": 17,
  "findingsKept": 15,
  "findingsDropped": 2,
  "severity": {
    "required": 0,
    "suggestion": 8,
    "nit": 7
  },
  "verdict": "REQUEST_CHANGES",
  "prNumber": 108,
  "commitSha": "259b120cf0c0dfa816b5de2c7ca07292b04b06e9",
  "agentMetrics": [
    {
      "name": "Security & Safety",
      "findingsRaw": 3,
      "findingsKept": 3,
      "responseLength": 2913
    },
    {
      "name": "Architecture & Design",
      "findingsRaw": 4,
      "findingsKept": 4,
      "responseLength": 4173
    },
    {
      "name": "Correctness & Logic",
      "findingsRaw": 3,
      "findingsKept": 3,
      "responseLength": 2633
    },
    {
      "name": "Testing & Coverage",
      "findingsRaw": 4,
      "findingsKept": 4,
      "responseLength": 4848
    },
    {
      "name": "Performance & Efficiency",
      "findingsRaw": 3,
      "findingsKept": 3,
      "responseLength": 2627
    }
  ],
  "judgeMetrics": {
    "confidenceDistribution": {
      "high": 9,
      "medium": 6,
      "low": 0
    },
    "severityChanges": 15,
    "mergedDuplicates": 2
  },
  "fileMetrics": {
    "fileTypes": {
      ".rs": 5
    },
    "findingsPerFile": {
      "dash-spv/src/network/reputation.rs": 9,
      "dash-spv/src/network/required_services.rs": 1,
      "dash-spv/src/network/reputation_tests.rs": 5
    }
  },
  "reviewerModel": "claude-sonnet-4-6",
  "judgeModel": "claude-sonnet-4-6"
}

Comment thread dash-spv/src/network/reputation.rs
Comment thread dash-spv/src/network/reputation.rs
Comment thread dash-spv/src/network/reputation.rs
Comment thread dash-spv/src/network/reputation.rs Outdated
Comment thread dash-spv/src/network/reputation.rs
let rep = rep_with_streak(3, None);
assert!(!rep.in_cooldown(SystemTime::now()));
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion [medium confidence]: Missing test: select_best_peers with empty available_peers list

There is no test for select_best_peers(required, vec![], n) where the input peer slice is empty. The acceptance criteria call for comprehensive edge-case coverage, and this is a distinct code path (the scoring loop never executes, scored is empty). While the code is correct today, a regression in the early-exit or capacity-allocation logic could silently break this case.

Suggested fix
Suggested change
#[tokio::test]
async fn test_select_empty_input_returns_empty() {
let manager = PeerReputationManager::new();
let selected = manager.select_best_peers(network_required(), vec![], 5).await;
assert!(selected.is_empty());
}
AI context
{
  "file": "dash-spv/src/network/reputation_tests.rs",
  "line": 725,
  "severity": "suggestion",
  "confidence": "medium",
  "flaggedBy": [
    "Testing & Coverage"
  ],
  "title": "Missing test: select_best_peers with empty available_peers list",
  "fix": "#[tokio::test]\nasync fn test_select_empty_input_returns_empty() {\n    let manager = PeerReputationManager::new();\n    let selected = manager.select_best_peers(network_required(), vec![], 5).await;\n   "
}

.select_best_peers(network_required(), vec![stale.clone(), fresh.clone()], 2)
.await;
assert_eq!(selected, vec![fresh_addr, stale_addr]);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion [medium confidence]: composite_score: future addr.time guard (saturating_sub) is untested

composite_score uses now_epoch_secs.saturating_sub(addr.time as u64) to prevent underflow when a peer's gossip timestamp is in the future. Every ranking test supplies a past or equal time, so the guard path is never exercised. A future-stamped peer should receive zero staleness penalty and therefore rank ahead of stale peers, but this is not verified. If the saturating_sub were accidentally removed or the cast changed to signed, negative staleness would silently mis-rank peers.

Suggested fix
Suggested change
}
#[tokio::test]
async fn test_select_future_addr_time_gets_no_staleness_penalty() {
let manager = PeerReputationManager::new();
let now = now_epoch_secs();
// addr.time slightly ahead of now — should yield 0 staleness penalty
let future = addr_msg("10.10.10.1", 9999, ServiceFlags::NETWORK, now.saturating_add(3600));
let stale = addr_msg("10.10.10.2", 9999, ServiceFlags::NETWORK, now.saturating_sub(7200));
let selected = manager
.select_best_peers(network_required(), vec![stale.clone(), future.clone()], 2)
.await;
// future-stamped should rank first (no staleness penalty)
assert_eq!(selected[0], future.socket_addr().unwrap());
}
AI context
{
  "file": "dash-spv/src/network/reputation_tests.rs",
  "line": 575,
  "severity": "suggestion",
  "confidence": "medium",
  "flaggedBy": [
    "Testing & Coverage"
  ],
  "title": "composite_score: future addr.time guard (saturating_sub) is untested",
  "fix": "#[tokio::test]\nasync fn test_select_future_addr_time_gets_no_staleness_penalty() {\n    let manager = PeerReputationManager::new();\n    let now = now_epoch_secs();\n    // addr.time slightly ahead of no"
}

Comment thread dash-spv/src/network/reputation_tests.rs
…best_peers\`

Explicit peers from \`ClientConfig.peers\` are user-configured addresses whose real
service flags are learned only after handshake. Integration tests seed the peer store
with \`NETWORK\`-only flags but run with \`enable_filters = true\`, which requires
\`COMPACT_FILTERS\`. The hard-filter was rejecting all seeded peers so the client
never dialed the test dashd node.

Fix: snapshot explicit peer addresses in \`PeerNetworkManager\` as a \`HashSet<SocketAddr>\`
and pass them to \`select_best_peers\`. Explicit peers bypass the services filter but
are still subject to ban and cooldown checks.

Also bundle the remaining runtime correctness fixes in the same set of functions:
- Scope the \`reputations\` write lock to the gather phase only; release before sort
  to eliminate write-lock contention across concurrent reputation updates.
- Use \`.get().cloned().unwrap_or_default()\` instead of \`.entry().or_default()\` in
  the selection path to prevent map growth to the full gossip address book.
- Clamp \`success_ratio\` to \`[0.0, 1.0]\` in \`composite_score\` so diverging counters
  cannot produce an unbounded score bonus.
- Clamp \`addr.time\` to at most \`now_epoch_secs\` before computing staleness so a
  peer-controlled future timestamp cannot zero the staleness penalty.
…\`addr.time\`, combined filters)

Add \`no_explicit()\` helper and thread the new \`explicit_peers\` arg through all
existing \`select_best_peers\` call sites in the test module.

Three new tests:
- \`test_select_empty_available_peers_returns_empty\`: verifies no panic and no map
  insertion when the candidate list is empty.
- \`test_select_future_addr_time_does_not_panic_and_ranks_as_fresh\`: confirms that a
  future \`addr.time\` is clamped to now (staleness = 0) and ranks ahead of a stale peer.
- \`test_select_combined_services_and_cooldown_filters\`: provides three peers where
  (a) lacks a required service and (b) is in cooldown, and asserts only (c) survives.
…ntegration test

\`create_non_exclusive_test_config\` was seeding the peer store with only
\`ServiceFlags::NETWORK\`. With \`enable_filters = true\` (the default), the
\`RequiredServices\` hard-filter in \`select_best_peers\` requires
\`NETWORK | COMPACT_FILTERS\`, so the seeded peer was filtered out and the client
never dialed the dashd test node.

Seed with \`NETWORK | COMPACT_FILTERS\` to match what the regtest dashd node
actually advertises. Real service flags are re-learned via \`mark_seen\` after
the first successful handshake; the seed value only needs to pass selection.
@manki-review
Copy link
Copy Markdown

manki-review Bot commented Apr 17, 2026

Manki — Review complete

Planner (21s)
    feature · 759 lines · 4 agents
    review effort: medium · judge effort: high

Review — 11 findings
    ✅ Security & Safety — 2 (191s)
    ✅ Correctness & Logic — 3 (305s)
    ✅ Architecture & Design — 3 (160s)
    ✅ Testing & Coverage — 3 (87s)

Judge — 6 kept · 0 dropped (93s)
    kept: 1 required · 4 suggestion · 1 nit

Review metadata

Config:

  • Models: reviewer=claude-sonnet-4-6, judge=claude-sonnet-4-6
  • Review level: medium (auto, 759 lines)
  • Team: Security & Safety, Correctness & Logic, Architecture & Design, Testing & Coverage
  • Memory: disabled
  • Nit handling: issues

Judge decisions:

  • ✓ Kept: "Peer-advertised NODE_HEADERS_COMPRESSED grants unverifiable 15-point ranking advantage" (suggestion, medium confidence) — "Impact: Medium (biases peer selection toward adversarial peers), Likelihood: Possible (requires adversarial network participant deliberately advertising false flags). The PR already demonstrates awareness of this threat class via the addr.time clamp; gating the bonus on successful_connections > 0 is a targeted fix. Only one reviewer flagged this, but the concern is technically accurate and the fix is clear."
  • ✓ Kept: "Misleading double apply_decay call for stored peers in select_best_peers" (suggestion, high confidence) — "Impact: Medium (misleading comment, redundant work, future maintainability risk if decay logic changes), Likelihood: Certain (exists on every call for known peers). Three of four reviewers independently flagged this; that consensus warrants keeping it at suggestion. Functionally correct today due to DECAY_INTERVAL semantics, but the 'persist decay back' comment actively misrepresents what the second call does. Merged findings 2, 3, and 6 — same underlying issue."
  • ✓ Kept: "No test for explicit_peers services-filter bypass" (required, high confidence) — "Impact: High (untested security-relevant bypass; a regression would silently connect to incapable peers, reinstating the original bug this PR fixes), Likelihood: Certain (the gap exists — every new test passes no_explicit(), the bypass branch is dead code from a test perspective). The PR's acceptance criteria explicitly mandates coverage of the required-services filter, and this is an unmet acceptance criterion. One reviewer flagged it as required; the direct AC violation confirms that rating."
  • ✓ Kept: "No test verifying banned explicit peers are still excluded from selection" (suggestion, high confidence) — "Impact: High (a regression lifting ban/cooldown checks for explicit peers would silently allow connecting to malicious or recently-failed peers), Likelihood: Possible (requires a future refactor to mis-order filter guards). The doc comment explicitly states explicit peers are still subject to ban and cooldown — both invariants are untested. Merged findings 4, 10, and 11 — all cover the same class of missing explicit_peers safety tests. Three reviewers flagged variants of this gap."
  • ✓ Kept: "Integration test seeds peer with hardcoded service flags instead of deriving from RequiredServices" (suggestion, medium confidence) — "Impact: Medium (if ClientConfig::regtest() adds another required service, the seeded peer silently fails selection and non-exclusive integration tests break with no obvious cause), Likelihood: Possible (depends on future config changes, but the PR itself shows RequiredServices is actively evolving). Two reviewers flagged this independently. Merged findings 5 and 7."
  • ✓ Kept: "ReputationAware is now pub(crate) with a single in-crate implementor — trait adds indirection without benefit" (nit, high confidence) — "Impact: Low (minor boilerplate, callers must import both struct and trait), Likelihood: Certain. Correctly narrowed to pub(crate) and has one implementor with no dynamic dispatch. A design preference with no correctness impact."

Timing:

  • Parse: 1.6s
  • Review agents: 330.3s
  • Judge: 92.6s
  • Total: 424.5s

@manki-review manki-review Bot dismissed their stale review April 17, 2026 22:57

Superseded by new review

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 98.06452% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.08%. Comparing base (bda8a95) to head (9edbc12).

Files with missing lines Patch % Lines
dash-spv/src/network/reputation.rs 95.58% 3 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##           v0.42-dev     #108      +/-   ##
=============================================
+ Coverage      68.04%   68.08%   +0.04%     
=============================================
  Files            319      320       +1     
  Lines          68117    68258     +141     
=============================================
+ Hits           46348    46476     +128     
- Misses         21769    21782      +13     
Flag Coverage Δ
core 75.25% <ø> (ø)
ffi 38.99% <ø> (ø)
rpc 20.00% <ø> (ø)
spv 85.62% <98.06%> (+0.05%) ⬆️
wallet 67.62% <ø> (ø)
Files with missing lines Coverage Δ
dash-spv/src/network/manager.rs 65.69% <100.00%> (-0.97%) ⬇️
dash-spv/src/network/mod.rs 77.14% <ø> (ø)
dash-spv/src/network/required_services.rs 100.00% <100.00%> (ø)
dash-spv/src/network/reputation.rs 82.41% <95.58%> (+2.80%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@manki-review manki-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The addr.time clamping, lock-narrowing, success_ratio clamping, and no-default-insertion issues from the last review are all cleanly addressed. The explicit_peers bypass — the PR's new feature — has zero test coverage: every new test passes no_explicit(), leaving the services-filter bypass and its safety invariants (ban, cooldown) entirely unverified despite the acceptance criteria requiring comprehensive coverage.

📊 6 findings (1 required, 4 suggestion, 1 nit) · 759 lines · 424s

Review stats
{
  "model": "claude-sonnet-4-6",
  "reviewTimeMs": 424490,
  "diffLines": 759,
  "diffAdditions": 730,
  "diffDeletions": 29,
  "filesReviewed": 6,
  "agents": [
    "Security & Safety",
    "Correctness & Logic",
    "Architecture & Design",
    "Testing & Coverage"
  ],
  "findingsRaw": 11,
  "findingsKept": 6,
  "findingsDropped": 5,
  "severity": {
    "required": 1,
    "suggestion": 4,
    "nit": 1
  },
  "verdict": "REQUEST_CHANGES",
  "prNumber": 108,
  "commitSha": "9edbc1269be185ece05a7210750b3ab8100b5fda",
  "agentMetrics": [
    {
      "name": "Security & Safety",
      "findingsRaw": 2,
      "findingsKept": 2,
      "responseLength": 2403
    },
    {
      "name": "Correctness & Logic",
      "findingsRaw": 3,
      "findingsKept": 2,
      "responseLength": 3477
    },
    {
      "name": "Architecture & Design",
      "findingsRaw": 3,
      "findingsKept": 3,
      "responseLength": 3174
    },
    {
      "name": "Testing & Coverage",
      "findingsRaw": 3,
      "findingsKept": 1,
      "responseLength": 4274
    }
  ],
  "judgeMetrics": {
    "confidenceDistribution": {
      "high": 4,
      "medium": 2,
      "low": 0
    },
    "severityChanges": 6,
    "mergedDuplicates": 5
  },
  "fileMetrics": {
    "fileTypes": {
      ".rs": 6
    },
    "findingsPerFile": {
      "dash-spv/src/network/reputation.rs": 3,
      "dash-spv/src/network/reputation_tests.rs": 2,
      "dash-spv/tests/dashd_sync/setup.rs": 1
    }
  },
  "reviewerModel": "claude-sonnet-4-6",
  "judgeModel": "claude-sonnet-4-6"
}

// zero out the staleness penalty.
let addr_time = (addr.time as u64).min(now_epoch_secs);
let staleness_secs = now_epoch_secs.saturating_sub(addr_time) as f64;
let staleness_penalty = (staleness_secs / scoring_weights::STALENESS_SECS_PER_POINT)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion [medium confidence]: Peer-advertised NODE_HEADERS_COMPRESSED grants unverifiable 15-point ranking advantage

The composite_score function subtracts PREFERRED_SERVICES_BONUS (15.0) when a peer's advertised services include NODE_HEADERS_COMPRESSED. Unlike addr.time (which this PR correctly mitigates via clamping to now_epoch_secs), service flags are fully peer-controlled and never verified before selection. An adversarial peer can falsely advertise this flag to consistently rank 15 points ahead of honest peers with identical reputation scores. After failing to serve compressed headers it accrues penalties, but a sybil peer that disconnects-and-reconnects before ban threshold can sustain the advantage indefinitely. The PR's own addr.time fix demonstrates awareness of this threat class. Consider gating the bonus on post-connection evidence, such as requiring successful_connections > 0, so newly-seen peers cannot claim it.

Suggested fix
Suggested change
let staleness_penalty = (staleness_secs / scoring_weights::STALENESS_SECS_PER_POINT)
if addr.services.has(ServiceFlags::NODE_HEADERS_COMPRESSED)
&& reputation.successful_connections > 0
{
score -= scoring_weights::PREFERRED_SERVICES_BONUS;
}
AI context
{
  "file": "dash-spv/src/network/reputation.rs",
  "line": 679,
  "severity": "suggestion",
  "confidence": "medium",
  "flaggedBy": [
    "Security & Safety"
  ],
  "title": "Peer-advertised NODE_HEADERS_COMPRESSED grants unverifiable 15-point ranking advantage",
  "fix": "if addr.services.has(ServiceFlags::NODE_HEADERS_COMPRESSED)\n    && reputation.successful_connections > 0\n{\n    score -= scoring_weights::PREFERRED_SERVICES_BONUS;\n}"
}

let mut rep = reputations.get(&socket_addr).cloned().unwrap_or_default();
rep.apply_decay();

// Persist decay back only for peers that already have an entry.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion [high confidence]: Misleading double apply_decay call for stored peers in select_best_peers

For peers already present in the reputation map, apply_decay() is called twice in quick succession: once on the cloned value (rep) used for scoring, and once directly on the stored entry. The comment 'Persist decay back' implies writing the clone's computed state back to storage, but stored.apply_decay() recomputes decay independently from the same base rather than copying results from rep. Since DECAY_INTERVAL is one hour and the two calls are nanoseconds apart, the second call is always a no-op and the comment is actively misleading. Simply remove the second call; the stored entry's last_update is untouched until a genuine hour boundary arrives.

Suggested fix
Suggested change
// Persist decay back only for peers that already have an entry.
// Only one apply_decay call needed; no separate 'persist back' step required.
let mut rep = reputations.get(&socket_addr).cloned().unwrap_or_default();
rep.apply_decay();
// Remove: if let Some(stored) = reputations.get_mut(&socket_addr) { stored.apply_decay(); }
AI context
{
  "file": "dash-spv/src/network/reputation.rs",
  "line": 755,
  "severity": "suggestion",
  "confidence": "high",
  "flaggedBy": [
    "Security & Safety",
    "Correctness & Logic",
    "Architecture & Design"
  ],
  "title": "Misleading double apply_decay call for stored peers in select_best_peers",
  "fix": "// Only one apply_decay call needed; no separate 'persist back' step required.\nlet mut rep = reputations.get(&socket_addr).cloned().unwrap_or_default();\nrep.apply_decay();\n// Remove: if let Some(store"
}

@@ -408,6 +438,317 @@ mod tests {
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 Required [high confidence]: No test for explicit_peers services-filter bypass

The PR introduces an explicit_peers bypass that allows user-configured peers to skip the required-services hard-filter, yet every new test passes no_explicit(). The bypass code path (explicit_peers.contains(&socket_addr)) has zero test coverage. The PR's own acceptance criteria states 'Comprehensive test coverage is a requirement' and explicitly calls out the required-services filter as a must-cover case; this gap directly violates that requirement.

Suggested fix
Suggested change
);
#[tokio::test]
async fn test_select_explicit_peer_bypasses_services_filter() {
let manager = PeerReputationManager::new();
let now = now_epoch_secs();
// Peer advertises only NETWORK, missing COMPACT_FILTERS.
let addr: SocketAddr = "40.0.0.1:9999".parse().unwrap();
let msg = addr_msg("40.0.0.1", 9999, ServiceFlags::NETWORK, now);
let required = RequiredServices::from_flags(
ServiceFlags::NETWORK | ServiceFlags::COMPACT_FILTERS,
);
// Without explicit entry it must be filtered out.
let without = manager
.select_best_peers(required, vec![msg.clone()], 1, &no_explicit())
.await;
assert!(without.is_empty(), "should be filtered without explicit entry");
// With explicit entry it must survive the services filter.
let mut explicit = HashSet::new();
explicit.insert(addr);
let with = manager
.select_best_peers(required, vec![msg], 1, &explicit)
.await;
assert_eq!(with, vec![addr], "explicit peer must bypass services filter");
}
AI context
{
  "file": "dash-spv/src/network/reputation_tests.rs",
  "line": 438,
  "severity": "required",
  "confidence": "high",
  "flaggedBy": [
    "Testing & Coverage"
  ],
  "title": "No test for explicit_peers services-filter bypass",
  "fix": "#[tokio::test]\nasync fn test_select_explicit_peer_bypasses_services_filter() {\n    let manager = PeerReputationManager::new();\n    let now = now_epoch_secs();\n    // Peer advertises only NETWORK, miss"
}

assert!(!selected.contains(&missing_service_addr));
assert!(!selected.contains(&in_cooldown_addr));
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion [high confidence]: No test verifying banned explicit peers are still excluded from selection

The ReputationAware trait documentation says explicit_peers bypass only the required-services filter, remaining subject to ban and cooldown filters. The implementation is correct, but none of the 24 new tests cover the scenario where a peer is both in explicit_peers and banned. A regression that accidentally lifts the ban check for explicit peers (e.g., mis-ordering the filter guards) would pass all current tests undetected.

Suggested fix
Suggested change
}
#[tokio::test]
async fn test_select_explicit_banned_peer_still_excluded() {
let manager = PeerReputationManager::new();
let now = now_epoch_secs();
let addr: SocketAddr = "40.0.0.1:9999".parse().unwrap();
manager.update_reputation(addr, misbehavior_scores::INVALID_MESSAGE * 10, "bad").await;
assert!(manager.is_banned(&addr).await);
let msg = addr_msg("40.0.0.1", 9999, ServiceFlags::NETWORK, now);
let explicit = HashSet::from([addr]);
let selected = manager
.select_best_peers(network_required(), vec![msg], 1, &explicit)
.await;
assert!(selected.is_empty(), "banned explicit peer must still be excluded");
}
AI context
{
  "file": "dash-spv/src/network/reputation_tests.rs",
  "line": 862,
  "severity": "suggestion",
  "confidence": "high",
  "flaggedBy": [
    "Correctness & Logic"
  ],
  "title": "No test verifying banned explicit peers are still excluded from selection",
  "fix": "#[tokio::test]\nasync fn test_select_explicit_banned_peer_still_excluded() {\n    let manager = PeerReputationManager::new();\n    let now = now_epoch_secs();\n    let addr: SocketAddr = \"40.0.0.1:9999\".p"
}

.expect("Failed to open peer storage");
let msg = AddrV2Message::new(peer_addr, ServiceFlags::NETWORK);
let msg = AddrV2Message::new(peer_addr, ServiceFlags::NETWORK | ServiceFlags::COMPACT_FILTERS);
peer_store.save_peers(&[msg]).await.expect("Failed to seed peer store");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion [medium confidence]: Integration test seeds peer with hardcoded service flags instead of deriving from RequiredServices

create_non_exclusive_test_config hardcodes ServiceFlags::NETWORK | ServiceFlags::COMPACT_FILTERS for the seeded peer. If ClientConfig::regtest() ever adds another required capability (e.g., BLOOM for a bloom-filter mempool strategy), the seed will no longer satisfy the RequiredServices hard-filter, silently breaking non-exclusive-mode integration tests. The root cause is that RequiredServices is pub(crate) and therefore inaccessible from integration test crates. Exposing RequiredServices publicly (or adding a ClientConfig::required_service_flags() -> ServiceFlags helper) would let test setup code auto-derive the correct mask and stay in sync with config changes.

Suggested fix
Suggested change
peer_store.save_peers(&[msg]).await.expect("Failed to seed peer store");
// In ClientConfig (or RequiredServices):
pub fn required_service_flags(&self) -> ServiceFlags {
RequiredServices::from_config(self).0
}
// In setup.rs:
let msg = AddrV2Message::new(peer_addr, config.required_service_flags());
AI context
{
  "file": "dash-spv/tests/dashd_sync/setup.rs",
  "line": 408,
  "severity": "suggestion",
  "confidence": "medium",
  "flaggedBy": [
    "Architecture & Design"
  ],
  "title": "Integration test seeds peer with hardcoded service flags instead of deriving from `RequiredServices`",
  "fix": "// In ClientConfig (or RequiredServices):\npub fn required_service_flags(&self) -> ServiceFlags {\n    RequiredServices::from_config(self).0\n}\n\n// In setup.rs:\nlet msg = AddrV2Message::new(peer_addr, co"
}

@github-actions
Copy link
Copy Markdown

This PR has merge conflicts with the base branch. Please rebase or merge the base branch into your branch to resolve them.

@xdustinface xdustinface force-pushed the v0.42-dev branch 2 times, most recently from 66e5614 to 05b14a8 Compare May 8, 2026 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Capability-aware peer selection scoring

2 participants