[consensus] Latency-weighted leader reputation + tighter classifier (combined)#19341
Draft
danielxiangzl wants to merge 10 commits into
Draft
[consensus] Latency-weighted leader reputation + tighter classifier (combined)#19341danielxiangzl wants to merge 10 commits into
danielxiangzl wants to merge 10 commits into
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Closed
1 task
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
490aaf3 to
31cee9e
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
31cee9e to
94ff966
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
4485f9d to
30a41f4
Compare
94ff966 to
742552e
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
742552e to
be75836
Compare
|
✅ No security or compliance issues detected. Reviewed everything up to be75836. Security Overview
Detected Code Changes
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
cfc4529 to
677ccd7
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
1 task
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
2 tasks
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…onfig
Adds a continuous, per-validator weight scaling to LeaderReputation that
prefers validators with lower historical commit-to-commit interval as
proposers, gated behind a new on-chain config variant.
Heuristic (LatencyWeightedHeuristic in consensus/liveness/leader_reputation.rs):
- compute_round_times: split successful pairs 50/50 between newer and
older proposer; attribute timeout-spanning gaps in full to the failed
proposer(s) via failed_proposer_indices. Healthy adjacent proposers
no longer absorb others' timeouts.
- get_weights: aggregate per-validator round-time observations using
*mean* (not median, which discarded the failure tail). Scale active
validators by (max_mean / val_mean)^multiplier, with a per-validator
fallback when fewer than MIN_OBSERVATIONS=2 entries exist, a
MAX_LATENCY_RATIO=10 ceiling on the boost, and degenerate-case guards
(empty means / zero max_mean -> base weights).
On-chain gating (types/on_chain_config/consensus_config.rs):
- New ProposerAndVoterV3 variant of LeaderReputationType carrying
ProposerAndVoterConfigV3 { base, use_latency_weighted,
latency_weight_multiplier_milli } so all validators deterministically
agree on whether to enable latency weighting and with what exponent
(BCS-friendly integer milli-units; 1000 = 1.0x). Without on-chain
gating, partial rollout would fork the chain.
- Version-agnostic proposer_and_voter_params() accessor returns the
base config plus the latency-weighted toggle for V1/V2 (toggle=false)
and V3.
Wiring:
- consensus/epoch_manager.rs: read base config + toggle through
proposer_and_voter_params(); decode multiplier-milli to f64 at
construction.
- consensus/dag/bootstrap.rs: handle V3 by using its base config (DAG
anchor election does not yet wire LatencyWeighted; TODO marker added).
- testsuite/smoke-test/{state_sync,aptos_cli/validator}.rs: cover V3 in
match arms.
- testsuite/forge-cli/realistic_environment.rs: use V3 in genesis with
use_latency_weighted=true, latency_weight_multiplier_milli=1000, and
proposer_window_num_validators_multiplier=50 (bumped from 10 so the
heuristic gets ~350 blocks of history -- enough samples for the mean
to stabilize on a 10%-failure validator).
Tests: adds 7 unit tests for the heuristic (50/50 split, failure
attribution, multi-failure split, per-validator fallback, empty
history, ratio clamp, cross-epoch skip). All 18 leader-reputation lib
tests pass.
Prototype/experiment code -- not for merge to main; the canonical merge
PR will need governance migration of the on-chain config.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
14fcae1 to
5a7416a
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…=5%, window=100x Makes #19566 vs #19341 a clean isolation of the latency heuristic's contribution: both branches now share the same binary classifier config, differing only in use_latency_weighted and latency_weight_multiplier_milli. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…alty + carry-forward Root cause of 3.0× instability (analyzed via forge runs at multipliers 2.0× / 3.0×): The previous formula was `weight = active_weight * (max_mean / val_mean)^multiplier`. This BOOSTED fast validators rather than PENALIZING slow ones — the slowest validator (V6 in our test) received the BASE active_weight, not the lowest weight. The exponentiation amplified small variations among healthy validators: a transient 20ms blip on a fast validator cliff-dropped its weight, redistributing load and triggering cascading instability. At multiplier=2.0× the system was barely stable; at 3.0× p99 exploded from ~1s to 8-41s with multi-minute oscillation cycles. Compounding this, the MIN_OBSERVATIONS=2 fallback created a step-function: when V6 was suppressed enough to drop below 2 observations in the window, V6 fell back to base active_weight, became selectable, failed, accumulated observations, got re-suppressed — a textbook oscillation. This commit redesigns the heuristic with two changes: 1. **Median-reference asymmetric penalty.** Use the median of observed per-validator means as the reference. Validators at or below median: factor = 1.0 (no change). Validators above median: factor = 1 / (val_mean / median)^multiplier, clamped at MAX_LATENCY_RATIO. The slowest validator now gets the LARGEST penalty, healthy validators are not destabilized by small noise, and higher multipliers no longer amplify variance among the good band. 2. **Carry-forward state for unobserved validators.** A `Mutex<HashMap<Author, f64>>` tracks the last computed weight factor per author. When a validator has too few fresh observations (because it was suppressed enough to drop out of selection), we apply the previously-computed factor instead of falling back to base active_weight. Newly-rotated-in validators (no prior factor) still default to 1.0 → base active_weight. This breaks the suppress→starve→reset oscillation. Tests: - All 7 existing latency-weighted tests updated to match new formula. - New `test_latency_weighted_carry_forward_for_unobserved_validator`: verifies V1's penalty is preserved across calls when V1 has too few fresh observations. - `test_latency_weighted_max_ratio_clamp` updated to test the penalty floor (V1 weight = active_weight / 10) rather than the boost ceiling (gone). Forge config restored to multiplier=2.0× to validate the redesigned heuristic at the previously-known-stable setting. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
If suppressing V6 too aggressively shifts the structural cut-off onto V5 (geographic asymmetry hypothesis), a milder multiplier should keep V6 reasonably suppressed without making V5 the new bottleneck. Combined with #19341's strict classifier (failed_weight=0, threshold=5%) which still hard-bans V6 from leadership entirely. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
Contributor
✅ Forge suite
|
5 tasks
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Combines two complementary improvements to leader reputation:
failed_weight=0, failure_threshold_percent=5so a slow validator is reliably bannedGated behind a new on-chain config variant so all validators deterministically agree on whether to enable it (no rollout fork).
Heuristic
Splits successful pair intervals 50/50 between newer/older proposer; attributes timeout-spanning gaps to the failed proposer(s) via
failed_proposer_indices. Aggregates with mean (not median — median discarded the failure tail). Healthy adjacents are no longer wrongly penalized for absorbing others' timeouts.Guards:
MIN_OBSERVATIONS=2MAX_LATENCY_RATIO=10×On-chain gate
New
LeaderReputationType::ProposerAndVoterV3(ProposerAndVoterConfigV3)withuse_latency_weighted: boolandlatency_weight_multiplier_milli: u32(BCS-friendly milli-units; 1000 = 1.0×). V1/V2 default the toggle to false → no behavior change for existing payloads.Forge config
Tests
7 new unit tests for the heuristic (50/50 split, failure attribution, multi-failure split, per-validator fallback, empty history, ratio clamp, cross-epoch skip). All 18 leader-reputation lib tests pass.
Experiment ladder
Forge results (run 2026-04-28 21:12-21:26 UTC, 14 min, 4k TPS, 1 slow validator)
Commit-accepted latency vs baseline #19330:
Winner across every percentile. The classifier provides a strong p90 floor; the heuristic flattens the p99 tail. Together they yield −29% p99 (1.27→0.90) vs. baseline.
Conclusions
Test plan
land_blocking, compare commit p50/p75/p90/p99 against baseline [forge] Latency baseline with 1 slow validator at 4k TPS #19330⚠ Prototype/experiment code — not for merge to main as-is. Canonical merge requires governance migration of the on-chain config.
🤖 Generated with Claude Code