Skip to content

[forge][experiment] Window-only: proposer window multiplier 10× → 100×#19574

Draft
danielxiangzl wants to merge 3 commits into
mainfrom
daniel/latency-window-only
Draft

[forge][experiment] Window-only: proposer window multiplier 10× → 100×#19574
danielxiangzl wants to merge 3 commits into
mainfrom
daniel/latency-window-only

Conversation

@danielxiangzl
Copy link
Copy Markdown
Contributor

@danielxiangzl danielxiangzl commented Apr 28, 2026

Summary

Window-only experiment branch. Bumps proposer_window_num_validators_multiplier from the default 10 to 100 (700 blocks / ~35 sec window with 7 forge validators), keeping all other classifier parameters at baseline defaults (failed_weight=1, failure_threshold_percent=10, no latency-weighted heuristic).

Why

Acts as a control PR to cleanly decompose the latency-improvement experiment ladder:

PR window threshold failed_weight heuristic
#19330 baseline 10× 10% 1 off
this PR window-only 100× 10% 1 off
#19566 classifier 100× 5% 0 off
#19567 heuristic-only 100× 10% 1 on (2×)
#19341 classifier+heuristic 100× 5% 0 on (2×)

Hypothesis (CONFIRMED)

P90 should not improve because the larger window stabilizes V6's measured failure rate near the 10% threshold boundary (= V6's true failure rate), causing more consistent "active" classification rather than helpful oscillation. Confirms that the threshold/failed_weight changes (not the window bump) are the actual fix in #19566 and #19341.

Forge results (run 2026-04-28 21:24-21:40 UTC, 16 min, 4k TPS, 1 slow validator)

Commit-accepted latency vs baseline #19330:

p50 p75 p90 p99
#19330 baseline 0.209 0.298 0.658 1.268
this PR 0.220 0.310 0.685 1.281
Δ vs baseline +5% +4% +4% +1%

Statistically indistinguishable from baseline. Window increase alone provides no measurable benefit when the threshold equals the true failure rate.

Conclusion

Window size is not a useful lever in isolation. Decision-relevant value of this PR: it isolates "window's solo contribution = ~0%" so the gains in #19566/#19341 can be attributed entirely to the classifier/heuristic changes, not the window bump that they all also include.

⚠ Draft / experiment — not for merge to main.

🤖 Generated with Claude Code

danielxiangzl and others added 3 commits April 27, 2026 13:28
Forge harness for measuring commit-latency baseline against leader-
reputation experiments. Self-contained:
- realistic_env_max_load: ConstTps 4000 with 7 validators, 0 fullnodes
- 24-hour epoch_duration_secs (no epoch changes during the test)
- 20-min effective duration via with_duration_override(1200)
- Last validator (by ordered index) simulates a slow proposer: 1s
  proposal delay every 10th round, guaranteeing round timeout

Adds the duration_override plumbing to ForgeConfig/runner so individual
tests can override the CLI duration. No P90 latency or multi-region
changes; pure baseline for A/B against latency-weighted leader-rep.

Prototype/experiment code -- not for merge to main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Isolates the contribution of just bumping the leader-reputation proposer
window from 10x (default, ~70 blocks) to 100x (~700 blocks / 35s). All
other classifier parameters (failed_weight=1, failure_threshold_percent=10)
match the baseline.

Acts as a control for #19567 (heuristic-only) and a decomposition step
for #19566 (which bundles window + threshold + failed_weight changes).
Hypothesis: window alone slightly regresses p90 because larger window
stabilizes the failure-rate estimate near the threshold boundary
(threshold = V6's true 10% rate).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@danielxiangzl danielxiangzl added the CICD:run-forge-e2e-perf Run the e2e perf forge only label Apr 28, 2026
@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown
Contributor

✅ Forge suite realistic_env_max_load success on c93c846783dd6ae780a26abf58b50411ac91afec

two traffics test: inner traffic : committed: 4000.00 txn/s, latency: 462.04 ms, (p50: 300 ms, p70: 300, p90: 900 ms, p99: 1500 ms), latency samples: 55080
two traffics test : committed: 100.01 txn/s, latency: 399.79 ms, (p50: 300 ms, p70: 400, p90: 800 ms, p99: 1500 ms), latency samples: 3260
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 0.262, avg: 0.190", "ConsensusProposalToOrdered: max: 0.089, avg: 0.083", "ConsensusOrderedToCommit: max: 0.024, avg: 0.018", "ConsensusProposalToCommit: max: 0.109, avg: 0.101"]
Max non-epoch-change gap was: 1 rounds at version 5800 (avg 0.01) [limit 4], 1.24s no progress at version 3245865 (avg 0.05s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.00s no progress at version 0 (avg 0.00s) [limit 16].
Test Ok

@danielxiangzl danielxiangzl changed the title [forge][experiment] Window-only: proposer window multiplier 10x -> 100x [forge][experiment] Window-only: proposer window multiplier 10× → 100× Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CICD:run-forge-e2e-perf Run the e2e perf forge only

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant