[consensus] HACK: Skip proposals to simulate proposer faults in P90 test by danielxiangzl · Pull Request #19328 · aptos-labs/aptos-core

danielxiangzl · 2026-04-04T00:45:26Z

Summary

~20% of validators (last address byte mod 10 < 2) skip both regular and optimistic proposals 50% of the time (even rounds)
Lightweight alternative to killing pods — simulates proposer failures directly in consensus
Prototype/experiment code — not for merge to main

Test plan

Run forge land_blocking (P90 latency test) and observe round timeout rate
Verify faulty validators trigger timeouts and leader reputation kicks in

🤖 Generated with Claude Code

Replaces the land_blocking forge suite with a latency-focused test that uses a mainnet-representative validator distribution (~70% EU, ~20% NA, ~10% Asia) instead of the previous even 25%/25%/25%/25% four-region split. The even split over-weights Asia (25% vs ~2% on mainnet) and under-weights EU, making P90 thresholds misleading. The new distribution causes EU proposers to dominate rounds as they do on mainnet, exercising the actual latency bottlenecks (distant proposers racing with EU batch arrival). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Last 2 validators (by ordered index) skip both regular and optimistic proposals 50% of the time (even rounds). Uses last 2 to avoid EU nodes which dominate the front of the ordered list in mainnet-like distribution. This is prototype/experiment code — not for merge to main. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-04T01:24:45Z

✅ Forge suite `realistic_env_max_load` success on `8aa9231232f954754acc0b39558bcc117684099e`

Forge report malformed: Expecting property name enclosed in double quotes: line 10 column 1 (char 182)
'{\n  "metrics": [\n    {\n      "test_name": "performance benchmark",\n      "metric": "submitted_txn",\n      "value": 1427880.0\n    },\n    {\n      "test_name": "performance benchmark",\n[2026-04-04T01:24:41Z INFO  aptos_forge::report] Test Ok\n      "metric": "expired_txn",\n      "value": 0.0\n    },\n    {\n      "test_name": "performance benchmark",\n      "metric": "avg_tps",\n      "value": 3500.1082648790402\n    },\n    {\n      "test_name": "performance benchmark",\n      "metric": "avg_latency",\n      "value": 292.21386173184356\n    },\n    {\n      "test_name": "performance benchmark",\n      "metric": "p50_latency",\n      "value": 200.0\n    },\n    {\n      "test_name": "performance benchmark",\n      "metric": "p90_latency",\n      "value": 300.0\n    },\n    {\n      "test_name": "performance benchmark",\n      "metric": "p99_latency",\n      "value": 1200.0\n    }\n  ],\n  "text": "performance benchmark : committed: 3500.11 txn/s, latency: 292.21 ms, (p50: 200 ms, p70: 200, p90: 300 ms, p99: 1200 ms), latency samples: 28640\\nLatency breakdown for phase 0: [\\"MempoolToBlockCreation: max: 0.151, avg: 0.111\\", \\"ConsensusProposalToOrdered: max: 0.086, avg: 0.084\\", \\"ConsensusOrderedToCommit: max: 0.016, avg: 0.015\\", \\"ConsensusProposalToCommit: max: 0.102, avg: 0.099\\"]\\nMax non-epoch-change gap was: 1 rounds at version 14301 (avg 0.00) [limit 10], 1.47s no progress at version 1651808 (avg 0.04s) [limit 30].\\nMax epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 10], 0.00s no progress at version 0 (avg 0.00s) [limit 30].\\nTest Ok"\n}'
Trailing Log Lines:
networkchaos.chaos-mesh.org "4-gcp--as-southeast1-to-3-gcp--us-east4-netem" deleted from forge-e2e-pr-19328 namespace
[2026-04-04T01:24:30Z INFO  ureq::unit] sending request POST http://vmagent-victoria-metrics-agent.victoria-metrics.svc:8429/api/v1/import/prometheus
test CompositeNetworkTest ... ok
Test Statistics: 
performance benchmark : committed: 3500.11 txn/s, latency: 292.21 ms, (p50: 200 ms, p70: 200, p90: 300 ms, p99: 1200 ms), latency samples: 28640
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 0.151, avg: 0.111", "ConsensusProposalToOrdered: max: 0.086, avg: 0.084", "ConsensusOrderedToCommit: max: 0.016, avg: 0.015", "ConsensusProposalToCommit: max: 0.102, avg: 0.099"]
Max non-epoch-change gap was: 1 rounds at version 14301 (avg 0.00) [limit 10], 1.47s no progress at version 1651808 (avg 0.04s) [limit 30].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 10], 0.00s no progress at version 0 (avg 0.00s) [limit 30].
Test Ok

=== BEGIN JUNIT ===
<?xml version="1.0" encoding="UTF-8"?>
<testsuites name="forge" tests="1" failures="0" errors="0" uuid="5d7cea94-8e0e-4c3c-94c1-ea09267acd5b">
    <testsuite name="local" tests="1" disabled="0" errors="0" failures="0">
        <testcase name="CompositeNetworkTest(network:multi-region-network-emulation(performance benchmark)) with ">
        </testcase>
    </testsuite>
</testsuites>
=== END JUNIT ===
[2026-04-04T01:24:41Z INFO  aptos_forge::backend::k8s::cluster_helper] Deleting namespace forge-e2e-pr-19328: Some(NamespaceStatus { conditions: None, phase: Some("Terminating") })
[2026-04-04T01:24:41Z INFO  aptos_forge::backend::k8s::cluster_helper] aptos-node resources for Forge removed in namespace: forge-e2e-pr-19328
[2026-04-04T01:24:41Z INFO  ureq::unit] sending request POST http://vmagent-victoria-metrics-agent.victoria-metrics.svc:8429/api/v1/import/prometheus

test result: ok. 1 passed; 0 soft failed; 0 hard failed; 0 filtered out

Debugging output:
NAME                                         READY   STATUS      RESTARTS   AGE
aptos-node-0-validator-0                     1/1     Running     0          14m
aptos-node-1-validator-0                     1/1     Running     0          14m
aptos-node-10-validator-0                    1/1     Running     0          14m
aptos-node-11-validator-0                    1/1     Running     0          14m
aptos-node-12-validator-0                    1/1     Running     0          14m
aptos-node-13-validator-0                    1/1     Running     0          14m
aptos-node-14-validator-0                    1/1     Running     0          14m
aptos-node-15-validator-0                    1/1     Running     0          14m
aptos-node-16-validator-0                    1/1     Running     0          14m
aptos-node-17-validator-0                    1/1     Running     0          14m
aptos-node-18-validator-0                    1/1     Running     0          14m
aptos-node-19-validator-0                    1/1     Running     0          14m
aptos-node-2-validator-0                     1/1     Running     0          14m
aptos-node-3-validator-0                     1/1     Running     0          14m
aptos-node-4-validator-0                     1/1     Running     0          14m
aptos-node-5-validator-0                     1/1     Running     0          14m
aptos-node-6-validator-0                     1/1     Running     0          14m
aptos-node-7-validator-0                     1/1     Running     0          14m
aptos-node-8-validator-0                     1/1     Running     0          14m
aptos-node-9-validator-0                     1/1     Running     0          14m
forge-testnet-deployer-4h6qq                 0/1     Completed   0          14m
genesis-aptos-genesis-eforge618dfd42-nvgdn   0/1     Completed   0          14m

danielxiangzl and others added 2 commits April 3, 2026 13:11

[ci] Increase forge land_blocking duration to 600s for P90 latency test

825c202

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

danielxiangzl added the CICD:run-forge-e2e-perf Run the e2e perf forge only label Apr 4, 2026

danielxiangzl force-pushed the daniel/latency-skip-proposal branch from d911a1b to 8aa9231 Compare April 4, 2026 00:50

This comment has been minimized.

Sign in to view

danielxiangzl closed this Apr 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[consensus] HACK: Skip proposals to simulate proposer faults in P90 test#19328

[consensus] HACK: Skip proposals to simulate proposer faults in P90 test#19328
danielxiangzl wants to merge 3 commits into
mainfrom
daniel/latency-skip-proposal

danielxiangzl commented Apr 4, 2026

Uh oh!

This comment has been minimized.

github-actions Bot commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

danielxiangzl commented Apr 4, 2026

Summary

Test plan

Uh oh!

This comment has been minimized.

github-actions Bot commented Apr 4, 2026

✅ Forge suite realistic_env_max_load success on 8aa9231232f954754acc0b39558bcc117684099e

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

✅ Forge suite `realistic_env_max_load` success on `8aa9231232f954754acc0b39558bcc117684099e`