Skip to content

Commit 7513e78

Browse files
authored
test: mark tx_stats_bench 10 TPS as flake-retryable on merge-train/spartan (#23083)
## Motivation The `verifies transactions at 10 TPS` sub-test of [`yarn-project/end-to-end/src/bench/tx_stats_bench.test.ts`](https://github.com/AztecProtocol/aztec-packages/blob/merge-train/spartan/yarn-project/end-to-end/src/bench/tx_stats_bench.test.ts) is now reliably flaking on the `bench all` step of `merge-train/spartan`. It has fired on at least two different merge-train commits hours apart, with no relation to either commit's diff: | Run | Triggering merge-train commit | CI log | |---|---|---| | [25546251580](https://github.com/AztecProtocol/aztec-packages/actions/runs/25546251580) | #22934 (refactor(node-rpc)! removing deprecated AztecNode methods) | http://ci.aztec-labs.com/1778227975844707 | | [25552992890](https://github.com/AztecProtocol/aztec-packages/actions/runs/25552992890) | #22405 (feat(p2p): detect and track announce IP changes at runtime) | http://ci.aztec-labs.com/1778237470322975 | Both runs hit the same assertion: ``` ● transaction benchmarks › verifies transactions at 10 TPS expect(received).toBe(expected) // Object.is equality Expected: true Received: false at bench/tx_stats_bench.test.ts:268 ``` Sub-test failing log on the latest run: http://ci.aztec-labs.com/ca459ca73d02002c (`bench all` parent: http://ci.aztec-labs.com/90616bad7bf7ebaa). The other three sub-tests in the suite (compression; single private verify x20 serial; single public verify x20 serial) pass cleanly against the same proven txs in both runs. The failure is in the stress sub-test that fires 600 IVC verifications at 10/s with 8 concurrent IVC verifiers (`BB_NUM_IVC_VERIFIERS=8`, `BB_IVC_CONCURRENCY=1`). At least one verification returns `valid: false` under load. ## Cause Neither triggering commit touches the IVC verifier path: - #22934 is a pure node-rpc surface refactor. - #22405 is p2p / discv5 ENR plumbing. The two failures sharing this signature across unrelated diffs is strong evidence that the flake is independent of the merge-train commit and stems from the bench infrastructure itself. The likely culprit is the recent bb-prover migration to the bb.js `NativeUnixSocket` backend (#21564), which spawns a fresh bb subprocess per Chonk verification via `withVerifierInstance`. Under 8x parallel verifications on the CPU-isolated bench host (each verifier requesting 16 threads, 8 × 16 = 128 threads on 56 isolated cores), transient verifier failures appear. The bench-output log shows continuous `bb.js - Received signal 15, shutting down gracefully...` traffic during the 10 TPS phase — verifier instances are being torn down rapidly, and at least one verification slips through with a stale/incomplete response. Because the serial sub-tests (`numIterations = 20` sequential) pass cleanly in both runs, this is a stress-only interaction, not a correctness regression. ## Approach Add `tx_stats_bench` to `.test_patterns.yml` with an `error_regex` anchored to the test file's stack-trace line (`tx_stats_bench.test.ts:<line>:<col>`), and assign `*charlie` as owner (author of the bb.js migration). With this entry, `ci3/run_test_cmd` retries the test once on failure and treats a single retry-pass as a flake instead of a hard fail, unblocking the merge train for unrelated commits while Charlie investigates the underlying concurrency interaction with the bb.js backend. The `error_regex` is intentionally narrow (file + line + column from the stack trace) so other ways tx_stats_bench could fail (timeout, OOM, infra) are still surfaced as hard fails. ## Changes - `.test_patterns.yml`: add a `tx_stats_bench` entry with an error_regex anchored to the test file's stack-trace line and `*charlie` as owner. ClaudeBox logs: - https://claudebox.work/s/6e7853d3a073145f?run=1 (initial diagnosis on #22934 failure) - https://claudebox.work/s/c12a360275f05ad3?run=1 (this update on #22405 recurrence)
1 parent b6bce5c commit 7513e78

1 file changed

Lines changed: 8 additions & 0 deletions

File tree

.test_patterns.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -302,6 +302,14 @@ tests:
302302
owners:
303303
- *adam
304304

305+
# tx_stats_bench's 10 TPS sub-test occasionally returns valid:false from a single IVC
306+
# verification under heavy concurrency (8 parallel verifiers, each spawning a fresh bb subprocess
307+
# via the bb.js NativeUnixSocket backend introduced in #21564). The serial sub-tests pass.
308+
- regex: "tx_stats_bench"
309+
error_regex: "tx_stats_bench\\.test\\.ts:[0-9]+:[0-9]+"
310+
owners:
311+
- *charlie
312+
305313
- regex: "src/e2e_token_bridge_tutorial.test.ts"
306314
error_regex: "Error: Unable to find low leaf for block"
307315
owners:

0 commit comments

Comments
 (0)