Commit 7513e78
authored
test: mark tx_stats_bench 10 TPS as flake-retryable on merge-train/spartan (#23083)
## Motivation
The `verifies transactions at 10 TPS` sub-test of
[`yarn-project/end-to-end/src/bench/tx_stats_bench.test.ts`](https://github.com/AztecProtocol/aztec-packages/blob/merge-train/spartan/yarn-project/end-to-end/src/bench/tx_stats_bench.test.ts)
is now reliably flaking on the `bench all` step of
`merge-train/spartan`. It has fired on at least two different
merge-train commits hours apart, with no relation to either commit's
diff:
| Run | Triggering merge-train commit | CI log |
|---|---|---|
|
[25546251580](https://github.com/AztecProtocol/aztec-packages/actions/runs/25546251580)
| #22934 (refactor(node-rpc)! removing deprecated AztecNode methods) |
http://ci.aztec-labs.com/1778227975844707 |
|
[25552992890](https://github.com/AztecProtocol/aztec-packages/actions/runs/25552992890)
| #22405 (feat(p2p): detect and track announce IP changes at runtime) |
http://ci.aztec-labs.com/1778237470322975 |
Both runs hit the same assertion:
```
● transaction benchmarks › verifies transactions at 10 TPS
expect(received).toBe(expected) // Object.is equality
Expected: true
Received: false
at bench/tx_stats_bench.test.ts:268
```
Sub-test failing log on the latest run:
http://ci.aztec-labs.com/ca459ca73d02002c (`bench all` parent:
http://ci.aztec-labs.com/90616bad7bf7ebaa).
The other three sub-tests in the suite (compression; single private
verify x20 serial; single public verify x20 serial) pass cleanly against
the same proven txs in both runs. The failure is in the stress sub-test
that fires 600 IVC verifications at 10/s with 8 concurrent IVC verifiers
(`BB_NUM_IVC_VERIFIERS=8`, `BB_IVC_CONCURRENCY=1`). At least one
verification returns `valid: false` under load.
## Cause
Neither triggering commit touches the IVC verifier path:
- #22934 is a pure node-rpc surface refactor.
- #22405 is p2p / discv5 ENR plumbing.
The two failures sharing this signature across unrelated diffs is strong
evidence that the flake is independent of the merge-train commit and
stems from the bench infrastructure itself.
The likely culprit is the recent bb-prover migration to the bb.js
`NativeUnixSocket` backend (#21564), which spawns a fresh bb subprocess
per Chonk verification via `withVerifierInstance`. Under 8x parallel
verifications on the CPU-isolated bench host (each verifier requesting
16 threads, 8 × 16 = 128 threads on 56 isolated cores), transient
verifier failures appear. The bench-output log shows continuous `bb.js -
Received signal 15, shutting down gracefully...` traffic during the 10
TPS phase — verifier instances are being torn down rapidly, and at least
one verification slips through with a stale/incomplete response. Because
the serial sub-tests (`numIterations = 20` sequential) pass cleanly in
both runs, this is a stress-only interaction, not a correctness
regression.
## Approach
Add `tx_stats_bench` to `.test_patterns.yml` with an `error_regex`
anchored to the test file's stack-trace line
(`tx_stats_bench.test.ts:<line>:<col>`), and assign `*charlie` as owner
(author of the bb.js migration). With this entry, `ci3/run_test_cmd`
retries the test once on failure and treats a single retry-pass as a
flake instead of a hard fail, unblocking the merge train for unrelated
commits while Charlie investigates the underlying concurrency
interaction with the bb.js backend.
The `error_regex` is intentionally narrow (file + line + column from the
stack trace) so other ways tx_stats_bench could fail (timeout, OOM,
infra) are still surfaced as hard fails.
## Changes
- `.test_patterns.yml`: add a `tx_stats_bench` entry with an error_regex
anchored to the test file's stack-trace line and `*charlie` as owner.
ClaudeBox logs:
- https://claudebox.work/s/6e7853d3a073145f?run=1 (initial diagnosis on
#22934 failure)
- https://claudebox.work/s/c12a360275f05ad3?run=1 (this update on #22405
recurrence)1 parent b6bce5c commit 7513e78
1 file changed
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
302 | 302 | | |
303 | 303 | | |
304 | 304 | | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
305 | 313 | | |
306 | 314 | | |
307 | 315 | | |
| |||
0 commit comments