Commit 8a62f05
authored
chore(ci): flag ProvingBroker "does not retry if job is stale" as flake (#23047)
Flagging `ProvingBroker > Retries > does not retry if job is stale` as a
flake in `.test_patterns.yml`. Failure surfaced on an unrelated wallet
PR — `dbanks12`'s wallet refactor — at
http://ci.aztec-labs.com/64a972aafaa40dd0.
## Failure
```
● ProvingBroker › Retries › does not retry if job is stale
Store is closed
> 99 | throw new Error('Store is closed');
| ^
at AztecLMDBStoreV2.transactionAsync (yarn-project/kv-store/dest/lmdb-v2/store.js:99:19)
at SingleEpochDatabase.transactionAsync [as batchWrite]
(yarn-project/prover-client/src/proving_broker/proving_broker_database/persisted.ts:45:22)
at KVBrokerDatabase.batchWrite [as commitWrites]
(yarn-project/prover-client/src/proving_broker/proving_broker_database/persisted.ts:120:14)
```
The broker tries to commit the final `reportProvingJobError` write after
the per-epoch LMDB store has already been closed (the test advances the
epoch from 1 → 3, which causes the epoch-1 store to be torn down). The
race is between the epoch advance / cleanup path and the final error
write — a timing flake, not a logic bug.
## Owner
Test was authored by `@alexghr` in #9400 (`feat: new proving broker
implementation`) and most recently edited by `@alexghr` in #22508
(`fix(prover-client): don't mark in-progress epoch N jobs as stale when
epoch N+1 starts`). `@spypsy` has also recently fixed retries-related
races in this file (#21842, #22355). Pinging Alex as primary owner; tag
Spyros if it's actually a retry-counter race rather than a
store-lifecycle race.
## Other branches
Spot-checked the most recent failed runs on `merge-train/fairies` and
`merge-train/spartan` — none of them hit this same `proving_broker` /
`Store is closed` failure in the data window I sampled. The flake has
only been observed on the one wallet PR run linked above so far.
## Pattern entry
The new entry uses both `regex` (test file path) and `error_regex`
(`does not retry if job is stale|Store is closed`) so unrelated failures
in `proving_broker.test.ts` still fail CI — only this specific timing
race gets quarantined to a Slack ping.
---
*Created by
[claudebox](https://claudebox.work/v2/sessions/b4b6eb63ff789d29) ·
group: `aztec`*1 file changed
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
188 | 188 | | |
189 | 189 | | |
190 | 190 | | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
191 | 199 | | |
192 | 200 | | |
193 | 201 | | |
| |||
0 commit comments