Skip to content

Commit 8a62f05

Browse files
authored
chore(ci): flag ProvingBroker "does not retry if job is stale" as flake (#23047)
Flagging `ProvingBroker > Retries > does not retry if job is stale` as a flake in `.test_patterns.yml`. Failure surfaced on an unrelated wallet PR — `dbanks12`'s wallet refactor — at http://ci.aztec-labs.com/64a972aafaa40dd0. ## Failure ``` ● ProvingBroker › Retries › does not retry if job is stale Store is closed > 99 | throw new Error('Store is closed'); | ^ at AztecLMDBStoreV2.transactionAsync (yarn-project/kv-store/dest/lmdb-v2/store.js:99:19) at SingleEpochDatabase.transactionAsync [as batchWrite] (yarn-project/prover-client/src/proving_broker/proving_broker_database/persisted.ts:45:22) at KVBrokerDatabase.batchWrite [as commitWrites] (yarn-project/prover-client/src/proving_broker/proving_broker_database/persisted.ts:120:14) ``` The broker tries to commit the final `reportProvingJobError` write after the per-epoch LMDB store has already been closed (the test advances the epoch from 1 → 3, which causes the epoch-1 store to be torn down). The race is between the epoch advance / cleanup path and the final error write — a timing flake, not a logic bug. ## Owner Test was authored by `@alexghr` in #9400 (`feat: new proving broker implementation`) and most recently edited by `@alexghr` in #22508 (`fix(prover-client): don't mark in-progress epoch N jobs as stale when epoch N+1 starts`). `@spypsy` has also recently fixed retries-related races in this file (#21842, #22355). Pinging Alex as primary owner; tag Spyros if it's actually a retry-counter race rather than a store-lifecycle race. ## Other branches Spot-checked the most recent failed runs on `merge-train/fairies` and `merge-train/spartan` — none of them hit this same `proving_broker` / `Store is closed` failure in the data window I sampled. The flake has only been observed on the one wallet PR run linked above so far. ## Pattern entry The new entry uses both `regex` (test file path) and `error_regex` (`does not retry if job is stale|Store is closed`) so unrelated failures in `proving_broker.test.ts` still fail CI — only this specific timing race gets quarantined to a Slack ping. --- *Created by [claudebox](https://claudebox.work/v2/sessions/b4b6eb63ff789d29) · group: `aztec`*
2 parents 5951e05 + 5b3e6d2 commit 8a62f05

1 file changed

Lines changed: 8 additions & 0 deletions

File tree

.test_patterns.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -188,6 +188,14 @@ tests:
188188
- *phil
189189
- *palla
190190

191+
# http://ci.aztec-labs.com/64a972aafaa40dd0
192+
# ProvingBroker › Retries › does not retry if job is stale — kv-store closes
193+
# before the broker's final reportProvingJobError write lands.
194+
- regex: "prover-client/src/proving_broker/proving_broker.test.ts"
195+
error_regex: "does not retry if job is stale|Store is closed"
196+
owners:
197+
- *alex
198+
191199
# Nightly GKE tests
192200
- regex: "spartan/bootstrap.sh"
193201
owners:

0 commit comments

Comments
 (0)