Skip to content

Commit e0f791e

Browse files
committed
fix(ci): raise AVM check-circuit per-tx timeout to 120s
The avm-check-circuit job runs bb-avm avm_check_circuit on every dumped e2e AVM input in parallel, each wrapped in a 30s timeout (exec_test's timeout -v $TIMEOUT). The runner uses --halt now,fail=1, so a single timeout fails the whole job. The e2e_multiple_blobs tx produces a ~700k-row AVM trace. On the default 2 CPUs, trace generation (~22s) plus the row check exceeded 30s and the check was killed with exit 124 (CI run 26755632012); every other input passed in 3-6s. Raise the per-check timeout to a 120s default and make it overridable via AVM_CHECK_CIRCUIT_TIMEOUT, so the heaviest inputs complete with margin while the common case still finishes quickly. CPU allocation stays at the default 2 (the runner core count is tuned so the parallel job count saturates it at 2 CPUs each); only wall-clock budget was the constraint. Supersedes the stale draft branch for #23662 (rebased onto current next).
1 parent cbc99df commit e0f791e

1 file changed

Lines changed: 9 additions & 5 deletions

File tree

yarn-project/end-to-end/bootstrap.sh

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -198,11 +198,15 @@ function avm_check_circuit_cmds {
198198
# Commands run from repo root via parallelize, so use path from top
199199
local dump_dir_from_top="yarn-project/end-to-end/$default_avm_inputs_dump_dir"
200200

201-
# Specify timeout and resources
202-
# WARNING: theoretically, transactions could need more CPU and MEM than we allocate by default.
203-
# In that case, they might start timing out. For now, all of the e2e test txs seem to be relatively
204-
# small and the AVM can run check-circuit with limited resources.
205-
local prefix="$hash:ISOLATE=1:TIMEOUT=30s"
201+
# Specify timeout and resources.
202+
# Most e2e test txs are small and check-circuit on them finishes in a few seconds, but heavier txs
203+
# (e.g. e2e_multiple_blobs, whose trace is ~700k rows) need noticeably more time. On the default 2
204+
# CPUs, trace generation plus the row check on that input took >30s and hit the previous timeout,
205+
# getting killed (exit 124) and failing the whole job. We keep the default 2 CPUs (the runner is sized
206+
# so that the parallel job count fully utilizes its cores at 2 CPUs each) and instead give a generous
207+
# timeout so the heaviest inputs pass with margin while the common case still finishes quickly.
208+
local timeout=${AVM_CHECK_CIRCUIT_TIMEOUT:-120s}
209+
local prefix="$hash:ISOLATE=1:TIMEOUT=$timeout"
206210

207211
# Find all .bin files in the dump directory (handles nested dirs)
208212
for input_file in "$default_avm_inputs_dump_dir"/*/*.bin "$default_avm_inputs_dump_dir"/*/*/*.bin; do

0 commit comments

Comments
 (0)