Skip to content

Commit 61baf08

Browse files
authored
feat(ci3): run uploadable benchmarks on a dedicated on-demand instance (#24028)
> [!IMPORTANT] > Depends on the IAM change aztec-labs-eng/iac#6 (grants `ci3-build-instance-role` the launch/SSM/PassRole surface). **That must apply first**, else the build instance's `create-fleet` hits `UnauthorizedOperation`. ## Problem Spot diversification (create-fleet) means build instances now land on variable EC2 types — m6a/m7a/m6i/r6a/r7a at 16/32/48xlarge, AMD vs Intel. The in-build benchmark phase runs on that box, so wall-time numbers vary by hardware family far more than the 105% regression alert threshold → false regressions. (The instance type isn't even recorded in the bench JSON.) ## Approach Only the canonical **merge-queue→next** series (the one used for real regression tracking) runs benches on a **dedicated, fixed, on-demand m6a.16xlarge**. PR `ci-full` runs keep running benches inline on the contended build box purely as a **breakage check** — no dedicated box, no upload. Benches are scheduled by the existing test engine: when the build completes in `build_and_test` (full builds only), - **upload runs** (`SHOULD_UPLOAD_BENCHMARKS=1`): launch the dedicated box via `./ci.sh bench` as a backgrounded, colored, denoised job (logged like the test engine) and `wait` on it (non-fatal) before returning; - **otherwise**: `bench_cmds >> $test_cmds_file` — benches become ordinary test commands. `ci.sh bench` → `bootstrap_ec2` blocks until the remote `ci-bench` finishes (ending in `cache_upload bench-<treehash>`), so the `wait` is the whole rendezvous. Results reach the GA `Upload benchmarks` step unchanged via that cache key (`ci3_success.sh` `gh-bench`). ## Changes - **`bootstrap.sh`**: drop inline `bench` from `ci-full`/`ci-full-no-test-cache`; add the `build_and_test` launch/append hook + non-fatal `wait`; new `ci-bench` mode = cache-hit `make full` + `bench` (no test engine). - **`ci.sh`**: new `bench` launcher — `AWS_INSTANCE=m6a.16xlarge NO_SPOT=1` (pins a fixed on-demand type; `CPUS` not needed since `AWS_INSTANCE` bypasses pool sizing). - **`ci3/bench_engine`**: drop the 8-core OS isolation / HT-disable / pinning. Dedicated box → benches use the full machine, honouring per-bench `CPUS` via the strict scheduler (defaults to `nproc/2` without `BENCH_CPU_COUNT`). This is what lets the 64-vCPU 16xlarge satisfy the `CPUS=32` bb rollup bench. - **`.github/ci3_labels_to_env.sh`**: scope `SHOULD_UPLOAD_BENCHMARKS` to merge-queue→next (it now also gates the dedicated box). **`ci3/bootstrap_ec2`**: pass it through to the instance. ## Notes - **One-time baseline shift** in `bench/next`: different machine + no isolation changes absolute numbers once; stable thereafter. May want to annotate the series. - **Soft failure**: a bench-box failure is logged and the run proceeds (no fresh numbers) rather than blocking the merge. - **PR benches-as-tests**: `:PARALLEL=0` serial benches lose one-at-a-time isolation and run contended — fine for breakage-only; real numbers come from the dedicated box's `bench_engine` path. - Validated: all touched scripts pass `bash -n`; the `AWS_INSTANCE`+`NO_SPOT` fixed-on-demand launch mechanism was verified live during the create-fleet work. Full e2e is exercised by a merge-queue→next run once the iac PR lands.
2 parents 3d1fc0a + 19de9f1 commit 61baf08

12 files changed

Lines changed: 235 additions & 99 deletions

File tree

.github/ci3_labels_to_env.sh

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -149,9 +149,13 @@ function main {
149149
echo "CI_MODE=$ci_mode" >> $GITHUB_ENV
150150
echo "CI mode: $ci_mode"
151151

152-
# Determine if benchmarks should be uploaded (merge-queue, full, or full-no-test-cache modes)
152+
# Benching modes run their benches on a dedicated, fixed-hardware box (stable numbers)
153+
# and publish the result; ci-fast never benches. For grind runs (merge-queue-heavy fires
154+
# ~10 instances) only the first instance keeps BENCH_UPLOAD=1 — multi_job_run forces the
155+
# rest to 0 so they bench inline as a breakage check without racing the upload. The
156+
# destination (bench/next vs bench/prs) is BENCH_BRANCH below.
153157
if [[ "$ci_mode" == "merge-queue" || "$ci_mode" == "merge-queue-heavy" || "$ci_mode" == "full" || "$ci_mode" == "full-no-test-cache" ]]; then
154-
echo "SHOULD_UPLOAD_BENCHMARKS=1" >> $GITHUB_ENV
158+
echo "BENCH_UPLOAD=1" >> $GITHUB_ENV
155159
fi
156160

157161
# Determine the branch label for benchmark publishing.

.github/ci3_success.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,17 +42,17 @@ function handle_squash_merge {
4242
}
4343

4444
function handle_benchmarks {
45-
if [ "${SHOULD_UPLOAD_BENCHMARKS:-0}" -eq 0 ]; then
45+
if [ "${BENCH_UPLOAD:-0}" -eq 0 ]; then
4646
return
4747
fi
4848
# Handle benchmarks download (internal only)
4949
echo "Downloading benchmarks..."
5050
if ./ci.sh gh-bench && [ -f "./bench-out/bench.json" ] && [ "$(cat ./bench-out/bench.json)" != "[]" ]; then
5151
echo "Benchmarks downloaded successfully"
52-
echo "SHOULD_UPLOAD_BENCHMARKS=1" >> $GITHUB_ENV
52+
echo "BENCH_UPLOAD=1" >> $GITHUB_ENV
5353
else
5454
echo "No benchmarks to upload"
55-
echo "SHOULD_UPLOAD_BENCHMARKS=0" >> $GITHUB_ENV
55+
echo "BENCH_UPLOAD=0" >> $GITHUB_ENV
5656
fi
5757
}
5858

.github/workflows/ci3-external.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ jobs:
128128
if: always()
129129
env:
130130
SHOULD_SQUASH_MERGE: ${{ contains(github.event.pull_request.labels.*.name, 'ci-squash-and-merge') && '1' || '0' }}
131-
SHOULD_UPLOAD_BENCHMARKS: "0"
131+
BENCH_UPLOAD: "0"
132132
# For updating success cache.
133133
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
134134
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

.github/workflows/ci3.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,7 @@ jobs:
158158
fi
159159
160160
- name: Upload benchmarks
161-
if: env.SHOULD_UPLOAD_BENCHMARKS == '1'
161+
if: env.BENCH_UPLOAD == '1'
162162
uses: benchmark-action/github-action-benchmark@52576c92bccf6ac60c8223ec7eb2565637cae9ba # v1.22.1
163163
with: &ci_benchmark_args
164164
name: Aztec Benchmarks

Makefile

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,14 @@ fast: release-image barretenberg boxes playground docs aztec-up \
6060
# Full bootstrap.
6161
full: fast bb-full-tests bb-cpp-full yarn-project-benches
6262

63+
# Everything required to run the full benchmark suite (see bootstrap.sh bench_cmds),
64+
# and nothing more. yarn-project-benches transitively builds the bb native/wasm bench
65+
# binaries (via bb-ts -> bb-cpp-native/wasm-threads), the e2e bench inputs, noir-projects
66+
# and l1-contracts; bb-sol adds the Solidity gas benchmark's generated verifier; bb-acir
67+
# builds barretenberg/acir_tests, whose headless-test harness (ts-node) the bb browser
68+
# memory bench (ci_benchmark_browser_memory.sh) drives.
69+
bench: yarn-project-benches bb-sol bb-acir
70+
6371
# Release. Everything plus copy bb cross compiles to ts projects.
6472
release: fast bb-cpp-release-dir bb-ts-cross-copy
6573

bootstrap.sh

Lines changed: 41 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -435,6 +435,18 @@ function build_and_test {
435435
start_txes
436436
make noir-projects-txe-tests
437437

438+
# Benches (full builds only). Uploadable runs (BENCH_UPLOAD=1 — the first instance of
439+
# a run) bench on a dedicated fixed-hardware box for stable numbers: launched here,
440+
# logged like the test engine, waited on below, and the sole uploader. Everything
441+
# else benches inline as ordinary tests — a breakage check only, no upload.
442+
if [ "$1" == full ]; then
443+
if [ "${BENCH_UPLOAD:-0}" == 1 ]; then
444+
setsid color_prefix "bench" "denoise './ci.sh bench'" & bench_pid=$!
445+
else
446+
bench_cmds >> $test_cmds_file
447+
fi
448+
fi
449+
438450
# Signal tests complete, handled by parallel -E STOP.
439451
echo STOP >> $test_cmds_file
440452
fi
@@ -447,6 +459,14 @@ function build_and_test {
447459

448460
stop_txes
449461

462+
# Benches (full builds only). Inline benches above are a breakage check only — the
463+
# dedicated box is the sole uploader. Wait on it here: fatal, matching the old inline
464+
# `bench`, since a benchmark that fails to build/run is a real breakage.
465+
if [ "$1" == full ] && [ -n "${bench_pid:-}" ]; then
466+
echo "Waiting for dedicated bench run..."
467+
wait "$bench_pid"
468+
fi
469+
450470
return 0
451471
}
452472

@@ -468,6 +488,16 @@ function bench_merge {
468488

469489
}
470490

491+
# Merge all component bench-out/*.bench.json into one and upload it to the
492+
# bench-<treehash> cache key, which the GA "Upload benchmarks" step then publishes.
493+
# Used both by `bench` (dedicated box) and by the inline benches-as-tests path.
494+
function bench_publish {
495+
rm -rf bench-out
496+
mkdir -p bench-out
497+
bench_merge
498+
cache_upload bench-$(git rev-parse HEAD^{tree}).tar.gz bench-out/bench.json
499+
}
500+
471501
function bench {
472502
# TODO bench for arm64.
473503
if [ $(arch) == arm64 ]; then
@@ -476,12 +506,7 @@ function bench {
476506
echo_header "bench all"
477507
bench_cmds > $bench_cmds_file
478508
denoise "bench_engine $bench_cmds_file"
479-
480-
rm -rf bench-out
481-
mkdir -p bench-out
482-
bench_merge
483-
cache_upload bench-$(git rev-parse HEAD^{tree}).tar.gz bench-out/bench.json
484-
509+
bench_publish
485510
}
486511

487512
### RELEASING ##########################################################################################################
@@ -750,13 +775,22 @@ case "$cmd" in
750775
export USE_TEST_CACHE=1
751776
export CI_FULL=1
752777
build_and_test full
753-
bench
754778
;;
755779
"ci-full-no-test-cache")
756780
export CI=1
757781
export USE_TEST_CACHE=0
758782
export CI_FULL=1
759783
build_and_test full
784+
;;
785+
"ci-bench")
786+
# Run on a dedicated, fixed, on-demand instance (launched by the build
787+
# instance via './ci.sh bench') for stable benchmark numbers. The build is a
788+
# near-instant cache pull, as the launching build instance already populated
789+
# the cache for this commit. No test engine; bench uploads bench-<treehash>.
790+
export CI=1
791+
export CI_FULL=1
792+
prep
793+
make bench
760794
bench
761795
;;
762796
"ci-chonk-input-update")

ci.sh

Lines changed: 119 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -39,10 +39,10 @@ function print_usage {
3939
echo_cmd "chonk-input-update" "Spin up an EC2 instance to update pinned Chonk IVC inputs and push the diff."
4040
echo_cmd "release" "Spin up an EC2 instance and run bootstrap release."
4141
echo_cmd "shell-new" "Spin up an EC2 instance, clone the repo, and drop into a shell."
42-
echo_cmd "shell" "Drop into a shell in the current running build instance container."
43-
echo_cmd "shell-host" "Drop into a shell in the current running build host."
42+
echo_cmd "shell-container" "Shell into a running build container. Optional filter tokens (e.g. 'pr-123 bench') select the instance; defaults to the current branch."
43+
echo_cmd "shell-host" "Shell into a running build host. Same instance selection as shell-container."
4444
echo_cmd "log" "Display the log of the given log ID."
45-
echo_cmd "kill" "Terminate running EC2 instance with instance_name."
45+
echo_cmd "kill" "Terminate running build instances matching the filter tokens (default: current branch)."
4646
echo_cmd "draft" "Mark the current PR as draft (no automatic CI runs when pushing)."
4747
echo_cmd "ready" "Mark the current PR as ready (enable automatic CI runs when pushing)."
4848
echo_cmd "pr-url" "Print the URL of the current PR associated with the branch."
@@ -53,27 +53,69 @@ function print_usage {
5353

5454
[ -n "$cmd" ] && shift
5555

56-
# Keep this in sync with bootstrap_ec2's instance_name scheme (repo-scoped) so the
57-
# shell/kill/get-ip helpers find instances launched by a CI run for this repo.
58-
repo=${GITHUB_REPOSITORY:-aztec-packages}
59-
repo=${repo##*/}
60-
instance_name=${INSTANCE_NAME:-${repo}_$(echo -n "$BRANCH" | tr -c 'a-zA-Z0-9-' '_')_${arch}}
61-
[ -n "${INSTANCE_POSTFIX:-}" ] && instance_name+="_$INSTANCE_POSTFIX"
56+
# Connecting to a running build instance: discover by the Group=build-instance tag
57+
# and match filter tokens against the Name (which aws_instance_name builds as
58+
# <repo>_<ref>_<arch>[_<job>]), rather than reconstructing the exact name (which
59+
# varies by arch/job/count). This is what lets `shell-container pr-123 bench` etc. work.
6260

63-
function get_ip_for_instance {
64-
ip=$(aws ec2 describe-instances \
61+
# Echo running build instances as: <Name>\t<InstanceId>\t<PublicIp>\t<LaunchTime>
62+
function list_build_instances {
63+
aws ec2 describe-instances \
6564
--region us-east-2 \
66-
--filters "Name=tag:Name,Values=$instance_name" "Name=instance-state-name,Values=running" \
67-
--query "Reservations[].Instances[0].PublicIpAddress" \
68-
--output text)
65+
--filters "Name=tag:Group,Values=build-instance" "Name=instance-state-name,Values=running" \
66+
--query "Reservations[].Instances[].[Tags[?Key=='Name']|[0].Value, InstanceId, PublicIpAddress, LaunchTime]" \
67+
--output text
6968
}
7069

71-
function get_iid_for_instance {
72-
iid=$(aws ec2 describe-instances \
73-
--region us-east-2 \
74-
--filters "Name=tag:Name,Values=$instance_name" "Name=instance-state-name,Values=running" \
75-
--query "Reservations[].Instances[0].InstanceId" \
76-
--output text | tr -d '\n\r' | xargs)
70+
# Echo running build instances whose Name matches every filter token (case-insensitive
71+
# substring). With no tokens, defaults to the current branch's canonical name.
72+
function filter_build_instances {
73+
local filters=("$@") rows f
74+
[ "${#filters[@]}" -eq 0 ] && filters=("$(aws_instance_name "$BRANCH" "$arch")")
75+
rows=$(list_build_instances)
76+
for f in "${filters[@]}"; do
77+
# Sanitise the token the same way instance names are (e.g. a branch's '/' -> '_'),
78+
# so passing a raw branch name like 'mv/f-669' still matches '..._mv_f-669_...'.
79+
f=$(printf '%s' "$f" | tr -c 'a-zA-Z0-9-' '_')
80+
rows=$(printf '%s\n' "$rows" | awk -v p="$f" 'index(tolower($1), tolower(p))')
81+
done
82+
printf '%s\n' "$rows" | sed '/^$/d'
83+
}
84+
85+
# Resolve exactly one instance from the filter tokens; sets iid/ip/resolved_name.
86+
# 0 matches -> error + list everything; >1 -> interactive pick on a TTY, else error
87+
# listing the candidates so you can add a narrowing token (e.g. an arch or job id).
88+
function resolve_instance {
89+
local matches chosen sel i
90+
matches=$(filter_build_instances "$@")
91+
if [ -z "$matches" ]; then
92+
echo_stderr "No running build instance matches: ${*:-$BRANCH}"
93+
echo_stderr "Running build instances:"
94+
list_build_instances | awk '{print " " $1}' | sort || true
95+
exit 1
96+
fi
97+
if [ "$(printf '%s\n' "$matches" | wc -l)" -eq 1 ]; then
98+
chosen=$matches
99+
elif [ -t 0 ]; then
100+
echo_stderr "Multiple build instances match '${*:-$BRANCH}':"
101+
i=1
102+
while IFS= read -r line; do
103+
echo_stderr " $i) $(printf '%s' "$line" | awk '{print $1}')"
104+
i=$((i + 1))
105+
done <<< "$matches"
106+
read -r -p "select [1-$((i - 1))]: " sel
107+
[[ "$sel" =~ ^[0-9]+$ ]] || { echo_stderr "Invalid selection."; exit 1; }
108+
chosen=$(printf '%s\n' "$matches" | sed -n "${sel}p")
109+
[ -z "$chosen" ] && echo_stderr "Invalid selection." && exit 1
110+
else
111+
echo_stderr "Multiple build instances match '${*}' — add a narrowing token (e.g. an arch or job id):"
112+
printf '%s\n' "$matches" | awk '{print " " $1}'
113+
exit 1
114+
fi
115+
resolved_name=$(printf '%s' "$chosen" | awk '{print $1}')
116+
iid=$(printf '%s' "$chosen" | awk '{print $2}')
117+
ip=$(printf '%s' "$chosen" | awk '{print $3}')
118+
echo_stderr "Connecting to $resolved_name ($iid)."
77119
}
78120

79121
function get_latest_run_id {
@@ -95,16 +137,30 @@ function multi_job_run {
95137
export AWS_SHUTDOWN_TIME_ARM=${AWS_SHUTDOWN_TIME_ARM:-90}
96138
export DENOISE=1
97139
export DENOISE_WIDTH=32
140+
# Only the first job (the amd64 full build) runs the dedicated bench box and uploads;
141+
# the rest bench inline as a breakage check (see bootstrap.sh build_and_test). This
142+
# de-races grind runs (e.g. merge-queue-heavy fires ~10 instances) that would otherwise
143+
# all upload to the same bench cache key.
144+
local bench_primary=${1%% *}
145+
export bench_primary
98146
run() {
99147
[ -n "${4:-}" ] && export REF_NAME=$4
100-
PARENT_LOG_ID=$RUN_ID JOB_ID=$1 INSTANCE_POSTFIX=$1 ARCH=$2 exec denoise "bootstrap_ec2 './bootstrap.sh $3'"
148+
local bench_upload=0
149+
[ "$1" == "$bench_primary" ] && bench_upload=${BENCH_UPLOAD:-0}
150+
# Timestamp the bootstrap_ec2 (instance request) sublog. denoise runs the command
151+
# under pipefail and bootstrap_ec2 handles spot-eviction retry internally (exec), so
152+
# piping through add_timestamps preserves its exit code. DENOISE_DISPLAY_NAME keeps
153+
# the parent log's "Executing:" line free of the pipe.
154+
PARENT_LOG_ID=$RUN_ID JOB_ID=$1 INSTANCE_POSTFIX=$1 ARCH=$2 BENCH_UPLOAD=$bench_upload \
155+
DENOISE_DISPLAY_NAME="bootstrap_ec2 './bootstrap.sh $3'" \
156+
exec denoise "bootstrap_ec2 './bootstrap.sh $3' 2>&1 | add_timestamps"
101157
}
102158
export -f run
103159

104160
parallel --colsep ' ' --jobs 100 --termseq 'TERM,10000' \
105161
--tagstring '{1}' \
106162
--line-buffered --halt now,fail=1 \
107-
'run {1} {2} {3} {4}' ::: "$@" | DUP=1 cache_log "CI run" $RUN_ID
163+
'run {1} {2} {3} {4}' ::: "$@" | add_timestamps | DUP=1 cache_log "CI run" $RUN_ID
108164
}
109165

110166
# Jobs in the ci dashboards are grouped on a single line by RUN_ID.
@@ -122,6 +178,21 @@ case "$cmd" in
122178
# GitHub status check name is unchanged.
123179
multi_job_run "x-$cmd amd64 ci-$cmd"
124180
;;
181+
bench)
182+
# Launched by the build instance on uploadable runs to produce stable benchmark
183+
# numbers on a dedicated instance of a FIXED type. AWS_INSTANCE pins the exact type
184+
# (bypasses spot pool diversification) — that's what keeps numbers comparable. Spot
185+
# vs on-demand is the same hardware, so we try spot first and fall back to on-demand
186+
# (the default fleet behaviour); a mid-run spot reclaim is handled by bootstrap_ec2's
187+
# internal on-demand retry. CI_DASHBOARD and PARENT_LOG_ID are inherited from the
188+
# launching run so it nests as a sibling job.
189+
# Timestamp the instance-request output. pipefail (in a subshell, since ci.sh doesn't
190+
# set it globally) keeps bootstrap_ec2's exit code through add_timestamps — the
191+
# launching build instance waits on this fatally.
192+
( set -o pipefail
193+
AWS_INSTANCE=m6a.32xlarge JOB_ID=x-bench INSTANCE_POSTFIX=x-bench \
194+
bootstrap_ec2 "./bootstrap.sh ci-bench" 2>&1 | add_timestamps )
195+
;;
125196
socket-fix)
126197
export CI_DASHBOARD="prs"
127198
export JOB_ID="x-socket-fix"
@@ -352,47 +423,51 @@ case "$cmd" in
352423
CI_USE_SSH=1 exec bootstrap_ec2 "$cmd"
353424
;;
354425
shell-container)
355-
# Drop into a shell in the current running build instance container.
426+
# Drop into a zsh shell in a running build instance's container. Optional filter
427+
# tokens select the instance, e.g.:
428+
# ci.sh shell-container # the current branch's instance
429+
# ci.sh shell-container pr-12345 bench # the bench box for that merge-queue run
430+
# ci.sh shell-container pr-12345 arm64 # the arm build of that run
431+
resolve_instance "$@"
432+
container_cmd="docker start aztec_build &>/dev/null || true && docker exec -it --user aztec-dev aztec_build zsh"
356433
if [ "${CI_USE_SSH:-0}" -eq 1 ]; then
357-
get_ip_for_instance
358-
[ -z "$ip" ] && echo "No instance found: $instance_name" && exit 1
359-
[ "$#" -eq 0 ] && set -- "zsh" || true
360-
ssh -tq -F $ci3/aws/build_instance_ssh_config ubuntu@$ip \
361-
"docker start aztec_build &>/dev/null || true && docker exec -it --user aztec-dev aztec_build $@"
434+
if [ -z "$ip" ] || [ "$ip" = "None" ]; then echo_stderr "No public IP for $resolved_name."; exit 1; fi
435+
ssh -tq -F $ci3/aws/build_instance_ssh_config ubuntu@$ip "$container_cmd"
362436
else
363-
get_iid_for_instance
364-
[ -z "$iid" ] || [ "$iid" = "None" ] && echo "No instance found: $instance_name" && exit 1
365-
[ "$#" -eq 0 ] && set -- "zsh" || true
437+
# SSM sessions run as the non-root ssm-user (which has passwordless sudo), so
438+
# use sudo rather than runuser. Running docker as root is fine — the container
439+
# itself drops to aztec-dev via --user.
366440
aws ssm start-session \
367441
--region us-east-2 \
368442
--target "$iid" \
369443
--document-name "AWS-StartInteractiveCommand" \
370-
--parameters "{\"command\":[\"runuser -u ubuntu -- bash -c 'docker start aztec_build &>/dev/null || true && docker exec -it --user aztec-dev aztec_build $@'\"]}"
444+
--parameters "{\"command\":[\"sudo bash -c '$container_cmd'\"]}"
371445
fi
372446
;;
373447
shell-host)
374-
# Drop into a shell in the current running build host.
448+
# Drop into a shell on a running build host. Optional filter tokens select the
449+
# instance (see shell-container).
450+
resolve_instance "$@"
375451
if [ "${CI_USE_SSH:-0}" -eq 1 ]; then
376-
get_ip_for_instance
377-
[ -z "$ip" ] && echo "No instance found: $instance_name" && exit 1
452+
if [ -z "$ip" ] || [ "$ip" = "None" ]; then echo_stderr "No public IP for $resolved_name."; exit 1; fi
378453
ssh -t -F $ci3/aws/build_instance_ssh_config ubuntu@$ip
379454
else
380-
get_iid_for_instance
381-
[ -z "$iid" ] || [ "$iid" = "None" ] && echo "No instance found: $instance_name" && exit 1
382455
aws ssm start-session \
383456
--region us-east-2 \
384457
--target "$iid"
385458
fi
386459
;;
387460
kill)
388-
existing_instance=$(aws ec2 describe-instances \
389-
--region us-east-2 \
390-
--filters "Name=tag:Name,Values=$instance_name" \
391-
--query "Reservations[].Instances[?State.Name!='terminated'].InstanceId[]" \
392-
--output text)
393-
if [ -n "$existing_instance" ]; then
394-
aws_terminate_instance $existing_instance
461+
# Terminate ALL running build instances matching the filter tokens (default: the
462+
# current branch). E.g. `ci.sh kill pr-12345` ends a whole merge-queue run.
463+
kill_rows=$(filter_build_instances "$@")
464+
if [ -z "$kill_rows" ]; then
465+
echo "No running build instance matches: ${*:-$BRANCH}"
466+
exit 0
395467
fi
468+
echo "Terminating:"
469+
printf '%s\n' "$kill_rows" | awk '{print " " $1 " (" $2 ")"}'
470+
printf '%s\n' "$kill_rows" | awk '{print $2}' | xargs aws ec2 terminate-instances --region us-east-2 --instance-ids >/dev/null
396471
;;
397472

398473
###################

0 commit comments

Comments
 (0)