You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ci: shard alpine-test into parallel jobs to reduce CI time
The alpine-test CI job runs all ~483 zdtm tests sequentially three
times (normal, mntns-compat-mode, criu-config), followed by many
non-shardable tests. This dominates overall CI wait time. With only
2 jobs running in parallel (GCC and CLANG) the alpine tests take
around 30 minutes.
Use the existing --test-shard-index and --test-shard-count flags
already built into test/zdtm.py to split the zdtm test suite across
four parallel runners (shards 0-3). A fifth shard runs all
non-shardable tests (lazy pages, fault injection, test/others/*,
rootless, compel, plugins, etc.) independently and in parallel with
the zdtm shards. This increases parallelism from 2 to 10 jobs and
reduces the alpine test wall-clock time from ~30 to ~10 minutes.
Changes:
- run-ci-tests.sh: Build SHARD_OPTS from ZDTM_SHARD_INDEX/COUNT
env vars and pass them to zdtm.py. Extract all non-shardable
tests into a run_non_shardable_tests() function. Dispatch based
on shard index: 0-3 run zdtm slices, 4 runs non-shardable
tests, unset runs everything sequentially (preserving existing
behavior). Validate that ZDTM_SHARD_INDEX is set when
ZDTM_SHARD_COUNT is set.
- Makefile: Pass ZDTM_SHARD_INDEX and ZDTM_SHARD_COUNT into the
container when set. Split long container run command across
multiple lines for readability.
- ci.yml: Add shard: [0, 1, 2, 3, 4] to the alpine-test matrix,
producing 10 jobs (2 compilers x 5 shards). Job labels now show
descriptive shard names (e.g. "zdtm 1/4", "non-zdtm") instead
of raw indices.
When sharding is not configured the script behaves identically to
before, so other CI jobs (aarch64, compat, gcov, etc.) are
unaffected.
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Adrian Reber <areber@redhat.com>
if [ -d /sys/fs/selinux ] &&command -v getenforce &>/dev/null;then
317
-
# Note: selinux in Enforcing mode prevents us from calling clone3() or writing to ns_last_pid on restore; hence set to Permissive for the test and then set back.
318
-
selinuxmode=$(getenforce)
319
-
if [ "$selinuxmode"!="Disabled" ];then
320
-
setenforce Permissive
321
-
fi
322
395
396
+
# When sharding is enabled, shards 0..count-1 run sharded zdtm tests and
397
+
# shard "count" (the extra shard) runs only the non-shardable tests.
398
+
# When sharding is not enabled, run everything sequentially.
399
+
if [ -z"$ZDTM_SHARD_COUNT" ] || [ "$ZDTM_SHARD_COUNT"-eq 0 ];then
400
+
# No sharding: run all zdtm tests followed by non-shardable tests
401
+
./test/zdtm.py run -a -p 2 --keep-going "${ZDTM_OPTS[@]}"
402
+
if criu/criu check --feature move_mount_set_group;then
403
+
./test/zdtm.py run -a -p 2 --mntns-compat-mode --keep-going "${ZDTM_OPTS[@]}"
323
404
fi
324
-
# Run it as non-root in a user namespace. Since CAP_CHECKPOINT_RESTORE behaves differently in non-user namespaces (e.g. no access to map_files) this tests that we can dump and restore
325
-
# under those conditions. Note that the "... && true" part is necessary; we need at least one statement after the tests so that bash can reap zombies in the user namespace,
326
-
# otherwise it will exec the last statement and get replaced and nobody will be left to reap our zombies.
327
-
sudo --user=#65534 --group=#65534 unshare -Ucfpm --mount-proc -- bash -c "./test/zdtm.py run -t zdtm/static/maps00 -f h --rootless && true"
328
-
if [ -d /sys/fs/selinux ] &&command -v getenforce &>/dev/null;then
329
-
if [ "$selinuxmode"!="Disabled" ];then
330
-
setenforce "$selinuxmode"
331
-
fi
332
-
fi
333
-
setcap -r criu/criu
405
+
./test/zdtm.py run -a -p 2 --keep-going --criu-config "${ZDTM_OPTS[@]}"
0 commit comments