Skip to content

Commit 2f0138e

Browse files
committed
Switch from pytest-split to pytest-xdist for parallel test execution
Previously, CPU tests are distributed across multiple workers using pytest-split, which assigns the same number of tests to each worker. However, since the runtime of tests is different, some workers end up finishing fast and stand idle while others take a long time, so we're not utilizing the workers fully. This change replaces the use of pytest-split with pytest-xdist which dynamically assigns work to workers.
1 parent 56a7fd8 commit 2f0138e

2 files changed

Lines changed: 5 additions & 4 deletions

File tree

.github/workflows/build_and_test_maxtext.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ jobs:
5151
fail-fast: false # don't cancel all jobs on failure
5252
matrix:
5353
image_type: ["py312"]
54-
worker_group: [1, 2, 3, 4]
54+
worker_group: [1, 2, 3]
5555
with:
5656
device_type: cpu
5757
device_name: X64
@@ -63,7 +63,7 @@ jobs:
6363
container_resource_option: "--privileged"
6464
is_scheduled_run: ${{ github.event_name == 'schedule' }}
6565
worker_group: ${{ matrix.worker_group }}
66-
total_workers: 4
66+
total_workers: 3
6767

6868
maxtext_tpu_unit_tests:
6969
needs: build_and_upload_maxtext_package

.github/workflows/run_tests_against_package.yml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ jobs:
7171
TF_FORCE_GPU_ALLOW_GROWTH: ${{ inputs.tf_force_gpu_allow_growth }}
7272
TPU_SKIP_MDS_QUERY: ${{ inputs.device_type == 'cpu' && '1' || '' }}
7373
MAXTEXT_PACKAGE_EXTRA: ${{ inputs.device_type == 'cpu' && 'tpu' || inputs.device_type }}
74+
ALLOW_MULTIPLE_LIBTPU_LOAD: ${{ inputs.device_type == 'cpu' && 'true' || '' }} # bypass /tmp/libtpu_lockfile check for cpu tests, which don't actually use accelerators (to allow concurrency)
7475
options: ${{ inputs.container_resource_option }}
7576
steps:
7677
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
@@ -109,8 +110,8 @@ jobs:
109110
export LIBTPU_INIT_ARGS='--xla_tpu_scoped_vmem_limit_kib=65536'
110111
fi
111112
if [ "${{ inputs.total_workers }}" -gt 1 ]; then
112-
.venv/bin/python3 -m pip install --quiet pytest-split
113-
SPLIT_ARGS="--splits ${{ inputs.total_workers }} --group ${{ inputs.worker_group }}"
113+
.venv/bin/python3 -m pip install --quiet pytest-split pytest-xdist
114+
SPLIT_ARGS="--splits ${{ inputs.total_workers }} --group ${{ inputs.worker_group }} -n auto"
114115
else
115116
SPLIT_ARGS=""
116117
fi

0 commit comments

Comments
 (0)