Skip to content

Commit 10e2eec

Browse files
SS-JIAclaude
andcommitted
Skip AOTI tests on macOS CI and bump job timeout to 120 min
Summary: AOTI tests (llama3_2_vision and select extension/llm tests) hang indefinitely on macOS CI runners after the PyTorch 2.12 pin update. The hang is in native C/C++ code (inductor compilation / dlopen), which prevents faulthandler from producing a traceback. Diagnosis is ongoing in pytorch#19886. Skip the affected tests and bump the macOS job timeout from the default 90 to 120 minutes to add margin (observed completion at ~79 min with skips applied). Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 915a82d commit 10e2eec

2 files changed

Lines changed: 14 additions & 2 deletions

File tree

.ci/scripts/unittest-macos-cmake.sh

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,19 @@ set -eux
1212
export TORCHINDUCTOR_CACHE_DIR="$(mktemp -d "${RUNNER_TEMP:-/tmp}/torchinductor_cache_XXXXXX")"
1313
trap 'rm -rf "${TORCHINDUCTOR_CACHE_DIR}"' EXIT
1414

15-
# Run pytest with coverage
16-
${CONDA_RUN} pytest -n auto --cov=./ --cov-report=xml
15+
# TODO(SS-JIA): AOTI tests hang on macOS CI runners — the thread blocks in
16+
# native C/C++ code (dlopen / inductor compilation) so faulthandler cannot
17+
# even produce a traceback. Diagnosis ongoing in #19886.
18+
AOTI_SKIPS=(
19+
--ignore=examples/models/llama3_2_vision/preprocess/test_preprocess.py
20+
--ignore=examples/models/llama3_2_vision/vision_encoder/test/test_vision_encoder.py
21+
--ignore=examples/models/llama3_2_vision/text_decoder/test/test_text_decoder.py
22+
--deselect=extension/llm/modules/test/test_position_embeddings.py::TilePositionalEmbeddingTest::test_tile_positional_embedding_aoti
23+
--deselect=extension/llm/modules/test/test_position_embeddings.py::TiledTokenPositionalEmbeddingTest::test_tiled_token_positional_embedding_aoti
24+
--deselect=extension/llm/modules/test/test_attention.py::AttentionTest::test_attention_aoti
25+
)
26+
27+
${CONDA_RUN} pytest -n auto --cov=./ --cov-report=xml "${AOTI_SKIPS[@]}"
1728
# Run gtest
1829
LLVM_PROFDATA="xcrun llvm-profdata" LLVM_COV="xcrun llvm-cov" \
1930
${CONDA_RUN} test/run_oss_cpp_tests.sh

.github/workflows/_unittest.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@ jobs:
4949
python-version: '3.11'
5050
submodules: 'recursive'
5151
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
52+
timeout: 120
5253
script: |
5354
set -eux
5455
# This is needed to get the prebuilt PyTorch wheel from S3

0 commit comments

Comments
 (0)