Skip to content

Commit 0ee31fc

Browse files
authored
Bump iOS XCTest timeout for ExecuTorchLLMTests (#19354)
Summary: The 13 XCTestCase methods in `xplat/executorch/extension/llm/apple:ExecuTorchLLMTests` (testLLaMA, testPhi4, testGemma, testLLaVA, testVoxtral and their reset variants) regularly hit the 1800-second per-test ceiling enforced by `fbobjc/Tools/xctest_runner` for the `long_running` label. LLM inference on iOS-sim CPU (1B-class models, 128-768 token sequences, each test calls `generate()` twice) routinely exceeds 30 minutes per test method, producing spurious "Test timed out after 1800 seconds" flakes on the test-issues dashboard for owner `ai_infra_mobile_platform`. Per the runner formula `TEST_CASE_TIMEOUT(60s) * label_multiplier * 3`: | label | multiplier | per-XCTestCase budget | |----------------|-----------:|----------------------:| | long_running | x10 | 1800s | | glacial (here) | x30 | 5400s | Switching to `glacial` (the highest tier supported by the runner) gives each test 90 minutes. Adding `test_test_rule_timeout_ms = 14400000` sets the bundle-level wall-clock budget to 4h, which is comfortable headroom for ~5 testcases at 90 min each plus xctest setup/teardown. Note: this diff is unrelated to T269848646. T269848646 tracks a separate cluster of 446 iOS-sim test-run *cancellations* (`duration: 0.00`, "test execution was cancelled because the test run was cancelled") that is owned by testinfra and is not addressed here. Reviewed By: shoumikhin Differential Revision: D104147313
1 parent 1414bc1 commit 0ee31fc

1 file changed

Lines changed: 11 additions & 1 deletion

File tree

  • extension/llm/apple

extension/llm/apple/BUCK

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,17 @@ non_fbcode_target(_kind = fb_apple_library,
1616
],
1717
sdks = IOS,
1818
visibility = EXECUTORCH_CLIENTS,
19-
test_labels = ["long_running"],
19+
# `glacial` raises the per-XCTestCase timeout from 1800s -> 5400s (90 min)
20+
# via fbobjc/Tools/xctest_runner: TEST_CASE_TIMEOUT(60s) * 30 * 3.
21+
# Required because LLM inference (LLaMA, Phi4, Gemma, LLaVA, Voxtral)
22+
# on iOS-sim CPU regularly exceeds 30 minutes for a full forward pass.
23+
test_labels = ["glacial"],
24+
# Rule-level wall-clock for the whole auto-generated test bundle:
25+
# ExecuTorchLLMTests currently contains 13 XCTestCase methods, and
26+
# individual methods can exceed 30 minutes on iOS-sim CPU. This 4h
27+
# budget is intended as the total bundle/shard wall-clock, including
28+
# xctest setup/teardown overhead; it is not based on "5 testcases".
29+
test_test_rule_timeout_ms = 14400000,
2030
test_deps = [
2131
":ExecuTorchLLMTestResource",
2232
"//xplat/executorch/backends/xnnpack:xnnpack_backendApple",

0 commit comments

Comments
 (0)