Skip to content

Commit 64751ea

Browse files
zhangyue207zhangyue
andauthored
fix(tests): skip crashing torch ops on ascend (#614)
* fix(tests): skip crashing torch ops on ascend * ci: update CI submodule for dynamic device allocation * fix(tests): skip smooth l1 loss on ascend * ci: use host-level device leases * ci: route local runners through leases * ci: use agent local CI tools * fix(tests): narrow Ascend torch op crash skips * fix(tests): skip Ascend soft margin loss crash * fix(tests): restore Ascend smooth l1 crash skip * ci: stop queue watchdog after jobs start * ci: require Ascend device lease before docker run * ci: remove queued job watchdog checks * fix(ci): keep iluvatar corex compiler in image * fix(ci): pass platform to v2 shadow runner --------- Co-authored-by: zhangyue <zhangyue@localhost.localdomain>
1 parent 1400daf commit 64751ea

5 files changed

Lines changed: 9 additions & 6 deletions

File tree

.github/ci_config.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,6 @@ platforms:
4747
- "--ipc=host"
4848
volumes:
4949
- /dev:/dev
50-
- /usr/local/corex-4.3.0/bin:/usr/local/corex-4.3.0.20250624/bin:ro
5150
- /lib/firmware:/lib/firmware
5251
- /usr/src:/usr/src
5352
- /lib/modules:/lib/modules

.github/workflows/ci_test.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,10 +27,10 @@ on:
2727

2828
jobs:
2929
ci:
30-
uses: InfiniTensor/ci/.github/workflows/infiniops-ci.yml@c6bf369739f16e759f46fd466f586c51c59f26c3
30+
uses: InfiniTensor/ci/.github/workflows/infiniops-ci.yml@45d1046ec42ea73d1a75d5a96eb1a8204d47cbcd
3131
with:
3232
config_path: .github/ci_config.yml
33-
ci_ref: c6bf369739f16e759f46fd466f586c51c59f26c3
33+
ci_ref: 45d1046ec42ea73d1a75d5a96eb1a8204d47cbcd
3434
max_parallel: 10
3535
platform: ${{ github.event_name == 'workflow_dispatch' && (inputs.platform == 'all' && 'nvidia,iluvatar,metax,moore,cambricon,ascend' || inputs.platform) || 'nvidia,iluvatar,metax,moore,cambricon,ascend' }}
3636
secrets: inherit

.github/workflows/ci_v2_shadow.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,10 +28,10 @@ on:
2828

2929
jobs:
3030
ci-v2-shadow:
31-
uses: InfiniTensor/ci/.github/workflows/infiniops-ci-v2-shadow.yml@c6bf369739f16e759f46fd466f586c51c59f26c3
31+
uses: InfiniTensor/ci/.github/workflows/infiniops-ci-v2-shadow.yml@45d1046ec42ea73d1a75d5a96eb1a8204d47cbcd
3232
with:
3333
config_path: .github/ci_config.yml
34-
ci_ref: c6bf369739f16e759f46fd466f586c51c59f26c3
34+
ci_ref: 45d1046ec42ea73d1a75d5a96eb1a8204d47cbcd
3535
max_parallel: 10
3636
platform: ${{ github.event_name == 'workflow_dispatch' && (inputs.platform == 'all' && 'nvidia,iluvatar,metax,moore,cambricon,ascend' || inputs.platform == 'active' && 'nvidia,iluvatar,metax,moore,cambricon,ascend' || inputs.platform) || 'nvidia,iluvatar,metax,moore,cambricon,ascend' }}
3737
secrets: inherit

tests/test_torch_ops.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -208,7 +208,11 @@ def _list_default(aten_type):
208208
_VENDOR_CRASH_OPS = frozenset(
209209
{
210210
("npu", "mish"),
211+
("npu", "mse_loss"),
212+
("npu", "nonzero"),
211213
("npu", "nuclear_norm"),
214+
("npu", "smooth_l1_loss"),
215+
("npu", "soft_margin_loss"),
212216
("npu", "_linalg_svd"),
213217
("npu", "svd"),
214218
}

0 commit comments

Comments
 (0)