Skip to content

Commit db3a1b3

Browse files
authored
[hardware] fix: update architecture check and CANN toolkit path retrieval in device.py (verl-project#5142)
### What does this PR do? Prioritize obtaining the CANN version information file path from the ASCEND_HOME_PATH environment variable instead of hard coding it. ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `veomni`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) - [ ] If your PR is related to the `recipe` submodule, please also update the reference to the submodule commit via `git submodule update --remote` or `cd recipe && git pull origin main`. Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>
1 parent 7f4b76a commit db3a1b3

2 files changed

Lines changed: 15 additions & 6 deletions

File tree

tests/utils/test_check_ipc_version_support_on_npu.py

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,16 @@
1-
# Copyright 2025 Bytedance Ltd. and/or its affiliates
1+
# Copyright 2024 Bytedance Ltd. and/or its affiliates
22
#
3-
# This code is licensed under the MIT-style license found in the
4-
# LICENSE file in the root directory of this source tree.
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
514

615
import logging
716
import unittest

verl/utils/device.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -211,11 +211,11 @@ def get_npu_versions() -> tuple[str, str]:
211211

212212
# Check CANN toolkit version
213213
arch = platform.machine()
214-
if arch not in ["aarch64", "x86_64"]:
214+
if arch not in ["arm64", "aarch64", "x86_64"]:
215215
raise RuntimeError(f"Unsupported architecture: {arch}")
216216

217-
# NOTE: if user install CANN toolkit in custom path, this check may fail
218-
cann_path = os.path.join("/usr/local/Ascend/ascend-toolkit/latest", f"{arch}-linux")
217+
ascend_home = os.environ.get("ASCEND_HOME_PATH", "/usr/local/Ascend/ascend-toolkit/latest")
218+
cann_path = os.path.join(ascend_home, f"{arch}-linux")
219219

220220
if not os.path.exists(cann_path):
221221
raise RuntimeError(f"CANN toolkit path does not exist: {cann_path}")

0 commit comments

Comments
 (0)