Skip to content

Commit bec05d7

Browse files
authored
Merge pull request #167 from erieaton-amd/build-guide
Update build directions for AMD GPUs
2 parents 9452570 + e48d8f6 commit bec05d7

1 file changed

Lines changed: 16 additions & 9 deletions

File tree

docs/build.md

Lines changed: 16 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@
88
We recommend installation in [Nvidia PyTorch container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch/tags).
99

1010
#### if for AMD GPU:
11-
- ROCM 6.3.0
12-
- Torch 2.4.1 with ROCM support
11+
- ROCM 7.1
12+
- Torch 2.7.1 with ROCM support
1313

1414

1515

@@ -139,7 +139,7 @@ bash .codebase/scripts/nvidia/run_tutorial_test.sh
139139
See examples in the `tutorials` directory at the project root.
140140

141141
## To use Triton-distributed with the AMD backend:
142-
Starting from the rocm/pytorch:rocm6.1_ubuntu22.04_py3.10_pytorch_2.4 Docker container
142+
Starting from the rocm/pytorch:rocm7.1_ubuntu24.04_py3.12_pytorch_release_2.7.1 Docker container
143143
#### AMD Build Steps
144144
1. Clone the repo
145145
```sh
@@ -150,14 +150,21 @@ git clone https://github.com/ByteDance-Seed/Triton-distributed.git
150150
cd Triton-distributed/
151151
git submodule update --init --recursive
152152
```
153+
If you are updating an old repo, there may be issues if the rocshmem submodule is still present. Erase it if necessary:
154+
```sh
155+
rm -rf 3rdparty/rocshmem # only for updated repo
156+
```
153157
3. Install dependencies
154158
```sh
155-
sudo apt-get update -y
156-
sudo apt install -y libopenmpi-dev
157-
pip3 install --pre torch --index-url https://download.pytorch.org/whl/nightly/rocm6.3 --no-deps
158-
bash ./shmem/rocshmem_bind/build.sh
159-
python3 -m pip install -i https://test.pypi.org/simple hip-python>=6.3.0 # (or whatever Rocm version you have)
159+
export TRITON_BUILD_WITH_CLANG_LLD=TRUE
160+
export TRITON_USE_ASSERT_ENABLED_LLVM=TRUE
161+
export TRITON_BUILD_PROTON=0
162+
rm -f /usr/local/bin/cmake
163+
apt-get update -y
164+
apt install -y libopenmpi-dev git cython3 ibverbs-utils openmpi-bin libopenmpi-dev libpci-dev libdw1 locales cmake miopen-hip autoconf libtool flex ninja-build clang lld
165+
python3 -m pip install -i https://test.pypi.org/simple hip-python>=7.1 # (or whatever Rocm version you have)
160166
pip3 install pybind11
167+
bash ./shmem/rocshmem_bind/build.sh
161168
```
162169
4. Build Triton-distributed
163170
```sh
@@ -167,7 +174,7 @@ pip3 install -e python --verbose --no-build-isolation --use-pep517
167174
#### GEMM ReduceScatter example on single node
168175
```sh
169176
bash ./scripts/launch_amd.sh ./python/triton_dist/test/amd/test_ag_gemm_intra_node.py 8192 8192 29568
170-
```
177+
```
171178
and see the following (reduced) output
172179
```sh
173180
✅ Triton and Torch match

0 commit comments

Comments
 (0)