88We recommend installation in [ Nvidia PyTorch container] ( https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch/tags ) .
99
1010#### if for AMD GPU:
11- - ROCM 6.3.0
12- - Torch 2.4 .1 with ROCM support
11+ - ROCM 7.1
12+ - Torch 2.7 .1 with ROCM support
1313
1414
1515
@@ -139,7 +139,7 @@ bash .codebase/scripts/nvidia/run_tutorial_test.sh
139139See examples in the ` tutorials` directory at the project root.
140140
141141# # To use Triton-distributed with the AMD backend:
142- Starting from the rocm/pytorch:rocm6.1_ubuntu22 .04_py3.10_pytorch_2.4 Docker container
142+ Starting from the rocm/pytorch:rocm7.1_ubuntu24 .04_py3.12_pytorch_release_2.7.1 Docker container
143143# ### AMD Build Steps
1441441. Clone the repo
145145` ` ` sh
@@ -150,14 +150,21 @@ git clone https://github.com/ByteDance-Seed/Triton-distributed.git
150150cd Triton-distributed/
151151git submodule update --init --recursive
152152` ` `
153+ If you are updating an old repo, there may be issues if the rocshmem submodule is still present. Erase it if necessary:
154+ ` ` ` sh
155+ rm -rf 3rdparty/rocshmem # only for updated repo
156+ ` ` `
1531573. Install dependencies
154158` ` ` sh
155- sudo apt-get update -y
156- sudo apt install -y libopenmpi-dev
157- pip3 install --pre torch --index-url https://download.pytorch.org/whl/nightly/rocm6.3 --no-deps
158- bash ./shmem/rocshmem_bind/build.sh
159- python3 -m pip install -i https://test.pypi.org/simple hip-python> =6.3.0 # (or whatever Rocm version you have)
159+ export TRITON_BUILD_WITH_CLANG_LLD=TRUE
160+ export TRITON_USE_ASSERT_ENABLED_LLVM=TRUE
161+ export TRITON_BUILD_PROTON=0
162+ rm -f /usr/local/bin/cmake
163+ apt-get update -y
164+ apt install -y libopenmpi-dev git cython3 ibverbs-utils openmpi-bin libopenmpi-dev libpci-dev libdw1 locales cmake miopen-hip autoconf libtool flex ninja-build clang lld
165+ python3 -m pip install -i https://test.pypi.org/simple hip-python> =7.1 # (or whatever Rocm version you have)
160166pip3 install pybind11
167+ bash ./shmem/rocshmem_bind/build.sh
161168` ` `
1621694. Build Triton-distributed
163170` ` ` sh
@@ -167,7 +174,7 @@ pip3 install -e python --verbose --no-build-isolation --use-pep517
167174# ### GEMM ReduceScatter example on single node
168175` ` ` sh
169176bash ./scripts/launch_amd.sh ./python/triton_dist/test/amd/test_ag_gemm_intra_node.py 8192 8192 29568
170- ` ` `
177+ ` ` `
171178and see the following (reduced) output
172179` ` ` sh
173180✅ Triton and Torch match
0 commit comments