Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 16 additions & 9 deletions docs/build.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
We recommend installation in [Nvidia PyTorch container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch/tags).

#### if for AMD GPU:
- ROCM 6.3.0
- Torch 2.4.1 with ROCM support
- ROCM 7.1
- Torch 2.7.1 with ROCM support



Expand Down Expand Up @@ -139,7 +139,7 @@ bash .codebase/scripts/nvidia/run_tutorial_test.sh
See examples in the `tutorials` directory at the project root.

## To use Triton-distributed with the AMD backend:
Starting from the rocm/pytorch:rocm6.1_ubuntu22.04_py3.10_pytorch_2.4 Docker container
Starting from the rocm/pytorch:rocm7.1_ubuntu24.04_py3.12_pytorch_release_2.7.1 Docker container
#### AMD Build Steps
1. Clone the repo
```sh
Expand All @@ -150,14 +150,21 @@ git clone https://github.com/ByteDance-Seed/Triton-distributed.git
cd Triton-distributed/
git submodule update --init --recursive
```
If you are updating an old repo, there may be issues if the rocshmem submodule is still present. Erase it if necessary:
```sh
rm -rf 3rdparty/rocshmem # only for updated repo
```
3. Install dependencies
```sh
sudo apt-get update -y
sudo apt install -y libopenmpi-dev
pip3 install --pre torch --index-url https://download.pytorch.org/whl/nightly/rocm6.3 --no-deps
bash ./shmem/rocshmem_bind/build.sh
python3 -m pip install -i https://test.pypi.org/simple hip-python>=6.3.0 # (or whatever Rocm version you have)
export TRITON_BUILD_WITH_CLANG_LLD=TRUE
export TRITON_USE_ASSERT_ENABLED_LLVM=TRUE
export TRITON_BUILD_PROTON=0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any workaround for this, I see cuda header when compiling proton files, is that why you suggest disabling?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proton isn't supported on AMD hardware yet so this can cause a build error.

rm -f /usr/local/bin/cmake
apt-get update -y
apt install -y libopenmpi-dev git cython3 ibverbs-utils openmpi-bin libopenmpi-dev libpci-dev libdw1 locales cmake miopen-hip autoconf libtool flex ninja-build clang lld
python3 -m pip install -i https://test.pypi.org/simple hip-python>=7.1 # (or whatever Rocm version you have)
pip3 install pybind11
bash ./shmem/rocshmem_bind/build.sh
```
4. Build Triton-distributed
```sh
Expand All @@ -167,7 +174,7 @@ pip3 install -e python --verbose --no-build-isolation --use-pep517
#### GEMM ReduceScatter example on single node
```sh
bash ./scripts/launch_amd.sh ./python/triton_dist/test/amd/test_ag_gemm_intra_node.py 8192 8192 29568
```
```
and see the following (reduced) output
```sh
✅ Triton and Torch match
Expand Down