Skip to content

Commit 2fe342e

Browse files
Updating README by introducing HaloBox into OS Environment setup
1 parent ca04058 commit 2fe342e

1 file changed

Lines changed: 59 additions & 4 deletions

File tree

  • playbooks/supplemental/pytorch-kernels

playbooks/supplemental/pytorch-kernels/README.md

Lines changed: 59 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -120,13 +120,18 @@ PyTorch also exposes `torch.cuda._compile_kernel()`, a high-level shortcut to JI
120120
---
121121

122122
## Setup
123-
<!-- @os:windows -->
124-
### Prerequisites - Windows
125-
- Install latest: [AMD Adrenalin Software](https://www.amd.com/en/products/software/adrenalin.html)
123+
124+
### System Detection
125+
Before setting up, the script checks if the system is a HaloBox. If it is, some steps, such as creating a virtual environment, are skipped as HaloBox comes with pre-installed driver.
126+
127+
<!-- @os:halobox -->
128+
### HaloBox: Skipping Virtual Environment Setup
129+
No need to set up a virtual environment as necessary driver and configurations are pre-installed.
126130
<!-- @os:end -->
127131

128132
### Create a Virtual Environment
129133
<!-- @os:linux -->
134+
#### Linux
130135
```bash
131136
sudo apt install -y python3-venv
132137
python3 -m venv ~/rocm-env
@@ -136,6 +141,12 @@ source ~/rocm-env/bin/activate
136141
<!-- @os:end -->
137142

138143
<!-- @os:windows -->
144+
### Prerequisites - Windows
145+
- Install latest: [AMD Adrenalin Software](https://www.amd.com/en/products/software/adrenalin.html)
146+
<!-- @os:end -->
147+
148+
<!-- @os:windows -->
149+
#### Windows
139150
```bash
140151
python -m venv rocm-env
141152
rocm-env\Scripts\activate
@@ -146,7 +157,15 @@ rocm-env\Scripts\activate
146157
---
147158

148159
### Installing Dependencies
160+
<!-- @os:halobox -->
161+
#### HaloBox
162+
```bash
163+
pip install --upgrade pip setuptools wheel --break-system-packages
164+
```
165+
<!-- @os:end -->
166+
149167
<!-- @os:linux -->
168+
#### Linux
150169
```bash
151170
source ~/rocm-env/bin/activate
152171

@@ -161,6 +180,7 @@ pip install --pre --index-url https://rocm.nightlies.amd.com/v2/gfx1151/ torch==
161180
<!-- @os:end -->
162181

163182
<!-- @os:windows -->
183+
#### Windows
164184
```bash
165185
rocm-env\Scripts\activate
166186

@@ -179,6 +199,7 @@ pip install --pre --index-url https://rocm.nightlies.amd.com/v2/gfx1151/ torch==
179199
180200
#### Set Environment Variables
181201
<!-- @os:linux -->
202+
#### Linux
182203
```bash
183204
rocm-sdk init # Initialize the devel libraries
184205

@@ -188,7 +209,18 @@ export PATH = "$ROCM_HOME/bin:$PATH"
188209
```
189210
<!-- @os:end -->
190211
212+
<!-- @os:linux&halobox -->
213+
#### Linux & HaloBox
214+
```bash
215+
# Set compiler and build settings
216+
export CC=clang
217+
export CXX=clang
218+
export DISTUTILS_USE_SDK=1
219+
```
220+
<!-- @os:end -->
221+
191222
<!-- @os:windows -->
223+
#### Windows
192224
```bash
193225
rocm-sdk init # Initialize the devel libraries
194226

@@ -299,13 +331,14 @@ On Windows, `rocm-smi` is not supported. To track GPU utilization, you can use T
299331
The full manual path: write the kernel and Python binding in a single `.cu` file, compile it as a native extension using PyTorch's build system, then import and call it from Python.
300332
301333
**Files:**
302-
334+
#### Windows
303335
<!-- @os:windows -->
304336
| File | Role |
305337
|---|---|
306338
| [add_one_kernel.cu](assets/Vector_Addition/add_one_kernel.cu) | Kernel + launcher + pybind11 binding, everything in one file |
307339
| [setup.py](assets/Vector_Addition/setup.py) | Build script, uses `CUDAExtension` to compile the `.cu` into a `.pyd` |
308340
<!-- @os:end -->
341+
#### Linux
309342
<!-- @os:linux -->
310343
| File | Role |
311344
|---|---|
@@ -349,16 +382,26 @@ the CPU immediately continues executing the next instruction without waiting for
349382
350383
**Step 2: Build**
351384
385+
<!-- @os:halobox -->
386+
#### HaloBox
387+
```bash
388+
pip install --no-build-isolation -v . --break-system-packages
389+
```
390+
<!-- @os:end -->
391+
392+
#### Linux & Windows
352393
```bash
353394
pip install --no-build-isolation -v .
354395
```
355396
356397
`CUDAExtension` is a CUDA build helper from `torch.utils.cpp_extension`. On AMD with ROCm, PyTorch **remaps `CUDAExtension` to use `hipcc`** instead of `nvcc`, so the same `setup.py` that would build a CUDA extension on NVIDIA compiles to AMD GPU code without any changes. This is the key mechanism that makes CUDA extension code portable to AMD: PyTorch's ROCm build intercepts the build path and routes it through the HIP compiler. Produces these in the same directory:
357398
<!-- @os:windows -->
399+
#### Windows
358400
- `build/`: directory with the `.pyd` files
359401
- `add_one_kernel.hip`: the HIP source generated by hipifying the `.cu` file; this is what `hipcc` actually compiled
360402
<!-- @os:end -->
361403
<!-- @os:linux -->
404+
#### Linux
362405
- `build/`: directory with the `.so` files
363406
- `add_one_kernel.hip`: the HIP source generated by hipifying the `.cu` file; this is what `hipcc` actually compiled
364407
<!-- @os:end -->
@@ -530,12 +573,14 @@ Average GPU Utilization: 55.00%
530573
The full manual path: write the kernel and Python binding in a `.cu` file, compile it as a native extension, then import and call it from Python. Mirrors the structure of `add_one_kernel.cu` exactly, only the kernel signature and launcher logic differ.
531574
532575
**Files:**
576+
#### Windows
533577
<!-- @os:windows -->
534578
| File | Role |
535579
|---|---|
536580
| [matmul_kernel.cu](assets/Matrix_Multiplication/matmul_kernel.cu) | Kernel + launcher + pybind11 binding |
537581
| [setup.py](assets/Matrix_Multiplication/setup.py) | Build script, uses `CUDAExtension` to compile the `.cu` into a `.pyd` |
538582
<!-- @os:end -->
583+
#### Linux
539584
<!-- @os:linux -->
540585
| File | Role |
541586
|---|---|
@@ -591,16 +636,26 @@ Compared to `add_one_launcher` in Walkthrough 1, the launcher here:
591636
592637
**Step 2: Build**
593638
639+
<!-- @os:halobox -->
640+
#### HaloBox
641+
```bash
642+
pip install --no-build-isolation -v . --break-system-packages
643+
```
644+
<!-- @os:end -->
645+
646+
#### Linux & Windows
594647
```bash
595648
pip install --no-build-isolation -v .
596649
```
597650
598651
Produces these in the same directory:
599652
<!-- @os:windows -->
653+
#### Windows
600654
- `build/`: directory with the `.pyd` files
601655
- `matmul_kernel.hip`: the HIP source generated by hipifying the `.cu` file; this is what `hipcc` actually compiled
602656
<!-- @os:end -->
603657
<!-- @os:linux -->
658+
#### Linux
604659
- `build/`: directory with the `.so` files
605660
- `matmul_kernel.hip`: the HIP source generated by hipifying the `.cu` file; this is what `hipcc` actually compiled
606661
<!-- @os:end -->

0 commit comments

Comments
 (0)