Commit 50cef42
Fix BNB_WARP_SIZE detection for HIP host compilation pass
__GFX9__ is only defined during the device compilation pass, not
during host compilation. This caused BNB_WARP_SIZE to be 32 on
the host pass even for gfx942 (CDNA, warp=64), making the
conditional WARP_TRANSPOSE vs DIRECT selection wrong.
Use __AMDGCN_WAVEFRONT_SIZE instead, which the HIP compiler
defines correctly in both host and device passes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent 0b33411 commit 50cef42
1 file changed
+5
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
16 | | - | |
17 | | - | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
18 | 19 | | |
19 | | - | |
| 20 | + | |
20 | 21 | | |
21 | 22 | | |
22 | 23 | | |
| |||
0 commit comments