Commit 9950ef9
committed
ggml-cuda : fix CDNA2 compute capability constant for gfx90a (MI210)
GGML_CUDA_CC_CDNA2 was set to 0x910 which corresponds to gfx910 (RDNA3),
not gfx90a (CDNA2/MI210). This caused CDNA2 GPUs to be misidentified,
skipping CDNA2-specific code paths such as MFMA acc register renaming.
Fix by setting the constant to 0x90a to match the actual gfx90a ISA.1 parent c08d28d commit 9950ef9
1 file changed
+1
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
68 | | - | |
| 68 | + | |
69 | 69 | | |
70 | 70 | | |
71 | 71 | | |
| |||
0 commit comments