Skip to content

Commit bb0e0eb

Browse files
authored
docs: update hardware support table (#2842)
* docs: update hardware support table * docs: trim npu support note
1 parent b0f9680 commit bb0e0eb

1 file changed

Lines changed: 10 additions & 8 deletions

File tree

README.md

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -266,16 +266,18 @@ Prism Bonsai GGUF checkpoints are supported for inference only through GPT-QMode
266266

267267
GPT-QModel is validated on Linux, macOS, and Windows 11:
268268

269-
| Platform | Device | | Optimized Arch | Kernels |
270-
|-----------------|---------------| --- | ------------ |-----------------------------------------------|
271-
| 🐧 Linux | NVIDIA GPU || `Turing+` | Marlin, Exllama V2, Exllama V1, Triton, Torch |
272-
| 🐧 Linux | AMD GPU || `7900XT+`, `ROCm 6.2+` | Exllama V2, Exllama V1, Torch |
273-
| 🐧 Linux | Intel XPU || `Arc`, `Datacenter Max` | TorchFused, TorchFusedAWQ, Torch |
274-
| 🐧 Linux | Intel/AMD CPU || `avx`, `amx` | TorchFused, TorchFusedAWQ, Torch |
275-
| 🍎 macOS | GPU (Metal) / CPU || `Apple Silicon`, `M1+` | Torch, MLX via conversion |
276-
| 🪟 Windows | GPU (NVIDIA) / CPU || `NVIDIA` | Torch |
269+
| Platform | Device | | Optimized Arch | Kernels |
270+
|---|---|---|---|---|
271+
| 🐧 Linux | NVIDIA GPU || `Turing+` (`sm_75+`) | Machete, Marlin, Exllama V3 / EXL3, Exllama V2, AWQ GEMM/GEMV, ParoQuant CUDA/Triton, GGUF CUDA/Triton, QQQ, BitBLAS, Triton, BitsAndBytes, Torch |
272+
| 🐧 Linux | AMD GPU || `7900XT+`, `ROCm 6.2+` | Exllama V2, AWQ GEMM/GEMV, QQQ, FP8 Torch, Torch |
273+
| 🐧 Linux | Huawei Ascend NPU || `Ascend 910B`, `torch-npu` / `CANN` | Native Torch kernels for GPTQ, AWQ, ParoQuant, GGUF, QQQ, and EXL3 |
274+
| 🐧 Linux | Intel XPU || `Arc`, `Datacenter Max` | TorchFused, TorchFusedAWQ, FP8 Torch, Torch |
275+
| 🐧 Linux | Intel/AMD CPU || `avx`, `amx` | TorchFused, TorchFusedAWQ, TorchAten int4, TorchInt8, GGUF C++, BitsAndBytes, Torch |
276+
| 🍎 macOS | GPU (Metal) / CPU || `Apple Silicon`, `M1+` | Torch, FP8 Torch, MLX via conversion |
277+
| 🪟 Windows | GPU (NVIDIA) / CPU || `NVIDIA` | Torch |
277278

278279
`Marlin` and JIT CUDA kernels now support NVIDIA `Turing+` (`sm_75+`) GPUs.
280+
Huawei Ascend NPU support uses native Torch kernels through `torch-npu` / `CANN`.
279281

280282

281283
## Install

0 commit comments

Comments
 (0)