docs: update hardware support table (#2842)

Qubitium · web-flow · commit bb0e0ebc9960 · 2026-04-29T12:22:42.000+08:00
* docs: update hardware support table

* docs: trim npu support note
diff --git a/README.md b/README.md
@@ -266,16 +266,18 @@ Prism Bonsai GGUF checkpoints are supported for inference only through GPT-QMode
 
 GPT-QModel is validated on Linux, macOS, and Windows 11:
 
-| Platform        | Device        |     |  Optimized Arch          | Kernels                                       |
-|-----------------|---------------| --- | ------------ |-----------------------------------------------| 
-| 🐧 Linux           | NVIDIA GPU    | ✅       | `Turing+` | Marlin, Exllama V2, Exllama V1, Triton, Torch |
-| 🐧 Linux | AMD GPU     | ✅             |   `7900XT+`,  `ROCm 6.2+` | Exllama V2, Exllama V1, Torch                 |
-| 🐧 Linux | Intel XPU     | ✅             |   `Arc`, `Datacenter Max` | TorchFused, TorchFusedAWQ, Torch              |
-| 🐧 Linux           | Intel/AMD CPU | ✅          | `avx`, `amx` | TorchFused, TorchFusedAWQ, Torch                           |
-| 🍎 macOS | GPU (Metal) / CPU          | ✅             |   `Apple Silicon`, `M1+` | Torch, MLX via conversion                     |
-| 🪟 Windows | GPU (NVIDIA) / CPU       | ✅             |   `NVIDIA`  | Torch                                         |
+| Platform | Device |  | Optimized Arch | Kernels |
+|---|---|---|---|---|
+| 🐧 Linux | NVIDIA GPU | ✅ | `Turing+` (`sm_75+`) | Machete, Marlin, Exllama V3 / EXL3, Exllama V2, AWQ GEMM/GEMV, ParoQuant CUDA/Triton, GGUF CUDA/Triton, QQQ, BitBLAS, Triton, BitsAndBytes, Torch |
+| 🐧 Linux | AMD GPU | ✅ | `7900XT+`, `ROCm 6.2+` | Exllama V2, AWQ GEMM/GEMV, QQQ, FP8 Torch, Torch |
+| 🐧 Linux | Huawei Ascend NPU | ✅ | `Ascend 910B`, `torch-npu` / `CANN` | Native Torch kernels for GPTQ, AWQ, ParoQuant, GGUF, QQQ, and EXL3 |
+| 🐧 Linux | Intel XPU | ✅ | `Arc`, `Datacenter Max` | TorchFused, TorchFusedAWQ, FP8 Torch, Torch |
+| 🐧 Linux | Intel/AMD CPU | ✅ | `avx`, `amx` | TorchFused, TorchFusedAWQ, TorchAten int4, TorchInt8, GGUF C++, BitsAndBytes, Torch |
+| 🍎 macOS | GPU (Metal) / CPU | ✅ | `Apple Silicon`, `M1+` | Torch, FP8 Torch, MLX via conversion |
+| 🪟 Windows | GPU (NVIDIA) / CPU | ✅ | `NVIDIA` | Torch |
 
 `Marlin` and JIT CUDA kernels now support NVIDIA `Turing+` (`sm_75+`) GPUs.
+Huawei Ascend NPU support uses native Torch kernels through `torch-npu` / `CANN`.
 
 
 ## Install