Commit 96df945
committed
fix(wasm): update INT4 quantization to use matmul_nbits_quantizer API
The onnxruntime API changed from matmul_4bits_quantizer to matmul_nbits_quantizer
with more generic n-bit quantization support.
API Changes:
- matmul_4bits_quantizer → matmul_nbits_quantizer module
- DefaultWeightOnlyQuantConfig → RTNWeightOnlyQuantConfig
- MatMul4BitsQuantizer → MatMulNBitsQuantizer
This fixes the ImportError: cannot import name 'matmul_4bits_quantizer' that
was preventing AI model INT4 quantization from working with onnxruntime >=1.20.1 parent 2348268 commit 96df945
File tree
2 files changed
+5
-4
lines changed- .github/workflows
- packages/models/scripts
2 files changed
+5
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
169 | 169 | | |
170 | 170 | | |
171 | 171 | | |
172 | | - | |
| 172 | + | |
173 | 173 | | |
174 | 174 | | |
175 | 175 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
229 | 229 | | |
230 | 230 | | |
231 | 231 | | |
232 | | - | |
| 232 | + | |
| 233 | + | |
233 | 234 | | |
234 | | - | |
| 235 | + | |
235 | 236 | | |
236 | 237 | | |
237 | 238 | | |
238 | 239 | | |
239 | 240 | | |
240 | | - | |
| 241 | + | |
241 | 242 | | |
242 | 243 | | |
243 | 244 | | |
| |||
0 commit comments