Skip to content

Commit 62d3e69

Browse files
authored
add int4_gptaq benchmark (#51)
1 parent b7c43b4 commit 62d3e69

1 file changed

Lines changed: 33 additions & 0 deletions

File tree

docs/source/performance/quantization/benchmarks.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -330,3 +330,36 @@ DeepSeek-R1-0528模型的`FP8-Block-Wise`、`W4A8-FP8`在`GPQA Diamond`、`AIME
330330
+-------------------------------+--------------+---------+---------+---------+
331331
332332
```
333+
334+
335+
## INT4-GPTAQ
336+
337+
INT4-GPTAQ在`GSM8K``HUMANEVAL``GPQA Diamond`上的评测结果如下:
338+
339+
```{eval-rst}
340+
.. table::
341+
:align: center
342+
:name: table-INT4-GPTAQ-performance
343+
344+
+-----------+--------------+-------+-----------+--------------+
345+
| Model | Quantization | GSM8K | HUMANEVAL | GPQA Diamond |
346+
+===========+==============+=======+===========+==============+
347+
| Qwen3-4B | BF16 | 85.37 | 72.56 | 37.88 |
348+
+ +--------------+-------+-----------+--------------+
349+
| | INT4-GPTQ | 81.65 | 61.59 | 35.35 |
350+
+ +--------------+-------+-----------+--------------+
351+
| | INT4-GPTAQ | 82.56 | 64.02 | 39.39 |
352+
+-----------+--------------+-------+-----------+--------------+
353+
| Qwen3-8B | BF16 | 87.79 | 63.41 | 32.32 |
354+
+ +--------------+-------+-----------+--------------+
355+
| | INT4-GPTQ | 86.43 | 62.20 | 34.85 |
356+
+ +--------------+-------+-----------+--------------+
357+
| | INT4-GPTAQ | 86.66 | 64.02 | 33.33 |
358+
+-----------+--------------+-------+-----------+--------------+
359+
| Qwen3-32B | BF16 | 74.53 | 37.80 | 40.40 |
360+
+ +--------------+-------+-----------+--------------+
361+
| | INT4-GPTQ | 65.58 | 43.29 | 40.40 |
362+
+ +--------------+-------+-----------+--------------+
363+
| | INT4-GPTAQ | 69.52 | 37.20 | - |
364+
+-----------+--------------+-------+-----------+--------------+
365+
```

0 commit comments

Comments
 (0)