Commit e8c90e9
committed
Reject QDQ Gemm→QGemm fusion when alpha != 1 with bias (#28130)
The Gemm→QGemm QDQ fusion selector only validated beta == 1, letting
Gemms with alpha != 1 and a bias through. QGemm broadcasts the int32
bias into the accumulator before applying the alpha*sa*sb output scale,
so the bias ends up scaled by alpha too — producing incorrect outputs
when alpha != 1 (bias == 0 masks the issue).
Add an alpha == 1 check alongside the existing beta == 1 check in
GemmNodeGroupSelector::Check (only when bias is present — without bias
the fused path is still exact). Extend QDQTransformerGemmTests and the
fastmath variant with an alpha_not_one parameter so the regression is
covered.
Follow-up tracked in the issue: absorb alpha into the int32 bias in
GemmReplaceWithQuant so alpha != 1 cases can keep the fusion.1 parent 0e72188 commit e8c90e9
3 files changed
Lines changed: 31 additions & 4 deletions
File tree
- onnxruntime
- core/optimizer/qdq_transformer/selectors_actions
- test/optimizer
Lines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
835 | 835 | | |
836 | 836 | | |
837 | 837 | | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
| 844 | + | |
838 | 845 | | |
839 | 846 | | |
840 | 847 | | |
| |||
Lines changed: 12 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
323 | 323 | | |
324 | 324 | | |
325 | 325 | | |
326 | | - | |
| 326 | + | |
| 327 | + | |
327 | 328 | | |
328 | 329 | | |
329 | 330 | | |
| |||
396 | 397 | | |
397 | 398 | | |
398 | 399 | | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
399 | 404 | | |
400 | 405 | | |
401 | 406 | | |
402 | 407 | | |
403 | 408 | | |
404 | | - | |
| 409 | + | |
| 410 | + | |
405 | 411 | | |
406 | 412 | | |
407 | 413 | | |
| |||
490 | 496 | | |
491 | 497 | | |
492 | 498 | | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
493 | 503 | | |
494 | 504 | | |
495 | 505 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
718 | 718 | | |
719 | 719 | | |
720 | 720 | | |
721 | | - | |
| 721 | + | |
| 722 | + | |
722 | 723 | | |
723 | 724 | | |
724 | 725 | | |
| |||
791 | 792 | | |
792 | 793 | | |
793 | 794 | | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
794 | 799 | | |
795 | 800 | | |
796 | 801 | | |
797 | 802 | | |
798 | 803 | | |
799 | | - | |
| 804 | + | |
| 805 | + | |
800 | 806 | | |
801 | 807 | | |
802 | 808 | | |
| |||
860 | 866 | | |
861 | 867 | | |
862 | 868 | | |
| 869 | + | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
863 | 873 | | |
864 | 874 | | |
865 | 875 | | |
| |||
0 commit comments