Commit e010a75
committed
[GAP9] gate NE16AdjustGEMMWeightLayoutPass on engine == "NE16"
The pass was added during the NE16 Linear PR integration (6c8ae2b) and
matches every Gemm/RequantizedGemm node without checking the engine
attribute, so cluster-bound GEMMs (e.g. MLPerf AnomalyDetection's 10
Gemm+RQ layers — they never run on NE16) had their mul/bias rewritten
into NE16 scale/scale_n/shift-diff layout. The cluster pulp_nn_linear
kernel then consumed the rewritten constants under its original integer
contract and produced ±1 mismatches versus the int8 reference outputs.
Mirror the existing NE16AdjustWeightMemoryLayoutPass: bail out for
nodes whose engine attr isn't "NE16". Pure-GAP9 cluster Gemms keep
Deeploy's Generic + PULPGEMMRequantMergePass layout (including the
bias += div/2 rounding compensation), matching the reference.
gvsoc gap9.evk (Models/MLPerf/AnomalyDetection L1=64000):
- before: 33/640 errors (all ±1), Runtime 89110 cycles
- after: 0/640 errors, Runtime 79332 cycles
- devel base 3b011bb (where bug doesn't exist): 0/640, 78500 cycles
gap9_tiled L2 single-buffer models goes from 9/11 → 10/11 pass. The
remaining failure (MLPerf/ImageClassification, parser backtracking on
a standalone RequantShift node) is unrelated to GEMM and pre-dates
this fix.1 parent 438f100 commit e010a75
1 file changed
Lines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
45 | 52 | | |
46 | 53 | | |
47 | 54 | | |
| |||
0 commit comments