Commit 42a5446
authored
Qualcomm AI Engine Direct - remove prefill calibration (#17805)
### Summary
- calibrate kv text decoder only to reduce calibration time
- deprecate outdated implementation & use deterministic example inputs
for llm
**Total Quantization Time**
| Model | Before(s) | After(s) | Improvement |
| :---: | :---: | :---: | :---: |
| gemma-2b | 2203.399 | 999.512 | 54.64% |
| gemma2-2b | 2177.285 | 1001.248 | 54.01% |
| gemma3-1b | 1776.861 | 548.312 | 69.14% |
| glm-1_5b | 1434.780 | 677.257 | 52.8% |
| granite_3_3-2b | 59566.790 | 6165.443 | 89.65% |
| llama3_2-1b | 4528.620 | 2953.233 | 34.79% |
| llama3_2-3b | 5744.429 | 1652.157 | 71.24% |
| phi_4_mini | 7005.601 | 2071.634 | 84.56% |
| qwen2_5-0_5b | 480.508 | 372.076 | 22.57% |
| qwen2_5-1_5b | 2064.333 | 899.164 | 56.44% |
| qwen3-0_6b | 1673.150 | 1124.149 | 32.81% |
| qwen3-1_7b | 3253.723 | 1148.511 | 64.7% |
| smollm2_135m | 502.779 | 414.510 | 17.56% |
| smollm3-3b | 4663.057 | 1613.516 | 65.4% |
| smolvlm_500m_instruct | 288.246 | 170.829 | 40.73% |
| internvl3_1b | 256.624 | 170.811 | 33.44% |
### Test plan
`python backends/qualcomm/tests/test_qnn_delegate.py -k
TestExampleLLMScript / TestExampleMultimodalityScript`1 parent 6c02866 commit 42a5446
5 files changed
Lines changed: 130 additions & 338 deletions
File tree
- backends/qualcomm/quantizer
- examples/qualcomm/oss_scripts/llama
- runner/multimodal_runner
- wrappers
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
29 | | - | |
30 | 29 | | |
31 | 30 | | |
32 | 31 | | |
| |||
92 | 91 | | |
93 | 92 | | |
94 | 93 | | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
109 | | - | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
127 | | - | |
128 | | - | |
129 | 94 | | |
130 | 95 | | |
131 | 96 | | |
| |||
Lines changed: 0 additions & 142 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
294 | 294 | | |
295 | 295 | | |
296 | 296 | | |
297 | | - | |
298 | | - | |
299 | | - | |
300 | | - | |
301 | | - | |
302 | | - | |
303 | | - | |
304 | | - | |
305 | | - | |
306 | | - | |
307 | | - | |
308 | | - | |
309 | | - | |
310 | | - | |
311 | | - | |
312 | | - | |
313 | | - | |
314 | | - | |
315 | | - | |
316 | | - | |
317 | | - | |
318 | | - | |
319 | | - | |
320 | | - | |
321 | | - | |
322 | | - | |
323 | | - | |
324 | | - | |
325 | | - | |
326 | | - | |
327 | | - | |
328 | | - | |
329 | | - | |
330 | | - | |
331 | | - | |
332 | | - | |
333 | | - | |
334 | | - | |
335 | | - | |
336 | | - | |
337 | | - | |
338 | | - | |
339 | | - | |
340 | | - | |
341 | | - | |
342 | | - | |
343 | | - | |
344 | | - | |
345 | | - | |
346 | | - | |
347 | | - | |
348 | | - | |
349 | | - | |
350 | | - | |
351 | | - | |
352 | | - | |
353 | | - | |
354 | | - | |
355 | | - | |
356 | | - | |
357 | | - | |
358 | | - | |
359 | | - | |
360 | | - | |
361 | | - | |
362 | | - | |
363 | | - | |
364 | | - | |
365 | | - | |
366 | | - | |
367 | | - | |
368 | | - | |
369 | | - | |
370 | | - | |
371 | | - | |
372 | | - | |
373 | | - | |
374 | | - | |
375 | | - | |
376 | | - | |
377 | | - | |
378 | | - | |
379 | | - | |
380 | | - | |
381 | | - | |
382 | | - | |
383 | | - | |
384 | | - | |
385 | | - | |
386 | | - | |
387 | | - | |
388 | | - | |
389 | | - | |
390 | | - | |
391 | | - | |
392 | | - | |
393 | | - | |
394 | | - | |
395 | | - | |
396 | | - | |
397 | | - | |
398 | | - | |
399 | 297 | | |
400 | 298 | | |
401 | 299 | | |
| |||
599 | 497 | | |
600 | 498 | | |
601 | 499 | | |
602 | | - | |
603 | | - | |
604 | | - | |
605 | | - | |
606 | | - | |
607 | | - | |
608 | | - | |
609 | | - | |
610 | | - | |
611 | | - | |
612 | | - | |
613 | | - | |
614 | | - | |
615 | | - | |
616 | | - | |
617 | | - | |
618 | | - | |
619 | | - | |
620 | | - | |
621 | | - | |
622 | | - | |
623 | | - | |
624 | | - | |
625 | | - | |
626 | | - | |
627 | | - | |
628 | | - | |
629 | | - | |
630 | | - | |
631 | | - | |
632 | | - | |
633 | | - | |
634 | | - | |
635 | | - | |
636 | | - | |
637 | | - | |
638 | | - | |
639 | | - | |
640 | | - | |
641 | | - | |
642 | 500 | | |
643 | 501 | | |
644 | 502 | | |
| |||
Lines changed: 0 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
144 | | - | |
145 | | - | |
146 | | - | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
154 | 144 | | |
155 | 145 | | |
156 | 146 | | |
| |||
Lines changed: 0 additions & 31 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
55 | | - | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | 45 | | |
77 | 46 | | |
78 | 47 | | |
| |||
0 commit comments