Qualcomm AI Engine Direct - Add fp16a8w quantization config#19537
Qualcomm AI Engine Direct - Add fp16a8w quantization config#19537shewu-quic wants to merge 1 commit into
Conversation
Summary: - Add a pass `insert_cast_for_fp_act_quantized_weight.py` to cast fp32 -> fp16 due to constraint in QNN HTP - Add a test case to run conv2d and linear with fp16a8w
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19537
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit d69272f with merge base d939b9b ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
Hi @psiddh, Bug Fix PR: Op Enablement PR:
Claude Skill PR: LLM Related PR:
Debugging Related PR:
Others: |
Thanks for listing all the PRs I will start looking at them |
Summary:
insert_cast_for_fp_act_quantized_weight.pyto cast fp32 -> fp16 due to constraint in QNN HTPTest plan