Qualcomm AI Engine Direct - Add fp16a8w quantization config by shewu-quic · Pull Request #19537 · pytorch/executorch

shewu-quic · 2026-05-13T05:16:23Z

Summary:

Add fp16a8w quantization config
- Note that fp16a8w is only supported with Conv2d (kernel size = 1) and Linear by QNN HTP
Add a pass insert_cast_for_fp_act_quantized_weight.py to cast fp32 -> fp16 due to constraint in QNN HTP
Add a test case to run conv2d and linear with fp16a8w

Test plan

python3 backends/qualcomm/tests/test_qnn_delegate.py TestQNNFloatingPointOperator.test_qnn_backend_fp16a8w_simple_model  -b build-android -H ${HOST} -s ${DEVICE}  -m SM8750 -r /path/to/executorch -a /path/to/artifacts

Summary: - Add a pass `insert_cast_for_fp_act_quantized_weight.py` to cast fp32 -> fp16 due to constraint in QNN HTP - Add a test case to run conv2d and linear with fp16a8w

pytorch-bot · 2026-05-13T05:16:27Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19537

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Run pull request jobs on OSDC runners in shadow mode

✅ No Failures

As of commit d69272f with merge base d939b9b ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-05-13T05:17:10Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

shewu-quic · 2026-05-15T03:09:18Z

Hi @psiddh,
Thank you for reviewing the PR. The following PRs are pending review and merge. Could you please take a look?

Bug Fix PR:

Op Enablement PR:

Qualcomm AI Engine Direct - Adding QNN backend support for randn core ATen op #19377
Qualcomm AI Engine Direct - Adding QNN backend support for tan core ATen op #19301
Qualcomm AI Engine Direct - Adding QNN backend support for scatter.src core ATen op #19283

Claude Skill PR:

Qualcomm AI Engine Direct - Updating Claude skill for new op development #19302

LLM Related PR:

Debugging Related PR:

Others:

psiddh · 2026-05-15T03:19:06Z

Hi @psiddh, Thank you for reviewing the PR. The following PRs are pending review and merge. Could you please take a look?

Bug Fix PR:

Qualcomm AI Engine Direct - Minor qnn_config fix #19388

Qualcomm AI Engine Direct - Fix Full #19359

Op Enablement PR:

Qualcomm AI Engine Direct - Adding QNN backend support for randn core ATen op #19377

Qualcomm AI Engine Direct - Adding QNN backend support for tan core ATen op #19301

Qualcomm AI Engine Direct - Adding QNN backend support for scatter.src core ATen op #19283

Claude Skill PR:

Qualcomm AI Engine Direct - Updating Claude skill for new op development #19302

LLM Related PR:

Qualcomm AI Engine Direct - Decouple quantization and compile graphs for faster VLM/LLM PTQ #19220

Qualcomm AI Engine Direct - Refactor llama runner for dynamic IO dtypes #19146

Debugging Related PR:

Qualcomm AI Engine Direct - heap profiling at runtime with HTP backend #19224

Qualcomm AI Engine Direct - Debugger Convergence Phase 2: Migrating to official numeric discrepancy evaluator #18834

Others:

Qualcomm AI Engine Direct - Remove AIHub Example Scripts #19144

Qualcomm AI Engine Direct - Refactor QnnDlcManager #19105

Qualcomm AI Engine Direct - Add fp16a8w quantization config #19537

Thanks for listing all the PRs I will start looking at them

Qualcomm AI Engine Direct - Add fp16a8w quantization config

d69272f

Summary: - Add a pass `insert_cast_for_fp_act_quantized_weight.py` to cast fp32 -> fp16 due to constraint in QNN HTP - Add a test case to run conv2d and linear with fp16a8w

shewu-quic requested a review from abhinaykukkadapu as a code owner May 13, 2026 05:16

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qualcomm AI Engine Direct - Add fp16a8w quantization config#19537

Qualcomm AI Engine Direct - Add fp16a8w quantization config#19537
shewu-quic wants to merge 1 commit into
pytorch:mainfrom
CodeLinaro:dev1/hutton/fp_act_quantized_weight

shewu-quic commented May 13, 2026

Uh oh!

pytorch-bot Bot commented May 13, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

shewu-quic commented May 15, 2026

Uh oh!

psiddh commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shewu-quic commented May 13, 2026

Summary:

Test plan

Uh oh!

pytorch-bot Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19537

❗ 1 Active SEVs

✅ No Failures

Uh oh!

github-actions Bot commented May 13, 2026

This PR needs a release notes: label

Uh oh!

shewu-quic commented May 15, 2026

Uh oh!

psiddh commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot Bot commented May 13, 2026 •

edited

Loading

This PR needs a `release notes:` label