Add 16A8W quantization configuration utility for ARM backend by Ninja91 · Pull Request #13893 · pytorch/executorch

Ninja91 · 2025-09-03T01:58:40Z

Stack from ghstack (oldest at bottom):

(to be filled)

This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479.

Key Changes

1. New Quantization Configuration Function

Add get_16a8w_quantization_config() in fbcode/executorch/backends/arm/quantizer/arm_quantizer.py
Provides 16-bit activations with HistogramObserver (better precision than 8A8W)
Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient)
Technically supported by TOSA through EXT-INT16 extension/profile

Benefits

Better Precision: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets.
@exported-using-ghexport

@bypass-github-export-checks
@bypass-github-pytorch-ci-checks
@bypass-github-executorch-ci-checks

Differential Revision: D81550512

cc @freddan80 @per @zingo @oscarandersson8218 @digantdesai

This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. ghstack-source-id: 305991462 @exported-using-ghexport @bypass-github-export-checks @bypass-github-pytorch-ci-checks @bypass-github-executorch-ci-checks Differential Revision: [D81550512](https://our.internmc.facebook.com/intern/diff/D81550512/) [ghstack-poisoned]

pytorch-bot · 2025-09-03T01:58:43Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13893

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 2 Unrelated Failures

As of commit 4632410 with merge base 02da205 ():

NEW FAILURES - The following jobs have failed:

Lint / lintrunner / linux-job (gh)
>>> Lint for backends/arm/quantizer/arm_quantizer.py:
Propose to merge ghstack orig PRs to main / Try to create a PR with ghstack /orig branch (gh)
Process completed with exit code 1.
pull / unittest-arm-backend-with-no-fvp (test_pytest_ops) / linux-job (gh)
RuntimeError: Command docker exec -t 751888db88ec6e8f56e24ed07dde5d2bde7efa2011eb4fa72eb540ca4ba06461 /exec failed with exit code 1

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest-arm-backend-with-no-fvp (test_pytest_models) / linux-job (gh) (trunk failure)
backends/arm/test/models/stable_diffusion/test_T5EncoderModel.py::TestT5EncoderModel::test_T5EncoderModel_tosa_INT
pull / unittest-nxp-neutron / linux-job (gh) (trunk failure)
test_split_group_convolution__applied_by_default

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-09-03T01:59:19Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

github-actions · 2025-11-03T00:51:36Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

github-actions · 2026-01-02T00:51:46Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

zingo · 2026-02-11T14:46:50Z

Is this still needed?

zingo · 2026-03-10T08:31:38Z

We think this is done and can be closed, please reopen of you don't agree.

Ninja91 requested a review from digantdesai as a code owner September 3, 2025 01:58

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 3, 2025

github-actions Bot added the Stale PRs inactive for over 60 days label Nov 3, 2025

Sebastian-Larsson added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Feb 11, 2026

zingo closed this Mar 10, 2026

zingo had a problem deploying to cherry-pick-bot March 10, 2026 08:31 — with GitHub Actions Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add 16A8W quantization configuration utility for ARM backend#13893

Add 16A8W quantization configuration utility for ARM backend#13893
Ninja91 wants to merge 1 commit intogh/Ninja91/16/basefrom
gh/Ninja91/16/head

Ninja91 commented Sep 3, 2025 •

edited by pytorch-bot Bot

Loading

Uh oh!

pytorch-bot Bot commented Sep 3, 2025 •

edited

Loading

Uh oh!

github-actions Bot commented Sep 3, 2025

Uh oh!

github-actions Bot commented Nov 3, 2025

Uh oh!

github-actions Bot commented Jan 2, 2026

Uh oh!

zingo commented Feb 11, 2026

Uh oh!

zingo commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Ninja91 commented Sep 3, 2025 • edited by pytorch-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Key Changes

Benefits

Uh oh!

pytorch-bot Bot commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13893

❌ 3 New Failures, 2 Unrelated Failures

Uh oh!

github-actions Bot commented Sep 3, 2025

This PR needs a release notes: label

Uh oh!

github-actions Bot commented Nov 3, 2025

Uh oh!

github-actions Bot commented Jan 2, 2026

Uh oh!

zingo commented Feb 11, 2026

Uh oh!

zingo commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Ninja91 commented Sep 3, 2025 •

edited by pytorch-bot Bot

Loading

pytorch-bot Bot commented Sep 3, 2025 •

edited

Loading

This PR needs a `release notes:` label