[ET-VK][qlinear] Look through output view_copy when detecting output quantization by SS-JIA · Pull Request #18014 · pytorch/executorch

SS-JIA · 2026-03-09T16:14:19Z

Stack from ghstack (oldest at bottom):

When aten.linear has 3D+ inputs, it decomposes into
view_copy -> mm -> view_copy. The output view_copy between mm and the
subsequent quantize_per_tensor node was preventing the pattern matcher
from detecting output quantization, causing the match to fall through
to linear_q8ta_q8csw instead of q8ta_linear_gemv. This caused a
dtype mismatch during FakeTensor re-tracing in FusePatternsPass because
linear_q8ta_q8csw's composite implementation does not dequantize its
input, producing int8 output where float32 was expected.

Mirror the existing input-side view_copy handling (lines 99-104) on the
output side so the quantize node is found through the view_copy.

Differential Revision: D95807075

cc @manuelcandales @digantdesai @cbilgin

…quantization When `aten.linear` has 3D+ inputs, it decomposes into `view_copy -> mm -> view_copy`. The output view_copy between mm and the subsequent quantize_per_tensor node was preventing the pattern matcher from detecting output quantization, causing the match to fall through to `linear_q8ta_q8csw` instead of `q8ta_linear_gemv`. This caused a dtype mismatch during FakeTensor re-tracing in FusePatternsPass because `linear_q8ta_q8csw`'s composite implementation does not dequantize its input, producing int8 output where float32 was expected. Mirror the existing input-side view_copy handling (lines 99-104) on the output side so the quantize node is found through the view_copy. Differential Revision: [D95807075](https://our.internmc.facebook.com/intern/diff/D95807075/) [ghstack-poisoned]

pytorch-bot · 2026-03-09T16:14:24Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18014

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures, 1 Cancelled Job

As of commit d7eda9c with merge base f09bd55 ():

NEW FAILURES - The following jobs have failed:

pull / unittest-arm-backend-with-no-deps (test_pytest_models_tosa) / linux-job (gh)
RuntimeError: Command docker exec -t 7c5eb6e1f76841d365a0b5948a88f492f6ce7aeca8e418feb381ee5944d9e86e /exec failed with exit code 1
pull / unittest-editable / macos / macos-job (gh)
export/tests/test_target_recipes.py::TestTargetRecipes::test_mv3_model
Test Metal Backend / export-model-metal-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-metal) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1
Test Metal Backend / export-model-metal-artifact (openai, whisper-large-v3-turbo, non-quantized) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1
Test Metal Backend / export-model-metal-artifact (openai, whisper-large-v3-turbo, quantized-int4-metal) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1

CANCELLED JOB - The following job was cancelled. Please retry:

pull / test-samsung-models-linux / linux-job (gh)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-03-09T16:15:47Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

…ing output quantization" When `aten.linear` has 3D+ inputs, it decomposes into `view_copy -> mm -> view_copy`. The output view_copy between mm and the subsequent quantize_per_tensor node was preventing the pattern matcher from detecting output quantization, causing the match to fall through to `linear_q8ta_q8csw` instead of `q8ta_linear_gemv`. This caused a dtype mismatch during FakeTensor re-tracing in FusePatternsPass because `linear_q8ta_q8csw`'s composite implementation does not dequantize its input, producing int8 output where float32 was expected. Mirror the existing input-side view_copy handling (lines 99-104) on the output side so the quantize node is found through the view_copy. Differential Revision: [D95807075](https://our.internmc.facebook.com/intern/diff/D95807075/) [ghstack-poisoned]

…quantization Pull Request resolved: #18014 When `aten.linear` has 3D+ inputs, it decomposes into `view_copy -> mm -> view_copy`. The output view_copy between mm and the subsequent quantize_per_tensor node was preventing the pattern matcher from detecting output quantization, causing the match to fall through to `linear_q8ta_q8csw` instead of `q8ta_linear_gemv`. This caused a dtype mismatch during FakeTensor re-tracing in FusePatternsPass because `linear_q8ta_q8csw`'s composite implementation does not dequantize its input, producing int8 output where float32 was expected. Mirror the existing input-side view_copy handling (lines 99-104) on the output side so the quantize node is found through the view_copy. ghstack-source-id: 349646653 @exported-using-ghexport Differential Revision: [D95807075](https://our.internmc.facebook.com/intern/diff/D95807075/)

SS-JIA mentioned this pull request Mar 9, 2026

[ET-VK][ez] Fix duplicate placeholder target in create_constant_placeholder #18013

Merged

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 9, 2026

This was referenced Mar 9, 2026

[ET-VK][qdq] Support high-dimensional tensors in quantize/dequantize per tensor #18015

Merged

[ET-VK][qconv] Add q8ta_conv2d_transposed operator #18016

Merged

[ET-VK][qlinear] Add bmm support to quantized linear pattern detector #18017

Merged

meta-codesync Bot added fb-exported meta-exported labels Mar 9, 2026

manuelcandales approved these changes Mar 9, 2026

View reviewed changes

ssjia added 2 commits March 9, 2026 12:11

digantdesai added the module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/ label Mar 10, 2026

meta-codesync Bot merged commit 420ce2c into gh/SS-JIA/461/base Mar 10, 2026
213 of 221 checks passed

meta-codesync Bot deleted the gh/SS-JIA/461/head branch March 10, 2026 08:53

meta-codesync Bot temporarily deployed to cherry-pick-bot March 10, 2026 08:53 Inactive

pytorchbot mentioned this pull request Mar 10, 2026

[ET-VK][qlinear] Look through output view_copy when detecting output quantization #18032

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ET-VK][qlinear] Look through output view_copy when detecting output quantization#18014

[ET-VK][qlinear] Look through output view_copy when detecting output quantization#18014
meta-codesync[bot] merged 4 commits intogh/SS-JIA/461/basefrom
gh/SS-JIA/461/head

SS-JIA commented Mar 9, 2026 •

edited by pytorch-bot Bot

Loading

Uh oh!

pytorch-bot Bot commented Mar 9, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

SS-JIA commented Mar 9, 2026 • edited by pytorch-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18014

❌ 5 New Failures, 1 Cancelled Job

Uh oh!

github-actions Bot commented Mar 9, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SS-JIA commented Mar 9, 2026 •

edited by pytorch-bot Bot

Loading

pytorch-bot Bot commented Mar 9, 2026 •

edited

Loading

This PR needs a `release notes:` label