[ET-VK][qconv] Add software fallback for dotPacked4x8AccSatEXT in q8ta_ shaders by SS-JIA · Pull Request #17704 · pytorch/executorch

SS-JIA · 2026-02-25T14:18:07Z

Stack from ghstack (oldest at bottom):

Devices that lack VK_KHR_shader_integer_dot_product (older GPUs, emulators)
currently fail with ShaderNotSupportedError when running int8-quantized
conv2d/linear because the q8ta_ shaders unconditionally require
GL_EXT_integer_dot_product. This adds fallback SPIR-V variants that use a
pure-GLSL software implementation so those devices can still execute the
operators at a performance cost.

Approach: compile-time macro USE_INT8_DOT_PRODUCT_EXT selects the
implementation. Each affected YAML file gains a *_fallback shader variant
compiled with USE_INT8_DOT_PRODUCT_EXT=0. At C++ dispatch time,
adapter_ptr()->supports_int8_dot_product() picks the matching variant.

Changes:

common.glslh: add dotPacked4x8Acc_fallback() and dotPacked4x8AccSat()
dispatch macro
linear_fp_output_tile_int8_int8_compute.glslh: guard extension + use macro
q8ta_conv2d/pw/linear/linear_gemv .glsl: inject USE_INT8_DOT_PRODUCT_EXT
template define, guard extension, replace direct EXT calls with macro
q8ta_conv2d/pw/linear/linear_gemv .yaml: add USE_INT8_DOT_PRODUCT_EXT
parameter and *_fallback shader variants
Q8taConv2d/PW/Linear/LinearGemv .cpp: call supports_int8_dot_product() to
select hardware vs. fallback variant at runtime

Differential Revision: D94314256

…a_ shaders Devices that lack VK_KHR_shader_integer_dot_product (older GPUs, emulators) currently fail with ShaderNotSupportedError when running int8-quantized conv2d/linear because the q8ta_ shaders unconditionally require GL_EXT_integer_dot_product. This adds fallback SPIR-V variants that use a pure-GLSL software implementation so those devices can still execute the operators at a performance cost. Approach: compile-time macro USE_INT8_DOT_PRODUCT_EXT selects the implementation. Each affected YAML file gains a *_fallback shader variant compiled with USE_INT8_DOT_PRODUCT_EXT=0. At C++ dispatch time, adapter_ptr()->supports_int8_dot_product() picks the matching variant. Changes: - common.glslh: add dotPacked4x8Acc_fallback() and dotPacked4x8AccSat() dispatch macro - linear_fp_output_tile_int8_int8_compute.glslh: guard extension + use macro - q8ta_conv2d/pw/linear/linear_gemv .glsl: inject USE_INT8_DOT_PRODUCT_EXT template define, guard extension, replace direct EXT calls with macro - q8ta_conv2d/pw/linear/linear_gemv .yaml: add USE_INT8_DOT_PRODUCT_EXT parameter and *_fallback shader variants - Q8taConv2d/PW/Linear/LinearGemv .cpp: call supports_int8_dot_product() to select hardware vs. fallback variant at runtime Differential Revision: [D94314256](https://our.internmc.facebook.com/intern/diff/D94314256/) [ghstack-poisoned]

pytorch-bot · 2026-02-25T14:18:11Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17704

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 1 Cancelled Job

As of commit 616c756 with merge base 63f9724 ():

NEW FAILURES - The following jobs have failed:

Build Presets / windows (pybind) / build (gh)
pull / android / run-emulator (gh)
The process '/opt/android/sdk/platform-tools/adb' failed with exit code 224
Test CUDA Builds / test-model-cuda-e2e (openai, whisper-small, non-quantized) / linux-job (gh)
RuntimeError: Command docker exec -t 429afbe58116d55f0680b06bd104bb667a1bf1bd53d4dfb14c69538a816fe93f /exec failed with exit code 1

CANCELLED JOB - The following job was cancelled. Please retry:

pull / unittest-buck / macos / macos-job (gh)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-02-25T14:19:00Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

…a_ shaders Devices that lack VK_KHR_shader_integer_dot_product (older GPUs, emulators) currently fail with ShaderNotSupportedError when running int8-quantized conv2d/linear because the q8ta_ shaders unconditionally require GL_EXT_integer_dot_product. This adds fallback SPIR-V variants that use a pure-GLSL software implementation so those devices can still execute the operators at a performance cost. Approach: compile-time macro USE_INT8_DOT_PRODUCT_EXT selects the implementation. Each affected YAML file gains a *_fallback shader variant compiled with USE_INT8_DOT_PRODUCT_EXT=0. At C++ dispatch time, adapter_ptr()->supports_int8_dot_product() picks the matching variant. Changes: - common.glslh: add dotPacked4x8Acc_fallback() and dotPacked4x8AccSat() dispatch macro - linear_fp_output_tile_int8_int8_compute.glslh: guard extension + use macro - q8ta_conv2d/pw/linear/linear_gemv .glsl: inject USE_INT8_DOT_PRODUCT_EXT template define, guard extension, replace direct EXT calls with macro - q8ta_conv2d/pw/linear/linear_gemv .yaml: add USE_INT8_DOT_PRODUCT_EXT parameter and *_fallback shader variants - Q8taConv2d/PW/Linear/LinearGemv .cpp: call supports_int8_dot_product() to select hardware vs. fallback variant at runtime Differential Revision: [D94314256](https://our.internmc.facebook.com/intern/diff/D94314256/) ghstack-source-id: 344664015 Pull Request resolved: #17704

This was referenced Feb 25, 2026

[ET-VK][qconv] Read weight buffer as int in pack_q8_conv2d_weights shader #17703

Merged

[ET-VK][testing] Add ETVK_FORCE_NO_EXTENSIONS debug flag #17705

Merged

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 25, 2026

manuelcandales approved these changes Feb 25, 2026

View reviewed changes

SS-JIA merged commit 51e1a4c into gh/SS-JIA/447/base Feb 25, 2026
200 of 206 checks passed

SS-JIA deleted the gh/SS-JIA/447/head branch February 25, 2026 19:53

SS-JIA temporarily deployed to cherry-pick-bot February 25, 2026 19:53 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ET-VK][qconv] Add software fallback for dotPacked4x8AccSatEXT in q8ta_ shaders#17704

[ET-VK][qconv] Add software fallback for dotPacked4x8AccSatEXT in q8ta_ shaders#17704
SS-JIA merged 1 commit intogh/SS-JIA/447/basefrom
gh/SS-JIA/447/head

SS-JIA commented Feb 25, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Feb 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SS-JIA commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17704

❌ 3 New Failures, 1 Cancelled Job

Uh oh!

github-actions Bot commented Feb 25, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SS-JIA commented Feb 25, 2026 •

edited

Loading

pytorch-bot Bot commented Feb 25, 2026 •

edited

Loading

This PR needs a `release notes:` label