Skip to content

[ET-VK] Insert prepack nodes for constant primary inputs of prepacking ops#17850

Merged
meta-codesync[bot] merged 1 commit intogh/SS-JIA/459/basefrom
gh/SS-JIA/459/head
Mar 4, 2026
Merged

[ET-VK] Insert prepack nodes for constant primary inputs of prepacking ops#17850
meta-codesync[bot] merged 1 commit intogh/SS-JIA/459/basefrom
gh/SS-JIA/459/head

Conversation

@SS-JIA
Copy link
Copy Markdown
Contributor

@SS-JIA SS-JIA commented Mar 4, 2026

Stack from ghstack (oldest at bottom):

The insert_prepack_nodes pass was skipping prepack node insertion for all
constant tensor args of ops with supports_prepacking=True. However, these ops
only handle prepacking for weight/bias tensors internally; the primary input
tensor is still expected to be a GPU tensor. If the primary input happens to be
a constant tensor (serialized as TensorRef), the op throws an exception at
runtime.

Fix this by detecting the primary input index directly in insert_prepack_nodes.
Most prepacking ops have the primary input at arg 0, but embedding uses arg 1
since its signature is embedding(weight, indices, ...). The pass now checks
whether a constant tensor is used as the primary input of a prepacking op, and
if so, still inserts a prepack node for it.

Differential Revision: D95217949

…g ops

The insert_prepack_nodes pass was skipping prepack node insertion for all
constant tensor args of ops with supports_prepacking=True. However, these ops
only handle prepacking for weight/bias tensors internally; the primary input
tensor is still expected to be a GPU tensor. If the primary input happens to be
a constant tensor (serialized as TensorRef), the op throws an exception at
runtime.

Fix this by detecting the primary input index directly in insert_prepack_nodes.
Most prepacking ops have the primary input at arg 0, but embedding uses arg 1
since its signature is embedding(weight, indices, ...). The pass now checks
whether a constant tensor is used as the primary input of a prepacking op, and
if so, still inserts a prepack node for it.

Differential Revision: [D95217949](https://our.internmc.facebook.com/intern/diff/D95217949/)

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Mar 4, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17850

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 4 Unrelated Failures

As of commit 2d65b78 with merge base 1a75394 (image):

NEW FAILURE - The following job has failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

SS-JIA pushed a commit that referenced this pull request Mar 4, 2026
…g ops

The insert_prepack_nodes pass was skipping prepack node insertion for all
constant tensor args of ops with supports_prepacking=True. However, these ops
only handle prepacking for weight/bias tensors internally; the primary input
tensor is still expected to be a GPU tensor. If the primary input happens to be
a constant tensor (serialized as TensorRef), the op throws an exception at
runtime.

Fix this by detecting the primary input index directly in insert_prepack_nodes.
Most prepacking ops have the primary input at arg 0, but embedding uses arg 1
since its signature is embedding(weight, indices, ...). The pass now checks
whether a constant tensor is used as the primary input of a prepacking op, and
if so, still inserts a prepack node for it.

Differential Revision: [D95217949](https://our.internmc.facebook.com/intern/diff/D95217949/)

ghstack-source-id: 347411473
Pull Request resolved: #17850
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 4, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 4, 2026

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@meta-codesync meta-codesync Bot merged commit 8c4d7a9 into gh/SS-JIA/459/base Mar 4, 2026
208 of 224 checks passed
@meta-codesync meta-codesync Bot deleted the gh/SS-JIA/459/head branch March 4, 2026 23:44
@meta-codesync meta-codesync Bot temporarily deployed to cherry-pick-bot March 4, 2026 23:44 Inactive
SS-JIA pushed a commit that referenced this pull request Mar 5, 2026
…g ops

The insert_prepack_nodes pass was skipping prepack node insertion for all
constant tensor args of ops with supports_prepacking=True. However, these ops
only handle prepacking for weight/bias tensors internally; the primary input
tensor is still expected to be a GPU tensor. If the primary input happens to be
a constant tensor (serialized as TensorRef), the op throws an exception at
runtime.

Fix this by detecting the primary input index directly in insert_prepack_nodes.
Most prepacking ops have the primary input at arg 0, but embedding uses arg 1
since its signature is embedding(weight, indices, ...). The pass now checks
whether a constant tensor is used as the primary input of a prepacking op, and
if so, still inserts a prepack node for it.

Differential Revision: [D95217949](https://our.internmc.facebook.com/intern/diff/D95217949/)

ghstack-source-id: 347411473
Pull Request resolved: #17850
SS-JIA pushed a commit that referenced this pull request Mar 5, 2026
…g ops

The insert_prepack_nodes pass was skipping prepack node insertion for all
constant tensor args of ops with supports_prepacking=True. However, these ops
only handle prepacking for weight/bias tensors internally; the primary input
tensor is still expected to be a GPU tensor. If the primary input happens to be
a constant tensor (serialized as TensorRef), the op throws an exception at
runtime.

Fix this by detecting the primary input index directly in insert_prepack_nodes.
Most prepacking ops have the primary input at arg 0, but embedding uses arg 1
since its signature is embedding(weight, indices, ...). The pass now checks
whether a constant tensor is used as the primary input of a prepacking op, and
if so, still inserts a prepack node for it.

Differential Revision: [D95217949](https://our.internmc.facebook.com/intern/diff/D95217949/)

ghstack-source-id: 347411473
Pull Request resolved: #17850
SS-JIA pushed a commit that referenced this pull request Mar 5, 2026
…g ops

The insert_prepack_nodes pass was skipping prepack node insertion for all
constant tensor args of ops with supports_prepacking=True. However, these ops
only handle prepacking for weight/bias tensors internally; the primary input
tensor is still expected to be a GPU tensor. If the primary input happens to be
a constant tensor (serialized as TensorRef), the op throws an exception at
runtime.

Fix this by detecting the primary input index directly in insert_prepack_nodes.
Most prepacking ops have the primary input at arg 0, but embedding uses arg 1
since its signature is embedding(weight, indices, ...). The pass now checks
whether a constant tensor is used as the primary input of a prepacking op, and
if so, still inserts a prepack node for it.

Differential Revision: [D95217949](https://our.internmc.facebook.com/intern/diff/D95217949/)

ghstack-source-id: 347411473
Pull Request resolved: #17850
jpiat pushed a commit to jpiat/executorch that referenced this pull request Mar 17, 2026
…g ops

The insert_prepack_nodes pass was skipping prepack node insertion for all
constant tensor args of ops with supports_prepacking=True. However, these ops
only handle prepacking for weight/bias tensors internally; the primary input
tensor is still expected to be a GPU tensor. If the primary input happens to be
a constant tensor (serialized as TensorRef), the op throws an exception at
runtime.

Fix this by detecting the primary input index directly in insert_prepack_nodes.
Most prepacking ops have the primary input at arg 0, but embedding uses arg 1
since its signature is embedding(weight, indices, ...). The pass now checks
whether a constant tensor is used as the primary input of a prepacking op, and
if so, still inserts a prepack node for it.

Differential Revision: [D95217949](https://our.internmc.facebook.com/intern/diff/D95217949/)

ghstack-source-id: 347411473
Pull Request resolved: pytorch#17850
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants