Fix Conv1d w8a32 operator (#16607)#16607
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16607
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (2 Unrelated Failures)As of commit d041e3f with merge base b5cf3c3 ( BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
727b730 to
5d8c391
Compare
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
There was a problem hiding this comment.
Pull request overview
This PR fixes the Conv1d w8a32 operator by adding metadata propagation for transposed tensors and adding input validation to prevent unsupported configurations.
Changes:
- Added metadata propagation for the
valattribute when creating transposed inputs and weights in the Conv1d w8a32 operator - Added validation in patterns.py to bail early when input length doesn't equal kernel size (marked as "not yet supported")
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| backends/cadence/aot/quantizer/fusion_pass.py | Adds proper fake_mode-aware metadata propagation for transposed_inputs and transposed_weights when transforming Conv1d tensors from NCL to NLC format |
| backends/cadence/aot/quantizer/patterns.py | Adds validation to reject Conv1d operations where input length doesn't equal kernel size (3), marking this configuration as not yet supported |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Bail if length != kernel size - Not yet supported | ||
| if inputs_shape[-1] != cnn_weights_shape[2]: | ||
| return ( | ||
| PartitionAnchors( | ||
| empty=True, | ||
| ), | ||
| conv_layer, | ||
| ) |
There was a problem hiding this comment.
This check restricts the w8a32_conv pattern to only match when the input length equals the kernel size (3). While the comment indicates this is intentionally not yet supported, this is quite restrictive. Standard convolution operations typically support input lengths greater than or equal to the kernel size. The reference implementation in ref_implementations.py (lines 926-970) and the test in test_ref_implementations.py (lines 1156-1166 show length=5 with kernel=3) both support arbitrary input lengths. Consider whether this restriction is necessary, or if it should be relaxed to allow input_length >= kernel_size to enable the optimization in more cases.
| # Propagate val metadata for transposed_inputs | ||
| if "val" in other_inputs[0].meta: | ||
| original_val = other_inputs[0].meta["val"] | ||
| fake_mode = original_val.fake_mode | ||
| if fake_mode is not None: | ||
| with fake_mode: | ||
| transposed_val = torch.ops.aten.permute.default( | ||
| original_val, [0, 2, 1] | ||
| ) | ||
| transposed_inputs.meta["val"] = transposed_val | ||
| else: | ||
| transposed_inputs.meta["val"] = torch.ops.aten.permute.default( | ||
| original_val, [0, 2, 1] | ||
| ) | ||
| copy_node_metadata(transposed_inputs, other_inputs[0]) | ||
|
|
||
| transposed_weights = graph_module.graph.call_function( | ||
| torch.ops.aten.permute.default, | ||
| (weights_inputs[0], [2, 0, 1]), # NCL -> LNC | ||
| ) | ||
| # Propagate val metadata for transposed_weights | ||
| if "val" in weights_inputs[0].meta: | ||
| original_val = weights_inputs[0].meta["val"] | ||
| fake_mode = original_val.fake_mode | ||
| if fake_mode is not None: | ||
| with fake_mode: | ||
| transposed_val = torch.ops.aten.permute.default( | ||
| original_val, [2, 0, 1] | ||
| ) | ||
| transposed_weights.meta["val"] = transposed_val | ||
| else: | ||
| transposed_weights.meta["val"] = torch.ops.aten.permute.default( | ||
| original_val, [2, 0, 1] | ||
| ) |
There was a problem hiding this comment.
The metadata propagation logic for transposed_inputs (lines 435-448) and transposed_weights (lines 455-468) is duplicated with only minor variations. This pattern also appears elsewhere in the codebase (e.g., lines 164-176, 189-200, 376-385, 653-671). Consider extracting this into a helper function to reduce code duplication and improve maintainability. The helper function could take parameters like the node, transformation operation, and transformation arguments.
5d8c391 to
7850292
Compare
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
7850292 to
24f179a
Compare
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
Summary: Pull Request resolved: pytorch#16607 #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
24f179a to
d870e83
Compare
d870e83 to
54e7f94
Compare
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
Summary: Pull Request resolved: pytorch#16607 #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
54e7f94 to
f01d1a1
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Bail if length != kernel size - Not yet supported | ||
| if inputs_shape[-1] != cnn_weights_shape[2]: |
There was a problem hiding this comment.
The new input-length guard inputs_shape[-1] != cnn_weights_shape[2] looks incorrect for Conv1d: inputs_shape[-1] is the sequence length (L), while cnn_weights_shape[2] is the kernel size (K=3). cadence::quantized_w8a32_conv (and its meta/ref implementations) support L > K (output length is L - K + 1), and existing tests cover L=5, K=3. This condition would incorrectly bail out for normal Conv1d inputs and prevent the pattern from ever matching.
Consider removing this guard, or (if needed) only bailing when L < K (or other truly unsupported cases).
| # Bail if length != kernel size - Not yet supported | |
| if inputs_shape[-1] != cnn_weights_shape[2]: | |
| # Bail only when the input length is smaller than the kernel size. | |
| # Conv1d supports input lengths greater than the kernel size. | |
| if inputs_shape[-1] < cnn_weights_shape[2]: |
| inputs = conv_layer.args[0] | ||
| if "tensor_meta" in inputs.meta: | ||
| inputs_shape = inputs.meta["tensor_meta"].shape | ||
| # Bail if length != kernel size - Not yet supported |
There was a problem hiding this comment.
This new shape-validation block is gated by if hasattr(cnn_weights.meta, "tensor_meta") above. Since fx.Node.meta is a dict, hasattr(..., "tensor_meta") will always be false, so none of the weight/input shape checks (including the new input-length check) will run.
Use a dict-key check (e.g., "tensor_meta" in cnn_weights.meta) so the validations actually execute when metadata is available.
| # Propagate val metadata for transposed_inputs | ||
| if "val" in other_inputs[0].meta: | ||
| original_val = other_inputs[0].meta["val"] | ||
| fake_mode = original_val.fake_mode | ||
| if fake_mode is not None: | ||
| with fake_mode: | ||
| transposed_val = torch.ops.aten.permute.default( | ||
| original_val, [0, 2, 1] | ||
| ) | ||
| transposed_inputs.meta["val"] = transposed_val |
There was a problem hiding this comment.
get_args_and_kwargs_mixed_w8a32_conv now conditionally propagates meta["val"] for the inserted permute nodes (and adds a fake_mode fallback). There doesn't appear to be any unit/integration test coverage exercising QuantFusion on a Conv1d->quantized_w8a32_conv rewrite, so regressions here (e.g., missing/incorrect meta causing later passes to fail) may go unnoticed.
Add a small test that runs QuantFusion on a minimal Conv1d graph and asserts the resulting graph contains the expected permutes + cadence::quantized_w8a32_conv, and that the graph can run FakeTensor/meta propagation without errors.
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
f01d1a1 to
065e5a9
Compare
Summary: Pull Request resolved: pytorch#16607 #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
065e5a9 to
51c81d0
Compare
aced046 to
cf1a951
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
Summary: Pull Request resolved: pytorch#16607 #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Differential Revision: D89863750 Reviewed By: mcremon-meta
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
cf1a951 to
36e3d08
Compare
Summary: Pull Request resolved: pytorch#16607 #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Differential Revision: D89863750 Reviewed By: mcremon-meta
Summary: Pull Request resolved: pytorch#16607 #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
36e3d08 to
61d10e8
Compare
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
61d10e8 to
d19f640
Compare
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
Summary: Pull Request resolved: pytorch#16607 #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Differential Revision: D89863750 Reviewed By: mcremon-meta
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Propagate val metadata for transposed_inputs | ||
| if "val" in other_inputs[0].meta: | ||
| original_val = other_inputs[0].meta["val"] | ||
| fake_mode = original_val.fake_mode | ||
| if fake_mode is not None: | ||
| with fake_mode: | ||
| transposed_val = torch.ops.aten.permute.default(original_val, [0, 2, 1]) | ||
| transposed_inputs.meta["val"] = transposed_val | ||
| else: | ||
| transposed_inputs.meta["val"] = torch.ops.aten.permute.default( | ||
| original_val, [0, 2, 1] | ||
| ) |
There was a problem hiding this comment.
This change adds new behavior for when original_val.fake_mode is None (falling back to a real aten.permute to populate meta["val"]). There doesn’t appear to be a unit test covering the fake_mode is None path for MixedW8A32 conv input/weight metadata propagation; adding one would help prevent regressions where the fusion pass reintroduces assertions or produces missing/incorrect val metadata.
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
d19f640 to
c4cf850
Compare
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
c4cf850 to
b3ae2b1
Compare
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
Summary: Pull Request resolved: pytorch#16607 #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
b3ae2b1 to
d041e3f
Compare
Summary:
Summary
This diff fixes the Conv1d w8a32 operator by adding a transformation to the
valattribute of theother_inputs[0].metadictionary. Specifically, thepermuteoperation is applied to theoriginal_valtensor with thefake_modecontext, and the resultingtransposed_valis assigned totransposed_inputs.meta["val"].Reviewed By: mcremon-meta
Differential Revision: D89863750