Add extended support from new Neutron C flow for Clamp operator#19510
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19510
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
38c5e26 to
3a90b03
Compare
3a90b03 to
f65fe45
Compare
There was a problem hiding this comment.
Pull request overview
This PR moves the use_new_flow_neutron_c flag from CustomDelegationOptions onto NeutronTargetSpec so it can be threaded through the quantizer and converters as part of target configuration, and extends the new Neutron‑C flow's clamp support beyond Relu‑shaped bounds by emitting Maximum/Minimum ops with quantized constants.
Changes:
- Move
use_new_flow_neutron_cfromCustomDelegationOptionstoNeutronTargetSpec, updating all call sites and converters that read the flag. - Extend
ClampConverterto lower arbitrary clamp bounds to aMax+Minsequence under the new flow, factoring_is_convertible_to_reluout as a helper used by both the converter and the newClampPattern.get_anchors. - Refactor
QuantizationPatternto requireneutron_quantizerin the base constructor, propagate this through all subclasses, and reworkClampPatternto use a shared‑spec anchor under the new flow.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| backends/nxp/backend/custom_delegation_options.py | Removes use_new_flow_neutron_c from CustomDelegationOptions. |
| backends/nxp/backend/neutron_target_spec.py | Adds use_new_flow_neutron_c constructor argument and attribute. |
| backends/nxp/nxp_backend.py | Threads flag into NeutronTargetSpec constructions; drops it from CustomDelegationOptions. |
| backends/nxp/backend/ir/converter/node_converters/ops_converters/abs_converter.py | Reads new flag from target spec instead of delegation options. |
| backends/nxp/backend/ir/converter/node_converters/ops_converters/avg_pool_2d_converter.py | Same flag-source switch. |
| backends/nxp/backend/ir/converter/node_converters/ops_converters/max_pool2d_with_indices_converter.py | Same flag-source switch. |
| backends/nxp/backend/ir/converter/node_converters/ops_converters/mul_tensor_converter.py | Same flag-source switch. |
| backends/nxp/backend/ir/converter/node_converters/ops_converters/clamp_converter.py | Adds Max/Min lowering for non‑Relu clamp bounds under new flow; introduces _is_convertible_to_relu. |
| backends/nxp/quantizer/patterns.py | QuantizationPattern now takes neutron_quantizer; ClampPattern adds shared‑spec anchors when new flow is on. |
| backends/nxp/quantizer/neutron_quantizer.py | Passes self into all pattern constructors. |
| backends/nxp/tests/executorch_pipeline.py | Construct NeutronTargetSpec with the new flag; minor formatting. |
| backends/nxp/tests/ir/converter/node_converter/test_clamp_converter.py | Adds parameterized tests for clamp under the new Neutron‑C flow. |
| examples/nxp/aot_neutron_compile.py | Forwards CLI flag into NeutronTargetSpec. |
Comments suppressed due to low confidence (1)
backends/nxp/backend/ir/converter/node_converters/ops_converters/clamp_converter.py:169
- The min/max constant tensors are hard-coded to
np.int8, but_is_supported_on_targetreturnsTrueunconditionally whenuse_new_flow_neutron_cis set, with no dtype filter. If the input isuint8(which is listed as a supported type by other converters in this flow, e.g.abs_converter.py,avg_pool_2d_converter.py), values >127 will silently overflow when stored asint8, producing incorrect outputs. The dtype passed tonp.arrayshould be derived from the dequant node's dtype (or fromquant_min/quant_max).
min_tensor = self.builder.create_tensor_for_data(
np.array([min_value], np.int8), "min"
)
self.propagate_quantization(x, min_tensor)
max_tensor = self.builder.create_tensor_for_data(
np.array([max_value], np.int8), "max"
)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return True | ||
|
|
| if ( | ||
| self.neutron_quantizer.neutron_target_spec.use_new_flow_neutron_c | ||
| and not _is_convertible_to_relu(node) | ||
| ): | ||
| # Shared spec pattern | ||
| assert len(fused_partition[0].input_nodes) == 1 | ||
| prev_node = fused_partition[0].input_nodes[0] | ||
|
|
||
| # Previous node was not quantized => we are not able to share q-params | ||
| if Q_ANNOTATION_KEY not in prev_node.meta: | ||
| return None | ||
|
|
||
| qspec = SharedQuantizationSpec(prev_node) | ||
|
|
||
| return PartitionAnchors( | ||
| inputs=[(node, NodeArgsIdx(0))], | ||
| weights=[], | ||
| biases=[], | ||
| output=[ | ||
| (node, qspec), | ||
| ], | ||
| ) |
| @@ -1 +1 @@ | |||
| # Copyright 2025 NXP | |||
| class BatchNormPattern(QuantizationPattern): | ||
| def __init__(self, is_qat: bool): | ||
| super().__init__(is_qat=is_qat) | ||
|
|
There was a problem hiding this comment.
Why was this removed?
There was a problem hiding this comment.
It was not needed as is was already identical with QuantizationPattern init. Since now the QuantizationPattern has also the neutron_quantizer parameter, this init overrides the QuantizationPattern with inconsistent number of parameters - so I removed it.
| if bounds not in ClampConverter.SUPPORTED_BOUNDS.values(): | ||
| return False | ||
| if neutron_target_spec.use_new_flow_neutron_c: | ||
| return True |
There was a problem hiding this comment.
Please add type checking here.
if not NodeConverter.uses_quantization_type_for_io(
node,
supported_types=[torch.int8, torch.uint8],
input_indices=[0],
output_indices=[0],
):
return False|
|
||
|
|
||
| class ClampConverter(NodeConverter): | ||
| SUPPORTED_BOUNDS = { |
There was a problem hiding this comment.
This should be renamed now as there are not the only supported bounds.
|
|
||
| @staticmethod | ||
| def propagate_quantization(from_node, to_node): | ||
| to_node.quantization = copy(from_node.quantization) |
There was a problem hiding this comment.
Please add type hints.
A torch.fx Node doesn't have a quantization attribute.
Edit:
This method is used with Neutron IR tensors in the code, so the parameters should be renamed. There already is a function which does this: quantization_utils.py:propagate_quantization().
Since you used the same name, were you aware of this and was there a reason you wrote this method instead?
There was a problem hiding this comment.
I didn't know the function existed. There is no need to reimplement it as it does the same thing.
| return np.clip(rescaled_value, quant_min, quant_max) | ||
|
|
||
| def convert(self, node: Node): | ||
| """Convert the `aten.clamp.default` operator to Neutron IR `Relu*` operators. |
There was a problem hiding this comment.
This is no longer true. It can be converted to other NeutronIR operators.
| if x.quantization != y.quantization: | ||
| raise AssertionError( | ||
| "Input and output quantization should be same in order to convert to max/min." | ||
| ) |
There was a problem hiding this comment.
This check should be in the is_supported_on_target() method.
I understand you updated the clamp pattern to quantize with this in mind, but someone could use a different quantizer (for example when combining backends), and we don't want to raise an exception in that case.
|
|
||
| # Make sure the `clamp` was NOT delegated. | ||
| assert graph_contains_any_of_ops(delegated_ep.graph, [Clamp]) | ||
|
|
There was a problem hiding this comment.
Please add the new tests inside a class. It's a new convention we have started with the new flow:
class TestClampNewNeutronFlow:
We can then use shorter test names with improved clarity.
Please draw inspiration from other open PRs which enable new ops (e.g. here). The TestClampNewNeutronFlow class should test all aspects of the operator with the new flow, because eventually we will remove the old tests with the old flow. So please add more tests.
| # Make sure the tested program contains the `clamp`. | ||
| assert graph_contains_any_of_ops(intermediate_ep.graph, [Clamp]) | ||
|
|
||
| convert_run_compare( |
There was a problem hiding this comment.
Please use lower_run_compare instead. We no longer want to compare the Neutron IR model with the edge model. Instead, we are now always testing the full models with Neutron delegate calls (with NSYS).
Again, you can draw inspiration from other similar PRs.
| pytest.param(10, 17, id="min = 10, max = 17 (Max/Min)"), | ||
| pytest.param(0, 1, id="min = 0, max = 1 (Relu0To1)"), | ||
| pytest.param(-1, 1, id="min = -1, max = 1 (ReluN1To1)"), | ||
| pytest.param(0, None, id="min = 0, max = None (Relu)"), |
There was a problem hiding this comment.
Would be nice to see some assertions that the node is indeed converted to a Relu variant when possible.
Summary
CustomCompileConfigtoNeutronTargetSpecTest plan
New test cases for Clamp are introduced. The relocation of new flag is covered by already existing unit tests.
cc @robert-kalmar @JakeStevens @digantdesai @rascani