[Relax][FRONTEND][ONNX] Support Softmax, LogSoftmax and Hardmax when opset version ≤12#19428
[Relax][FRONTEND][ONNX] Support Softmax, LogSoftmax and Hardmax when opset version ≤12#19428cchung100m wants to merge 1 commit intoapache:mainfrom
Conversation
…max, LogSoftmax and Hardmax
There was a problem hiding this comment.
Code Review
This pull request implements legacy semantics for ONNX Softmax, LogSoftmax, and Hardmax operators (opset <= 12), which require flattening the input tensor into a 2D matrix based on the axis attribute. The review feedback correctly identifies that the current fallback logic for tensors with unknown static rank incorrectly defaults to modern (opset >= 13) behavior. It is recommended to raise a ValueError in these cases to prevent silent calculation errors for dynamic-rank tensors.
| if prepared is None: | ||
| return relax.op.nn.softmax(inputs[0], axis=axis) |
There was a problem hiding this comment.
The fallback implementation for legacy Softmax (opset <= 12) is incorrect when the input tensor's rank is not statically known. The fallback relax.op.nn.softmax(inputs[0], axis=axis) implements the modern (opset >= 13) softmax behavior, which does not perform the required flattening operation based on the axis attribute. This will lead to silently incorrect results for models with dynamic-rank tensors.
Instead of falling back to an incorrect implementation, it would be better to raise an error indicating that static rank is required for legacy softmax.
| if prepared is None: | |
| return relax.op.nn.softmax(inputs[0], axis=axis) | |
| if prepared is None: | |
| raise ValueError( | |
| "Softmax with opset <= 12 requires a statically known input rank " | |
| "to apply legacy semantics." | |
| ) |
| if prepared is None: | ||
| return relax.op.nn.log_softmax(inputs[0], axis=axis) |
There was a problem hiding this comment.
The fallback implementation for legacy LogSoftmax (opset <= 12) is incorrect when the input tensor's rank is not statically known. The fallback relax.op.nn.log_softmax(inputs[0], axis=axis) implements the modern (opset >= 13) log_softmax behavior, which does not perform the required flattening operation based on the axis attribute. This will lead to silently incorrect results for models with dynamic-rank tensors.
Instead of falling back to an incorrect implementation, it would be better to raise an error indicating that static rank is required for legacy log_softmax.
| if prepared is None: | |
| return relax.op.nn.log_softmax(inputs[0], axis=axis) | |
| if prepared is None: | |
| raise ValueError( | |
| "LogSoftmax with opset <= 12 requires a statically known input rank " | |
| "to apply legacy semantics." | |
| ) |
| if prepared is None: | ||
| return cls._hardmax_impl(inputs[0], axis) |
There was a problem hiding this comment.
The fallback implementation for legacy Hardmax (opset <= 12) is incorrect when the input tensor's rank is not statically known. The fallback calls _hardmax_impl(inputs[0], axis), which implements the modern (opset >= 13) hardmax behavior. It does not perform the required flattening operation and uses axis normalization logic (_normalize_constant_axes) that is incorrect for legacy opsets. This will lead to silently incorrect results for models with dynamic-rank tensors.
Instead of falling back to an incorrect implementation, it would be better to raise an error indicating that static rank is required for legacy hardmax.
| if prepared is None: | |
| return cls._hardmax_impl(inputs[0], axis) | |
| if prepared is None: | |
| raise ValueError( | |
| "Hardmax with opset <= 12 requires a statically known input rank " | |
| "to apply legacy semantics." | |
| ) |
Hi Commiters,
This PR is trying to fix issues #19420. Any suggestions would be appreciated if you are available.