Skip to content

[Relax][FRONTEND][ONNX] Support Softmax, LogSoftmax and Hardmax when opset version ≤12#19428

Draft
cchung100m wants to merge 1 commit intoapache:mainfrom
cchung100m:issue-19420
Draft

[Relax][FRONTEND][ONNX] Support Softmax, LogSoftmax and Hardmax when opset version ≤12#19428
cchung100m wants to merge 1 commit intoapache:mainfrom
cchung100m:issue-19420

Conversation

@cchung100m
Copy link
Copy Markdown
Contributor

Hi Commiters,

This PR is trying to fix issues #19420. Any suggestions would be appreciated if you are available.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements legacy semantics for ONNX Softmax, LogSoftmax, and Hardmax operators (opset <= 12), which require flattening the input tensor into a 2D matrix based on the axis attribute. The review feedback correctly identifies that the current fallback logic for tensors with unknown static rank incorrectly defaults to modern (opset >= 13) behavior. It is recommended to raise a ValueError in these cases to prevent silent calculation errors for dynamic-rank tensors.

Comment on lines +806 to +807
if prepared is None:
return relax.op.nn.softmax(inputs[0], axis=axis)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fallback implementation for legacy Softmax (opset <= 12) is incorrect when the input tensor's rank is not statically known. The fallback relax.op.nn.softmax(inputs[0], axis=axis) implements the modern (opset >= 13) softmax behavior, which does not perform the required flattening operation based on the axis attribute. This will lead to silently incorrect results for models with dynamic-rank tensors.

Instead of falling back to an incorrect implementation, it would be better to raise an error indicating that static rank is required for legacy softmax.

Suggested change
if prepared is None:
return relax.op.nn.softmax(inputs[0], axis=axis)
if prepared is None:
raise ValueError(
"Softmax with opset <= 12 requires a statically known input rank "
"to apply legacy semantics."
)

Comment on lines +828 to +829
if prepared is None:
return relax.op.nn.log_softmax(inputs[0], axis=axis)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fallback implementation for legacy LogSoftmax (opset <= 12) is incorrect when the input tensor's rank is not statically known. The fallback relax.op.nn.log_softmax(inputs[0], axis=axis) implements the modern (opset >= 13) log_softmax behavior, which does not perform the required flattening operation based on the axis attribute. This will lead to silently incorrect results for models with dynamic-rank tensors.

Instead of falling back to an incorrect implementation, it would be better to raise an error indicating that static rank is required for legacy log_softmax.

Suggested change
if prepared is None:
return relax.op.nn.log_softmax(inputs[0], axis=axis)
if prepared is None:
raise ValueError(
"LogSoftmax with opset <= 12 requires a statically known input rank "
"to apply legacy semantics."
)

Comment on lines +859 to +860
if prepared is None:
return cls._hardmax_impl(inputs[0], axis)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fallback implementation for legacy Hardmax (opset <= 12) is incorrect when the input tensor's rank is not statically known. The fallback calls _hardmax_impl(inputs[0], axis), which implements the modern (opset >= 13) hardmax behavior. It does not perform the required flattening operation and uses axis normalization logic (_normalize_constant_axes) that is incorrect for legacy opsets. This will lead to silently incorrect results for models with dynamic-rank tensors.

Instead of falling back to an incorrect implementation, it would be better to raise an error indicating that static rank is required for legacy hardmax.

Suggested change
if prepared is None:
return cls._hardmax_impl(inputs[0], axis)
if prepared is None:
raise ValueError(
"Hardmax with opset <= 12 requires a statically known input rank "
"to apply legacy semantics."
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant