[ET-VK][ops] Add eq.Scalar operator#20383
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20383
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 5 New Failures, 3 Unrelated FailuresAs of commit 05e3da6 with merge base 1621fa2 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This PR needs a
|
Pull Request resolved: #20383 Adds Vulkan support for `aten.eq.Scalar`. This is the second of two ops needed to collapse the Llama4-mini TISO en_US backbone export to a single Vulkan partition (after `bitwise_or`): the discrete-speech mask compares the int token-id tensor against scalar constants via `aten.eq.Scalar`, which previously had no Vulkan implementation and forced a CPU fallback that split the delegated graph. Implemented by extending the existing tensor-scalar binary-op path with a comparison-output variant: `binary_scalar_buffer.glsl` / `binary_scalar_texture.glsl` gain an `IS_COMPARISON_OP` code path that writes a `uint8` (bool) output while leaving the existing arithmetic (e.g. `pow`) path unchanged; `binary_scalar_buffer.yaml` / `binary_scalar_texture.yaml` add an `eq_scalar` variant (half/float/int32 — the texture variant uses `equal(X, Y)` for per-lane `bvec4`, the buffer variant uses scalar `X == Y`); `BinaryScalarOp.cpp` adds an `eq_tensor_scalar` dispatch and `VK_REGISTER_OP(aten.eq.Scalar, eq_tensor_scalar)`; `op_registry.py` registers `aten.eq.Scalar` `OpFeatures` (FP/INT tensor input, bool output). The int64 token tensor is serialized to int32 via the existing `downcast_64_bit` path, so the dispatch resolves to the int32 shader variant; no dtype-conversion pass is added. This change was authored with Claude. ghstack-source-id: 396618180 @exported-using-ghexport Differential Revision: [D108457791](https://our.internmc.facebook.com/intern/diff/D108457791/)
Stack from ghstack (oldest at bottom):
Adds Vulkan support for
aten.eq.Scalar. This is the second of two ops needed to collapse the Llama4-mini TISO en_US backbone export to a single Vulkan partition (afterbitwise_or): the discrete-speech mask compares the int token-id tensor against scalar constants viaaten.eq.Scalar, which previously had no Vulkan implementation and forced a CPU fallback that split the delegated graph.Implemented by extending the existing tensor-scalar binary-op path with a comparison-output variant:
binary_scalar_buffer.glsl/binary_scalar_texture.glslgain anIS_COMPARISON_OPcode path that writes auint8(bool) output while leaving the existing arithmetic (e.g.pow) path unchanged;binary_scalar_buffer.yaml/binary_scalar_texture.yamladd aneq_scalarvariant (half/float/int32 — the texture variant usesequal(X, Y)for per-lanebvec4, the buffer variant uses scalarX == Y);BinaryScalarOp.cppadds aneq_tensor_scalardispatch andVK_REGISTER_OP(aten.eq.Scalar, eq_tensor_scalar);op_registry.pyregistersaten.eq.ScalarOpFeatures(FP/INT tensor input, bool output). The int64 token tensor is serialized to int32 via the existingdowncast_64_bitpath, so the dispatch resolves to the int32 shader variant; no dtype-conversion pass is added.This change was authored with Claude.
Differential Revision: D108457791