Skip to content

feat: expand OpenCL operator support (where, channel_shuffle, ceil_mode)#2

Open
cafeTechne wants to merge 2 commits intosoftcookiepp:masterfrom
cafeTechne:feat/operator-expansion
Open

feat: expand OpenCL operator support (where, channel_shuffle, ceil_mode)#2
cafeTechne wants to merge 2 commits intosoftcookiepp:masterfrom
cafeTechne:feat/operator-expansion

Conversation

@cafeTechne
Copy link
Copy Markdown

This PR expands the OpenCL backend with several native operator implementations to reduce CPU fallbacks and improve model compatibility:

  • Implemented aten::where: Uses a native ternary broadcasted pointwise kernel.
  • Implemented aten::channel_shuffle: Native decomposition (view/transpose/reshape) ensuring data stays on GPU.
  • Enabled pooling ceil_mode: Verified and unlocked forward/backward for avg/max pooling.
  • Fixed masked_select broadcasting: Added expansion logic to support arbitrary mask shapes.
  • Verification: All changes verified with \ est_op.py\ on AMD RX 6500 XT.

Includes submodule updates for dlprimitives fixing kernel parameter syntax.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant