Skip to content

flash-attn2: prefer using add_op_namespace_prefix#877

Open
drbh wants to merge 5 commits into
mainfrom
fix-op-prefixes
Open

flash-attn2: prefer using add_op_namespace_prefix#877
drbh wants to merge 5 commits into
mainfrom
fix-op-prefixes

Conversation

@drbh
Copy link
Copy Markdown
Collaborator

@drbh drbh commented May 19, 2026

This PR fixes flash-attn2 to correctly register fake ops

sayakpaul
sayakpaul previously approved these changes May 19, 2026
Copy link
Copy Markdown
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@drbh
Copy link
Copy Markdown
Collaborator Author

drbh commented May 20, 2026

note this PR has been updated to remove the unused ops folder that included non flash attn code that was unused. It also cleans up the exposed functions to removed the unused low levels functions in init.

the kernel now only exposes the core top level functions (listed in all) and passes the python kernels/nix-builder/pkgs/torch-ops-check/torch-ops-check-hook.py kernels-community/flash-attn2/torch-ext check add in huggingface/kernels#569

Copy link
Copy Markdown
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why the windows build would fail?

@drbh
Copy link
Copy Markdown
Collaborator Author

drbh commented May 20, 2026

Any reason why the windows build would fail?

not 100% sure at the moment but seems to be related to the xpu path on windows. In general the windows build workflow may need some tweaks since it has some custom logic that diverges from the standard kernel-builder nix path.

gonna take a look and see if there is a small change to resolve - otherwise fixing the workflow may be best to tackle in another PR

@sayakpaul
Copy link
Copy Markdown
Member

Works for me.

@drbh
Copy link
Copy Markdown
Collaborator Author

drbh commented May 20, 2026

added a small PR to skip the windows xpu backend for flash attn2 since there seems to be a bug related to the cutlass fork, its possible that merging that PR and rebasing this PR on top will avoid the xpu windows build and enable the windows cuda build to succeed.. #885

danieldk
danieldk previously approved these changes May 20, 2026
@drbh drbh force-pushed the fix-op-prefixes branch from 70249ba to 780299f Compare May 22, 2026 06:57
)


def fwd(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an API break, we need to bump up the version if we remove these. Is removal necessary?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants