Skip to content

Skip numeric drop-out when PComputeWindow is a null_tile_window in Bl…#7256

Open
qianfengz wants to merge 2 commits intodevelopfrom
users/qianfengz/ck/block_dropout_no_drop
Open

Skip numeric drop-out when PComputeWindow is a null_tile_window in Bl…#7256
qianfengz wants to merge 2 commits intodevelopfrom
users/qianfengz/ck/block_dropout_no_drop

Conversation

@qianfengz
Copy link
Copy Markdown
Contributor

The BlockDropout implementation already provides very complete logic for generating random numbers and executing dropout for the P tensor after first attention Gemm with capability to support both Warp-Gemm 32x32 and 16x16 as well as to run on both wave32 and wave64 arch.

But in some situation, we only need the block-layer process to generate random numbers, rather than simultaneously execute dropout in real-time on the vgpr tile. For example, xformers' test_mem_eff_attention.py::test_dropout_ck requires the host reference implementation of attention forward with dropout to use the same random numbers to compare & verify the device side implementation of attention forward with dropout, so a standalone kernel to generate random numbers only is required.

This PR will enable xformers's random_val generating kernel (in file ck_tiled_rand_uniform_kernel.h) to depend on BlockDropout's Run() operator completely to generate random numbers for a [MPerBlock, NPerBlock] tile during the tile iteration, no need to replicate the logic of BlockDropout in the xformers kernel

Comment thread projects/composablekernel/include/ck_tile/ops/fmha/block/block_dropout.hpp Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants