Skip to content

Commit bb4e73f

Browse files
committed
Update on "Limit softmax to causally-valid elements in cpu_sdpa"
Instead of setting masked positions to -inf and computing max/exp/normalize over all kvSize elements, limit the softmax to only the causally-valid range per row. This avoids unnecessary computation on masked positions and zero-fills them for GEMM 2. Differential Revision: [D96044307](https://our.internmc.facebook.com/intern/diff/D96044307/) [ghstack-poisoned]
2 parents 4fc0c00 + 92586a5 commit bb4e73f

0 file changed

File tree

    0 commit comments

    Comments
     (0)