You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Revert skip-softmax threshold formula change: restore * sm_scale
The * sm_scale factor is intentional: it scales the tile-skip threshold
relative to head dimension, so larger head_dim (smaller sm_scale) produces
more aggressive sparsity for the same lambda value. The previous 'fix' was
incorrect.
Signed-off-by: Ye Yu <yeyu@nvidia.com>
0 commit comments