Skip to content

Optimize attention softmax buffer reuse#2066

Merged
jordimas merged 1 commit into
OpenNMT:masterfrom
jordimas:opt/whisper-attn-buffer-reuse
Jul 1, 2026
Merged

Optimize attention softmax buffer reuse#2066
jordimas merged 1 commit into
OpenNMT:masterfrom
jordimas:opt/whisper-attn-buffer-reuse

Conversation

@jordimas

Copy link
Copy Markdown
Collaborator

Reuses the output buffer in-place for softmax when raw attention is not needed, avoiding an extra allocation; only allocates a separate attn buffer when callers want unnormalize attention.

@jordimas jordimas merged commit 3562f5f into OpenNMT:master Jul 1, 2026
32 of 44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant