Optimize attention softmax buffer reuse by jordimas · Pull Request #2066 · OpenNMT/CTranslate2

jordimas · 2026-06-20T15:43:24Z

Reuses the output buffer in-place for softmax when raw attention is not needed, avoiding an extra allocation; only allocates a separate attn buffer when callers want unnormalize attention.

Optimize attention softmax buffer reuse

11f1402

jordimas merged commit 3562f5f into OpenNMT:master Jul 1, 2026
32 of 44 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize attention softmax buffer reuse#2066

Optimize attention softmax buffer reuse#2066
jordimas merged 1 commit into
OpenNMT:masterfrom
jordimas:opt/whisper-attn-buffer-reuse

jordimas commented Jun 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jordimas commented Jun 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant