Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion src/transformers/modeling_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -867,8 +867,10 @@ def invert_attention_mask(self, encoder_attention_mask: Tensor) -> Tensor:
"""
if encoder_attention_mask.dim() == 3:
encoder_extended_attention_mask = encoder_attention_mask[:, None, :, :]
if encoder_attention_mask.dim() == 2:
elif encoder_attention_mask.dim() == 2:
encoder_extended_attention_mask = encoder_attention_mask[:, None, None, :]
else:
raise ValueError(f"Wrong shape for encoder_attention_mask (shape {encoder_attention_mask.shape})")
Comment on lines +870 to +873
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea. we try to remove these tbh - see #43924 (I really should get back to that PR). So I honestly don't want to add extra logic that may result in other issues

# T5 has a mask that can compare sequence ids, we can simulate this here with this transposition
# encoder_extended_attention_mask = (encoder_extended_attention_mask ==
# encoder_extended_attention_mask.transpose(-1, -2))
Expand Down
Loading