Hi,
when working with RoFormer models, I've noticed the **kwargs option is not handled correctly. Most classes take in **kwargs but do not pass them further into the model. For example:
|
encoder_outputs = self.encoder( |
|
embedding_output, |
|
attention_mask=attention_mask, |
|
encoder_hidden_states=encoder_hidden_states, |
|
encoder_attention_mask=encoder_attention_mask, |
|
past_key_values=past_key_values, |
|
use_cache=use_cache, |
|
output_attentions=output_attentions, |
|
output_hidden_states=output_hidden_states, |
|
return_dict=return_dict, |
|
) |
This is very annoying since I want to implement a custom attention and send needed inputs through
**kwargs.
The fix is trivial and I would be happy to make a PR for this if the members agree.
Hi,
when working with RoFormer models, I've noticed the
**kwargsoption is not handled correctly. Most classes take in**kwargsbut do not pass them further into the model. For example:transformers/src/transformers/models/roformer/modeling_roformer.py
Lines 737 to 747 in 5206626
**kwargs.The fix is trivial and I would be happy to make a PR for this if the members agree.