We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent fa0ae17 commit 8581fbaCopy full SHA for 8581fba
1 file changed
src/transformers/optimization_tf.py
@@ -176,7 +176,7 @@ class AdamWeightDecay(Adam):
176
with the m and v parameters in strange ways as shown in [Decoupled Weight Decay
177
Regularization](https://arxiv.org/abs/1711.05101).
178
179
- Instead we want ot decay the weights in a manner that doesn't interact with the m/v parameters. This is equivalent
+ Instead we want to decay the weights in a manner that doesn't interact with the m/v parameters. This is equivalent
180
to adding the square of the weights to the loss with plain (non-momentum) SGD.
181
182
Args:
0 commit comments