Skip to content

Commit e93f2d9

Browse files
committed
Remove duplicated optimizer.zero_grad() lines
1 parent 5cf8c1c commit e93f2d9

3 files changed

Lines changed: 0 additions & 9 deletions

File tree

docs/source/examples/amp.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,6 @@ following example shows the resulting code for a multi-task learning use-case.
5353
scaler.step(optimizer)
5454
scaler.update()
5555
optimizer.zero_grad()
56-
optimizer.zero_grad()
5756
5857
.. hint::
5958
Within the ``torch.autocast`` context, some operations may be done in ``float16`` type. For

docs/source/examples/iwrm.rst

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,6 @@ batch of data. When minimizing per-instance losses (IWRM), we use either autojac
6969
7070
optimizer.step()
7171
optimizer.zero_grad()
72-
optimizer.zero_grad()
7372
7473
In this baseline example, the update may negatively affect the loss of some elements of the
7574
batch.
@@ -105,7 +104,6 @@ batch of data. When minimizing per-instance losses (IWRM), we use either autojac
105104
106105
optimizer.step()
107106
optimizer.zero_grad()
108-
optimizer.zero_grad()
109107
110108
Here, we compute the Jacobian of the per-sample losses with respect to the model parameters
111109
and use it to update the model such that no loss from the batch is (locally) increased.
@@ -141,7 +139,6 @@ batch of data. When minimizing per-instance losses (IWRM), we use either autojac
141139
losses.backward(weights)
142140
optimizer.step()
143141
optimizer.zero_grad()
144-
optimizer.zero_grad()
145142
146143
Here, the per-sample gradients are never fully stored in memory, leading to large
147144
improvements in memory usage and speed compared to autojac, in most practical cases. The

tests/doc/test_rst.py

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,6 @@ def test_amp():
4747
scaler.step(optimizer)
4848
scaler.update()
4949
optimizer.zero_grad()
50-
optimizer.zero_grad()
5150

5251

5352
def test_basic_usage():
@@ -122,7 +121,6 @@ def test_iwmtl():
122121
losses.backward(weights)
123122
optimizer.step()
124123
optimizer.zero_grad()
125-
optimizer.zero_grad()
126124

127125

128126
def test_iwrm():
@@ -146,7 +144,6 @@ def test_autograd():
146144
loss.backward()
147145
optimizer.step()
148146
optimizer.zero_grad()
149-
optimizer.zero_grad()
150147

151148
def test_autojac():
152149
import torch
@@ -201,7 +198,6 @@ def test_autogram():
201198
losses.backward(weights)
202199
optimizer.step()
203200
optimizer.zero_grad()
204-
optimizer.zero_grad()
205201

206202
test_autograd()
207203
test_autojac()
@@ -399,7 +395,6 @@ def test_partial_jd():
399395
losses.backward(weights)
400396
optimizer.step()
401397
optimizer.zero_grad()
402-
optimizer.zero_grad()
403398

404399

405400
def test_rnn():

0 commit comments

Comments
 (0)