Skip to content

Commit 0e8add2

Browse files
committed
Free .jacs earlier to divide by two peak memory
@PierreQuinton this was a big issue that we didn't spot earlier. I don't think the jacobian_matrix can be a view of the concatenated jacobians, so I think that having both the individual matrices + the combined matrix alive at the same time means using double memory. With this _free_jacs call much ealier, if the garbage collector is reactive, we shouldn't have this issue of doubling the peak memory usage for no reason. I think we should check that this PR doesn't introduce a huge memory efficiency regression. Can't merge without doing that.
1 parent 8a0fb0e commit 0e8add2

1 file changed

Lines changed: 3 additions & 3 deletions

File tree

src/torchjd/autojac/_jac_to_grad.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -70,14 +70,14 @@ def jac_to_grad(
7070
if not all([jacobian.shape[0] == jacobians[0].shape[0] for jacobian in jacobians[1:]]):
7171
raise ValueError("All Jacobians should have the same number of rows.")
7272

73+
if not retain_jac:
74+
_free_jacs(tensors_)
75+
7376
jacobian_matrix = _unite_jacobians(jacobians)
7477
gradient_vector = aggregator(jacobian_matrix)
7578
gradients = _disunite_gradient(gradient_vector, jacobians, tensors_)
7679
accumulate_grads(tensors_, gradients)
7780

78-
if not retain_jac:
79-
_free_jacs(tensors_)
80-
8181

8282
def _unite_jacobians(jacobians: list[Tensor]) -> Tensor:
8383
jacobian_matrices = [jacobian.reshape(jacobian.shape[0], -1) for jacobian in jacobians]

0 commit comments

Comments
 (0)