Free .jacs earlier to divide by two peak memory

ValerianRey · ValerianRey · commit 0e8add286903 · 2026-01-14T02:26:28.000+01:00
@PierreQuinton this was a big issue that we didn't spot earlier. I don't think the jacobian_matrix can be a view of the concatenated jacobians, so I think that having both the individual matrices + the combined matrix alive at the same time means using double memory. With this _free_jacs call much ealier, if the garbage collector is reactive, we shouldn't have this issue of doubling the peak memory usage for no reason. I think we should check that this PR doesn't introduce a huge memory efficiency regression. Can't merge without doing that.
diff --git a/src/torchjd/autojac/_jac_to_grad.py b/src/torchjd/autojac/_jac_to_grad.py
@@ -70,14 +70,14 @@ def jac_to_grad(
     if not all([jacobian.shape[0] == jacobians[0].shape[0] for jacobian in jacobians[1:]]):
         raise ValueError("All Jacobians should have the same number of rows.")
 
+    if not retain_jac:
+        _free_jacs(tensors_)
+
     jacobian_matrix = _unite_jacobians(jacobians)
     gradient_vector = aggregator(jacobian_matrix)
     gradients = _disunite_gradient(gradient_vector, jacobians, tensors_)
     accumulate_grads(tensors_, gradients)
 
-    if not retain_jac:
-        _free_jacs(tensors_)
-
 
 def _unite_jacobians(jacobians: list[Tensor]) -> Tensor:
     jacobian_matrices = [jacobian.reshape(jacobian.shape[0], -1) for jacobian in jacobians]