fix(aggregation): Fix type mismatch bug in NashMTL (#317)

ValerianRey · web-flow · commit e42d677d3307 · 2025-04-19T17:15:21.000+02:00
* Move cast of alpha to torch.Tensor out of the condition on the step value * Add changelog entry Notes: * self.prvs_alpha is always a numpy array, so in both cases (if (self.step % self.update_weights_every) == 0: and else), we have to cast alpha to a Tensor. * In the original implementation of https://github.com/AvivNavon/nash-mtl/blob/main/methods/weight_methods.py#L238, there was already a mismatch of type, with alpha being a tensor when entering the condition (if (self.step % self.update_weights_every) == 0), and being a numpy array otherwise, but the following line (weighted_loss = sum([losses[i] * alpha[i] for i in range(len(alpha))])) made it work regardless.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -35,6 +35,8 @@ changes that do not affect the user.
 - Removed arbitrary exception handling in `IMTLG` and `AlignedMTL` when the computation fails. In
   practice, this fix should only affect some matrices with extremely large values, which should
   not usually happen.
+- Fixed a bug in `NashMTL` that made it fail (due to a type mismatch) when `update_weights_every`
+  was more than 1.
 
 ## [0.5.0] - 2025-02-01
 
diff --git a/src/torchjd/aggregation/nash_mtl.py b/src/torchjd/aggregation/nash_mtl.py
@@ -197,11 +197,12 @@ def forward(self, matrix: Tensor) -> Tensor:
             self.normalization_factor = torch.norm(GTG).detach().cpu().numpy().reshape((1,))
             GTG = GTG / self.normalization_factor.item()
             alpha = self._solve_optimization(GTG.cpu().detach().numpy())
-            alpha = torch.from_numpy(alpha).to(device=matrix.device, dtype=matrix.dtype)
         else:
             self.step += 1
             alpha = self.prvs_alpha
 
+        alpha = torch.from_numpy(alpha).to(device=matrix.device, dtype=matrix.dtype)
+
         if self.max_norm > 0:
             norm = torch.linalg.norm(alpha @ matrix)
             if norm > self.max_norm: