Skip to content

Commit 33bce01

Browse files
authored
Merge branch 'main' into add-autojac-jac
2 parents e666c49 + 2c2a58f commit 33bce01

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

81 files changed

+1600
-879
lines changed

.github/workflows/check-todos.yml

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
name: "Check for TODOs"
2+
3+
on:
4+
pull_request:
5+
6+
jobs:
7+
check-todos:
8+
runs-on: ubuntu-latest
9+
steps:
10+
- name: Checkout code
11+
uses: actions/checkout@v4
12+
13+
- name: Scan for TODO strings
14+
run: |
15+
echo "Scanning codebase for TODOs..."
16+
17+
git grep -nE "TODO" -- . ':(exclude).github/workflows/*' > todos_found.txt || true
18+
19+
if [ -s todos_found.txt ]; then
20+
echo "❌ ERROR: Found TODOs in the following files:"
21+
echo "-------------------------------------------"
22+
23+
while IFS=: read -r file line content; do
24+
echo "::error file=$file,line=$line::TODO found at $file:$line - must be resolved before merge:%0A$content"
25+
done < todos_found.txt
26+
27+
echo "-------------------------------------------"
28+
echo "Please resolve these TODOs or track them in an issue before merging."
29+
30+
exit 1
31+
else
32+
echo "✅ No TODOs found. Codebase is clean!"
33+
exit 0
34+
fi

.github/workflows/claude.yml

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
name: Claude Code
2+
3+
on:
4+
issue_comment:
5+
types: [created]
6+
pull_request_review_comment:
7+
types: [created]
8+
issues:
9+
types: [opened, assigned]
10+
pull_request_review:
11+
types: [submitted]
12+
13+
jobs:
14+
claude:
15+
if: |
16+
(github.event_name == 'issue_comment' && contains(github.event.comment.body, '@claude')) ||
17+
(github.event_name == 'pull_request_review_comment' && contains(github.event.comment.body, '@claude')) ||
18+
(github.event_name == 'pull_request_review' && contains(github.event.review.body, '@claude')) ||
19+
(github.event_name == 'issues' && (contains(github.event.issue.body, '@claude') || contains(github.event.issue.title, '@claude')))
20+
runs-on: ubuntu-latest
21+
permissions:
22+
contents: read
23+
pull-requests: read
24+
issues: read
25+
id-token: write
26+
actions: read # Required for Claude to read CI results on PRs
27+
steps:
28+
- name: Checkout repository
29+
uses: actions/checkout@v4
30+
with:
31+
fetch-depth: 1
32+
33+
- name: Run Claude Code
34+
id: claude
35+
uses: anthropics/claude-code-action@v1
36+
with:
37+
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
38+
39+
# This is an optional setting that allows Claude to read CI results on PRs
40+
additional_permissions: |
41+
actions: read
42+
43+
# Optional: Give a custom prompt to Claude. If this is not specified, Claude will perform the instructions specified in the comment that tagged it.
44+
# prompt: 'Update the pull request description to include a summary of changes.'
45+
46+
# Optional: Add claude_args to customize behavior and configuration
47+
# See https://github.com/anthropics/claude-code-action/blob/main/docs/usage.md
48+
# or https://code.claude.com/docs/en/cli-reference for available options
49+
# claude_args: '--allowed-tools Bash(gh pr:*)'

.github/workflows/tests.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,8 +52,7 @@ jobs:
5252
env:
5353
PYTEST_TORCH_DTYPE: ${{ matrix.dtype || 'float32' }}
5454

55-
- &upload-codecov
56-
name: Upload results to Codecov
55+
- name: Upload results to Codecov
5756
uses: codecov/codecov-action@v4
5857
with:
5958
token: ${{ secrets.CODECOV_TOKEN }}

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
1+
# Profiling results
2+
traces/
3+
14
# uv
25
uv.lock
36

.pre-commit-config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ repos:
1818
]
1919

2020
- repo: https://github.com/pycqa/isort
21-
rev: 6.1.0
21+
rev: 7.0.0
2222
hooks:
2323
- id: isort # Sort imports.
2424
args: [
@@ -31,7 +31,7 @@ repos:
3131
]
3232

3333
- repo: https://github.com/psf/black-pre-commit-mirror
34-
rev: 25.9.0
34+
rev: 25.12.0
3535
hooks:
3636
- id: black # Format code.
3737
args: [--line-length=100]

CHANGELOG.md

Lines changed: 43 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@
33
All notable changes to this project will be documented in this file.
44

55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6-
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). This changelog does not include internal
7-
changes that do not affect the user.
6+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). This
7+
changelog does not include internal changes that do not affect the user.
88

99
## [Unreleased]
1010

@@ -13,7 +13,47 @@ changes that do not affect the user.
1313
- Added the function `torchjd.autojac.jac` to compute the Jacobian of some outputs with respect to
1414
some inputs, without doing any aggregation. Its interface is very similar to
1515
`torch.autograd.grad`.
16-
- Added `__all__` in the `__init__.py` of packages. This should prevent PyLance from triggering warnings when importing from `torchjd`.
16+
- Added a `scale_mode` parameter to `AlignedMTL` and `AlignedMTLWeighting`, allowing to choose
17+
between `"min"`, `"median"`, and `"rmse"` scaling.
18+
19+
### Changed
20+
21+
- **BREAKING**: Removed from `backward` and `mtl_backward` the responsibility to aggregate the
22+
Jacobian. Now, these functions compute and populate the `.jac` fields of the parameters, and a new
23+
function `torchjd.autojac.jac_to_grad` should then be called to aggregate those `.jac` fields into
24+
`.grad` fields.
25+
This means that users now have more control on what they do with the Jacobians (they can easily
26+
aggregate them group by group or even param by param if they want), but it now requires an extra
27+
line of code to do the Jacobian descent step. To update, please change:
28+
```python
29+
backward(losses, aggregator)
30+
```
31+
to
32+
```python
33+
backward(losses)
34+
jac_to_grad(model.parameters(), aggregator)
35+
```
36+
and
37+
```python
38+
mtl_backward(losses, features, aggregator)
39+
```
40+
to
41+
```python
42+
mtl_backward(losses, features)
43+
jac_to_grad(shared_module.parameters(), aggregator)
44+
```
45+
46+
- Removed an unnecessary memory duplication. This should significantly improve the memory efficiency
47+
of `autojac`.
48+
- Removed an unnecessary internal cloning of gradient. This should slightly improve the memory
49+
efficiency of `autojac`.
50+
51+
## [0.8.1] - 2026-01-07
52+
53+
### Added
54+
55+
- Added `__all__` in the `__init__.py` of packages. This should prevent PyLance from triggering
56+
warnings when importing from `torchjd`.
1757

1858
## [0.8.0] - 2025-11-13
1959

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -111,11 +111,11 @@ using [UPGrad](https://torchjd.org/stable/docs/aggregation/upgrad/).
111111
loss1 = loss_fn(output1, target1)
112112
loss2 = loss_fn(output2, target2)
113113

114-
optimizer.zero_grad()
115114
- loss = loss1 + loss2
116115
- loss.backward()
117116
+ mtl_backward(losses=[loss1, loss2], features=features, aggregator=aggregator)
118117
optimizer.step()
118+
optimizer.zero_grad()
119119
```
120120

121121
> [!NOTE]
@@ -150,12 +150,12 @@ Jacobian descent using [UPGrad](https://torchjd.org/stable/docs/aggregation/upgr
150150
- loss = loss_fn(output, target) # shape [1]
151151
+ losses = loss_fn(output, target) # shape [16]
152152

153-
optimizer.zero_grad()
154153
- loss.backward()
155154
+ gramian = engine.compute_gramian(losses) # shape: [16, 16]
156155
+ weights = weighting(gramian) # shape: [16]
157156
+ losses.backward(weights)
158157
optimizer.step()
158+
optimizer.zero_grad()
159159
```
160160

161161
Lastly, you can even combine the two approaches by considering multiple tasks and each element of
@@ -201,10 +201,10 @@ for input, target1, target2 in zip(inputs, task1_targets, task2_targets):
201201
# Obtain the weights that lead to no conflict between reweighted gradients
202202
weights = weighting(gramian) # shape: [16, 2]
203203

204-
optimizer.zero_grad()
205204
# Do the standard backward pass, but weighted using the obtained weights
206205
losses.backward(weights)
207206
optimizer.step()
207+
optimizer.zero_grad()
208208
```
209209

210210
> [!NOTE]

docs/source/docs/autojac/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,4 @@ autojac
1111
backward.rst
1212
mtl_backward.rst
1313
jac.rst
14+
jac_to_grad.rst
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
:hide-toc:
2+
3+
jac_to_grad
4+
===========
5+
6+
.. autofunction:: torchjd.autojac.jac_to_grad

docs/source/examples/amp.rst

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,15 +12,15 @@ case, the losses) should preferably be scaled with a `GradScaler
1212
following example shows the resulting code for a multi-task learning use-case.
1313

1414
.. code-block:: python
15-
:emphasize-lines: 2, 17, 27, 34, 36-38
15+
:emphasize-lines: 2, 17, 27, 34-35, 37-38
1616
1717
import torch
1818
from torch.amp import GradScaler
1919
from torch.nn import Linear, MSELoss, ReLU, Sequential
2020
from torch.optim import SGD
2121
2222
from torchjd.aggregation import UPGrad
23-
from torchjd.autojac import mtl_backward
23+
from torchjd.autojac import mtl_backward, jac_to_grad
2424
2525
shared_module = Sequential(Linear(10, 5), ReLU(), Linear(5, 3), ReLU())
2626
task1_module = Linear(3, 1)
@@ -48,10 +48,11 @@ following example shows the resulting code for a multi-task learning use-case.
4848
loss2 = loss_fn(output2, target2)
4949
5050
scaled_losses = scaler.scale([loss1, loss2])
51-
optimizer.zero_grad()
52-
mtl_backward(losses=scaled_losses, features=features, aggregator=aggregator)
51+
mtl_backward(losses=scaled_losses, features=features)
52+
jac_to_grad(shared_module.parameters(), aggregator)
5353
scaler.step(optimizer)
5454
scaler.update()
55+
optimizer.zero_grad()
5556
5657
.. hint::
5758
Within the ``torch.autocast`` context, some operations may be done in ``float16`` type. For

0 commit comments

Comments
 (0)