Skip to content

Commit fbf0d8e

Browse files
committed
minor fixes
1 parent 7fd5107 commit fbf0d8e

2 files changed

Lines changed: 48 additions & 24 deletions

File tree

skills/implement-method/SKILL.md

Lines changed: 45 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -60,24 +60,29 @@ Always do this — it is the step we invariably end up needing. Read the relevan
6060
algorithm box in the paper, then read the official and best-known third-party implementations at the
6161
exact files/lines from the tracking row.
6262

63-
Reconcile any discrepancies between them (a different normalization, an extra factor, an
64-
initialization value, a stabilization trick, a sign convention). Decide which version to follow,
65-
note **why**, and surface the disagreement to the user. The implementation should be faithful to a
66-
clearly-stated source, not an unexplained blend.
63+
Reconcile any discrepancies between them. The ones that most often bite:
64+
65+
- **Minimization vs maximization.** TorchJD minimizes losses; much MOO/evolutionary work is written
66+
for maximization, with the minimization form buried in a footnote. Find it, and check the sign of
67+
every reference / ideal-point subtraction.
68+
- **Normalization.** A direction or weight vector may be normalized (`w / ‖w‖`) in the code but not
69+
the paper, or vice versa.
70+
- **Dead arguments.** An impl may accept a parameter (e.g. a reference point) yet silently ignore it.
71+
- **Droppable terms.** An `abs` / `clamp` / `max(0, ·)` in the paper may be unnecessary under the
72+
method's preconditions (e.g. non-negative weights); drop it only with a justification.
73+
- **Other:** an extra factor, an init value, a stabilization / epsilon trick.
74+
75+
Decide which to follow, note **why**, and surface the disagreement to the user — the implementation
76+
should be faithful to a clearly-stated source, not an unexplained blend.
6777

6878
### Step 4: Settle the interface and design decisions
6979

70-
Using the research findings (Step 1) and the comparison (Step 3), decide how each non-standard
71-
aspect maps onto TorchJD, reusing the closest existing pattern:
72-
73-
- **Stateful, trainable** parameter(s) → `nn.Parameter` + the `Stateful` mixin's `reset()` (mirror
74-
`UW` / `IMTL-L`).
75-
- **Stateful, non-trainable** history/buffer → a registered buffer + an explicit update method (e.g.
76-
`step()`), with no `nn.Parameter` (mirror `DWA`).
77-
- **Internal optimizer / multi-call update protocol** (mirror `FAMO`).
78-
- **Preference / reference vector** argument (mirror `STCH` / `COSMOS` / `PBI`).
79-
- **Preconditions** (e.g. positivity): decide whether to enforce them (raise `ValueError`) or only
80-
document them, and how `nan`/`inf` should propagate.
80+
Using the research findings (Step 1) and the comparison (Step 3), map each non-standard aspect onto
81+
the closest existing pattern from the reference loaded in Step 2 (statefulness, trainable parameters,
82+
an internal optimizer, a preference/reference vector, ...). Then settle, for any method type:
83+
84+
- **Preconditions** (e.g. positivity): enforce them (raise `ValueError`) or only document them, and
85+
how `nan`/`inf` should propagate.
8186
- Which constructor arguments are **required vs optional**, and their **defaults**.
8287

8388
List the non-standard parts and your proposed handling, and **confirm the design with the user
@@ -91,15 +96,34 @@ license header to the source file and an entry to `NOTICES` (see the reference).
9196

9297
### Step 6: Verify
9398

94-
Run the checks listed in the reference (unit tests with `-W error`, lint, type-check, and the docs
99+
Run the checks listed in the reference (unit tests with `-W error`, lint, and the docs
95100
build/doctest). GPU tests require a CUDA device; if you cannot run them, provide the exact commands
96101
for the user to run on their GPU and report back the results.
97102

98-
### Step 7: Open the PR
103+
### Step 7: Self-review the code you produced
104+
105+
Before opening anything, re-read your own diff against the requirements and improve what can be
106+
improved. Check that:
107+
108+
- The class follows the closest existing method's conventions (the reference's checklist): correct
109+
base class(es), `forward(self, values, /)` returning a 0-dim scalar, shape validation, `reset()`
110+
for stateful methods, a correct `__repr__`, and the docstring conventions (`r"""` only with LaTeX,
111+
`:class:` cross-ref, `.. math::` + bullet list, a usage doctest, `:param:` for each argument).
112+
- The design decisions settled in Step 4 are actually reflected in the code, and any discrepancy
113+
between the paper and the existing implementations (Step 3) is resolved deliberately, with a
114+
comment or docstring note where it is non-obvious.
115+
- The tests cover the documented edge cases and contracts, not just the happy path.
116+
- All six files are present and consistent (class, `__init__.py`, `.rst`, toctree, test,
117+
`CHANGELOG.md`), plus `NOTICES` + a license header if you adapted code.
118+
119+
Apply the fixes you find, then re-run the relevant checks from Step 6.
120+
121+
### Step 8: Open a draft PR
99122

100-
Create a new branch, commit, and open a pull request targeting `main`, following the repository's
101-
PR conventions (a `CHANGELOG.md` entry under `[Unreleased] > ### Added`; when asked for a PR
102-
description, output raw GitHub-flavored markdown in a fenced code block, with GitHub math syntax
103-
`$...$` / `$$...$$` and no em dashes). Return the PR URL when done.
123+
Create a new branch, commit, and open a **draft** pull request targeting `main`, following the
124+
repository's PR conventions (a `CHANGELOG.md` entry under `[Unreleased] > ### Added`; when asked for
125+
a PR description, output raw GitHub-flavored markdown in a fenced code block, with GitHub math syntax
126+
`$...$` / `$$...$$` and no em dashes). Keep it a draft so the contributor can read the code
127+
themselves before requesting maintainer review. Return the PR URL when done.
104128

105129
---

skills/implement-method/references/scalarizers.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,6 @@ PYTEST_TORCH_DEVICE=cuda:0 uv run pytest tests/unit -W error # GPU (needs CUDA)
103103

104104
- If `uv run` re-syncs unexpectedly, prefix with `UV_NO_SYNC=1`. Docs build is strict (`-W -n`), so
105105
an `.rst` title underline must match its title length.
106-
- `test_dualproj.py::test_permutation_invariant` and `test_upgrad.py::test_permutation_invariant`
107-
are known flaky off-Linux (~1 float32 ULP, quadprog), pre-existing and unrelated. CI (Linux) is
108-
the source of truth.
106+
- Treat CI as the source of truth. A pre-existing test unrelated to your change can fail by
107+
a tiny float tolerance on other platforms; confirm your new tests pass and that nothing you
108+
touched regressed, rather than chasing it.

0 commit comments

Comments
 (0)