@@ -60,24 +60,29 @@ Always do this — it is the step we invariably end up needing. Read the relevan
6060algorithm box in the paper, then read the official and best-known third-party implementations at the
6161exact files/lines from the tracking row.
6262
63- Reconcile any discrepancies between them (a different normalization, an extra factor, an
64- initialization value, a stabilization trick, a sign convention). Decide which version to follow,
65- note ** why** , and surface the disagreement to the user. The implementation should be faithful to a
66- clearly-stated source, not an unexplained blend.
63+ Reconcile any discrepancies between them. The ones that most often bite:
64+
65+ - ** Minimization vs maximization.** TorchJD minimizes losses; much MOO/evolutionary work is written
66+ for maximization, with the minimization form buried in a footnote. Find it, and check the sign of
67+ every reference / ideal-point subtraction.
68+ - ** Normalization.** A direction or weight vector may be normalized (` w / ‖w‖ ` ) in the code but not
69+ the paper, or vice versa.
70+ - ** Dead arguments.** An impl may accept a parameter (e.g. a reference point) yet silently ignore it.
71+ - ** Droppable terms.** An ` abs ` / ` clamp ` / ` max(0, ·) ` in the paper may be unnecessary under the
72+ method's preconditions (e.g. non-negative weights); drop it only with a justification.
73+ - ** Other:** an extra factor, an init value, a stabilization / epsilon trick.
74+
75+ Decide which to follow, note ** why** , and surface the disagreement to the user — the implementation
76+ should be faithful to a clearly-stated source, not an unexplained blend.
6777
6878### Step 4: Settle the interface and design decisions
6979
70- Using the research findings (Step 1) and the comparison (Step 3), decide how each non-standard
71- aspect maps onto TorchJD, reusing the closest existing pattern:
72-
73- - ** Stateful, trainable** parameter(s) → ` nn.Parameter ` + the ` Stateful ` mixin's ` reset() ` (mirror
74- ` UW ` / ` IMTL-L ` ).
75- - ** Stateful, non-trainable** history/buffer → a registered buffer + an explicit update method (e.g.
76- ` step() ` ), with no ` nn.Parameter ` (mirror ` DWA ` ).
77- - ** Internal optimizer / multi-call update protocol** (mirror ` FAMO ` ).
78- - ** Preference / reference vector** argument (mirror ` STCH ` / ` COSMOS ` / ` PBI ` ).
79- - ** Preconditions** (e.g. positivity): decide whether to enforce them (raise ` ValueError ` ) or only
80- document them, and how ` nan ` /` inf ` should propagate.
80+ Using the research findings (Step 1) and the comparison (Step 3), map each non-standard aspect onto
81+ the closest existing pattern from the reference loaded in Step 2 (statefulness, trainable parameters,
82+ an internal optimizer, a preference/reference vector, ...). Then settle, for any method type:
83+
84+ - ** Preconditions** (e.g. positivity): enforce them (raise ` ValueError ` ) or only document them, and
85+ how ` nan ` /` inf ` should propagate.
8186- Which constructor arguments are ** required vs optional** , and their ** defaults** .
8287
8388List the non-standard parts and your proposed handling, and ** confirm the design with the user
@@ -91,15 +96,34 @@ license header to the source file and an entry to `NOTICES` (see the reference).
9196
9297### Step 6: Verify
9398
94- Run the checks listed in the reference (unit tests with ` -W error ` , lint, type-check, and the docs
99+ Run the checks listed in the reference (unit tests with ` -W error ` , lint, and the docs
95100build/doctest). GPU tests require a CUDA device; if you cannot run them, provide the exact commands
96101for the user to run on their GPU and report back the results.
97102
98- ### Step 7: Open the PR
103+ ### Step 7: Self-review the code you produced
104+
105+ Before opening anything, re-read your own diff against the requirements and improve what can be
106+ improved. Check that:
107+
108+ - The class follows the closest existing method's conventions (the reference's checklist): correct
109+ base class(es), ` forward(self, values, /) ` returning a 0-dim scalar, shape validation, ` reset() `
110+ for stateful methods, a correct ` __repr__ ` , and the docstring conventions (` r""" ` only with LaTeX,
111+ ` :class: ` cross-ref, ` .. math:: ` + bullet list, a usage doctest, ` :param: ` for each argument).
112+ - The design decisions settled in Step 4 are actually reflected in the code, and any discrepancy
113+ between the paper and the existing implementations (Step 3) is resolved deliberately, with a
114+ comment or docstring note where it is non-obvious.
115+ - The tests cover the documented edge cases and contracts, not just the happy path.
116+ - All six files are present and consistent (class, ` __init__.py ` , ` .rst ` , toctree, test,
117+ ` CHANGELOG.md ` ), plus ` NOTICES ` + a license header if you adapted code.
118+
119+ Apply the fixes you find, then re-run the relevant checks from Step 6.
120+
121+ ### Step 8: Open a draft PR
99122
100- Create a new branch, commit, and open a pull request targeting ` main ` , following the repository's
101- PR conventions (a ` CHANGELOG.md ` entry under ` [Unreleased] > ### Added ` ; when asked for a PR
102- description, output raw GitHub-flavored markdown in a fenced code block, with GitHub math syntax
103- ` $...$ ` / ` $$...$$ ` and no em dashes). Return the PR URL when done.
123+ Create a new branch, commit, and open a ** draft** pull request targeting ` main ` , following the
124+ repository's PR conventions (a ` CHANGELOG.md ` entry under ` [Unreleased] > ### Added ` ; when asked for
125+ a PR description, output raw GitHub-flavored markdown in a fenced code block, with GitHub math syntax
126+ ` $...$ ` / ` $$...$$ ` and no em dashes). Keep it a draft so the contributor can read the code
127+ themselves before requesting maintainer review. Return the PR URL when done.
104128
105129---
0 commit comments