You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refactor generated doc exports and remove centralized trait tests (#676)
* initial
* clean up reduction_graph.json
* refactor-remove-conflicting files
* update skills and prompts
* relax model test naming to flexible coverage with >= 3 test floor
Replace rigid per-model test function name checklist with flexible
coverage guidance enforced by a minimum of 3 test functions. All three
files (CLAUDE.md, add-model skill, structural reviewer) are now
consistent on the threshold.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* remove conflicting generated files
These files were intentionally removed earlier in this branch;
re-delete after merge conflict resolution.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Model tests: `test_<model>_basic`, `test_<model>_serialization`
210
+
- Model tests: descriptive names — e.g., `test_<model>_creation`, `test_<model>_evaluate_*`, `test_<model>_direction`, `test_<model>_solver`, `test_<model>_serialization`. Use whichever are relevant; there is no fixed per-model naming set.
211
211
- Solver tests: `test_<solver>_<problem>`
212
212
213
213
### Key Testing Patterns
@@ -218,6 +218,8 @@ See Key Patterns above for solver API signatures. Follow the reference files for
218
218
219
219
Unit tests in `src/unit_tests/` linked via `#[path]` (see Core Modules above). Integration tests in `tests/suites/`, consolidated through `tests/main.rs`. Canonical example-db coverage lives in `src/unit_tests/example_db.rs`.
220
220
221
+
Model review automation checks for a dedicated test file under `src/unit_tests/models/...` with at least 3 test functions. The exact split of coverage is judged per model during review.
222
+
221
223
## Documentation Locations
222
224
-`README.md` — Project overview and quickstart
223
225
-`.claude/` — Claude Code instructions and skills
@@ -258,7 +260,7 @@ Also add to the `display-name` dictionary:
258
260
]
259
261
```
260
262
261
-
Every directed reduction in the graph needs its own `reduction-rule` entry. The paper auto-checks completeness against `reduction_graph.json`.
263
+
Every directed reduction in the graph needs its own `reduction-rule` entry. The paper auto-checks completeness against the generated `reduction_graph.json` export.
-`test_<name>_paper_example` -- **use the same instance from the paper example** (Step 6), verify the issue's expected outcome is valid/optimal and the solution count matches
198
+
Every model needs **at least 3 test functions** (the structural reviewer enforces this). Choose from the coverage areas below — pick whichever are relevant to the model:
206
199
207
-
The `test_<name>_paper_example` test is critical for consistency between code and paper. It must:
208
-
1. Construct the exact same instance shown in the paper's example figure
-**Evaluation** — valid and invalid configs so the feasibility boundary is explicit.
202
+
-**Direction** — verify optimization direction (optimization problems only).
203
+
-**Solver** — brute-force solver finds correct solutions (when the model is small enough).
204
+
-**Serialization** — round-trip serde (when the model is used in CLI/example-db flows).
205
+
-**Paper example** — verify the worked example from the paper entry (see below).
206
+
207
+
When you add `test_<name>_paper_example`, it should:
208
+
1. Construct the same instance shown in the paper's example figure
209
209
2. Evaluate the solution from the issue's **Expected Outcome** section as shown in the paper and assert it is valid (and optimal for optimization problems)
210
-
3. Use `BruteForce` to find all optimal/satisfying solutions and assert the count matches the paper's claim
210
+
3. Use `BruteForce` to confirm the claimed optimum/satisfying solution count when the instance is small enough for unit tests
211
211
212
-
This test should be written **after** Step 6 (paper entry), once the example instance and expected outcome are finalized. If writing tests before the paper, use the issue's Example Instance + Expected Outcome as the source of truth and come back to verify consistency.
212
+
This test is usually written **after** Step 6 (paper entry), once the example instance and expected outcome are finalized. If writing tests before the paper, use the issue's Example Instance + Expected Outcome as the source of truth and come back to verify consistency.
213
213
214
214
Link the test file via `#[cfg(test)] #[path = "..."] mod tests;` at the bottom of the model file.
215
215
216
-
## Step 5.5: Add trait_consistency entry
217
-
218
-
Add the new problem to `src/unit_tests/trait_consistency.rs`:
219
-
220
-
1.**`test_all_problems_implement_trait_correctly`** — add a `check_problem_trait(...)` call with a small instance
221
-
2.**`test_direction`** (optimization problems only) — add an `assert_eq!(...direction(), Direction::Minimize/Maximize)` entry
222
-
223
-
This is **required** for every new model — it ensures the Problem trait implementation is well-formed.
224
-
225
216
## Step 6: Document in paper
226
217
227
218
Write a `problem-def` entry in `docs/paper/reductions.typ`. **Reference example:** search for `problem-def("MaximumIndependentSet")` to see the gold-standard entry — use it as a template.
| Missing from CLI help table | Must add entry to "Flags by problem type" table in `cli.rs``after_help`|
295
286
| Schema lists derived fields | Schema should list constructor params, not internal fields (e.g., `matrix, k` not `matrix, m, n, k`) |
296
287
| Missing canonical model example | Add a builder in `src/example_db/model_builders.rs` and keep it aligned with paper/example workflows |
297
-
| Forgetting trait_consistency | Must add entry in `test_all_problems_implement_trait_correctly` (and `test_direction` for optimization) in `src/unit_tests/trait_consistency.rs`|
298
288
| Paper example not tested | Must include `test_<name>_paper_example` that verifies the exact instance, solution, and solution count shown in the paper |
Copy file name to clipboardExpand all lines: .claude/skills/final-review/SKILL.md
-1Lines changed: 0 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -184,7 +184,6 @@ Verify the PR includes all required components. Check:
184
184
-[ ] Canonical model example function in the model file
185
185
-[ ] Paper section in `docs/paper/reductions.typ` (`problem-def` entry)
186
186
-[ ]`display-name` entry in paper
187
-
-[ ]`trait_consistency.rs` entry in `src/unit_tests/trait_consistency.rs` (`test_all_problems_implement_trait_correctly`, plus `test_direction` for optimization)
188
187
-[ ] Aliases: if provided, verify they are standard literature abbreviations (not made up); if empty, confirm no well-known abbreviation is missing; check no conflict with existing aliases
Copy file name to clipboardExpand all lines: .claude/skills/issue-to-pr/SKILL.md
+3-4Lines changed: 3 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -97,7 +97,7 @@ The plan MUST reference the appropriate implementation skill and follow its step
97
97
Include the concrete details from the issue (problem definition, reduction algorithm, example, etc.) mapped onto each step.
98
98
99
99
**Plan batching:** The paper writing step (add-model Step 6 / add-rule Step 5) MUST be in a **separate batch** from the implementation steps, so it gets its own subagent with fresh context. It depends on the implementation being complete (needs exports). Example batch structure for a `[Model]` plan:
Copy file name to clipboardExpand all lines: .claude/skills/review-pipeline/SKILL.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -329,7 +329,7 @@ Completed: 2/2 | All moved to Final review
329
329
| Guessing on an issue card with multiple linked repo PRs | Stop, show options to the user, and recommend the most likely correct OPEN PR |
330
330
| Picking a PR before Copilot has reviewed | Inspect the checked-out diff and PR body first. If the PR is incomplete, comment and move it back to Ready. If it is review-ready, request Copilot review and switch to another item instead of waiting |
0 commit comments