Skip to content

Commit 24d7b80

Browse files
isPANNclaude
andcommitted
Revert review-structural Step 4b; let auto-pipeline Step 3 cover it
Step 4b mixed three different concerns into review-structural (a read-only structural skill): - 4b-2 ran `cargo test --exact <closed_loop>` for the new rule — redundant with auto-pipeline Step 3's workspace-wide `make check`. - 4b-3 read the test source to score it against four criteria — that's a test-quality judgment, not a structural check; review-quality is the right home for that and it already covers test quality. - 4b-4 ran `pred solve` end-to-end via the new rule — overlaps with the agentic feature tests review-pipeline already runs. - 4b-1 (grep for the closed_loop test) is subsumed by Step 4's existing checklist of required rule artefacts. Restoring review-structural to its main version. Also removing the Phase 4 prompt's mandatory-Step-4b clause from auto-pipeline. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 69d7f0c commit 24d7b80

2 files changed

Lines changed: 0 additions & 58 deletions

File tree

.claude/skills/auto-pipeline/SKILL.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -373,9 +373,6 @@ Run /review-pipeline on PR #<PR>. Follow
373373
.claude/skills/review-pipeline/SKILL.md exactly; it always moves the
374374
PR to Final review at the end.
375375
376-
For PRs adding a reduction rule, review-structural Step 4b (round-trip
377-
execution) is mandatory — see that skill for the four criteria.
378-
379376
Return ONLY this JSON shape:
380377
{
381378
"outcome": "success" | "failure",

.claude/skills/review-structural/SKILL.md

Lines changed: 0 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -118,55 +118,6 @@ Report pass/fail. If tests fail, identify which tests. **Do NOT fix anything**
118118
3. **Example quality** — Is it tutorial-style? Does the JSON export include both source and target data?
119119
4. **Paper quality** — Is the reduction-rule statement precise? Is the proof sketch sound?
120120

121-
## Step 4b: Round-trip Execution (Rule reviews only) — MANDATORY
122-
123-
Run the rule **end to end** on its canonical example(s), not just confirm a `closed_loop` test exists. The goal is to catch reductions that compile but silently lose information or drop corner cases.
124-
125-
### 4b-1: Locate the test
126-
127-
```bash
128-
# The closed-loop test for this rule. The reference helper is:
129-
# crate::rules::test_helpers::assert_optimization_round_trip_from_optimization_target
130-
# Find every test in the new test file that exercises the round-trip:
131-
grep -nE "fn test_.*(closed_loop|round_trip|jl_parity)" src/unit_tests/rules/{R}.rs
132-
```
133-
134-
### 4b-2: Run it by name and confirm it actually executes
135-
136-
```bash
137-
cargo test --lib --no-fail-fast -- --exact test_{rule_module}::test_{...}_closed_loop
138-
```
139-
140-
- Confirm the binary prints `test result: ok. 1 passed` for **each** named test.
141-
- "0 passed; 0 failed; 0 ignored" means the test name was wrong → flag as ISSUE.
142-
- If the closed-loop test is `#[ignore]`'d → FAIL.
143-
144-
### 4b-3: Confirm the round-trip is real, not a type check
145-
146-
Read each closed-loop test and verify it does ALL FOUR of the following:
147-
148-
| # | What the test must do | Red flag |
149-
|---|---|---|
150-
| 1 | Build a concrete source instance with **multiple feasible solutions of different objective values** | Trivial instance (single edge, single clause, empty graph) — anything where the answer is unique by inspection |
151-
| 2 | Call `ReduceTo::<Target>::reduce_to(&source)` to produce the target | Test only asserts on `target_problem()` fields without actually solving |
152-
| 3 | Solve the target (BruteForce / ILP / appropriate solver) AND solve the source independently | Test compares only target value, never source value |
153-
| 4 | Use `extract_solution` (or the appropriate helper, e.g. `assert_optimization_round_trip_from_optimization_target`) to map the target witness back, then assert the extracted source configuration is **optimal** for the source | Test asserts only that `extract_solution` returns `Some(_)` or has the right length |
154-
155-
If any of (1)-(4) is missing → ISSUE (test does not verify the round-trip).
156-
157-
### 4b-4: Spot-check on a fresh instance (when feasible)
158-
159-
The rule's own closed-loop test is author-supplied — it may use exactly the input that makes the reduction look right. To catch that class of bug, drive a different instance through the CLI:
160-
161-
```bash
162-
pred path <Source> <Target> --json # confirm the new rule is on the chosen path
163-
pred create --example <rule_example_id> --output /tmp/src.json
164-
pred solve /tmp/src.json # direct source-side optimum
165-
pred solve /tmp/src.json --via <Target> # through the new reduction
166-
```
167-
168-
The two `pred solve` values must agree. If `--via` isn't wired for this path, skip 4b-4 and note it in the report — do not pad with a re-statement of 4b-2.
169-
170121
## Step 5: Issue Compliance Review
171122

172123
Only if a linked issue was provided.
@@ -212,12 +163,6 @@ Flag any deviation as ISSUE.
212163
### Semantic Review
213164
- [check]: OK / ISSUE — description
214165
215-
### Round-trip Execution (rules only)
216-
- Closed-loop test located: PASS / FAIL — name(s) and file
217-
- Closed-loop test runs and passes: PASS / FAIL — paste the `test result: ok. N passed` line
218-
- Test exercises real round-trip (4 criteria): PASS / FAIL — note any missing criterion
219-
- Spot-check with pred / scratch: PASS / FAIL — record method used and observed values
220-
221166
### Issue Compliance (if linked issue found)
222167
| # | Check | Status |
223168
|---|-------|--------|

0 commit comments

Comments
 (0)