Revert review-structural Step 4b; let auto-pipeline Step 3 cover it

isPANN · claude · isPANN · commit 24d7b80e6cb8 · 2026-05-26T21:43:03.000+08:00
Step 4b mixed three different concerns into review-structural (a
read-only structural skill):
- 4b-2 ran `cargo test --exact &lt;closed_loop&gt;` for the new rule —
  redundant with auto-pipeline Step 3's workspace-wide `make check`.
- 4b-3 read the test source to score it against four criteria — that's
  a test-quality judgment, not a structural check; review-quality is
  the right home for that and it already covers test quality.
- 4b-4 ran `pred solve` end-to-end via the new rule — overlaps with
  the agentic feature tests review-pipeline already runs.
- 4b-1 (grep for the closed_loop test) is subsumed by Step 4's
  existing checklist of required rule artefacts.

Restoring review-structural to its main version. Also removing the
Phase 4 prompt's mandatory-Step-4b clause from auto-pipeline.

Co-Authored-By: Claude Opus 4.7 &lt;noreply@anthropic.com&gt;
diff --git a/.claude/skills/auto-pipeline/SKILL.md b/.claude/skills/auto-pipeline/SKILL.md
@@ -373,9 +373,6 @@ Run /review-pipeline on PR #<PR>. Follow
 .claude/skills/review-pipeline/SKILL.md exactly; it always moves the
 PR to Final review at the end.
 
-For PRs adding a reduction rule, review-structural Step 4b (round-trip
-execution) is mandatory — see that skill for the four criteria.
-
 Return ONLY this JSON shape:
 {
   "outcome": "success" | "failure",
diff --git a/.claude/skills/review-structural/SKILL.md b/.claude/skills/review-structural/SKILL.md
@@ -118,55 +118,6 @@ Report pass/fail. If tests fail, identify which tests. **Do NOT fix anything** 
 3. **Example quality** — Is it tutorial-style? Does the JSON export include both source and target data?
 4. **Paper quality** — Is the reduction-rule statement precise? Is the proof sketch sound?
 
-## Step 4b: Round-trip Execution (Rule reviews only) — MANDATORY
-
-Run the rule **end to end** on its canonical example(s), not just confirm a `closed_loop` test exists. The goal is to catch reductions that compile but silently lose information or drop corner cases.
-
-### 4b-1: Locate the test
-
-```bash
-# The closed-loop test for this rule. The reference helper is:
-#   crate::rules::test_helpers::assert_optimization_round_trip_from_optimization_target
-# Find every test in the new test file that exercises the round-trip:
-grep -nE "fn test_.*(closed_loop|round_trip|jl_parity)" src/unit_tests/rules/{R}.rs
-```
-
-### 4b-2: Run it by name and confirm it actually executes
-
-```bash
-cargo test --lib --no-fail-fast -- --exact test_{rule_module}::test_{...}_closed_loop
-```
-
-- Confirm the binary prints `test result: ok. 1 passed` for **each** named test.
-- "0 passed; 0 failed; 0 ignored" means the test name was wrong → flag as ISSUE.
-- If the closed-loop test is `#[ignore]`'d → FAIL.
-
-### 4b-3: Confirm the round-trip is real, not a type check
-
-Read each closed-loop test and verify it does ALL FOUR of the following:
-
-| # | What the test must do | Red flag |
-|---|---|---|
-| 1 | Build a concrete source instance with **multiple feasible solutions of different objective values** | Trivial instance (single edge, single clause, empty graph) — anything where the answer is unique by inspection |
-| 2 | Call `ReduceTo::<Target>::reduce_to(&source)` to produce the target | Test only asserts on `target_problem()` fields without actually solving |
-| 3 | Solve the target (BruteForce / ILP / appropriate solver) AND solve the source independently | Test compares only target value, never source value |
-| 4 | Use `extract_solution` (or the appropriate helper, e.g. `assert_optimization_round_trip_from_optimization_target`) to map the target witness back, then assert the extracted source configuration is **optimal** for the source | Test asserts only that `extract_solution` returns `Some(_)` or has the right length |
-
-If any of (1)-(4) is missing → ISSUE (test does not verify the round-trip).
-
-### 4b-4: Spot-check on a fresh instance (when feasible)
-
-The rule's own closed-loop test is author-supplied — it may use exactly the input that makes the reduction look right. To catch that class of bug, drive a different instance through the CLI:
-
-```bash
-pred path <Source> <Target> --json     # confirm the new rule is on the chosen path
-pred create --example <rule_example_id> --output /tmp/src.json
-pred solve /tmp/src.json                # direct source-side optimum
-pred solve /tmp/src.json --via <Target> # through the new reduction
-```
-
-The two `pred solve` values must agree. If `--via` isn't wired for this path, skip 4b-4 and note it in the report — do not pad with a re-statement of 4b-2.
-
 ## Step 5: Issue Compliance Review
 
 Only if a linked issue was provided.
@@ -212,12 +163,6 @@ Flag any deviation as ISSUE.
 ### Semantic Review
 - [check]: OK / ISSUE — description
 
-### Round-trip Execution (rules only)
-- Closed-loop test located: PASS / FAIL — name(s) and file
-- Closed-loop test runs and passes: PASS / FAIL — paste the `test result: ok. N passed` line
-- Test exercises real round-trip (4 criteria): PASS / FAIL — note any missing criterion
-- Spot-check with pred / scratch: PASS / FAIL — record method used and observed values
-
 ### Issue Compliance (if linked issue found)
 | # | Check | Status |
 |---|-------|--------|

Original file line number	Diff line number	Diff line change
`@@ -373,9 +373,6 @@ Run /review-pipeline on PR #<PR>. Follow`
`373`	`373`	`.claude/skills/review-pipeline/SKILL.md exactly; it always moves the`
`374`	`374`	`PR to Final review at the end.`
`375`	`375`
`376`		`-For PRs adding a reduction rule, review-structural Step 4b (round-trip`
`377`		`-execution) is mandatory — see that skill for the four criteria.`
`378`		`-`
`379`	`376`	`Return ONLY this JSON shape:`
`380`	`377`	`{`
`381`	`378`	`"outcome": "success" \| "failure",`