Commit 07caf14
feat: add verify-reduction skill (#979)
* docs: add design spec for proposed reductions Typst note
Covers 9 reductions: 2 NP-hardness chain extensions (#973, #198),
4 Tier 1a blocked issues (#379, #380, #888, #822), and
3 Tier 1b blocked issues (#892, #894, #890).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add verify-reduction skill for mathematical verification of reductions
New skill: /verify-reduction <issue-number>
End-to-end pipeline that takes a reduction rule issue and produces:
1. Typst proof (Construction/Correctness/Extraction/Overhead + YES/NO examples)
2. Python verification script (7 mandatory sections, ≥5000 checks, exhaustive n≤5)
3. Lean 4 lemmas (non-trivial structural proofs required)
Follows issue-to-pr conventions: creates worktree, works in isolation, submits PR.
Strict quality gates (zero tolerance):
- No "trivial" category — every reduction ≥5000 checks
- 7 mandatory Python sections including NO (infeasible) example
- Non-trivial Lean required (rfl/omega tautologies rejected)
- Zero hand-waving in Typst ("clearly", "obviously" → rejected)
- Mandatory gap analysis: every proof claim must have a test
- Self-review checklist with 20+ items across 4 categories
Developed and validated through PR #975 (800K+ checks, 3 bugs caught)
and tested on issues #868 (caught wrong example) and #841 (35K checks).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style: add YAML frontmatter + professional tone to verify-reduction skill
- Added frontmatter (name, description) matching other skills' convention
- Toned down aggressive language ("ZERO TOLERANCE", "THE HARSHEST STEP",
"NON-NEGOTIABLE") to professional but firm language
- All quality gates unchanged — same strictness, better presentation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: adversarial multi-agent verification in verify-reduction skill
Replaces Lean-required gates with adversarial second agent:
- Step 5: Adversary agent independently implements reduce() and
extract_solution() from theorem statement only (not constructor's script)
- Step 5c: Cross-comparison of both implementations on 1000 instances
- Lean downgraded from required to optional
- hypothesis property-based testing for n up to 50
- Quality gates: 2 independent scripts ≥5000 checks each + cross-comparison
Design rationale (docs/superpowers/specs/2026-04-01-adversarial-verification-design.md):
- Same agent writing proof + test is the #1 risk for AI verification
- Two independent implementations agreeing > one + trivial Lean
- Lean caught 0 bugs in PR #975; Python caught all 4
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: design spec for verify-reduction skill enhancements
Typst↔Python auto-matching, test vectors JSON for downstream
consumption by add-rule and review-pipeline, adversary tailoring
by reduction type, compositional verification via pred CLI.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: implementation plan for verify-reduction enhancements
5 tasks: update verify-reduction (Step 4.5 auto-matching, Step 5 typed
adversary, Step 8 downstream artifacts), create add-reduction skill,
register in CLAUDE.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: enhance verify-reduction with test vectors export, typed adversary, pipeline integration
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: create add-reduction skill — consumes verified artifacts from verify-reduction
* feat: register add-reduction skill in CLAUDE.md, update verify-reduction description
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: three improvements to verify-reduction and add-reduction skills
1. verify-reduction Step 1: type compatibility gate — checks source/target
Value types before proceeding. Stops and comments on issue if types are
incompatible (e.g., optimization → decision needs K parameter).
2. add-reduction Step 7: mandatory cleanup of verification artifacts from
docs/paper/verify-reductions/ — Python scripts, JSON, Typst, PDF must
not get into the library.
3. add-reduction Steps 4/4b/5: mandatory requirements from #974 —
canonical example in rule_builders.rs (Check 9), example-db lookup
test (Check 10), paper reduction-rule entry (Check 11).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: concise verify-reduction (761→124 lines) + self-contained add-reduction
verify-reduction: removed verbose templates, condensed checklists into
prose, kept all requirements but removed boilerplate code blocks that
the agent can derive from context.
add-reduction: integrated add-rule Steps 1-6, write-rule-in-paper
Steps 1-6, and #974 requirements (Checks 9/10/11) into a single
self-contained skill. No need to read 3 other skills.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: restore structural requirements in verify-reduction (124→274 lines)
The previous rewrite over-condensed the skill, removing gates that agents
need to follow: 7-section descriptions with table, minimum check count
table, check count audit template, gap analysis format, common mistakes
table, and self-review checklist with checkboxes.
Restored: all structural gates and requirements.
Kept concise: no verbose Python/Typst code templates (agent derives these).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: harden add-reduction with file-level verification gates
Steps 4, 4b, 5 now have HARD GATE labels with verification commands
that check the SPECIFIC required files appear in `git diff --name-only`.
Step 8 has a pre-commit gate that lists all 6 required files and blocks
commit if any is missing.
Root cause: subagents skipped Steps 4 (put example in rule file instead
of rule_builders.rs) and 5 (skipped paper entry entirely) because the
skill said "MANDATORY" but had no mechanical enforcement.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: add CI-equivalent checks to add-reduction pre-commit gate
Root cause: PRs #985 and #991 failed CI because:
1. Local clippy doesn't use -D warnings but CI does (caught needless_range_loop)
2. New reductions can create paths that dominate existing direct reductions
(test_find_dominated_rules_returns_known_set has hardcoded known set)
Added to Step 6:
- Mandatory `cargo clippy -- -D warnings` (not just `cargo clippy`)
- Mandatory `cargo test` (full suite, not filtered)
- Explicit dominated-rules gate with fix instructions
Added to Common Mistakes:
- clippy without -D warnings
- dominated rules test
- skipping full cargo test
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: correct add-reduction HARD GATE for canonical examples
rule_builders.rs is a 4-line pass-through — canonical examples are
registered via canonical_rule_example_specs() in each rule file,
wired through mod.rs. Updated Step 4 to match actual architecture.
Also added analysis.rs to git add list (for dominated-rules updates).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: parent-side verification + pre-commit hook for add-reduction
Two enforcement mechanisms that don't rely on subagent compliance:
1. Parent-side verification (Step 8a): After subagent reports DONE,
the parent runs file gate checks independently. If any required
file is missing, sends the subagent back — doesn't trust self-report.
2. Pre-commit hook (.claude/hooks/add-reduction-precommit.sh):
Mechanically blocks commits of new rule files unless example_db.rs,
reductions.typ, and mod.rs are also staged. Subagents cannot bypass.
Root cause: subagents skip HARD GATE steps despite skill text saying
"MANDATORY". Text-based enforcement doesn't work — need mechanical
checks that run after the subagent, not instructions the subagent reads.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: strengthen type compatibility gate in verify-reduction skill
Expanded the type compatibility table to explicitly list all incompatible
pairs (Min→Or, Max→Or, Max→Min, Min→Max, Or→Sum, etc.) with concrete
examples from batch verification (#198 MVC→HamCircuit, #890 MaxCut→OLA).
Added common mistake entry for proceeding past the type gate.
Learned from batch run: 5 out of 50 reductions were mathematically
verified but turned out to be unimplementable as ReduceTo due to
type mismatches that the original gate didn't catch.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Delete docs/superpowers/plans/2026-04-01-verify-reduction-enhancements.md
* Delete docs/superpowers/specs/2026-03-31-proposed-reductions-note-design.md
* Delete .claude/CLAUDE.md
* Revert "Delete .claude/CLAUDE.md"
This reverts commit 71c1444.
* chore: remove docs/superpowers/specs/ directory
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: integrate verify-reduction into add-rule pipeline
- Delete /add-reduction skill and pre-commit hook (absorbed into /add-rule)
- /add-rule now runs /verify-reduction by default; --no-verify to skip
- /verify-reduction simplified: no worktree, no PR, no saved artifacts
- /issue-to-pr passes --no-verify flag through to /add-rule
- Update CLAUDE.md skill descriptions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Jinguo Liu <cacate0129@gmail.com>1 parent 90fcc85 commit 07caf14
4 files changed
Lines changed: 320 additions & 16 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
| 15 | + | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| 32 | + | |
32 | 33 | | |
33 | 34 | | |
34 | 35 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
9 | 18 | | |
10 | 19 | | |
11 | 20 | | |
| |||
27 | 36 | | |
28 | 37 | | |
29 | 38 | | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
30 | 59 | | |
31 | 60 | | |
32 | 61 | | |
| |||
35 | 64 | | |
36 | 65 | | |
37 | 66 | | |
38 | | - | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
39 | 80 | | |
40 | 81 | | |
41 | 82 | | |
| |||
67 | 108 | | |
68 | 109 | | |
69 | 110 | | |
| 111 | + | |
70 | 112 | | |
71 | 113 | | |
72 | 114 | | |
| |||
78 | 120 | | |
79 | 121 | | |
80 | 122 | | |
81 | | - | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
82 | 126 | | |
83 | 127 | | |
84 | 128 | | |
85 | 129 | | |
86 | 130 | | |
87 | 131 | | |
88 | 132 | | |
89 | | - | |
| 133 | + | |
90 | 134 | | |
91 | 135 | | |
92 | 136 | | |
93 | 137 | | |
94 | 138 | | |
95 | | - | |
| 139 | + | |
96 | 140 | | |
97 | 141 | | |
98 | 142 | | |
| |||
105 | 149 | | |
106 | 150 | | |
107 | 151 | | |
| 152 | + | |
| 153 | + | |
108 | 154 | | |
109 | 155 | | |
110 | 156 | | |
| |||
117 | 163 | | |
118 | 164 | | |
119 | 165 | | |
120 | | - | |
| 166 | + | |
121 | 167 | | |
122 | 168 | | |
123 | 169 | | |
124 | | - | |
| 170 | + | |
125 | 171 | | |
126 | 172 | | |
127 | 173 | | |
128 | 174 | | |
129 | 175 | | |
130 | | - | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
131 | 179 | | |
132 | 180 | | |
133 | 181 | | |
| |||
140 | 188 | | |
141 | 189 | | |
142 | 190 | | |
143 | | - | |
| 191 | + | |
144 | 192 | | |
145 | 193 | | |
146 | 194 | | |
| |||
158 | 206 | | |
159 | 207 | | |
160 | 208 | | |
161 | | - | |
| 209 | + | |
162 | 210 | | |
163 | 211 | | |
164 | 212 | | |
| |||
170 | 218 | | |
171 | 219 | | |
172 | 220 | | |
173 | | - | |
| 221 | + | |
174 | 222 | | |
175 | 223 | | |
176 | 224 | | |
177 | 225 | | |
178 | 226 | | |
179 | 227 | | |
180 | 228 | | |
181 | | - | |
| 229 | + | |
182 | 230 | | |
183 | 231 | | |
184 | 232 | | |
| |||
187 | 235 | | |
188 | 236 | | |
189 | 237 | | |
190 | | - | |
| 238 | + | |
191 | 239 | | |
192 | 240 | | |
193 | 241 | | |
| |||
229 | 277 | | |
230 | 278 | | |
231 | 279 | | |
| 280 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
12 | | - | |
| 12 | + | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
37 | 38 | | |
38 | 39 | | |
39 | 40 | | |
| 41 | + | |
40 | 42 | | |
41 | 43 | | |
42 | 44 | | |
| |||
91 | 93 | | |
92 | 94 | | |
93 | 95 | | |
94 | | - | |
| 96 | + | |
95 | 97 | | |
96 | 98 | | |
97 | 99 | | |
| |||
0 commit comments