Skip to content

Commit ff8ae70

Browse files
isPANNclaude
andauthored
Paper review sessions 1-6: entries 1-60 (#1051)
* fix completeness check: use covered-rules.final() instead of .get() The warning falsely reported 4 missing rules because .get() only sees state accumulated before the current location. Rules defined after the check were invisible. Using .final() sees all rules in the document. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * update arxiv * Update arXiv link * add citation section to README Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * paper review session 5: entries 41-50 (MinimumMaximalMatching–MinimumSetCovering) - MinimumMaximalMatching: add @xiao2014 citation, O*(1.3160^n) complexity - PartitionIntoPathsOfLength2: upgrade canonical example from trivial 6-vertex (two disjoint paths) to 9-vertex 3×3 grid with 12 edges and 10 distinct valid groupings; add figure, pred commands, fixture - MinimumSumMulticenter: replace prose citations with @karivhakimi1979, @cohenaddad2022 - MinMaxMulticenter: replace prose citations with @karivhakimi1979, @hochbaumshmoys1985, @hsunemhauser1979; add per-vertex distances - MultipleCopyFileAllocation: upgrade degenerate canonical example (all vertices host copies) to path P_6 with varied usage/storage showing storage-vs-access tradeoff; enrich background with CDN, database replication, UFL connection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * add VC and VertexCover as aliases for DecisionMinimumVertexCover Closes #1050 — users can now refer to the decision vertex cover problem as "VC", "VertexCover", or "DMVC" (legacy). The internal registered name remains DecisionMinimumVertexCover to avoid breaking the proc macro name extraction in #[reduction]. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * move display math punctuation inside formula blocks Move trailing punctuation (periods, commas) from outside display math closing `$` to inside the formula content, e.g. `$.\n` → content line gets `.` appended and closing becomes `$\n`. Affects 100 instances across both multi-line and single-line display equations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * add review-paper skill for evaluating quality of problem definitions and reduction rules This new skill allows users to review the Typst paper for quality issues, evaluating 10 entries per session and generating structured reports on mechanical and critical issues. The skill includes detailed checklists for both problem definitions and reduction rules, ensuring thorough evaluations without modifying any files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * paper review session 6: entries 51-60 (MinimumHittingSet–MinimumCardinalityKey) Add 5 figures (ConsecutiveSets, ExactCoverBy3Sets, ThreeDimensionalMatching, ThreeMatroidIntersection, MinimumCardinalityKey), fix SetSplitting citation (Lovász 1973 via NAE-3SAT), fix PrimeAttributeName citation (@lucchesi1978keys), add @lovasz1973 bib entry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent d4060d6 commit ff8ae70

8 files changed

Lines changed: 941 additions & 238 deletions

File tree

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
---
2+
name: review-paper
3+
description: Review the Typst paper (docs/paper/reductions.typ) for quality issues — evaluates 10 entries per session, reports mechanical and critical issues without fixing
4+
---
5+
6+
# Review Paper
7+
8+
Evaluate the quality of problem definitions and reduction rules in `docs/paper/reductions.typ`. Each session reviews **10 entries** (problems or rules), producing a structured report. **Read-only — do not modify any files.**
9+
10+
## Usage
11+
12+
```
13+
/review-paper # review next 10 unreviewed problem-defs
14+
/review-paper rules # review next 10 unreviewed reduction-rules
15+
/review-paper ProblemName # review a specific problem-def
16+
/review-paper Source Target # review a specific reduction-rule
17+
```
18+
19+
## Step 0: Determine Scope
20+
21+
Parse the argument:
22+
- No argument or `problems` → review problem-defs
23+
- `rules` → review reduction-rules
24+
- A specific name → review that single entry
25+
26+
To pick which 10 to review, scan `docs/paper/reductions.typ` for all `problem-def(...)` or `reduction-rule(...)` entries. Start from the beginning of the file, skipping any that have been reviewed in a previous session (check memory for `paper-review-progress`). If all have been reviewed, report completion.
27+
28+
## Step 1: Load Gold Standard
29+
30+
Read the reference examples before reviewing:
31+
- **Problem gold standard:** search for `problem-def("MaximumIndependentSet")` in `reductions.typ` — note its structure, depth, and components
32+
- **Rule gold standard:** search for `reduction-rule("MaximumIndependentSet", "MinimumVertexCover"` — note its proof depth and example
33+
34+
## Step 2: Review Each Entry
35+
36+
For each of the 10 entries, read the full entry text and evaluate against the checklists below.
37+
38+
### Problem-Def Checklist
39+
40+
**Mechanical checks** (objective, can be verified by reading):
41+
42+
| Check | Criterion |
43+
|-------|-----------|
44+
| M1. Display name | Entry exists in `display-name` dictionary |
45+
| M2. Formal definition | `def` parameter is present and non-empty |
46+
| M3. Self-contained notation | Every symbol in `def` is defined before first use |
47+
| M4. Background text | Body contains at least 2 sentences of background/motivation |
48+
| M5. Example present | Body contains `*Example.*` or `Example.` |
49+
| M6. Example from fixture | Example data matches `src/example_db/fixtures/examples.json` (not invented) — check by loading the JSON and comparing |
50+
| M7. Figure present | Body contains `#figure(` |
51+
| M8. Pred commands | Body contains `pred-commands(` or `pred create` |
52+
| M9. Algorithm citation | Complexity claims have `@citation` or a footnote explaining absence |
53+
| M10. Evaluation shown | Example shows how the objective/verifier computes the value |
54+
55+
**Critical checks** (require judgment):
56+
57+
| Check | Criterion |
58+
|-------|-----------|
59+
| C1. Definition correctness | Does the formal definition accurately describe the problem? Compare with the Rust implementation (`src/models/`) and literature |
60+
| C2. Background quality | Is the background informative? Does it mention applications, history, special cases, or algorithmic context? |
61+
| C3. Example pedagogy | Is the example small enough to verify by hand? Does it illustrate the key aspects of the problem? |
62+
| C4. Completeness | Are there important aspects of the problem that are missing (e.g., well-known special cases, relationship to other problems)? |
63+
64+
### Reduction-Rule Checklist
65+
66+
**Mechanical checks:**
67+
68+
| Check | Criterion |
69+
|-------|-----------|
70+
| M1. Theorem statement | Rule body describes the construction |
71+
| M2. Proof present | Proof body is non-empty |
72+
| M3. Proof length | Proof is at least 3 sentences (not just "trivial" or a one-liner) |
73+
| M4. Overhead documented | Overhead is auto-generated from JSON (verify edge exists in `reduction_graph.json`) |
74+
| M5. Example present | `example: true` and example renders correctly |
75+
| M6. Example from fixture | Example data matches `src/example_db/fixtures/examples.json` |
76+
| M7. Pred commands | Example section contains `pred-commands(` with create/reduce/evaluate pipeline |
77+
| M8. Both directions | If the reverse rule also exists in the graph, check it has its own entry |
78+
79+
**Critical checks:**
80+
81+
| Check | Criterion |
82+
|-------|-----------|
83+
| C1. Construction correctness | Does the theorem statement accurately describe what `reduce_to()` does? Read `src/rules/<source>_<target>.rs` to verify |
84+
| C2. Proof correctness | Does the proof correctly argue that the reduction preserves solutions? |
85+
| C3. Example clarity | Does the example clearly show source → target → solution extraction? |
86+
| C4. Proof-only flag | If this is a proof-only reduction (not solver-executable), is that stated? |
87+
88+
## Step 3: Generate Report
89+
90+
Present results **one entry at a time** in this format:
91+
92+
```
93+
### [N/10] ProblemName (or Source → Target)
94+
95+
**Mechanical Issues:**
96+
- [PASS] M1. Display name
97+
- [FAIL] M5. Example present — no worked example in body
98+
- [WARN] M9. Algorithm citation — complexity claim "O*(2^n)" has no @citation
99+
100+
**Critical Issues:**
101+
- [FAIL] C2. Background quality — body is only one sentence ("This is NP-hard.")
102+
with no applications, history, or algorithmic context
103+
- [OK] C1. Definition correctness — matches Rust implementation
104+
105+
**Verdict:** 2 mechanical fails, 1 critical fail — needs improvement
106+
```
107+
108+
After each entry, pause and ask: **"Continue to next entry, or discuss this one?"**
109+
110+
Use these severity levels:
111+
- **PASS** — meets criterion
112+
- **WARN** — minor issue, could be improved but acceptable
113+
- **FAIL** — does not meet criterion, should be fixed
114+
115+
## Step 4: Session Summary
116+
117+
After all 10 entries, print a summary table:
118+
119+
```
120+
## Session Summary
121+
122+
| Entry | Mechanical | Critical | Verdict |
123+
|-------|-----------|----------|---------|
124+
| ProblemA | 9/10 pass | 4/4 pass | Good |
125+
| ProblemB | 7/10 pass | 3/4 pass | Needs work |
126+
| ... | ... | ... | ... |
127+
128+
Overall: X/10 entries pass all checks.
129+
Top priorities for improvement: [list the 3 worst entries]
130+
```
131+
132+
## Step 5: Save Progress
133+
134+
Save progress to memory so the next session can continue where this one left off. Record which entries have been reviewed and their verdicts.
135+
136+
## Important Rules
137+
138+
1. **Do not modify any files.** This skill is read-only.
139+
2. **Do not invent issues.** Only report problems you can verify by reading the source.
140+
3. **Check the Rust source** for critical checks — don't guess whether the math is right.
141+
4. **Be specific.** "Background is thin" is not useful. "Background is one sentence with no applications or algorithmic context" is useful.
142+
5. **Compare to gold standard.** The MIS entry is the reference — entries don't need to be as long, but they should cover the same structural elements.

0 commit comments

Comments
 (0)