Skip to content

Commit 60de10e

Browse files
chore: add mypy CI, extract review prompt, add component tests (#4312)
## Summary Three high-impact improvements from the TAC audit (scored 7.8/10, Level 4 — Orchestrated): - **Add mypy type checking to CI** (LP6, Impact 8) — adds `[tool.mypy]` config to `pyproject.toml` with gradual-adoption overrides for 11 modules with systemic library-interaction issues. New mypy step in `ci-lint.yml` checks `api/` and `core/` on every Python change. - **Extract impl-review.yml inline prompt** (LP3, Impact 7) — consolidates the ~131-line inline prompt into `prompts/workflow-prompts/ai-quality-review.md` and replaces it with a thin 5-line wrapper, matching the pattern used by `impl-generate.yml`. - **Add frontend component tests** (LP9, Impact 7) — installs testing-library + jsdom, configures vitest for component testing, adds 25 new tests across 4 files: `useLocalStorage` (7), `useFilterState` (5), `ErrorBoundary` (5), `ImageCard` (8). ## Test plan - [x] `uv run --extra typecheck mypy api core --pretty` — 0 errors (32 files checked) - [x] `uv run ruff check core/ api/ && uv run ruff format --check core/ api/` — all passing - [x] `uv run pytest tests/unit/ -v` — 1044 passed - [x] `cd app && yarn test` — 44 passed (6 suites) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 8ce8211 commit 60de10e

File tree

18 files changed

+1813
-773
lines changed

18 files changed

+1813
-773
lines changed

.github/workflows/ci-lint.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,10 @@ jobs:
7373
if: steps.check.outputs.should_lint == 'true'
7474
run: uv run ruff format --check .
7575

76+
- name: Run type checking
77+
if: steps.check.outputs.should_lint == 'true'
78+
run: uv run --extra typecheck mypy api core --pretty
79+
7680
- name: Skip notice
7781
if: steps.check.outputs.should_lint == 'false'
7882
run: echo "::notice::Linting skipped - no Python files changed"

.github/workflows/impl-review.yml

Lines changed: 6 additions & 129 deletions
Original file line numberDiff line numberDiff line change
@@ -132,136 +132,13 @@ jobs:
132132
claude_args: "--model opus"
133133
allowed_bots: '*'
134134
prompt: |
135-
## Task: AI Quality Review for **${{ steps.pr.outputs.library }}** (Attempt ${{ steps.attempts.outputs.display }}/3)
135+
Read `prompts/workflow-prompts/ai-quality-review.md` and follow those instructions.
136136
137-
Review the implementation and evaluate if it meets quality standards.
138-
139-
### Your Task
140-
141-
1. **Read the specification**: `plots/${{ steps.pr.outputs.specification_id }}/specification.md`
142-
143-
2. **Read the implementation**:
144-
`plots/${{ steps.pr.outputs.specification_id }}/implementations/${{ steps.pr.outputs.library }}.py`
145-
146-
3. **Read library rules**: `prompts/library/${{ steps.pr.outputs.library }}.md`
147-
148-
4. **Read impl-tags guide**: `prompts/impl-tags-generator.md` (for step 8)
149-
150-
5. **MANDATORY: View the plot image**
151-
- You MUST use the Read tool to open `plot_images/plot.png`
152-
- Visually analyze the image - this is critical for the review
153-
- DO NOT skip this step - a review without seeing the image is invalid
154-
- If the image cannot be read, STOP and report the error
155-
156-
6. **Evaluate against quality criteria** from `prompts/quality-criteria.md`
157-
158-
7. **Post verdict as PR comment** on PR #${{ steps.pr.outputs.pr_number }}:
159-
160-
```markdown
161-
## AI Review - Attempt ${{ steps.attempts.outputs.display }}/3
162-
163-
### Image Description
164-
> Describe what you see in the plot: colors used, axis labels, title, data representation, overall layout.
165-
> This proves you actually looked at the image.
166-
167-
### Quality Score: XX/100
168-
169-
### Criteria Checklist
170-
**Visual Quality (40 pts)**
171-
- [ ] VQ-01: Text Legibility (10) - all text readable at full size
172-
- [ ] VQ-02: No Overlap (8) - no overlapping text
173-
- [ ] VQ-03: Element Visibility (8) - markers/lines sized for data density
174-
- [ ] VQ-04: Color Accessibility (5) - colorblind-safe
175-
- [ ] VQ-05: Layout Balance (5) - good proportions
176-
177-
**Spec Compliance (25 pts)**
178-
- [ ] SC-01: Plot Type (8) - correct chart type
179-
- [ ] SC-02: Data Mapping (5) - X/Y correctly assigned
180-
- [ ] SC-03: Required Features (5) - all spec features present
181-
- [ ] SC-06: Title Format (2) - uses {spec-id} · {library} · pyplots.ai
182-
183-
**Data Quality (20 pts)**
184-
- [ ] DQ-01: Feature Coverage (8) - shows ALL aspects of plot type
185-
- [ ] DQ-02: Realistic Context (7) - plausible scenario
186-
- [ ] DQ-03: Appropriate Scale (5) - sensible values
187-
188-
**Code Quality (10 pts)**
189-
- [ ] CQ-01: KISS Structure (3) - no functions/classes
190-
- [ ] CQ-02: Reproducibility (3) - fixed seed
191-
192-
**Library Features (5 pts)**
193-
- [ ] LF-01: Uses distinctive library features
194-
195-
### Strengths
196-
- Strength 1 (keep these aspects)
197-
- Strength 2
198-
199-
### Weaknesses
200-
- Weakness 1 (AI will fix these - let it decide HOW)
201-
202-
### Verdict: APPROVED / REJECTED
203-
```
204-
205-
8. **Save review data to files** (for the workflow to parse):
206-
```bash
207-
echo "XX" > quality_score.txt
208-
209-
# Save structured feedback as JSON (one array per file)
210-
echo '["Strength 1", "Strength 2"]' > review_strengths.json
211-
echo '["Weakness 1"]' > review_weaknesses.json
212-
213-
# Save verdict
214-
echo "APPROVED" > review_verdict.txt # or "REJECTED"
215-
216-
# Save image description (multi-line text)
217-
cat > review_image_description.txt << 'EOF'
218-
The plot shows a scatter plot with blue markers...
219-
[Your full image description here]
220-
EOF
221-
222-
# Save criteria checklist as structured JSON
223-
cat > review_checklist.json << 'EOF'
224-
{
225-
"visual_quality": {
226-
"score": 36,
227-
"max": 40,
228-
"items": [
229-
{"id": "VQ-01", "name": "Text Legibility", "score": 10, "max": 10, "passed": true, "comment": "All text readable"},
230-
{"id": "VQ-02", "name": "No Overlap", "score": 8, "max": 8, "passed": true, "comment": "No overlapping elements"}
231-
]
232-
},
233-
"spec_compliance": {"score": 23, "max": 25, "items": [...]},
234-
"data_quality": {"score": 18, "max": 20, "items": [...]},
235-
"code_quality": {"score": 10, "max": 10, "items": [...]},
236-
"library_features": {"score": 5, "max": 5, "items": [...]}
237-
}
238-
EOF
239-
```
240-
241-
9. **Generate impl_tags** (based on prompts/impl-tags-generator.md):
242-
Analyze the implementation code and create impl_tags with 5 dimensions:
243-
- `dependencies`: External packages beyond numpy/pandas/plotting library
244-
- `techniques`: Visualization techniques (twin-axes, colorbar, etc.)
245-
- `patterns`: Code patterns (data-generation, iteration-over-groups, etc.)
246-
- `dataprep`: Data transformations (kde, binning, correlation-matrix, etc.)
247-
- `styling`: Visual style (publication-ready, alpha-blending, etc.)
248-
249-
```bash
250-
cat > review_impl_tags.json << 'EOF'
251-
{
252-
"dependencies": [],
253-
"techniques": ["colorbar", "annotations"],
254-
"patterns": ["data-generation"],
255-
"dataprep": [],
256-
"styling": ["publication-ready"]
257-
}
258-
EOF
259-
```
260-
261-
10. **DO NOT add ai-approved or ai-rejected labels** - the workflow will add them after updating metadata.
262-
263-
**IMPORTANT**: Your review MUST include the "Image Description" section. A review without an image description will be considered invalid.
264-
**IMPORTANT**: All review data (strengths, weaknesses, image_description, criteria_checklist) is saved to metadata for future regeneration. Be specific!
137+
Variables for this run:
138+
- LIBRARY: ${{ steps.pr.outputs.library }}
139+
- SPEC_ID: ${{ steps.pr.outputs.specification_id }}
140+
- PR_NUMBER: ${{ steps.pr.outputs.pr_number }}
141+
- ATTEMPT: ${{ steps.attempts.outputs.display }}
265142
266143
- name: Extract quality score
267144
id: score

0 commit comments

Comments
 (0)