Skip to content

Commit e7f89a9

Browse files
feat: implementation-level tags for code analysis (#2434) (#3179)
## Summary - Add implementation-level tags (`impl_tags`) to describe HOW code implements a plot - Backfill all 1,476 metadata files with impl_tags - Add 5 new filter categories in the UI: uses, technique, pattern, dataprep, style - Short URL keys for filters: `dep`, `tech`, `pat`, `prep`, `style` - Add tooltips for all 11 filter categories - Track impl_tags in Plausible analytics ## Changes ### Backend - New impl_tags structure in metadata YAML files - API returns impl-level filter counts alongside spec-level - Updated FilterCountsResponse schema ### Frontend - 11 filter categories in dropdown (6 spec-level + 5 impl-level) - Tooltips describing each filter category - Plausible tracking for all filter types ### Data - 1,476 metadata files updated with impl_tags - Tags: dependencies, techniques, patterns, dataprep, styling ## Test plan - [x] Unit tests pass (874 tests) - [x] Ruff check and format pass - [x] Manual testing: all 11 categories appear in dropdown - [x] Tooltips display correctly Closes #2434 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 502373c commit e7f89a9

File tree

1,498 files changed

+19155
-414
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,498 files changed

+19155
-414
lines changed

.github/workflows/impl-review.yml

Lines changed: 39 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -144,15 +144,17 @@ jobs:
144144
145145
3. **Read library rules**: `prompts/library/${{ steps.pr.outputs.library }}.md`
146146
147-
4. **MANDATORY: View the plot image**
147+
4. **Read impl-tags guide**: `prompts/impl-tags-generator.md` (for step 8)
148+
149+
5. **MANDATORY: View the plot image**
148150
- You MUST use the Read tool to open `plot_images/plot.png`
149151
- Visually analyze the image - this is critical for the review
150152
- DO NOT skip this step - a review without seeing the image is invalid
151153
- If the image cannot be read, STOP and report the error
152154
153-
5. **Evaluate against quality criteria** from `prompts/quality-criteria.md`
155+
6. **Evaluate against quality criteria** from `prompts/quality-criteria.md`
154156
155-
6. **Post verdict as PR comment** on PR #${{ steps.pr.outputs.pr_number }}:
157+
7. **Post verdict as PR comment** on PR #${{ steps.pr.outputs.pr_number }}:
156158
157159
```markdown
158160
## AI Review - Attempt ${{ steps.attempts.outputs.display }}/3
@@ -199,7 +201,7 @@ jobs:
199201
### Verdict: APPROVED / REJECTED
200202
```
201203
202-
7. **Save review data to files** (for the workflow to parse):
204+
8. **Save review data to files** (for the workflow to parse):
203205
```bash
204206
echo "XX" > quality_score.txt
205207
@@ -235,7 +237,27 @@ jobs:
235237
EOF
236238
```
237239
238-
8. **DO NOT add ai-approved or ai-rejected labels** - the workflow will add them after updating metadata.
240+
9. **Generate impl_tags** (based on prompts/impl-tags-generator.md):
241+
Analyze the implementation code and create impl_tags with 5 dimensions:
242+
- `dependencies`: External packages beyond numpy/pandas/plotting library
243+
- `techniques`: Visualization techniques (twin-axes, colorbar, etc.)
244+
- `patterns`: Code patterns (data-generation, iteration-over-groups, etc.)
245+
- `dataprep`: Data transformations (kde, binning, correlation-matrix, etc.)
246+
- `styling`: Visual style (publication-ready, alpha-blending, etc.)
247+
248+
```bash
249+
cat > review_impl_tags.json << 'EOF'
250+
{
251+
"dependencies": [],
252+
"techniques": ["colorbar", "annotations"],
253+
"patterns": ["data-generation"],
254+
"dataprep": [],
255+
"styling": ["publication-ready"]
256+
}
257+
EOF
258+
```
259+
260+
10. **DO NOT add ai-approved or ai-rejected labels** - the workflow will add them after updating metadata.
239261
240262
**IMPORTANT**: Your review MUST include the "Image Description" section. A review without an image description will be considered invalid.
241263
**IMPORTANT**: All review data (strengths, weaknesses, image_description, criteria_checklist) is saved to metadata for future regeneration. Be specific!
@@ -314,6 +336,7 @@ jobs:
314336
image_description = None
315337
criteria_checklist = None
316338
verdict = None
339+
impl_tags = None
317340
318341
if Path('review_strengths.json').exists():
319342
try:
@@ -350,6 +373,13 @@ jobs:
350373
except:
351374
pass
352375
376+
if Path('review_impl_tags.json').exists():
377+
try:
378+
with open('review_impl_tags.json') as f:
379+
impl_tags = json.load(f)
380+
except:
381+
pass
382+
353383
# Load existing metadata
354384
with open(metadata_file, 'r') as f:
355385
data = yaml.safe_load(f)
@@ -372,6 +402,10 @@ jobs:
372402
if verdict:
373403
data['review']['verdict'] = verdict
374404
405+
# Add impl_tags (issue #2434)
406+
if impl_tags:
407+
data['impl_tags'] = impl_tags
408+
375409
def str_representer(dumper, data):
376410
if isinstance(data, str) and data.endswith('Z') and 'T' in data:
377411
return dumper.represent_scalar('tag:yaml.org,2002:str', data, style="'")

.github/workflows/spec-create.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,7 @@ jobs:
117117
6. **Create specification files:**
118118
- Read template: `prompts/templates/specification.md`
119119
- Read metadata template: `prompts/templates/specification.yaml`
120-
- Read tagging guide: `docs/reference/tagging-system.md`
120+
- Read tagging guide: `prompts/spec-tags-generator.md`
121121
- Create directory: `plots/{specification-id}/`
122122
- Create: `plots/{specification-id}/specification.md` (follow template structure)
123123
- Create: `plots/{specification-id}/specification.yaml` with:
@@ -213,7 +213,7 @@ jobs:
213213
6. **Create specification files:**
214214
- Read template: `prompts/templates/specification.md`
215215
- Read metadata template: `prompts/templates/specification.yaml`
216-
- Read tagging guide: `docs/reference/tagging-system.md`
216+
- Read tagging guide: `prompts/spec-tags-generator.md`
217217
- Create directory: `plots/{specification-id}/`
218218
- Create: `plots/{specification-id}/specification.md` (follow template structure)
219219
- Create: `plots/{specification-id}/specification.yaml` with:

CLAUDE.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -311,6 +311,14 @@ preview_html: null
311311
# Quality
312312
quality_score: 92
313313
314+
# Implementation-level tags (describes HOW the code implements the plot)
315+
impl_tags:
316+
dependencies: [] # External packages (scipy, sklearn, etc.)
317+
techniques: [colorbar, annotations] # Visualization techniques
318+
patterns: [data-generation] # Code patterns
319+
dataprep: [] # Data transformations
320+
styling: [publication-ready] # Visual style
321+
314322
# Review feedback (used for regeneration)
315323
review:
316324
# AI's visual description of the generated plot
@@ -364,8 +372,8 @@ Quality: 92/100 | Created: 2025-01-10
364372
- **Review feedback** stored in metadata for regeneration (AI reads previous feedback to improve)
365373
- **Extended review data**: `image_description`, `criteria_checklist`, and `verdict` for targeted fixes
366374
- Contributors credited via `suggested` field
367-
- Tags are at spec level (same for all libraries)
368-
- Per-library metadata updated automatically by `impl-review.yml` (quality score, review feedback)
375+
- **Two-level tagging**: Spec-level `tags` describe WHAT is visualized (same for all libraries), impl-level `impl_tags` describe HOW code implements it (per-library)
376+
- Per-library metadata updated automatically by `impl-review.yml` (quality score, review feedback, impl_tags)
369377
- `sync-postgres.yml` workflow syncs to database on push to main
370378
- Database stores full spec content (markdown) and implementation code (Python source)
371379

@@ -473,7 +481,7 @@ uv run python -c "from core.database import is_db_configured; print(is_db_config
473481
- Spec content (full markdown from specification.md)
474482
- Spec metadata (title, description, tags, structured_tags from specification.yaml)
475483
- Implementation code (full Python source)
476-
- Implementation metadata (library, variant, quality score, generation info from metadata/*.yaml)
484+
- Implementation metadata (library, variant, quality score, impl_tags, generation info from metadata/*.yaml)
477485
- GCS URLs for preview images
478486

479487
**What's in Repository** (source of truth):
@@ -512,6 +520,8 @@ The `prompts/` directory contains AI agent prompts for code generation, quality
512520
| `quality-evaluator.md` | AI quality evaluation prompt |
513521
| `spec-validator.md` | Validates plot request issues |
514522
| `spec-id-generator.md` | Assigns unique spec IDs |
523+
| `spec-tags-generator.md` | AI rules for spec-level tag assignment |
524+
| `impl-tags-generator.md` | AI rules for impl-level tag assignment |
515525

516526
### Using Prompts
517527

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
"""add_impl_tags
2+
3+
Add impl_tags JSONB column to impls table for issue #2434:
4+
Implementation-level tags describing HOW code is implemented.
5+
5 dimensions: dependencies, techniques, patterns, dataprep, styling
6+
7+
Revision ID: a2f4b8c91d23
8+
Revises: 6345896e2e90
9+
Create Date: 2026-01-07
10+
11+
"""
12+
13+
from typing import Sequence, Union
14+
15+
import sqlalchemy as sa
16+
from sqlalchemy.dialects import postgresql
17+
18+
from alembic import op
19+
20+
21+
# revision identifiers, used by Alembic.
22+
revision: str = "a2f4b8c91d23"
23+
down_revision: Union[str, None] = "6345896e2e90"
24+
branch_labels: Union[str, Sequence[str], None] = None
25+
depends_on: Union[str, Sequence[str], None] = None
26+
27+
28+
def upgrade() -> None:
29+
"""Add impl_tags JSONB column and GIN index to impls table."""
30+
# Add impl_tags column (JSONB for 5 tag dimensions)
31+
op.add_column("impls", sa.Column("impl_tags", postgresql.JSONB(), nullable=True))
32+
33+
# GIN index for fast JSONB containment/existence queries
34+
# Enables efficient filtering by any impl_tags dimension:
35+
# - dependencies, techniques, patterns, dataprep, styling
36+
op.execute("CREATE INDEX ix_impls_impl_tags ON impls USING GIN (impl_tags)")
37+
38+
39+
def downgrade() -> None:
40+
"""Remove impl_tags column and GIN index from impls table."""
41+
op.execute("DROP INDEX ix_impls_impl_tags")
42+
op.drop_column("impls", "impl_tags")

0 commit comments

Comments
 (0)