feat: implementation-level tags for code analysis (#2434)#3179
feat: implementation-level tags for code analysis (#2434)#3179MarkusNeusinger merged 9 commits intomainfrom
Conversation
- Introduced impl_tags to describe implementation details - Updated workflows and scripts to handle new impl_tags structure - Enhanced filtering and counting mechanisms for impl-level tags
Added impl_tags section with 5 dimensions:
- dependencies: 14 unique (selenium, scipy, sklearn most common)
- techniques: 24 unique (html-export, annotations, layer-composition)
- patterns: 10 unique (data-generation in 93% of files)
- dataprep: 9 unique (kde, normalization, regression)
- styling: 6 unique (alpha-blending, grid-styling, minimal-chrome)
Cleaned up styling tags:
- Removed 'publication-ready' (too vague, not meaningful)
- Fixed 'custom-colormap' to only tag actual cmap= usage
- Added 'minimal-chrome' where axis('off') is used
Updated tagging rules in prompts/impl-tags-generator.md
Issue #2434
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added _create_direct_engine_sync() function that uses pg8000 driver for sync database operations. This allows running the sync script locally with DATABASE_URL instead of only in Cloud SQL environments. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update _image_matches_groups tests to include impl_lookup parameter - Update _calculate_contextual_counts test to include impl_lookup - Update _calculate_or_counts tests to include impl_lookup parameter - Update init_db_sync test to mock _create_direct_engine_sync - Update app version/description assertions to match current values 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add impl_dep, impl_tech, impl_pat, impl_prep, impl_style to the tracked filter categories for Plausible analytics. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces implementation-level tags to describe HOW code implements plots, complementing existing spec-level tags that describe WHAT is visualized. The system adds 5 new filter dimensions (dependencies, techniques, patterns, dataprep, styling) with corresponding UI filters and analytics tracking.
Key Changes:
- New two-level tagging architecture separating specification intent from implementation details
- Backend support for impl_tags in database schema, API responses, and filtering logic
- Frontend UI with 11 total filter categories (6 spec-level + 5 impl-level) including tooltips
- Backfilled 1,476 metadata files with implementation tags
Reviewed changes
Copilot reviewed 300 out of 1496 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/reference/tagging-system.md | Documents two-level tagging system with naming conventions and examples |
| core/database/models.py | Adds impl_tags JSONB column to Impl model |
| core/database/connection.py | Adds sync engine creation for local development |
| automation/scripts/sync_to_postgres.py | Syncs impl_tags from YAML to database |
| alembic/versions/a2f4b8c91d23_add_impl_tags.py | Migration adding impl_tags column with GIN index |
| api/schemas.py | Adds impl_tags field and 5 new filter count categories |
| api/routers/plots.py | Implements filtering and counting logic for impl-level tags |
| app/src/types/index.ts | Defines 5 new filter categories with labels and tooltips |
| app/src/hooks/useAnalytics.ts | Tracks impl-level filter usage in Plausible |
| app/src/components/FilterBar.tsx | Displays tooltips for all filter categories |
| plots//metadata/.yaml | 1,476 files backfilled with impl_tags structure |
| .github/workflows/spec-create.yml | Updates tagging guide reference |
| CLAUDE.md | Documents impl_tags in metadata structure |
|
|
||
| def _create_direct_engine(): | ||
| """Create engine using direct DATABASE_URL connection.""" | ||
| """Create async engine using direct DATABASE_URL connection.""" |
There was a problem hiding this comment.
Corrected docstring: function creates a sync engine, not async.
| filter_groups: list[dict], all_images: list[dict], spec_id_to_tags: dict, spec_lookup: dict | ||
| filter_groups: list[dict], all_images: list[dict], spec_id_to_tags: dict, spec_lookup: dict, impl_lookup: dict | ||
| ) -> list[dict]: | ||
| """Calculate OR preview counts for each filter group.""" |
There was a problem hiding this comment.
The function signature has 5 parameters but the docstring only documents the first 3. Add documentation for spec_lookup and impl_lookup parameters.
| """Calculate OR preview counts for each filter group.""" | |
| """Calculate OR preview counts for each filter group. | |
| Args: | |
| filter_groups: List of filter group dictionaries defining categories and values for filtering. | |
| all_images: List of image dictionaries to evaluate against the filter groups. | |
| spec_id_to_tags: Mapping from specification IDs to their associated tag metadata. | |
| spec_lookup: Mapping from specification IDs to full specification metadata, including tags. | |
| impl_lookup: Mapping from (spec_id, library) pairs to implementation-level tag metadata. | |
| Returns: | |
| A list of dictionaries, one per filter group, mapping each possible value in that group | |
| to the number of images that would match when that value is applied with the other groups. | |
| """ |
| - custom-legend | ||
| - bezier-curves | ||
| - columndatasource | ||
| - html-export | ||
| patterns: | ||
| - data-generation | ||
| - iteration-over-groups | ||
| - columndatasource |
There was a problem hiding this comment.
The value columndatasource appears in both techniques (line 20) and patterns (line 25). Based on the tagging documentation, columndatasource is a code structure pattern, not a visualization technique. Remove it from the techniques list.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
6efc090 to
4016f0f
Compare
- Use short URL keys for impl-level filters: dep, tech, pat, prep, style (consistent with spec-level filters like lib, plot, data) - Add FILTER_TOOLTIPS with descriptions for all 11 filter categories - Update Plausible analytics orderedKeys for tracking - Remove publication-ready from tagging docs example Issue #2434 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix docstring: sync engine, not async (connection.py) - Add missing docstring params for spec_lookup and impl_lookup (plots.py) - Remove duplicate columndatasource from techniques (bokeh.yaml) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add 5 impl-level URL params: dep, tech, pat, prep, style - Update category values reference with spec/impl distinction - Add filter_* properties for og:image tracking - Update API response example with impl-level counts
- Add tests for dep, tech, pat, prep, style filter matching - Add test for impl_not_in_lookup case - Add test for global_counts with impl_tags - Add test for contextual_counts with impl_tags 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
||
|
|
||
| def _create_direct_engine_sync(): | ||
| """Create sync engine using direct DATABASE_URL connection (for sync scripts).""" |
There was a problem hiding this comment.
The function docstring says it's for 'sync scripts', but this is more specific - it's for local development scripts that need sync database access. Consider updating to match the explanation in init_db_sync() which mentions 'scripts like sync_to_postgres.py'.
| """Create sync engine using direct DATABASE_URL connection (for sync scripts).""" | |
| """Create sync engine using direct DATABASE_URL connection. | |
| Used for local development scripts that need synchronous database access, | |
| such as scripts like sync_to_postgres.py. | |
| """ |
| filter_groups: list[dict], all_images: list[dict], spec_id_to_tags: dict, spec_lookup: dict, impl_lookup: dict | ||
| ) -> list[dict]: | ||
| """Calculate OR preview counts for each filter group.""" | ||
| """Calculate OR preview counts for each filter group. | ||
|
|
||
| Args: | ||
| filter_groups: List of filter group dictionaries defining categories and values. | ||
| all_images: List of image dictionaries to evaluate against the filter groups. | ||
| spec_id_to_tags: Mapping from specification IDs to their associated tag metadata. | ||
| spec_lookup: Mapping from specification IDs to full specification metadata. | ||
| impl_lookup: Mapping from (spec_id, library) pairs to implementation-level tags. | ||
|
|
||
| Returns: | ||
| List of dicts, one per filter group, mapping values to matching image counts. | ||
| """ |
There was a problem hiding this comment.
While the docstring was added, other functions in this file like _calculate_global_counts, _calculate_contextual_counts, _filter_images, and _image_matches_groups also gained new parameters but did not receive updated docstrings. For consistency, consider adding similar documentation to those functions.
Summary
impl_tags) to describe HOW code implements a plotdep,tech,pat,prep,styleChanges
Backend
Frontend
Data
Test plan
Closes #2434
🤖 Generated with Claude Code