Skip to content

Commit 416b98a

Browse files
author
jgstern
committed
Merge pull request 'Release v0.9.1' (#302) from jgstern-agent/release/0.9.1 into main
Reviewed-on: https://codeberg.org/iterabloom/hypergumbo/pulls/302
2 parents 6d18947 + 5e31b18 commit 416b98a

69 files changed

Lines changed: 13241 additions & 544 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CHANGELOG.md

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,28 @@
22

33
All notable changes to hypergumbo are documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
44

5-
- Released **tool** is at: v0.9.0
5+
- Released **tool** is at: v0.9.1
66
- Released **schema** is at: v0.2.0
77

88
This changelog tracks the **tool version** (package releases). The **schema version** (output format) is tracked separately in `schema.py` as `SCHEMA_VERSION`. The schema version only changes when the JSON output format has breaking changes.
99

1010
## [Unreleased]
1111

12-
## [0.9.0] - 2026-01-09
12+
## [0.9.1] - 2026-01-09
13+
14+
### Fixed
15+
- **Incomplete v0.9.0 release:** The v0.9.0 release was accidentally built from the wrong branch
16+
and did not include the ADR-0003 implementation (framework patterns, `--frameworks` flag,
17+
semantic entry detection, etc.). This release includes all the features described in the
18+
v0.9.0 changelog. Users should upgrade from v0.9.0 to v0.9.1.
19+
20+
### Added
21+
- Regenerated `docs/ARCHITECTURE.md` with current codebase metrics.
22+
23+
## [0.9.0] - 2026-01-09 (INCOMPLETE RELEASE)
24+
25+
> **Warning:** This release was built from the wrong branch and is missing most features
26+
> described below. Please use v0.9.1 instead.
1327
1428
### Changed (Breaking)
1529
- **Schema version 0.2.0:** The output schema version bumped from 0.1.0 to 0.2.0.

docs/ARCHITECTURE.md

Lines changed: 94 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -5,25 +5,25 @@
55
66
<!--
77
GENERATION METADATA (for drift detection):
8-
commit: 7adfcdc07367
9-
hypergumbo: 0.6.0
8+
commit: 83810f78db4d
9+
hypergumbo: 0.9.0
1010
python: 3.12.3
1111
-->
1212

1313
## Self-Analysis Summary
1414

1515
hypergumbo analyzed its own source code and found:
16-
- **105** Python modules (68 analyzers, 14 linkers)
17-
- **1638** symbols (functions, classes, methods)
18-
- **5966** edges (calls, imports, instantiates)
16+
- **107** Python modules (68 analyzers, 14 linkers)
17+
- **1688** symbols (functions, classes, methods)
18+
- **6143** edges (calls, imports, instantiates)
1919

2020
## Sketch (hypergumbo on hypergumbo)
2121

2222
```markdown
2323
# src
2424

2525
## Overview
26-
Python (100%) · 109 files · ~45,031 LOC
26+
Python (98%), Yaml (2%) · 126 files · ~48,452 LOC
2727

2828
## Structure
2929

@@ -46,10 +46,12 @@ Python (100%) · 109 files · ~45,031 LOC
4646
- `hypergumbo/cli.py`
4747
- `hypergumbo/metrics.py`
4848
- `hypergumbo/compact.py`
49+
- `hypergumbo/framework_patterns.py`
4950
- `hypergumbo/slice.py`
5051
- `hypergumbo/entrypoints.py`
5152
- `hypergumbo/build_grammars.py`
5253
- `hypergumbo/__main__.py`
54+
- `hypergumbo/sketch_embeddings.py`
5355
- `hypergumbo/llm_assist.py`
5456
- `hypergumbo/profile.py`
5557
- `hypergumbo/plan.py`
@@ -63,9 +65,7 @@ Python (100%) · 109 files · ~45,031 LOC
6365
- `hypergumbo/analyze/sql.py`
6466
- `hypergumbo/analyze/capnp.py`
6567
- `hypergumbo/analyze/groovy.py`
66-
- `hypergumbo/analyze/registry.py`
67-
- `hypergumbo/analyze/xml_config.py`
68-
- ... and 79 more files
68+
- ... and 81 more files
6969

7070
## Entry Points
7171

@@ -81,23 +81,24 @@ Python (100%) · 109 files · ~45,031 LOC
8181
- `Edge` (class) — A relationship between two symbols (e.g., function calls).
8282

8383
### `hypergumbo/analyze/base.py`
84+
- `node_text(node: 'tree_sitter.Node', source: bytes) -> str` (function) ★ — Extract text content for a tree-sitter node.
8485
- `iter_tree(root: 'tree_sitter.Node') -> Iterator['tree_sitter.Node']` (function) — Iterate over all nodes in a tree-sitter tree without recursion.
85-
- `node_text(node: 'tree_sitter.Node', source: bytes) -> str` (function) — Extract text content for a tree-sitter node.
8686

8787
### `hypergumbo/discovery.py`
8888
- `find_files(repo_root: Path, patterns: list[str], excludes: list[str] …` (function) — Find files matching patterns while respecting exclude rules.
8989

9090
### `hypergumbo/catalog.py`
9191
- `Pass` (class) — An analysis pass that can be applied to source code.
9292

93-
### `hypergumbo/analyze/julia.py`
94-
- `_find_child_by_type(node: 'tree_sitter.Node', type_name: str) -> Optional['tre…` (function) — Find first child of given type.
95-
- `_node_text(node: 'tree_sitter.Node', source: bytes) -> str` (function) — Extract text for a tree-sitter node.
96-
9793
### `hypergumbo/entrypoints.py`
9894
- `Entrypoint` (class) — A detected entrypoint in the codebase.
95+
- `_emit_path_deprecation_warning(framework: str) -> None` (function) — Emit a deprecation warning for path-based entrypoint detection.
9996
- `_get_filename(path: str) -> str` (function) — Extract filename from path.
10097

98+
### `hypergumbo/analyze/julia.py`
99+
- `_find_child_by_type(node: 'tree_sitter.Node', type_name: str) -> Optional['tre…` (function) — Find first child of given type.
100+
- `_node_text(node: 'tree_sitter.Node', source: bytes) -> str` (function) — Extract text for a tree-sitter node.
101+
101102
### `hypergumbo/analyze/rust.py`
102103
- `_find_child_by_field(node: 'tree_sitter.Node', field_name: str) -> Optional['tr…` (function) — Find child by field name.
103104

@@ -107,10 +108,13 @@ Python (100%) · 109 files · ~45,031 LOC
107108
### `hypergumbo/linkers/registry.py`
108109
- `LinkerResult` (class) — Result from running a linker.
109110

110-
(... and 1499 more symbols across 87 other files)
111+
### `hypergumbo/analyze/py.py`
112+
- `_format_annotation(node: ast.expr) -> str` (function) — Format a type annotation node to a readable string.
113+
114+
(... and 1554 more symbols across 89 other files)
111115

112116
The following symbols, for brevity shown only once above, would have appeared multiple times:
113-
- `_node_text` - we omitted 8 appearances of `_node_text`
117+
- `_node_text` - we omitted 6 appearances of `_node_text`
114118
- `_find_child_by_type` - we omitted 5 appearances
115119

116120
## All Files
@@ -167,13 +171,7 @@ The following symbols, for brevity shown only once above, would have appeared mu
167171
- `hypergumbo/analyze/php.py`
168172
- `hypergumbo/analyze/powershell.py`
169173
- `hypergumbo/analyze/proto.py`
170-
- `hypergumbo/analyze/py.py`
171-
- `hypergumbo/analyze/r_lang.py`
172-
- `hypergumbo/analyze/registry.py`
173-
- `hypergumbo/analyze/ruby.py`
174-
- `hypergumbo/analyze/rust.py`
175-
- `hypergumbo/analyze/scala.py`
176-
- ... and 51 more files
174+
- ... and 74 more files
177175
```
178176

179177
## Data Flow
@@ -188,7 +186,7 @@ Source Files
188186
│ │
189187
▼ ▼
190188
┌─────────────┐ ┌─────────────┐
191-
│ analyzers │────▶│ IR │ 1638 Symbols + 5966 Edges
189+
│ analyzers │────▶│ IR │ 1688 Symbols + 6143 Edges
192190
└─────────────┘ └─────────────┘
193191
│ │
194192
▼ ▼
@@ -210,12 +208,12 @@ These symbols have the highest in-degree (most referenced by other symbols):
210208

211209
| Symbol | Kind | In-Degree | Location |
212210
|--------|------|-----------|----------|
213-
| `Symbol` | class | 333 | ir.py |
211+
| `Symbol` | class | 334 | ir.py |
214212
| `Span` | class | 322 | ir.py |
215213
| `iter_tree` | function | 160 | base.py |
216214
| `find_files` | function | 147 | discovery.py |
215+
| `node_text` | function | 131 | base.py |
217216
| `Edge` | class | 127 | ir.py |
218-
| `node_text` | function | 98 | base.py |
219217
| `AnalysisRun` | class | 92 | ir.py |
220218
| `Pass` | class | 66 | catalog.py |
221219
| `_find_child_by_type` | function | 30 | julia.py |
@@ -229,6 +227,7 @@ These symbols have the highest in-degree (most referenced by other symbols):
229227
- **`compact`**: Compact output mode with coverage-based truncation and residual sum...
230228
- **`discovery`**: File discovery with exclude patterns.
231229
- **`entrypoints`**: Entrypoint detection heuristics for code analysis.
230+
- **`framework_patterns`**: Framework pattern matching for symbol enrichment (ADR-0003 v0.8.x).
232231
- **`ir`**: Internal Representation (IR) for code analysis.
233232
- **`limits`**: Limits tracking for behavior map output.
234233
- **`llm_assist`**: LLM-assisted capsule plan generation.
@@ -238,6 +237,7 @@ These symbols have the highest in-degree (most referenced by other symbols):
238237
- **`selection.filters`**: Path classification and symbol kind filtering for selection.
239238
- **`selection.language_proportional`**: Language-proportional symbol selection utilities.
240239
- **`selection.token_budget`**: Token estimation and budget management for LLM-aware output.
240+
- **`sketch_embeddings`**: Embedding-based config extraction for sketch generation.
241241
- **`slice`**: Graph slicing for LLM context extraction.
242242
- **`supply_chain`**: Supply chain classification for code analysis.
243243
- **`user_config`**: User configuration management for hypergumbo.
@@ -383,6 +383,74 @@ Provenance tracking for reproducibility:
383383
4. Create cross-language edges
384384
5. Add tests in `tests/test_<name>_linker.py`
385385

386+
## Framework Patterns Architecture (ADR-0003)
387+
388+
> **Note:** This section is manually maintained. See `docs/adr/0003-architectural-analysis-and-revision-plan.md` for the full design rationale.
389+
390+
ADR-0003 introduced a layered architecture for framework-aware analysis:
391+
392+
```
393+
Source Files
394+
395+
396+
┌─────────────────┐
397+
│ Analyzers │ Extract language-level metadata only
398+
│ (py.py, etc.) │ - Decorators, annotations, base classes
399+
└─────────────────┘ - No framework-specific interpretation
400+
401+
402+
┌─────────────────────────┐
403+
│ FRAMEWORK_PATTERNS │ Data-driven symbol enrichment
404+
│ (framework_patterns.py)│ - YAML pattern files define framework patterns
405+
└─────────────────────────┘ - Symbols enriched with concept metadata
406+
407+
408+
┌─────────────────┐
409+
│ Linkers │ Cross-language edge creation
410+
│ (http.py, etc)│ - Use concept metadata for matching
411+
└─────────────────┘ - Fall back to legacy meta.route_path
412+
```
413+
414+
### Key Components
415+
416+
- **`framework_patterns.py`**: Loads and applies YAML pattern files
417+
- **`frameworks/*.yaml`**: Pattern definitions for each framework
418+
- **`meta.concepts`**: List of matched concepts on enriched symbols
419+
- **`meta.annotations`/`meta.decorators`**: Raw metadata for pattern matching
420+
421+
### Adding a New Framework Pattern
422+
423+
1. Create `src/hypergumbo/frameworks/<framework>.yaml`
424+
2. Define patterns matching decorator/annotation names
425+
3. Specify concept types (route, model, task, etc.)
426+
4. Add extraction methods for path/method if needed
427+
5. Add tests in `tests/test_framework_patterns.py`
428+
429+
Example pattern:
430+
```yaml
431+
id: myframework
432+
language: python
433+
434+
patterns:
435+
- concept: route
436+
decorator: "^myapp\\.(get|post|put|delete)$"
437+
extract_path: "args[0]"
438+
extract_method: "decorator_suffix"
439+
```
440+
441+
### Migration Status (v1.0.x)
442+
443+
Analyzer-level route detection is deprecated. Deprecation warnings fire when:
444+
- Spring Boot/JAX-RS routes detected in Java
445+
- Django URL patterns detected in Python
446+
- ASP.NET Core routes detected in C#
447+
- Axum/Actix routes detected in Rust
448+
- Rails routes detected in Ruby
449+
- Laravel routes detected in PHP
450+
- Express routes detected in JavaScript/TypeScript
451+
452+
Use `--frameworks` flag with YAML patterns instead.
453+
386454
---
387455

388456
*Generated by `./scripts/generate-architecture` using hypergumbo self-analysis.*

docs/RELEASE_SOP.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -214,9 +214,11 @@ This script:
214214

215215
#### Step 3: Human Creates Signed Tag
216216

217+
**Before running:** Have both your Codeberg SSH key passphrase and GPG signing key passphrase ready. The script makes multiple SSH connections and a GPG signing operation in quick succession, and passphrase timeouts between steps can cause the push to fail.
218+
217219
```bash
218220
# Human runs this after merging the PR
219-
./scripts/tag-release 0.8.0
221+
./scripts/tag-release 0.9.1
220222
```
221223

222224
This script:

docs/hypergumbo-spec.md

Lines changed: 25 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1636,13 +1636,31 @@ This prefers well-connected entries, producing richer slices.
16361636

16371637
See [ADR-0003 §5.2](adr/0003-architectural-analysis-and-revision-plan.md#52-migration-path) for the authoritative migration plan.
16381638

1639-
| Version | Focus | Entry Detection Impact |
1640-
|---------|-------|------------------------|
1641-
| **v0.6.x** | Path heuristics + exclusions | Current state |
1642-
| **v0.7.x** | Foundation: metadata enrichment, `--frameworks` flag | Analyzers capture richer metadata |
1643-
| **v0.8.x** | FRAMEWORK_PATTERNS phase (YAML-driven) | Symbols enriched with concept metadata |
1644-
| **v0.9.x** | Semantic entry detection | `entrypoints.py` queries enriched metadata; path heuristics deprecated (retained only for `main()` fallback) |
1645-
| **v1.0.x** | Complete extraction | All frameworks as YAML; all analyzers pure |
1639+
| Version | Focus | Entry Detection Impact | Status |
1640+
|---------|-------|------------------------|--------|
1641+
| **v0.6.x** | Path heuristics + exclusions | Current state | 🟩 |
1642+
| **v0.7.x** | Foundation: metadata enrichment, `--frameworks` flag | Analyzers capture richer metadata | 🟩 |
1643+
| **v0.8.x** | FRAMEWORK_PATTERNS phase (YAML-driven) | Symbols enriched with concept metadata | 🟩 |
1644+
| **v0.9.x** | Semantic entry detection | `entrypoints.py` queries enriched metadata; path heuristics deprecated (retained only for `main()` fallback) | 🟩 |
1645+
| **v1.0.x** | Complete extraction | All frameworks as YAML; all analyzers pure | 🟨 |
1646+
1647+
**v0.8.x (complete):**
1648+
- 🟩 `framework_patterns.py` module with YAML-driven pattern matching
1649+
- 🟩 FastAPI patterns YAML (`fastapi.yaml`)
1650+
- 🟩 `enrich_symbols()` called in CLI pipeline after analyzers
1651+
- 🟩 Linkers respect activation conditions (conditional execution)
1652+
- 🟩 Flask, NestJS, Spring Boot, Django, Express, Celery, Rails, Phoenix, Laravel, Go Web patterns
1653+
- 🟩 HTTP linker supports concept metadata from FRAMEWORK_PATTERNS phase
1654+
1655+
**v0.9.x (complete):**
1656+
- 🟩 Semantic entry detection via `_detect_from_concepts()` in entrypoints.py
1657+
- 🟩 Path heuristics deprecated with warnings
1658+
- 🟩 CLI `main()` detection retained as non-deprecated fallback
1659+
- 🟩 ASP.NET Core, Rust web (Actix, Axum, Rocket, Diesel, SeaORM), Hapi, Koa patterns
1660+
1661+
**v1.0.x (in progress):**
1662+
- 🟩 Deprecation warnings added to all analyzer-level route detection
1663+
- 🟧 Remaining: purify analyzers by removing framework logic entirely (post-deprecation)
16461664

16471665
## 9) Testing & quality bar
16481666

pyproject.toml

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44

55
[project]
66
name = "hypergumbo"
7-
version = "0.9.0"
7+
version = "0.9.1"
88
description = "Local-first repo behavior map generator (MVP)"
99
readme = "README.md"
1010
requires-python = ">=3.10"
@@ -99,6 +99,15 @@ filterwarnings = [
9999
"ignore:.*analysis skipped.*requires tree-sitter:UserWarning",
100100
# tree-sitter deprecation warnings (older grammar packages)
101101
"ignore:int argument support is deprecated:DeprecationWarning",
102+
# Pack deprecation warnings (ADR-0003)
103+
"ignore:Packs are deprecated:DeprecationWarning",
104+
"ignore:PackConfig is deprecated:DeprecationWarning",
105+
]
106+
107+
[tool.coverage.run]
108+
# Omit optional modules that require extra dependencies
109+
omit = [
110+
"*/sketch_embeddings.py", # Requires sentence-transformers
102111
]
103112

104113
[tool.hatch.build.targets.wheel]
@@ -131,8 +140,8 @@ ignore = [
131140
]
132141

133142
[tool.ruff.lint.per-file-ignores]
134-
# Tests can use assert, hardcoded values, subprocess, temp paths
135-
"tests/**/*.py" = ["S101", "S105", "S106", "S603", "S108", "E741", "F841", "RUF005"]
143+
# Tests can use assert, hardcoded values, subprocess, temp paths, side-effect imports
144+
"tests/**/*.py" = ["S101", "S105", "S106", "S603", "S108", "E741", "F401", "F841", "RUF005"]
136145
# Conditional imports for optional tree-sitter deps
137146
"src/hypergumbo/analyze/*.py" = ["E402"]
138147

0 commit comments

Comments
 (0)