release: v0.10.0 (#213)

RaghavChamadiya · web-flow · commit b09126b679fb · 2026-05-18T23:18:51.000+05:30
diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md
@@ -9,6 +9,42 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ---
 
+## [0.10.0] — 2026-05-18
+
+### Added
+- **Code health layer — a fifth intelligence layer alongside graph / git / docs / decisions.** New `repowise health` command, `health_*` SQLite tables, biomarker engine, and `/repos/[id]/health` web UI surface what hotspots actually *cost* to maintain. Tree-sitter complexity walker feeds 10+ biomarkers across structural (`large_method`, `nested_complexity`, `bumpy_road`, `complex_method`, `brain_method`, `primitive_obsession`), organizational (`developer_congestion`, `knowledge_loss`), test (`untested_hotspot`, `coverage_gap`), and DRY (`dry_violation`, backed by a tokenizer + Rabin-Karp clone detector with co-change correlation) categories. A composite 0–10 score is rolled up per file, per module (NLOC-weighted), and per repo. Findings persist with deterministic refactoring suggestions and an `acknowledged | resolved | false_positive` lifecycle. `HealthSnapshot` writer + trend detector surface declining-health alerts. `.repowise/health-rules.json` supports per-file overrides. `repowise update` runs an incremental upsert so the dashboard stays fresh without re-running the full pipeline. New `repowise health --trend`, `--refactoring-targets`, `--module`, `--file`, and `--coverage` flags; `repowise status` prints a one-line health digest. Parallel biomarker analysis via `asyncio.gather`. Full architecture deep-dive in `docs/CODE_HEALTH.md` (#212).
+- **Coverage ingestion (LCOV / Cobertura / Clover).** `repowise health --coverage report.lcov` parses one or more reports (format auto-detected, override with `--coverage-format`), persists per-file line + branch + covered-line sets to a new `coverage_files` table, and feeds two coverage-aware biomarkers — `untested_hotspot` (hotspot files with low coverage *or*, when no coverage is ingested, no paired test file) and `coverage_gap` (significant uncovered surface area on non-test files). New `/api/repos/{id}/health/coverage` endpoint + `/repos/[id]/health/coverage` page with risk × coverage scatter, module rollup, and per-file drilldown (#212).
+- **Code-health web UX overhaul — tabbed chrome, trend, scatter quadrants, file drawer.** `/health` is now a four-tab surface (Overview / Trend / Coverage / Refactoring) with shared `HealthPageChrome`, sparklines pulled from `/health/trend`, a 5th "Hotspot Health" KPI card, server-side paginated file table with sortable headers + filter chips (Hotspots / Untested / Failing) + path search, biomarker glossary tooltips, severity-distribution bars, slide-over `HealthFileDrawer` with score breakdown by category, impact × effort quadrant on the refactoring page, and one-click status mutation (Acknowledge / Resolved / False positive) wired to a new `PATCH /health/findings/{id}` endpoint. Inline `HealthBadge` chips appear on the hotspots / ownership / graph views so health context follows you across the app (#212).
+- **AI fix / test prompts on the refactoring and coverage pages.** Per-row `AI fix prompt` / `AI test prompt` buttons open a modal that picks a target agent (Generic / Claude Code / Cursor), previews the generated prompt (with biomarkers, line ranges, severities, score deductions, suggested directions, hard constraints, completion contract — and the explicit "verify each finding against the real code; treat analyzer output as leads, not ground truth" preamble), and copies to clipboard. Prompt builder is generic (`buildAiPrompt` / `buildCoverageAiPrompt`) so future surfaces can reuse it (#212).
+- **New `get_health` MCP tool + health enrichment on existing tools.** `get_health(include=['coverage'])` returns the score + biomarker breakdown + coverage summary; `get_context(include=['health'])` surfaces per-file score, top two biomarkers, and linked coverage row; `get_risk` rows gain `coverage_pct` / `branch_coverage_pct`; `get_overview` exposes hotspot-health KPIs. Auto-generated `CLAUDE.md` gains a Code Health section listing critical biomarkers so agents see the health context on every invocation (#212).
+- **C4 architecture diagrams (L1 System Context / L2 Containers / L3 Components).** New `/repos/[id]/c4` page (React Flow + ELK layout, URL-synced via `?level=` and `?container=` with nuqs), backed by `services/c4_builder/` and three endpoints under `/api/graph/{repo_id}/c4/{l1,l2,l3}`. Container detection re-uses manifest paths from a new `external_systems` table (manifest parsers for npm / PyPI / Cargo / Go / NuGet that capture name + version + ecosystem + heuristic category). Containers fall back to top-level dirs on repos without manifests. L3 components are subdirs inside a container. Inspector panel surfaces module-health context per component. SVG / PNG / Mermaid export menu with a `/c4/mermaid` server endpoint. Shared UI lives in `packages/ui/src/c4/` so the hosted product can reuse it (#204, closes #203).
+- **MCP tool surface bumped 7 → 8 with `get_symbol` exposed.** `get_symbol("path::Name")` returns raw source bytes for one indexed symbol with exact line bounds — cheaper and safer than `Read` + offset math. `get_context` was trimmed to a triage card (title, summary, signatures, `hotspot` bit, `decision_records` titles, `symbol_id` pointers) — the `include=["source"]` mode was removed; agents should pipe `symbol_id` into `get_symbol` instead. `get_risk` PR mode now emits a structured `directive` block (`will_break` / `missing_cochanges` / `missing_tests`) with capped co-change / transitive lists. `search_codebase` gains a `kind` filter and a per-result `search_method` (`embedding` vs `bm25` fallback) plus a bareword-identifier grep hint. Every response carries an `_meta` envelope (`index_age_days`, `indexed_commit`); a `stale_warning` fires only when the indexed HEAD actually diverges from `.git/HEAD`, so silence is trustworthy (#210).
+- **`get_answer` rewritten as a hybrid retrieval pipeline.** FTS + vector store run in parallel, merged via reciprocal-rank fusion, PageRank-biased, expanded one graph hop to rescue near-misses, fused with decision records on "why"-shaped questions, and prepended with a structured prelude (top symbols, recent significant commits, decision titles). Confidence and `retrieval_quality` report independently so synthesis quality and retrieval quality don't get conflated. Low-confidence returns now ship `best_guesses` with one-line justifications instead of an empty answer. Schema-versioned cache auto-invalidates earlier-pipeline payloads (#210).
+- **Doc-generation upgrade — enforced coverage budget, faster runs, wiki interlinking.** New `generation/selection/` package is the single source of truth for which pages get emitted; `PageGenerator` and `cost_estimator` both consume it, so the pre-run estimate can no longer drift from the actual run. `GenerationConfig.coverage_pct` (default 0.20) is the user-facing knob, with per-bucket shares across `file_page` / `symbol_spotlight` / `module_page` / `api_contract` / `infra_page` / `scc_page` — no more bypass paths around the budget. New interactive coverage chooser shows per-bucket counts and a cost range (10 / 15 / 20 / 30 / 40 / 50 %) computed from live ingestion data, with self-calibration from prior runs' `wiki_pages`; `--coverage` CLI flag for non-TTY use. Prompt caching via provider-agnostic `CacheHint` (Anthropic emits `cache_control`, OpenAI surfaces `cached_tokens`). Persistent cross-run page cache short-circuits the LLM call when `source_hash + model` match. Module pages now group by graph community (default `min_module_size=3`) instead of top-level dir — handful of generic per-directory pages → 30–80 genuinely scoped module pages on large repos. Dead-code findings, decisions, and external systems now flow into `file_page` / `module_page` / `repo_overview` contexts. New `api_contract_detector` routes FastAPI routers and ASP.NET controllers through the dedicated `api_contract` template (#208).
+- **Onboarding documentation collection — 8 curated pages, default-on at `repowise init`.** New `generation/onboarding/` subpackage emits up to eight gated subkinds (`codebase_map` always; `getting_started`, `key_concepts`, `how_it_works`, `development_guide`, `active_landscape` gated on signals like manifest presence, PageRank P90 symbols, execution-flow depth, suffix patterns, recent commit volume) plus two promoted slots (`project_overview`, `architecture_guide`) that reuse the existing `repo_overview` / `architecture_diagram` pages via `metadata.onboarding_slot`. UI renders an "Onboarding" folder at the top of the docs tree (Compass icon, auto-expanded, canonical slot order). `--onboarding` / `--no-onboarding` flag on `repowise init`, persisted to `config.yaml` (#208).
+- **Wiki interlinking.** Post-gen regex scan resolves backtick refs to other pages' `page_id`s and populates `metadata.wiki_links` + reverse-index `metadata.backlinks`. New `WikiLink` MDX component renders resolved refs as clickable anchors; new `BacklinksPanel` in the wiki sidebar surfaces pages linking into the current one (#208).
+- **Pipeline progress for previously silent phases.** Added phase events around `tsconfig` (TS path-alias resolver init), `dynamic_hints` (HintRegistry edge extraction), and `external_systems` (manifest parsing). The two graph aggregations (`graph.metrics`, `graph.communities`) now emit per-algorithm completion lines as each `asyncio.gather` task finishes — output makes it obvious which algorithm is the bottleneck (almost always betweenness on the symbol graph for medium+ repos) (#206).
+
+### Changed
+- **Dead-code analyzer cuts ~390 false positives across resolver, parser, and analyzer.** Alembic migration scripts under `*/alembic/versions/*.py` are never-flagged (reflectively loaded). Click / Typer decorators on locally-named groups (`@my_cli.command("add")`) are now recognised via suffix matching. `unused_internal` counts an incoming `imports` edge whose `imported_names` lists the symbol — catches dispatch-table patterns (`HANDLERS = {"python": _extract_python, ...}`). Entry-point allowlist extends to WSGI / ASGI / Flask / FastAPI factory conventions (`create_app`, `make_app`, `application`, `get_asgi_application`, …). Bare relative Python imports (`from . import a, b`) now expand into per-name `Import` objects so plugin barrels resolve. Symbol extraction skips any AST node nested inside a callable, so React handler closures and async-method-local coroutines no longer hoist to the top-level symbol list. `.tsx` files now parse with tree-sitter's JSX-aware grammar. JSX elements (`<Component />`) register as call sites for the named component. Public symbols in files imported as namespaces (`from . import cargo`, `import * as cargo from "./cargo"`) are rescued — the static graph can't tell which attribute is being dispatched, so flagging individuals yields guaranteed false positives. TS workspace resolver honours `package.json#exports` (conditional, wildcard, longest-prefix) so turborepo / nx / pnpm monorepos resolve through subpath exports — verified `−23 %` dead-code findings on the dogfooded monorepo. Win32 entry points (`wWinMain`, `WinMain`, `wmain`, `ServiceMain`, `LowLevelKeyboardProc`, MSTest macro family) and never-flag globs for precompiled-header anchors (`pch.h` / `stdafx.cpp`), COM `*ClassFactory.cpp`, and broader test-project conventions skipped on C++ codebases — roughly 520 high-confidence findings cleared on PowerToys (#194, #207).
+- **Ingestion treats nested git repos as traversal boundaries by default.** When a working tree physically contains other independent git repositories as subdirectories (workspace roots that are themselves versioned, sibling repos checked out under a parent), the traverser walked into them and pulled in their entire file trees. Now a `.git` entry (directory, submodule file, or external gitdir pointer) is a hard traversal boundary. Opt-in `include_nested_repos=True` preserves the old behaviour. New `skipped_nested_repo` counter surfaced in the filtering summary (#205).
+- **CLI editor setup refactored into an integrations package** with per-editor strategy classes (Claude / Cursor / generic) so adding a new editor is a new file rather than edits across the suite (#199).
+- **Dynamic-hint extraction is no longer the wall-clock-stall phase.** The 13 dynamic-hint extractors used to call `repo_root.rglob(pattern)` independently — each descended into `node_modules`, `.venv`, `.next`, `__pycache__`, and on Windows followed directory junctions into infinite loops. New `_walk.iter_glob` helper does `os.walk` with in-place dirname pruning, `followlinks=False`, realpath-based cycle detection, and a hard depth cap of 64. `HintRegistry.extract_all` runs the 13 extractors in a `ThreadPoolExecutor`. Walk completes in ~2 min on polyrepos with recursive junctions vs. the prior indefinite stall (#208).
+- **Embedder no longer blocks the LLM critical path.** Post-LLM embed-and-upsert spawned as a background task so the next wave's LLM calls start immediately; the level still drains pending embeds before advancing so the next level's RAG search sees a fully-indexed store. New `enable_rag_context` and `rag_min_store_size` config knobs short-circuit the RAG search on cheap models and on early pages before the store has enough indexed material to return useful hits (#208).
+- **OpenAI default model bumped from `gpt-4.1` to `gpt-5.4-nano`** in both the interactive provider picker and the web settings placeholder, to match the in-app cost-tier recommendation (#208).
+- **Tighter README tagline and corpus framing.** New tagline: *"The codebase intelligence layer for your AI coding agent."* Drops the misleading "500 commits" references (the cap is per-file, not a global corpus cap) and softens engineering-team-only framing so solo devs see themselves in the README (#196). Refreshed `webui.gif`, compressed 16.6 MB → 8.5 MB (#198).
+
+### Fixed
+- **`repowise update` post-commit auto-sync rewrite.** The hook was racing with itself — concurrent invocations from rapid commits all started from the same stale base, took 12+ minutes each, never converged, and discarded output to `/dev/null`. `repowise update` now enforces single-flight via `.repowise/.update.lock`; if another update is running, the new invocation writes `.update.pending` with the current HEAD and exits, and the running update rolls forward to it. Hook pre-writes `.update.queued` synchronously before backgrounding so the augment hook sees an in-flight marker during the start-up window. Augment hook emits *"Wiki update in background — started Ns ago, target X"* instead of *"Wiki is stale"* when a marker is present. Stdout / stderr captured to `.repowise/.update.log` (rotated to 64 KB tail when it exceeds 256 KB) so silent failures are diagnosable. Hook installer upgrades the marker block in place when the body differs (previously bailed with *"already installed"* and left users stuck on the buggy hook after a repowise upgrade). Cross-platform — git always runs hooks under POSIX sh, so the same script body works on Linux / macOS / Windows (#211).
+- **CI integration test failures introduced by the doc-generation PR.** `_TrackingProvider` mock didn't accept `cache_hints`; `_SlowVectorStore` mock didn't implement `get_page_summaries_by_paths`; `test_level_values_in_range` asserted level ≤ 7 but onboarding has been level 8 since phase 3. Also: `select_pages` now allocates all candidates when total supply ≤ budget (so `coverage_pct=1.0` on tiny repos returns pages instead of zero), and `score_file` applies a tiny per-symbol floor so leaf modules with zero PageRank still enter the candidate pool (#209).
+- **Packaging: three `__init__.py` files silently dropped by a local `_*.py` exclude rule** in `.git/info/exclude` — `selection`, `cost_estimator`, `onboarding`, `external_systems`, and `dynamic_hints/_walk.py` all hit "module has no attribute" import errors on CI before this was caught. The `c4_builder` `__init__.py` was hit by the same rule and force-added in #204.
+
+### Documentation
+- **`CODE_HEALTH.md` user guide** + CLI reference entry + READMEs touched up to mention the fifth intelligence layer and the 8-tool MCP surface. The MCP-tools table now leads each row with *what only that tool answers* and surfaces the new signals (`retrieval_quality`, `best_guesses`, `search_method`, `hotspot` bit, `decision_records` pointer, PR-mode directive block, `_meta.stale_warning`) (#210, #212).
+- **Removed `AUDIT_NOTES.md`** — internal scratchpad, not intended to ship in the public repo (#197).
+
+---
+
 ## [0.9.1] — 2026-05-13
 
 ### Fixed
diff --git a/packages/cli/src/repowise/cli/__init__.py b/packages/cli/src/repowise/cli/__init__.py
@@ -6,4 +6,4 @@
 AI-generated documentation.
 """
 
-__version__ = "0.9.1"
+__version__ = "0.10.0"
diff --git a/packages/core/src/repowise/core/__init__.py b/packages/core/src/repowise/core/__init__.py
@@ -6,4 +6,4 @@
 Namespace package: repowise.core is part of the repowise namespace.
 """
 
-__version__ = "0.9.1"
+__version__ = "0.10.0"
diff --git a/packages/server/src/repowise/server/__init__.py b/packages/server/src/repowise/server/__init__.py
@@ -7,4 +7,4 @@
     - Background job scheduler (APScheduler)
 """
 
-__version__ = "0.9.1"
+__version__ = "0.10.0"
diff --git a/pyproject.toml b/pyproject.toml
@@ -7,8 +7,8 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "repowise"
-version = "0.9.1"
-description = "Codebase intelligence for developers and AI — generates and maintains a structured wiki for any codebase"
+version = "0.10.0"
+description = "The codebase intelligence layer for your AI coding agent — dependency graph, git history, dead code, decisions, and code health, exposed over MCP"
 readme = "README.md"
 requires-python = ">=3.11"
 license = "AGPL-3.0-only"
diff --git a/uv.lock b/uv.lock