Skip to content

Commit b88cc11

Browse files
committed
feat: add coverage join and golden-fixture clone exclusions
- join external Cobertura coverage.xml into current-run metrics using stdlib XML parsing - surface coverage join facts consistently in CLI, reports, MCP, SARIF, and HTML - distinguish measured coverage hotspots from coverage scope gaps in canonical findings - add project-level golden_fixture_paths policy and carry excluded clone groups as suppressed report facts - surface suppressed golden fixtures in Clones UI/CLI without affecting health or gates - fix cached segment projection branch behavior and benchmark/cache regressions - preserve cached public API parameter order and keep warm/cold API diffs stable - bump canonical report schema and refresh docs, MCP contracts, and regression tests
1 parent 7ef49d0 commit b88cc11

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

80 files changed

+5757
-13929
lines changed

CHANGELOG.md

Lines changed: 35 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -2,56 +2,43 @@
22

33
## [2.0.0b5]
44

5+
Expands the canonical contract with adoption, API-surface, and coverage-join layers; clarifies run interpretation
6+
across MCP/HTML/clients; tightens MCP launcher/runtime behavior.
7+
58
### Contracts, metrics, and review surfaces
69

7-
- Bump canonical report schema to `2.5` for `metrics.families.coverage_adoption`
8-
and `metrics.families.api_surface`.
9-
- Bump clone baseline schema to `2.1` and standalone metrics-baseline schema to
10-
`1.2` for compact `api_surface` wire payloads (`local_name` on disk,
11-
reconstructed full qualnames in runtime) while keeping read-compatibility for
12-
earlier `2.0` / `1.1` baseline files in the current b5 line.
13-
- Add shared public/private visibility classification for public-symbol-aware
14-
metrics, without changing clone/fingerprint semantics.
15-
- Add canonical type/docstring adoption coverage:
16-
parameter coverage, return coverage, public docstring coverage, and explicit
17-
`Any` counts.
18-
- Add opt-in public API surface inventory and baseline diff:
19-
public symbol snapshots, added symbols, and breaking changes against a
20-
trusted metrics baseline.
21-
- Add new gates:
22-
`--min-typing-coverage`, `--min-docstring-coverage`,
23-
`--fail-on-typing-regression`, `--fail-on-docstring-regression`,
24-
`--fail-on-api-break`.
25-
- Surface adoption and API metrics compactly in MCP summaries/detail, the HTML
26-
Overview tab, and canonical report payloads without adding a new HTML tab.
27-
- Extend the normal CLI `Metrics` block with adoption coverage and public API
28-
facts, while keeping the quiet compact metrics line unchanged.
29-
- Make unified clone baselines preserve embedded metrics and optional
30-
`api_surface` payloads safely across saves.
31-
32-
### MCP, HTML, and docs
33-
34-
- Surface the effective runtime analysis profile (`min_loc`, `min_stmt`, block, and segment thresholds) in canonical
35-
report metadata, MCP summary/triage projections, and the HTML Executive Summary subtitle.
36-
- Clarify MCP interpretation with compact `health_scope`, `focus`, and `new_by_source_kind` fields in summary/triage
37-
projections.
38-
- Make baseline mismatch handling more explicit in MCP and the VS Code client by surfacing baseline/runtime python tags
39-
and whether comparison is proceeding without a valid baseline.
40-
- Make the Claude Desktop bundle and Codex plugin prefer workspace-local launchers before `PATH`, with Poetry environment fallback for
41-
python-tag-safe MCP startup.
42-
- Add `workspace_root` user-config field to the Claude Desktop bundle: setting it to the project directory forces the
43-
launcher to prefer `.venv` inside that path even when Claude Desktop starts with a different working directory
44-
(fixes python-tag mismatch caused by system-wide interpreter fallback).
45-
- Validate `git_diff_ref` inputs as safe single revision expressions in both
46-
CLI and MCP before invoking `git diff`.
47-
- Replace the segment-group raw digest `repr()` payload with canonical JSON
48-
bytes for cross-version-safe determinism.
49-
- Align the tests workflow coverage gate with the canonical `fail_under = 99`
50-
policy and refresh the remaining `actions/checkout` pin in `codeclone.yml`.
51-
- Refresh branch metadata and client docs for the `2.0.0b5` line.
52-
- Update the README repository health badge to `87 (B)`.
53-
54-
## [2.0.0b4]
10+
- Report schema `2.8`: add `coverage_adoption`, `api_surface`, `coverage_join`, and optional
11+
`clones.suppressed.*` (for `golden_fixture_paths`); separate coverage hotspots vs scope gaps.
12+
- Baselines: clone `2.1`, metrics `1.2`; compact `api_surface` payload (`local_name` on disk, qualnames at runtime);
13+
read-compatible with `2.0` / `1.1`.
14+
- Add public/private visibility classification for public-symbol metrics (no clone/fingerprint changes).
15+
- Add annotation/docstring adoption coverage: parameter, return, public docstrings, explicit `Any`.
16+
- Add opt-in API surface inventory + baseline diff (snapshots, additions, breaking changes).
17+
- Add coverage join (`--coverage`): per-function facts + findings for below-threshold or missing-in-scope functions;
18+
current-run only (not baseline truth, no fingerprint impact).
19+
- Add `golden_fixture_paths`: exclude matching clone groups from health/gates while keeping suppressed facts.
20+
- Add gates: `--min-typing-coverage`, `--min-docstring-coverage`, `--fail-on-typing-regression`,
21+
`--fail-on-docstring-regression`, `--fail-on-api-break`, `--fail-on-untested-hotspots`, `--coverage-min`.
22+
- Surface adoption/API/coverage-join in MCP, CLI Metrics, report payloads, and HTML (Overview + Quality subtab).
23+
- Preserve embedded metrics and optional `api_surface` in unified baselines.
24+
- Cache `2.4`: drop stale API-surface entries; preserve parameter order; align warm/cold API diffs.
25+
26+
### MCP, HTML, and client interpretation
27+
28+
- Surface effective analysis profile in report meta, MCP summary/triage, and HTML subtitle.
29+
- Add `health_scope`, `focus`, `new_by_source_kind` to MCP summary/triage.
30+
- Make baseline mismatch explicit (python tags + no-valid-baseline signal).
31+
- Prefer workspace-local launchers over `PATH` (Poetry fallback).
32+
- Add `workspace_root` to force project `.venv` selection.
33+
34+
### Safety and maintenance
35+
36+
- Validate `git_diff_ref` as safe single-revision expressions.
37+
- Replace segment digest `repr()` with canonical JSON bytes (determinism).
38+
- Align CI coverage gate (`fail_under = 99`) and refresh `actions/checkout` pin.
39+
- Refresh branch metadata/docs for `2.0.0b5`; update README badge to `89 (B)`.
40+
41+
## [2.0.0b4] - 2026-04-05
5542

5643
### MCP server
5744

README.md

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
<a href="https://github.com/orenlab/codeclone/actions/workflows/tests.yml"><img src="https://github.com/orenlab/codeclone/actions/workflows/tests.yml/badge.svg?branch=main&style=flat-square" alt="Tests"></a>
1717
<a href="https://github.com/orenlab/codeclone/actions/workflows/benchmark.yml"><img src="https://github.com/orenlab/codeclone/actions/workflows/benchmark.yml/badge.svg?style=flat-square" alt="Benchmark"></a>
1818
<a href="https://pypi.org/project/codeclone/"><img src="https://img.shields.io/pypi/pyversions/codeclone.svg?style=flat-square" alt="Python"></a>
19-
<a href="https://github.com/orenlab/codeclone"><img src="https://img.shields.io/badge/codeclone-87%20(B)-green" alt="codeclone 87 (B)"></a>
19+
<a href="https://github.com/orenlab/codeclone"><img src="https://img.shields.io/badge/codeclone-87%20(B)-green" alt="codeclone 89 (B)"></a>
2020
<a href="#license"><img src="https://img.shields.io/badge/license-MPL--2.0-brightgreen?style=flat-square" alt="License"></a>
2121
</p>
2222

@@ -43,8 +43,8 @@ Live sample report:
4343
- **Clone detection** — function (CFG fingerprint), block (statement windows), and segment (report-only) clones
4444
- **Structural findings** — duplicated branch families, clone guard/exit divergence and clone-cohort drift (report-only)
4545
- **Quality metrics** — cyclomatic complexity, coupling (`CBO`), cohesion (`LCOM4`), dependency cycles, dead code,
46-
health score, type/docstring adoption coverage, public API surface diff, and report-only `Overloaded Modules`
47-
profiling
46+
health score, type/docstring adoption coverage, current-run Cobertura coverage join, public API surface diff, and
47+
report-only `Overloaded Modules` profiling
4848
- **Baseline governance** — separates accepted **legacy** debt from **new regressions** and lets CI fail **only** on
4949
what changed
5050
- **Reports** — interactive HTML, deterministic JSON/TXT plus Markdown and SARIF projections from one canonical report
@@ -148,11 +148,17 @@ codeclone . --min-typing-coverage 80 --min-docstring-coverage 60
148148
codeclone . --fail-on-typing-regression --fail-on-docstring-regression
149149
codeclone . --api-surface --update-metrics-baseline
150150
codeclone . --fail-on-api-break
151+
152+
# Current-run Cobertura hotspot gate
153+
codeclone . --coverage coverage.xml --fail-on-untested-hotspots --coverage-min 50
151154
```
152155

153156
In normal full-mode CLI output, CodeClone now surfaces adoption coverage
154157
(`params`, `returns`, `docstrings`, `Any`) in the main `Metrics` block, and it
155-
adds a `Public API` line when `--api-surface` facts are collected.
158+
adds a `Public API` line when `--api-surface` facts are collected. Passing
159+
`--coverage FILE` adds a `Coverage` line from external Cobertura XML, surfaces
160+
joined details under HTML `Quality -> Coverage Join` and MCP/report
161+
`coverage_join`, and does not update the clone baseline.
156162

157163
### Pre-commit
158164

@@ -208,6 +214,7 @@ CodeClone can load project-level configuration from `pyproject.toml`:
208214
min_loc = 10
209215
min_stmt = 6
210216
baseline = "codeclone.baseline.json"
217+
golden_fixture_paths = ["tests/fixtures/golden_*"]
211218
skip_metrics = false
212219
quiet = false
213220
html_out = ".cache/codeclone/report.html"
@@ -288,11 +295,11 @@ class Middleware: # codeclone: ignore[dead-code]
288295
Dynamic/runtime false positives are resolved via explicit inline suppressions, not via broad heuristics.
289296

290297
<details>
291-
<summary>Canonical JSON report shape (v2.5)</summary>
298+
<summary>Canonical JSON report shape (v2.8)</summary>
292299

293300
```json
294301
{
295-
"report_schema_version": "2.5",
302+
"report_schema_version": "2.8",
296303
"meta": {
297304
"codeclone_version": "2.0.0b5",
298305
"project_name": "...",
@@ -362,11 +369,13 @@ Dynamic/runtime false positives are resolved via explicit inline suppressions, n
362369
"summary": {
363370
"...": "...",
364371
"coverage_adoption": { "...": "..." },
372+
"coverage_join": { "...": "..." },
365373
"api_surface": { "...": "..." }
366374
},
367375
"families": {
368376
"...": "...",
369377
"coverage_adoption": { "...": "..." },
378+
"coverage_join": { "...": "..." },
370379
"api_surface": { "...": "..." }
371380
}
372381
},

benchmarks/__init__.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# This Source Code Form is subject to the terms of the Mozilla Public
2+
# License, v. 2.0. If a copy of the MPL was not distributed with this
3+
# file, You can obtain one at https://mozilla.org/MPL/2.0/.
4+
# SPDX-License-Identifier: MPL-2.0
5+
# Copyright (c) 2026 Den Rozhnovskiy

benchmarks/run_benchmark.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,45 @@ def _run_cli_once(
176176
)
177177

178178

179+
def _validate_inventory_sample(
180+
*,
181+
scenario: Scenario,
182+
measurement: RunMeasurement,
183+
) -> None:
184+
if measurement.files_found <= 0:
185+
raise RuntimeError(
186+
f"scenario {scenario.name} produced an empty inventory sample; "
187+
"benchmark target is invalid"
188+
)
189+
if measurement.files_skipped > 0:
190+
raise RuntimeError(
191+
f"scenario {scenario.name} skipped {measurement.files_skipped} files; "
192+
"benchmark run is invalid"
193+
)
194+
if scenario.mode == "cold":
195+
if measurement.files_cached != 0:
196+
raise RuntimeError(
197+
f"cold scenario {scenario.name} unexpectedly used cache: "
198+
f"cached={measurement.files_cached}"
199+
)
200+
if measurement.files_analyzed <= 0:
201+
raise RuntimeError(
202+
f"cold scenario {scenario.name} analyzed no files: "
203+
f"found={measurement.files_found} analyzed={measurement.files_analyzed}"
204+
)
205+
return
206+
if measurement.files_cached <= 0:
207+
raise RuntimeError(
208+
f"warm scenario {scenario.name} did not use cache: "
209+
f"cached={measurement.files_cached}"
210+
)
211+
if measurement.files_analyzed != 0:
212+
raise RuntimeError(
213+
f"warm scenario {scenario.name} analyzed files unexpectedly: "
214+
f"analyzed={measurement.files_analyzed}"
215+
)
216+
217+
179218
def _scenario_result(
180219
*,
181220
scenario: Scenario,
@@ -230,6 +269,7 @@ def _scenario_result(
230269
report_path=scenario_dir / f"run-report-{idx}.json",
231270
extra_args=scenario.extra_args,
232271
)
272+
_validate_inventory_sample(scenario=scenario, measurement=measurement)
233273
measurements.append(measurement)
234274

235275
digests = sorted({m.digest for m in measurements})

0 commit comments

Comments
 (0)