Skip to content

Commit 2c8b3ad

Browse files
committed
feat(mcp): deliver b3 diff-aware agent workflows, SARIF hardening, and changed-scope CLI UX
- add optional read-only MCP server with diff-aware analysis, compare-runs, remediation payloads, reviewed state, and stable resources - complete b3 MCP and SARIF review spec work without changing baseline semantics - add changed-scope CLI flags and render changed-scope as a first-class summary block - add HTML IDE deep links and stable finding anchors for agent/IDE navigation - sync docs, AGENTS, and changelog with the new MCP/CLI/report surface
1 parent 7ac6183 commit 2c8b3ad

39 files changed

Lines changed: 5210 additions & 278 deletions

AGENTS.md

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,8 @@ Key artifacts:
6161
- `.cache/codeclone/cache.json` — analysis cache (integrity-checked)
6262
- `.cache/codeclone/report.html|report.json|report.md|report.sarif|report.txt` — reports
6363
- `codeclone-mcp` — optional read-only MCP server (install via `codeclone[mcp]`)
64+
- MCP runs are in-memory only; review markers are session-local and must never
65+
leak into baseline/cache/report artifacts
6466
- `docs/`, `mkdocs.yml`, `.github/workflows/docs.yml` — published documentation site and docs build pipeline
6567

6668
---
@@ -170,6 +172,7 @@ Reports come in:
170172

171173
MCP is a separate optional interface, not a report format. It must remain a
172174
read-only agent layer over the same canonical report/baseline/cache contracts.
175+
Session review markers are allowed only as ephemeral MCP process state.
173176

174177
### Report invariants
175178

@@ -179,6 +182,10 @@ read-only agent layer over the same canonical report/baseline/cache contracts.
179182
- baseline fingerprint + schema versions
180183
- baseline generator version
181184
- cache path / cache used
185+
- SARIF `partialFingerprints.primaryLocationLineHash` must remain stable across
186+
line-only shifts for the same finding identity.
187+
- SARIF `automationDetails.id` must be unique per run; result `kind` should be
188+
explicit when emitted.
182189

183190
### Explainability contract (core owns facts)
184191

@@ -256,6 +263,13 @@ Agents must preserve these semantics:
256263
- **3** — analysis gating failure (e.g., `--fail-threshold` exceeded or new clones in `--ci` as designed)
257264
- **5** — internal error (unexpected exception escaped top-level CLI handling)
258265

266+
Changed-scope flags are contract-sensitive:
267+
268+
- `--changed-only` keeps the canonical analysis/report full, but applies clone
269+
summary/threshold evaluation to the changed-files projection.
270+
- `--diff-against` requires `--changed-only`.
271+
- `--paths-from-git-diff` implies `--changed-only`.
272+
259273
If you introduce a new exit reason, document it and add tests.
260274

261275
---
@@ -349,7 +363,8 @@ Use this map to route changes to the right owner module.
349363
- `codeclone/report/*.py` (other modules) — deterministic projections/format transforms (
350364
text/markdown/sarif/derived/findings/suggestions); avoid injecting new analysis heuristics here.
351365
- `codeclone/mcp_service.py` — typed, in-process MCP service adapter over the current pipeline/report contracts; keep
352-
it read-only and deterministic; do not move shell UX or `sys.exit` behavior here.
366+
it deterministic; allow only session-local in-memory state such as reviewed markers, and never move shell UX or
367+
`sys.exit` behavior here.
353368
- `codeclone/mcp_server.py` — optional MCP launcher/server wiring, transport config, and MCP tool/resource
354369
registration; keep dependency loading lazy so base installs/CI do not require MCP runtime packages.
355370
- `codeclone/html_report.py` — public HTML facade/re-export surface; preserve backward-compatible imports here; do not
@@ -453,6 +468,7 @@ Policy:
453468
- Canonical report JSON schema/payload semantics (`REPORT_SCHEMA_VERSION` contract family).
454469
- Documented report projections and their machine/user-facing semantics (HTML/Markdown/SARIF/Text).
455470
- Documented MCP launcher/install behavior, tool names, resource URIs, and read-only semantics.
471+
- Session-local MCP review state semantics (`mark_finding_reviewed`, `exclude_reviewed`) as documented public behavior.
456472
- Documented finding families/kinds/ids and suppression-facing report fields.
457473
- Metrics baseline schema/compatibility where used by CI/gating.
458474
- Benchmark schema/outputs if consumed as a reproducible contract surface.

CHANGELOG.md

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,29 @@
66

77
- Add optional `codeclone[mcp]` extra and `codeclone-mcp` launcher.
88
- Add a deterministic, read-only MCP server over the canonical pipeline and report contracts.
9-
- Expose MCP tools/resources for repository analysis, run summaries, report sections, findings, hotlists, and gate previews.
9+
- Expose diff-aware MCP tools/resources for changed-files analysis, run comparison, report sections, findings,
10+
remediation payloads, hotlists, granular checks, and gate previews.
11+
- Add stable MCP resources for latest-run summary/report/health/gates/changed projections and schema discovery.
12+
- Add session-local reviewed-finding state for long AI-agent workflows without mutating baseline or repo state.
13+
- Add stable HTML deep-link anchors (`finding-{finding_id}`) for clone and structural finding cards.
14+
15+
### CLI
16+
17+
- Add `--changed-only`, `--diff-against`, and `--paths-from-git-diff` for changed-scope clone review and gating over a
18+
full canonical analysis.
19+
- Render changed-scope results as a first-class summary block in normal CLI output while keeping quiet mode compact.
20+
21+
### SARIF
22+
23+
- Stabilize `primaryLocationLineHash` across line-only shifts by hashing finding identity without line numbers.
24+
- Emit run-unique `automationDetails.id`, optional `startTimeUtc`, and explicit result `kind: "fail"`.
25+
- Move ancillary finding identity fields to SARIF `properties` and keep `partialFingerprints` minimal.
26+
27+
### HTML
28+
29+
- Add IDE picker with persistent selection (localStorage) supporting PyCharm, IntelliJ IDEA, VS Code, Cursor, Fleet, and
30+
Zed.
31+
- Make file paths across Clones, Quality, Suggestions, Dead Code, and Findings tabs clickable IDE deep links.
1032

1133
## [2.0.0b2]
1234

README.md

Lines changed: 45 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,13 @@ Docs: [orenlab.github.io/codeclone](https://orenlab.github.io/codeclone/) ·
2626
Live sample report:
2727
[orenlab.github.io/codeclone/examples/report/](https://orenlab.github.io/codeclone/examples/report/)
2828

29+
> [!NOTE]
30+
> This README and docs site track the in-development `v2.0.x` line from `main`.
31+
> For the latest stable CodeClone documentation (`v1.4.4`), see the
32+
> [`v1.4.4` README](https://github.com/orenlab/codeclone/blob/v1.4.4/README.md)
33+
> and the
34+
> [`v1.4.4` docs tree](https://github.com/orenlab/codeclone/tree/v1.4.4/docs).
35+
2936
## Features
3037

3138
- **Clone detection** — function (CFG fingerprint), block (statement windows), and segment (report-only) clones
@@ -48,6 +55,8 @@ codeclone . --html # generate HTML report
4855
codeclone . --html --open-html-report # generate and open HTML report
4956
codeclone . --json --md --sarif --text # generate machine-readable reports
5057
codeclone . --html --json --timestamped-report-paths # keep timestamped report snapshots
58+
codeclone . --changed-only --diff-against main # changed-scope clone gating against git diff
59+
codeclone . --paths-from-git-diff HEAD~1 # shorthand diff source for changed-scope review
5160
codeclone . --ci # CI mode (--fail-on-new --no-color --quiet)
5261
```
5362

@@ -80,8 +89,29 @@ For local command-based clients, prefer `stdio`. Use `streamable-http` only
8089
when the client expects a remote MCP endpoint.
8190

8291
CodeClone MCP is read-only and baseline-aware. It exposes deterministic tools
83-
for analysis, summaries, findings, hotspots, report sections, and gate previews
84-
without mutating source files or baselines.
92+
for:
93+
94+
- full repository analysis and changed-files analysis
95+
- run summaries and run-to-run comparison
96+
- findings, hotspots, remediation payloads, and PR summaries
97+
- granular clone / complexity / coupling / cohesion / dead-code checks
98+
- session-local review markers for long agent workflows
99+
100+
It never mutates source files, baselines, or repo state.
101+
Diff-aware MCP calls use repo-relative `changed_paths` lists (or `git_diff_ref`)
102+
and may reuse the same `run_id` when the canonical report digest stays
103+
unchanged.
104+
Focused `check_*` MCP tools may trigger a full analysis first when no stored run
105+
exists yet.
106+
107+
Latest-run resources are also available for MCP-capable clients:
108+
109+
- `codeclone://latest/summary`
110+
- `codeclone://latest/report.json`
111+
- `codeclone://latest/health`
112+
- `codeclone://latest/gates`
113+
- `codeclone://latest/changed`
114+
- `codeclone://schema`
85115

86116
Docs:
87117
[MCP interface contract](https://orenlab.github.io/codeclone/book/20-mcp-interface/)
@@ -240,6 +270,7 @@ Dynamic/runtime false positives are resolved via explicit inline suppressions, n
240270
"...": "..."
241271
},
242272
"runtime": {
273+
"analysis_started_at_utc": "...",
243274
"report_generated_at_utc": "..."
244275
}
245276
},
@@ -329,20 +360,20 @@ CFG semantics: [CFG semantics](https://orenlab.github.io/codeclone/cfg/)
329360

330361
## Documentation
331362

332-
| Topic | Link |
333-
|----------------------------|----------------------------------------------------------------------------------------------------|
334-
| Contract book (start here) | [Contracts and guarantees](https://orenlab.github.io/codeclone/book/00-intro/) |
335-
| Exit codes | [Exit codes and failure policy](https://orenlab.github.io/codeclone/book/03-contracts-exit-codes/) |
336-
| Configuration | [Config and defaults](https://orenlab.github.io/codeclone/book/04-config-and-defaults/) |
337-
| Baseline contract | [Baseline contract](https://orenlab.github.io/codeclone/book/06-baseline/) |
338-
| Cache contract | [Cache contract](https://orenlab.github.io/codeclone/book/07-cache/) |
339-
| Report contract | [Report contract](https://orenlab.github.io/codeclone/book/08-report/) |
363+
| Topic | Link |
364+
|----------------------------|-----------------------------------------------------------------------------------------------------|
365+
| Contract book (start here) | [Contracts and guarantees](https://orenlab.github.io/codeclone/book/00-intro/) |
366+
| Exit codes | [Exit codes and failure policy](https://orenlab.github.io/codeclone/book/03-contracts-exit-codes/) |
367+
| Configuration | [Config and defaults](https://orenlab.github.io/codeclone/book/04-config-and-defaults/) |
368+
| Baseline contract | [Baseline contract](https://orenlab.github.io/codeclone/book/06-baseline/) |
369+
| Cache contract | [Cache contract](https://orenlab.github.io/codeclone/book/07-cache/) |
370+
| Report contract | [Report contract](https://orenlab.github.io/codeclone/book/08-report/) |
340371
| Metrics & quality gates | [Metrics and quality gates](https://orenlab.github.io/codeclone/book/15-metrics-and-quality-gates/) |
341-
| Dead code | [Dead-code contract](https://orenlab.github.io/codeclone/book/16-dead-code-contract/) |
342-
| Docker benchmark contract | [Benchmarking contract](https://orenlab.github.io/codeclone/book/18-benchmarking/) |
343-
| Determinism | [Determinism policy](https://orenlab.github.io/codeclone/book/12-determinism/) |
372+
| Dead code | [Dead-code contract](https://orenlab.github.io/codeclone/book/16-dead-code-contract/) |
373+
| Docker benchmark contract | [Benchmarking contract](https://orenlab.github.io/codeclone/book/18-benchmarking/) |
374+
| Determinism | [Determinism policy](https://orenlab.github.io/codeclone/book/12-determinism/) |
344375

345-
## * Benchmarking
376+
## * Benchmarking
346377

347378
<details>
348379
<summary>Reproducible Docker Benchmark</summary>

codeclone/_cli_args.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,23 @@ def build_parser(version: str) -> _ArgumentParser:
130130
default=DEFAULT_PROCESSES,
131131
help=ui.HELP_PROCESSES,
132132
)
133+
_add_bool_optional_argument(
134+
analysis_group,
135+
flag="--changed-only",
136+
help_text=ui.HELP_CHANGED_ONLY,
137+
)
138+
analysis_group.add_argument(
139+
"--diff-against",
140+
default=None,
141+
metavar="GIT_REF",
142+
help=ui.HELP_DIFF_AGAINST,
143+
)
144+
analysis_group.add_argument(
145+
"--paths-from-git-diff",
146+
default=None,
147+
metavar="GIT_REF",
148+
help=ui.HELP_PATHS_FROM_GIT_DIFF,
149+
)
133150
_add_optional_path_argument(
134151
analysis_group,
135152
flag="--cache-path",

codeclone/_cli_meta.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ class ReportMeta(TypedDict):
6767
health_grade: str | None
6868
analysis_mode: str
6969
metrics_computed: list[str]
70+
analysis_started_at_utc: str | None
7071
report_generated_at_utc: str
7172

7273

@@ -91,6 +92,7 @@ def _build_report_meta(
9192
health_grade: str | None,
9293
analysis_mode: str,
9394
metrics_computed: tuple[str, ...],
95+
analysis_started_at_utc: str | None,
9496
report_generated_at_utc: str,
9597
) -> ReportMeta:
9698
project_name = scan_root.name or str(scan_root)
@@ -133,5 +135,6 @@ def _build_report_meta(
133135
"health_grade": health_grade,
134136
"analysis_mode": analysis_mode,
135137
"metrics_computed": list(metrics_computed),
138+
"analysis_started_at_utc": analysis_started_at_utc,
136139
"report_generated_at_utc": report_generated_at_utc,
137140
}

codeclone/_cli_summary.py

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,14 @@ class MetricsSnapshot:
2525
suppressed_dead_code_count: int = 0
2626

2727

28+
@dataclass(frozen=True, slots=True)
29+
class ChangedScopeSnapshot:
30+
paths_count: int
31+
findings_total: int
32+
findings_new: int
33+
findings_known: int
34+
35+
2836
class _Printer(Protocol):
2937
def print(self, *objects: object, **kwargs: object) -> None: ...
3038

@@ -149,3 +157,34 @@ def _print_metrics(
149157
suppressed=metrics.suppressed_dead_code_count,
150158
)
151159
)
160+
161+
162+
def _print_changed_scope(
163+
*,
164+
console: _Printer,
165+
quiet: bool,
166+
changed_scope: ChangedScopeSnapshot,
167+
) -> None:
168+
if quiet:
169+
console.print(
170+
ui.fmt_changed_scope_compact(
171+
paths=changed_scope.paths_count,
172+
findings=changed_scope.findings_total,
173+
new=changed_scope.findings_new,
174+
known=changed_scope.findings_known,
175+
)
176+
)
177+
return
178+
179+
from rich.rule import Rule
180+
181+
console.print()
182+
console.print(Rule(title=ui.CHANGED_SCOPE_TITLE, style="dim", characters="\u2500"))
183+
console.print(ui.fmt_changed_scope_paths(count=changed_scope.paths_count))
184+
console.print(
185+
ui.fmt_changed_scope_findings(
186+
total=changed_scope.findings_total,
187+
new=changed_scope.findings_new,
188+
known=changed_scope.findings_known,
189+
)
190+
)

codeclone/_html_css.py

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1075,6 +1075,11 @@
10751075
.theme-toggle{font-size:0;gap:0;width:32px;height:32px;
10761076
padding:0;align-items:center;justify-content:center}
10771077
.theme-toggle svg{width:16px;height:16px}
1078+
.ide-picker-btn{font-size:0;gap:0;width:32px;height:32px;
1079+
padding:0;align-items:center;justify-content:center}
1080+
.ide-picker-btn svg{width:16px;height:16px}
1081+
.ide-picker-label{display:none}
1082+
.ide-menu{right:0;min-width:140px}
10781083
.main-tabs-wrap{position:sticky;top:0;z-index:90;padding:var(--sp-2) 0 0}
10791084
.main-tabs{padding:var(--sp-1);gap:2px;
10801085
background:
@@ -1091,10 +1096,41 @@
10911096
.brand-logo{width:28px;height:28px}
10921097
}
10931098
1099+
/* IDE link */
1100+
.ide-link{color:inherit;text-decoration:none;cursor:default}
1101+
[data-ide]:not([data-ide=""]) .ide-link{cursor:pointer;color:var(--accent-primary);
1102+
text-decoration-line:underline;text-decoration-style:dotted;text-underline-offset:2px}
1103+
[data-ide]:not([data-ide=""]) .ide-link:hover{text-decoration-style:solid}
1104+
1105+
/* IDE picker dropdown */
1106+
.ide-picker{position:relative;display:inline-flex}
1107+
.ide-picker-btn{display:inline-flex;align-items:center;gap:var(--sp-1);
1108+
padding:var(--sp-1) var(--sp-3);background:none;border:1px solid var(--border);
1109+
border-radius:var(--radius-md);cursor:pointer;color:var(--text-muted);font-size:.85rem;
1110+
font-weight:500;font-family:inherit;transition:all var(--dur-fast) var(--ease);
1111+
white-space:nowrap}
1112+
.ide-picker-btn:hover{color:var(--text-primary);background:var(--bg-raised);border-color:var(--border-strong)}
1113+
.ide-picker-btn svg{width:16px;height:16px;flex-shrink:0}
1114+
.ide-picker-btn[aria-expanded="true"]{color:var(--accent-primary);border-color:var(--accent-primary)}
1115+
.ide-menu{display:none;position:absolute;top:100%;right:0;margin-top:var(--sp-1);
1116+
min-width:160px;background:var(--bg-surface);border:1px solid var(--border);
1117+
border-radius:var(--radius);box-shadow:0 4px 12px rgba(0,0,0,.15);
1118+
z-index:100;padding:var(--sp-1) 0;list-style:none}
1119+
.ide-menu[data-open]{display:block}
1120+
.ide-menu li{padding:0}
1121+
.ide-menu button{display:flex;align-items:center;gap:var(--sp-2);width:100%;
1122+
padding:var(--sp-1) var(--sp-3);background:none;border:none;color:var(--text-primary);
1123+
font-size:.8rem;font-family:var(--font-sans);cursor:pointer;text-align:left}
1124+
.ide-menu button:hover{background:var(--bg-alt)}
1125+
.ide-menu button[aria-checked="true"]{color:var(--accent-primary);font-weight:600}
1126+
.ide-menu button[aria-checked="true"]::before{content:'\\2713';font-size:.7rem;
1127+
width:14px;text-align:center;flex-shrink:0}
1128+
.ide-menu button[aria-checked="false"]::before{content:'';width:14px;flex-shrink:0}
1129+
10941130
/* Print */
10951131
@media print{
10961132
.topbar,.toolbar,.pagination,.theme-toggle,.toast-container,
1097-
.novelty-tabs,.clear-btn,.btn{display:none!important}
1133+
.novelty-tabs,.clear-btn,.btn,.ide-picker{display:none!important}
10981134
.tab-panel{display:block!important;break-inside:avoid}
10991135
.group-body{display:block!important}
11001136
body{background:#fff;color:#000}

0 commit comments

Comments
 (0)