Skip to content

Commit e30100e

Browse files
authored
MCP server, canonical report tightening, and MPL-2.0 relicensing (#18)
This PR prepares `codeclone` `2.0.0b3`. The beta line now lands as a coherent platform release: - optional read-only MCP server for agent workflows - canonical-report-first tightening across MCP and report projections - changed-scope review and CI-oriented delivery surfaces - MPL-2.0 relicensing for code, with docs kept under MIT ## Highlights - add optional `codeclone[mcp]` with `codeclone-mcp` launcher (`stdio` and `streamable-http`) - ship `20` MCP tools, `7` fixed resources, and `3` run-scoped URI templates - make MCP budget-aware and triage-first: compact first-pass workflow, bounded drill-down, and explicit guidance for low-cost agent usage - slim MCP payloads without changing canonical report truth - harden MCP safety semantics: absolute `root` required for analysis tools, honest `compare_runs`, session-local review state only - move threshold-aware design findings fully into the canonical report and record effective thresholds in `meta.analysis_thresholds.design_findings` - add canonical `derived.overview.directory_hotspots` and render `Hotspots by Directory` in Overview - bump report schema to `2.2` and cache schema to `2.3` - fix stale cache reuse after semantic analysis changes - fix AST normalization side effects that corrupted downstream cohesion metrics - refresh baseline/health after analysis fixes (`85 (B)` on the repo) - polish HTML report navigation, badges, overview rhythm, and hotspot presentation - relicense repository code to `MPL-2.0` while keeping documentation under `MIT` ## Contracts and compatibility - baseline schema remains `2.0` - fingerprint version remains `1` - report schema is now `2.2` - cache schema is now `2.3` - MCP remains read-only with respect to repo state and persisted artifacts
1 parent b5059ab commit e30100e

File tree

188 files changed

+17632
-1194
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

188 files changed

+17632
-1194
lines changed
Lines changed: 172 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,178 @@
11
# CodeClone GitHub Action
22

3-
Runs CodeClone to detect architectural code duplication in Python projects.
3+
Baseline-aware structural code quality analysis for Python with:
44

5-
## Usage
5+
- configurable CI gating
6+
- SARIF upload for GitHub Code Scanning
7+
- PR summary comments
8+
- deterministic JSON report generation
9+
10+
This action is designed for PR and CI workflows where you want CodeClone to act
11+
as a non-LLM review bot: run analysis, upload SARIF, post a concise summary,
12+
and propagate the real gate result.
13+
14+
## What it does
15+
16+
The v2 action flow is:
17+
18+
1. set up Python
19+
2. install `codeclone`
20+
3. optionally require a committed baseline
21+
4. run CodeClone with JSON + optional SARIF output
22+
5. optionally upload SARIF to GitHub Code Scanning
23+
6. optionally post or update a PR summary comment
24+
7. return the real CodeClone exit code as the job result
25+
26+
When the action is used from the checked-out CodeClone repository itself
27+
(`uses: ./.github/actions/codeclone`), it installs CodeClone from the repo
28+
source under test. Remote consumers still install from PyPI.
29+
30+
## Basic usage
631

732
```yaml
8-
- uses: orenlab/codeclone/.github/actions/codeclone@v1
33+
- uses: orenlab/codeclone/.github/actions/codeclone@main
934
with:
10-
path: .
11-
fail-on-new: true
35+
fail-on-new: "true"
36+
```
37+
38+
For released references, prefer pinning to a major version tag such as `@v2`
39+
or to an immutable commit SHA.
40+
41+
## PR workflow example
42+
43+
```yaml
44+
name: CodeClone
45+
46+
on:
47+
pull_request:
48+
types: [ opened, synchronize, reopened ]
49+
paths: [ "**/*.py" ]
50+
51+
permissions:
52+
contents: read
53+
security-events: write
54+
pull-requests: write
55+
56+
jobs:
57+
codeclone:
58+
runs-on: ubuntu-latest
59+
steps:
60+
- uses: actions/checkout@v4
61+
with:
62+
fetch-depth: 0
63+
64+
- uses: orenlab/codeclone/.github/actions/codeclone@main
65+
with:
66+
fail-on-new: "true"
67+
fail-health: "60"
68+
sarif: "true"
69+
pr-comment: "true"
70+
```
71+
72+
## Inputs
73+
74+
| Input | Default | Purpose |
75+
|-------------------------|---------------------------------|-------------------------------------------------------------------------------------------------------------------|
76+
| `python-version` | `3.13` | Python version used to run the action |
77+
| `package-version` | `""` | CodeClone version from PyPI for remote installs; ignored when the action runs from the checked-out CodeClone repo |
78+
| `path` | `.` | Project root to analyze |
79+
| `json-path` | `.cache/codeclone/report.json` | JSON report output path |
80+
| `sarif` | `true` | Generate SARIF and try to upload it |
81+
| `sarif-path` | `.cache/codeclone/report.sarif` | SARIF output path |
82+
| `pr-comment` | `true` | Post or update a PR summary comment |
83+
| `fail-on-new` | `true` | Fail if new clone groups are detected |
84+
| `fail-on-new-metrics` | `false` | Fail if metrics regress vs baseline |
85+
| `fail-threshold` | `-1` | Max allowed function+block clone groups |
86+
| `fail-complexity` | `-1` | Max cyclomatic complexity |
87+
| `fail-coupling` | `-1` | Max coupling CBO |
88+
| `fail-cohesion` | `-1` | Max cohesion LCOM4 |
89+
| `fail-cycles` | `false` | Fail on dependency cycles |
90+
| `fail-dead-code` | `false` | Fail on high-confidence dead code |
91+
| `fail-health` | `-1` | Minimum health score |
92+
| `require-baseline` | `true` | Fail early if the baseline file is missing |
93+
| `baseline-path` | `codeclone.baseline.json` | Baseline path passed to CodeClone |
94+
| `metrics-baseline-path` | `codeclone.baseline.json` | Metrics baseline path passed to CodeClone |
95+
| `extra-args` | `""` | Additional CodeClone CLI arguments |
96+
| `no-progress` | `true` | Disable progress output |
97+
98+
For numeric gate inputs, `-1` means "disabled".
99+
100+
## Outputs
101+
102+
| Output | Meaning |
103+
|-----------------|------------------------------------------------------------|
104+
| `exit-code` | CodeClone process exit code |
105+
| `json-path` | Resolved JSON report path |
106+
| `sarif-path` | Resolved SARIF report path |
107+
| `pr-comment-id` | PR comment id when the action updated or created a comment |
108+
109+
## Exit behavior
110+
111+
The action propagates the real CodeClone exit code at the end:
112+
113+
- `0` — success
114+
- `2` — contract error
115+
- `3` — gating failure
116+
- `5` — internal error
117+
118+
SARIF upload and PR comment posting are treated as additive integrations. The
119+
final job result is still driven by the CodeClone analysis exit code.
120+
121+
## Permissions
122+
123+
Recommended permissions:
124+
125+
```yaml
126+
permissions:
127+
contents: read
128+
security-events: write
129+
pull-requests: write
130+
```
131+
132+
Notes:
133+
134+
- `security-events: write` is required for SARIF upload
135+
- `pull-requests: write` is required for PR comments
136+
- if you only want gating and JSON output, you can disable `sarif` and
137+
`pr-comment`
138+
139+
## Stable vs prerelease installs
140+
141+
Stable:
142+
143+
```yaml
144+
with:
145+
package-version: ""
146+
```
147+
148+
Explicit prerelease:
149+
150+
```yaml
151+
with:
152+
package-version: "2.0.0b3"
153+
```
154+
155+
Local/self-repo validation:
156+
157+
- `uses: ./.github/actions/codeclone` installs CodeClone from the checked-out
158+
repository source, so beta branches and unreleased commits do not depend on
159+
PyPI publication.
160+
161+
## Notes and limitations
162+
163+
- For private repositories without GitHub Advanced Security, SARIF upload may
164+
not be available. In that case, set `sarif: "false"` and rely on the PR
165+
comment + exit code.
166+
- The baseline file must exist in the repository when `require-baseline: true`.
167+
- The action always generates a canonical JSON report, even if SARIF is
168+
disabled.
169+
- PR comments are updated in place using a hidden marker, so repeated runs do
170+
not keep adding duplicate comments.
171+
- Analysis has a 10-minute timeout. For very large repositories, consider
172+
using `extra-args: "--skip-metrics"` or narrowing the scan scope.
173+
174+
## See also
175+
176+
- [CodeClone repository](https://github.com/orenlab/codeclone)
177+
- [Documentation](https://orenlab.github.io/codeclone/)
178+
- [SARIF integration](https://orenlab.github.io/codeclone/sarif/)

0 commit comments

Comments
 (0)