oceanusXXD
diff --git a/‎CHANGELOG.md‎
Lines changed: 2 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎MANIFEST.in‎
Lines changed: 2 additions & 0 deletions b/‎MANIFEST.in‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 21 additions & 0 deletions b/‎README.md‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎README.zh-CN.md‎
Lines changed: 21 additions & 0 deletions b/‎README.zh-CN.md‎
Lines changed: 21 additions & 0 deletions
@@ -5,6 +5,8 @@ Detailed notes for each tagged release live under [`docs/releases/`](./docs/rele
 
 ## Unreleased
 
+- Added a reproducible structural evidence benchmark with path-only and AST-symbol baselines, README chart, result JSON, and benchmark notes.
+- Fixed Python entrypoint role scoring so `main.py` and similar entry files are not mislabeled as root configuration files.
 - Added semantic Python extraction for call targets, dynamic imports, type references, raised exceptions, class attributes, and lightweight data-flow edges.
 - Improved internal import resolution for detailed `from ... import ...` records and dynamic imports.
 - Fed call/type/data-flow evidence into file prioritization, planning prompts, generated skeletons, and project summaries.
 
@@ -3,7 +3,9 @@ include README.md
 include README.zh-CN.md
 include CHANGELOG.md
 include docs/*.md
+recursive-include docs/assets *.svg
 recursive-include docs/releases *.md
+recursive-include benchmarks *.py *.json
 prune docs/superpowers
 recursive-include src/code2skill py.typed
 prune .code2skill
 
@@ -41,6 +41,26 @@ Use it when a Python project needs coding assistants to follow the current modul
 | Platform automation | A DevEx team runs the workflow across many Python services | Python API returns structured results and readiness status |
 | Contributor onboarding | New contributors need project-specific implementation rules | Generated Skills and docs describe the repo's working contracts |
 
+## Benchmark
+
+`code2skill` is evaluated on structural evidence extraction before any LLM call. The benchmark compares two simple baselines against the semantic scanner used by the Skill generation pipeline.
+
+![Structural evidence benchmark](docs/assets/structural-evidence-benchmark.svg)
+
+| Method | Gold evidence recall |
+|---|---:|
+| Path-only baseline | 0.048 |
+| AST symbols baseline | 0.357 |
+| code2skill semantic scanner | 1.000 |
+
+The gold set covers route decorators, service calls, type references, data-flow edges, dynamic imports, raised exceptions, main guards, and internal dependency edges. Reproduce it with:
+
+```bash
+python benchmarks/evaluate_structural_evidence.py
+```
+
+Details: [Benchmark Notes](https://github.com/oceanusXXD/code2skill/blob/main/docs/benchmarks.md), [result JSON](https://github.com/oceanusXXD/code2skill/blob/main/benchmarks/results/structural-evidence-benchmark.json).
+
 ## Install
 
 Requires Python 3.10 or newer.
@@ -241,6 +261,7 @@ For lower-level automation, use `create_scan_config(...)` with `scan_repository(
 - [Python API](https://github.com/oceanusXXD/code2skill/blob/main/docs/python-api.md)
 - [Output Layout](https://github.com/oceanusXXD/code2skill/blob/main/docs/output-layout.md)
 - [Algorithm Notes](https://github.com/oceanusXXD/code2skill/blob/main/docs/algorithm-notes.md)
+- [Benchmark Notes](https://github.com/oceanusXXD/code2skill/blob/main/docs/benchmarks.md)
 - [Release Guide](https://github.com/oceanusXXD/code2skill/blob/main/docs/release.md)
 - [Changelog](https://github.com/oceanusXXD/code2skill/blob/main/CHANGELOG.md)
 
 
@@ -41,6 +41,26 @@
 | 平台自动化 | DevEx 团队跨多个 Python 服务运行同一流程 | Python API 返回结构化结果和 readiness |
 | 开源贡献者 onboarding | 新贡献者改代码前需要项目实现规则 | 生成的 Skills 和 docs 说明仓库的工作契约 |
 
+## 基准测试
+
+`code2skill` 评测的是 LLM 调用前的结构证据抽取能力。这个 benchmark 用两个简单 baseline 对比 Skill 生成流水线使用的语义扫描器。
+
+![Structural evidence benchmark](docs/assets/structural-evidence-benchmark.svg)
+
+| 方法 | Gold evidence recall |
+|---|---:|
+| Path-only baseline | 0.048 |
+| AST symbols baseline | 0.357 |
+| code2skill semantic scanner | 1.000 |
+
+Gold set 覆盖 route decorators、service calls、type references、data-flow edges、dynamic imports、raised exceptions、main guards 和 internal dependency edges。复现命令：
+
+```bash
+python benchmarks/evaluate_structural_evidence.py
+```
+
+详情见：[Benchmark Notes](https://github.com/oceanusXXD/code2skill/blob/main/docs/benchmarks.md)、[result JSON](https://github.com/oceanusXXD/code2skill/blob/main/benchmarks/results/structural-evidence-benchmark.json)。
+
 ## 安装
 
 需要 Python 3.10 或更高版本。
@@ -241,6 +261,7 @@ print(readiness.ready, readiness.score)
 - [Python API](https://github.com/oceanusXXD/code2skill/blob/main/docs/python-api.md)
 - [Output Layout](https://github.com/oceanusXXD/code2skill/blob/main/docs/output-layout.md)
 - [Algorithm Notes](https://github.com/oceanusXXD/code2skill/blob/main/docs/algorithm-notes.md)
+- [Benchmark Notes](https://github.com/oceanusXXD/code2skill/blob/main/docs/benchmarks.md)
 - [Release Guide](https://github.com/oceanusXXD/code2skill/blob/main/docs/release.md)
 - [Changelog](https://github.com/oceanusXXD/code2skill/blob/main/CHANGELOG.md)