docs: v2.95 skill-scoped trust scores documentation

J.A.R.V.I.S. · J.A.R.V.I.S. · commit 78c36d3f679c · 2026-04-10T03:56:58.000+08:00
- ROADMAP.md: v2.87→v2.95 completed table; v2.96 candidate list; header version updated
- spec/peer-trust-v2.34.md: §11 Skill-Scoped Trust Scores (data model, algorithm, API, compat, SS01-16 test table)
- whats-new.md: v2.95 entry with curl examples and response shapes
diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md
@@ -1,7 +1,7 @@
 # ACP 协议研发路线图
 
 > 持续更新。贾维斯每周自动扫描竞品动态，每月产出一个新版本。
-> 最后更新：2026-04-10（v2.94.0：principal_diversity_defense — GET /trust/bilateral-ir/diversity 共谋对惩罚；当前版本 2.94.0，commit b9f638e）
+> 最后更新：2026-04-10（v2.95.0：skill-scoped trust scores — governance_metadata.trust_scores dict + QuerySkill skill_trust_score；当前版本 2.95.0，commit 070e0d3）
 
 ---
 
@@ -772,10 +772,10 @@ APS:  https://github.com/aeoess/agent-passport-system  （Ed25519 身份，v0.8
 
 **测试结果**：场景 A+B+F+G+H 全部 PASS；card_signature 11/11；identity 全套通过
 
-### ✅ v2.87–v2.94（完成 — 2026-04-08 至 2026-04-10）
+### ✅ v2.87–v2.95（完成 — 2026-04-08 至 2026-04-10）
 **主题：信任基础设施深化 — 身份、治理、反操控**
 
-> 最后更新：2026-04-10（v2.94 principal_diversity_defense 完成）
+> 最后更新：2026-04-10（v2.95 skill-scoped trust scores 完成）
 
 | 版本 | 主题 | 关键交付 | Commit |
 |------|------|---------|--------|
@@ -787,21 +787,22 @@ APS:  https://github.com/aeoess/agent-passport-system  （Ed25519 身份，v0.8
 | v2.92 | RFC-003 治理元数据 | `GET /governance-metadata` — `derivation_rights` + `credential_lifecycle`；A2A #1717 + aeoess SDK v1.37.0 对齐；16测试GM01-16全通 | — |
 | v2.93 | RFC-004 无CA身份 | `docs/rfc/identity-without-ca.md` — Ed25519 自签名，三层信任模型，9维 vs CA 对比，multi-provider DID；A2A #1712 社区草稿 | `f384752` |
 | v2.94 | 主体多样性防御 | `GET /trust/bilateral-ir/diversity` — 共谋对惩罚（concentration>60%→0.10x权重）；`principal_diversity_defense: true`；16测试PD01-16全通 | `b9f638e` |
+| v2.95 | Skill 信任评分 | `_compute_skill_trust_scores()` + `GET /trust/skill-scores` + QuerySkill `skill_trust_score` + `governance_metadata.trust_scores` dict；`skill_scoped_v1` 算法；16测试SS01-16全通 | `070e0d3` |
 
-**当前版本**: `2.94.0` | **最新 commit**: `b9f638e`
+**当前版本**: `2.95.0` | **最新 commit**: `070e0d3`
 
 ---
 
-### 🔮 v2.95（候选，目标：2026-04-15）
-**主题：Show HN 发布 + 2-Agent demo + QuerySkill() 正式实现**
+### 🔮 v2.96（候选，目标：2026-04-15）
+**主题：Show HN 发布 + 2-Agent demo**
 
 | 候选特性 | 优先级 | 说明 |
 |---------|--------|------|
 | 2-Agent demo 终端录屏 | P0 | Alpha↔Beta curl 双向通信真实演示（asciinema 或 gif） |
 | Hacker News 发布 | P0 | 最佳时间：周一/周二早 9-10 AM ET；需 Stark 先生最终批准 |
 | README demo gif 嵌入 | P1 | 让首屏更直观，降低新访客摩擦 |
 | A2A #1712 评论发布 | P1 | `docs/community/a2a-1712-comment.md` → 发布到 GitHub（时机成熟：#1672 CA作者转向 hybrid） |
-| QuerySkill() 正式实现 | P2 | A2A #1655（9 comments）仍 open；ACP 已有草案，完善实现 |
+| BUG-007/BUG-009/BUG-003b 修复 | P1 | 三个 P1 bug 积压，需在 Show HN 前清理 |
 
 ---
 
diff --git a/docs/spec/peer-trust-v2.34.md b/docs/spec/peer-trust-v2.34.md
@@ -306,4 +306,110 @@ All 10 tests pass as of v2.34.0.
 
 ---
 
+---
+
+## 11. Skill-Scoped Trust Scores (v2.95)
+
+> **A2A reference**: Issue #1717 — governance_metadata skill-scoped trust (community convergence 2026-04-09)
+
+The global `trust_score` in §4 measures overall peer trustworthiness. As of v2.95, ACP introduces
+**per-skill trust scores** derived from bilateral IR evidence, enabling callers to assess how trustworthy
+a specific skill invocation is rather than relying solely on the aggregate peer score.
+
+### 11.1 Data Model
+
+`governance_metadata.trust_scores`:
+
+```json
+{
+  "trust_scores": {
+    "text.summarize": 0.525,
+    "code.review":    0.435
+  },
+  "trust_score_method": "skill_scoped_v1",
+  "trust_score": 0.75
+}
+```
+
+- `trust_scores` — dict of `skill_id → float [0.0, 1.0]`; empty `{}` = no bilateral IR evidence yet
+- `trust_score_method` — always `"skill_scoped_v1"` as of v2.95
+- `trust_score` — global scalar retained for backward compatibility (A2A #1717 v1 spec)
+
+### 11.2 Score Algorithm (`skill_scoped_v1`)
+
+```
+score(skill_id) =
+  clamp(
+    0.3
+    + min(unique_callers(skill_id), 10) * 0.04
+    + min(bilateral_count(skill_id), 50) * 0.005,
+    0.0, 1.0
+  )
+```
+
+Where:
+- `unique_callers` — number of distinct `caller_did` values in bilateral IR records for this skill
+- `bilateral_count` — number of bilateral IR records (`bilateral: true`) for this skill
+- Base score `0.3` = minimum for any skill with IR evidence
+- Caller diversity (max `+0.40`) rewards broad adoption over narrow usage
+- Volume (max `+0.25`) rewards sustained usage
+
+### 11.3 API
+
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/trust/skill-scores` | GET | All per-skill scores from bilateral IR evidence |
+| `/skills/query` | POST | Returns `skill_trust_score` field per queried skill |
+
+`GET /trust/skill-scores` response:
+
+```json
+{
+  "ok": true,
+  "trust_scores": { "text.summarize": 0.525 },
+  "method": "skill_scoped_v1",
+  "algorithm": {
+    "base": 0.3,
+    "caller_diversity": "min(unique_callers, 10) * 0.04",
+    "volume": "min(bilateral_count, 50) * 0.005",
+    "max": 1.0
+  },
+  "skill_count": 1,
+  "ir_count": 5,
+  "version": "2.95.0"
+}
+```
+
+### 11.4 Backward Compatibility
+
+- Global `trust_score` scalar is preserved in `governance_metadata`
+- When bilateral IR evidence exists, `trust_score` is updated to the average of per-skill scores
+- When no IR evidence, `trust_score` retains the configured/startup value
+- Clients that only read `trust_score` continue to work without modification
+
+### 11.5 Test Coverage
+
+| Test ID | Description |
+|---------|-------------|
+| SS01 | VERSION == 2.95.0 |
+| SS02 | `capabilities.skill_scoped_trust_scores: true` declared |
+| SS03 | `endpoints.skill_trust_scores` declared in AgentCard |
+| SS04 | Existing endpoints (bilateral_ir_log/diversity) still declared |
+| SS05 | `/trust/skill-scores` returns `{}` when no IR records |
+| SS06 | Response schema contains all required fields |
+| SS07 | `algorithm` block contains base/caller_diversity/volume/max |
+| SS08 | Single skill score computed correctly from IR evidence |
+| SS09 | Two skills produce separate independent scores |
+| SS10 | All scores clamped to [0.0, 1.0] |
+| SS11 | `skill_count` matches unique skill_ids in IR records |
+| SS12 | QuerySkill response contains `skill_trust_score` field |
+| SS13 | `skill_trust_score == null` when no IR evidence |
+| SS14 | `skill_trust_score` populated after bilateral IR for that skill |
+| SS15 | `/governance-metadata` includes `trust_scores` dict + `trust_score_method` |
+| SS16 | Global `trust_score` backward compat — configured value retained when no IR |
+
+All 16 tests pass as of v2.95.0 (`tests/test_skill_scoped_trust_v295.py`).
+
+---
+
 *ACP is built by Kickflip73 + J.A.R.V.I.S. · [GitHub](https://github.com/Kickflip73/agent-communication-protocol)*
diff --git a/docs/whats-new.md b/docs/whats-new.md
@@ -1,12 +1,66 @@
 # What's New in ACP — Last 7 Days
 
-> Last updated: 2026-04-07
+> Last updated: 2026-04-10
 > For the full history see [CHANGELOG.md](../CHANGELOG.md)
 
 ---
 
+### v2.95.0 — Skill-Scoped Trust Scores (2026-04-10)
+
+ACP v2.95 introduces **per-skill trust scores** derived from bilateral interaction record evidence — enabling callers to evaluate trust at the skill level rather than relying solely on aggregate peer scores. Aligned with A2A Issue #1717 community convergence.
+
+**New endpoint: `GET /trust/skill-scores`**
+
+```bash
+curl http://localhost:18900/trust/skill-scores
+```
+```json
+{
+  "ok": true,
+  "trust_scores": {
+    "text.summarize": 0.525,
+    "code.review":    0.435
+  },
+  "method": "skill_scoped_v1",
+  "algorithm": {
+    "base": 0.3,
+    "caller_diversity": "min(unique_callers, 10) * 0.04",
+    "volume": "min(bilateral_count, 50) * 0.005",
+    "max": 1.0
+  },
+  "skill_count": 2,
+  "ir_count": 8,
+  "version": "2.95.0"
+}
+```
+
+**QuerySkill now returns `skill_trust_score`:**
+
+```bash
+curl -X POST http://localhost:18900/skills/query \
+  -H "Content-Type: application/json" \
+  -d '{"skill_id": "text.summarize"}'
+# → {"skill_trust_score": 0.525, "support_level": "supported", ...}
+```
+
+**`governance_metadata` updated:**
+
+```json
+{
+  "trust_scores": {"text.summarize": 0.525},
+  "trust_score_method": "skill_scoped_v1",
+  "trust_score": 0.75
+}
+```
+
+Global `trust_score` retained for full backward compatibility. When bilateral IR evidence exists, it is updated to the per-skill average. Empty `{}` = no evidence yet (not an error).
+
+Test coverage: **SS01–SS16 = 16/16 PASS** | [See CHANGELOG](../CHANGELOG.md)
+
 ---
 
+### v2.94.0 — Principal Diversity Defense (2026-04-10)
+
 ---
 
 ### v2.79.0 — Protocol Binding Declaration / A2A §5.8 CPB (2026-04-07)