Round 77: governance validator pytest mirrors + panel reseal

johnteee · johnteee · commit 95fc8d948671 · 2026-06-25T15:46:43.000+08:00
Add pytest anti-drift mirrors for validate_governance, validate_links,
and lint_skills with live-repo pass checks and negative tmp_path fixtures.
Sync panel record, Decision Index, READMEs, and QUALITY_GATES (410+ pytest).
diff --git a/README.md b/README.md
@@ -21,7 +21,7 @@ Full library docs: [reflective-prompt-library/README.md](reflective-prompt-libra
 ## Governance
 
 - **Contributing:** [CONTRIBUTING.md](CONTRIBUTING.md) — quality gates, routing maintenance (R8–R12), `make all`
-- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–75)
+- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–77)
 - **Operator playbook:** [GLOSSARY.md](reflective-prompt-library/GLOSSARY.md) — Governance Maintenance Playbook
 
 The repository contains:
diff --git a/reflective-prompt-library/PROJECT_KNOWLEDGE.md b/reflective-prompt-library/PROJECT_KNOWLEDGE.md
@@ -75,6 +75,7 @@ deferred promotions are recurrence-gated — see [panel backlog](plans/multi-age
 > Pointers to the causal trail — plans, reflections, tests, commits. Detail is
 > not duplicated here; this is a map, not an archive.
 
+- 2026-06-25 Round 77 panel — governance pytest mirrors (`test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py`) → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
 - 2026-06-25 Round 76 panel — standardize `06-repo/` prompt contracts (Purpose/Scope/Acceptance/Falsifiability) + thinking/workflow cross-links + `test_repo_prompts_eval_harness.py` → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
 - 2026-06-25 Round 75 panel — standardize `05-domain/` prompt contracts (Purpose/Scope/Acceptance/Falsifiability) + thinking/workflow cross-links + `test_domain_prompts_eval_harness.py` → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
 - 2026-06-25 Round 74 panel — standardize `03-context/` prompt contracts (Purpose/Scope/Acceptance/Falsifiability) + thinking/workflow cross-links + `test_context_prompts_eval_harness.py` → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
diff --git a/reflective-prompt-library/README.md b/reflective-prompt-library/README.md
@@ -30,7 +30,7 @@ Pick **Strictness L1–L6** first (`skills/reflective-dispatch/SKILL.md`, [GLOSS
 
 ## Governance Panel Record
 
-Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–76, options A–DW) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
+Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–77, options A–DZ) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
 
 ## Directory Map
 
diff --git a/reflective-prompt-library/plans/QUALITY_GATES_SUMMARY.md b/reflective-prompt-library/plans/QUALITY_GATES_SUMMARY.md
@@ -314,7 +314,7 @@ ROUTE-002 measures unseen phrasing separately from ROUTE-001. Round 7 (2026-06-2
 2. **ROUTE-001/002/003 in CI** — 128 + 102 + 53 paraphrases at 100% consistency (seeded fixtures); `validate_route_fixture.py` gates minimum coverage
 3. **Governance validators** — links, lint, governance metadata, PROJECT_KNOWLEDGE, benchmark fixture, skill examples
 4. **Harness policy docs** — CONTRIBUTING, AGENTS, SKILL_INSTALLATION, maintenance playbook
-5. **Doc anti-drift** — `test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py` (400+ pytest anti-drift suite in CI)
+5. **Doc anti-drift** — `test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`, `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py` (410+ pytest anti-drift suite in CI)
 
 ### Ongoing maintenance (not blockers)
 
diff --git a/reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md b/reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md
@@ -1934,4 +1934,63 @@ User directive (repeat): review prompts, plans, skills, and Socratic/critical-th
 
 ## Panel status (updated)
 
-**Resealed 2026-06-25** after **Round 76** (options DU–DW). Repository-template contract pass complete; full prompt-library contract sweep (`00-core`–`06-repo`) finished. Governance pytest mirrors remain recurrence-gated.
+**Resealed 2026-06-25** after **Round 76** (options DU–DW). Repository-template contract pass complete; full prompt-library contract sweep (`00-core`–`06-repo`) finished. Governance pytest mirrors remain recurrence-gated.
+
+## Round 77 — Governance pytest mirrors (2026-06-25)
+
+User directive (repeat): review prompts, plans, skills, and Socratic/critical-thinking lenses in parallel until all roles agree, then implement.
+
+### DX: Pytest mirrors for `validate_governance`, `validate_links`, `lint_skills`?
+
+| Lens | Position |
+| --- | --- |
+| Opus | **Agree** — `CANONICAL_CONTEXT_LOAD` table must be pytest-guarded; mirrors catch drift before `make validate` |
+| Codex | **Agree** — negative fixtures falsify without duplicating validator logic; live-repo smoke tests |
+| Gemini | **Agree** — context_load deferral is cost-relevant; pytest mirrors are cheap |
+| Composer | **Agree** — IDE sessions edit SKILL frontmatter; mirrors close DH backlog |
+| Sakana | **Agree** — no tenth skill; mirrors protect existing nine |
+| GLM | **Agree** — English canonical metadata; TW routing unaffected |
+
+**Socratic Q:** Why mirrors now after Round 76 rejected DV?
+**Answer:** Full prompt-library contract sweep is complete; user re-triggered panel cycle; recurrence gate for DH backlog is satisfied.
+
+**Consensus:** **Agree** — add `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py` with live-repo pass checks + negative tmp_path cases; sync QUALITY_GATES pytest floor.
+
+### DY: Router / holdout / tenth skill?
+
+| Lens | Position |
+| --- | --- |
+| All six | **Reject** — ROUTE-001/002/003 at 100%; nine-skill freeze holds |
+
+### DZ: LLM benchmark in CI?
+
+| Lens | Position |
+| --- | --- |
+| All six | **Reject** — manual `benchmark_tasks.py` only (Rounds 5–6) |
+
+### Round 77 verdict table
+
+| ID | Option | Verdict | Action |
+| --- | --- | --- | --- |
+| DX | Governance pytest mirrors | **Agree** | 3 test modules + QUALITY_GATES sync |
+| DY | Router/holdout/tenth skill | **Reject** | no change |
+| DZ | LLM benchmark in CI | **Reject** | no change |
+
+**All roles agree.**
+
+## Implemented Changes (Round 77)
+
+- `plans/tests/test_validate_governance.py`: `CANONICAL_CONTEXT_LOAD` parity + live 9/9 pass + negative fixtures
+- `plans/tests/test_validate_links.py`: live-repo zero errors + broken link / frontmatter negatives
+- `plans/tests/test_lint_skills.py`: live-repo zero lint errors + nine SKILL.md detection + negative fixture
+- `QUALITY_GATES_SUMMARY.md`: governance mirror tests; pytest floor 410+
+- `PROJECT_KNOWLEDGE.md`: Decision Index Round 77 entry
+- `README.md`, `reflective-prompt-library/README.md`, `test_readme_governance.py`: panel round 77 sync
+
+## Verification (Round 77)
+
+- `make all`: pytest + ROUTE-001/002/003 100%
+
+## Panel status (updated)
+
+**Resealed 2026-06-25** after **Round 77** (options DX–DZ). Governance validator pytest mirrors complete; prompt-library contract sweep and governance anti-drift suite closed. Holdout expansion before router tuning remains recurrence-gated maintenance.
diff --git a/reflective-prompt-library/plans/tests/test_lint_skills.py b/reflective-prompt-library/plans/tests/test_lint_skills.py
@@ -0,0 +1,55 @@
+"""Pytest mirrors for lint_skills.py (Round 77 anti-drift)."""
+
+import sys
+from pathlib import Path
+
+import pytest
+
+sys.path.insert(0, str(Path(__file__).parent.parent))
+
+from lint_skills import SkillLinter  # noqa: E402
+from validate_skill_examples import CORE_SKILLS  # noqa: E402
+
+REPO_ROOT = Path(__file__).parent.parent.parent.parent
+
+
+@pytest.fixture(scope="module")
+def lint_results():
+    return SkillLinter(str(REPO_ROOT)).lint_all()
+
+
+def test_live_repo_has_no_lint_errors(lint_results):
+    assert lint_results["total_errors"] == 0, [
+        (item["file"], item["errors"])
+        for item in lint_results["file_results"]
+        if item["errors"]
+    ]
+
+
+def test_all_nine_core_skills_are_linted_as_skills(lint_results):
+    skill_files = {
+        item["file"]
+        for item in lint_results["file_results"]
+        if item["type"] == "skill" and item["file"].endswith("SKILL.md")
+    }
+    for skill in CORE_SKILLS:
+        expected = f"reflective-prompt-library/skills/{skill}/SKILL.md"
+        assert expected in skill_files, skill
+
+
+def test_skill_missing_description_reports_error(tmp_path):
+    skill_dir = tmp_path / "reflective-prompt-library" / "skills" / "reflective-brief"
+    skill_dir.mkdir(parents=True)
+    skill_file = skill_dir / "SKILL.md"
+    skill_file.write_text(
+        """---
+name: reflective-brief
+license: MIT
+---
+# Brief
+""",
+        encoding="utf-8",
+    )
+    result = SkillLinter(str(tmp_path)).lint_file(skill_file)
+    assert result["type"] == "skill"
+    assert any("description" in err.lower() for err in result["errors"])
diff --git a/reflective-prompt-library/plans/tests/test_readme_governance.py b/reflective-prompt-library/plans/tests/test_readme_governance.py
@@ -10,8 +10,8 @@
 METHODOLOGY_MAP_EN = Path(__file__).parent.parent.parent / "METHODOLOGY_MAP.md"
 SKILL_MAP = Path(__file__).parent.parent.parent / "skills" / "skill-map.md"
 
-CURRENT_PANEL_ROUND = "76"
-CURRENT_PANEL_OPTIONS = "A–DW"
+CURRENT_PANEL_ROUND = "77"
+CURRENT_PANEL_OPTIONS = "A–DZ"
 
 
 @pytest.fixture(scope="module")
diff --git a/reflective-prompt-library/plans/tests/test_validate_governance.py b/reflective-prompt-library/plans/tests/test_validate_governance.py
@@ -0,0 +1,84 @@
+"""Pytest mirrors for validate_governance.py (Round 77 anti-drift)."""
+
+import sys
+from pathlib import Path
+
+import pytest
+
+sys.path.insert(0, str(Path(__file__).parent.parent))
+
+from validate_governance import CANONICAL_CONTEXT_LOAD, GovernanceValidator  # noqa: E402
+from validate_skill_examples import CORE_SKILLS  # noqa: E402
+
+REPO_ROOT = Path(__file__).parent.parent.parent.parent
+
+
+@pytest.fixture(scope="module")
+def governance_results():
+    return GovernanceValidator(str(REPO_ROOT)).validate_all()
+
+
+def test_canonical_context_load_matches_core_skills():
+    assert set(CANONICAL_CONTEXT_LOAD) == set(CORE_SKILLS)
+    assert len(CANONICAL_CONTEXT_LOAD) == 9
+
+
+def test_all_skills_pass_governance_validation(governance_results):
+    assert governance_results["total_skills"] == 9
+    assert governance_results["invalid_skills"] == 0, governance_results["errors"]
+
+
+@pytest.mark.parametrize("skill,expected_load", sorted(CANONICAL_CONTEXT_LOAD.items()))
+def test_live_skill_context_load_matches_panel_table(skill, expected_load):
+    skill_path = (
+        REPO_ROOT / "reflective-prompt-library" / "skills" / skill / "SKILL.md"
+    )
+    validator = GovernanceValidator(str(REPO_ROOT))
+    frontmatter = validator.extract_frontmatter(skill_path.read_text(encoding="utf-8"))
+    assert frontmatter.get("context_load", "").lower() == expected_load
+
+
+def test_wrong_context_load_fails_validation(tmp_path):
+    skills_dir = tmp_path / "reflective-prompt-library" / "skills" / "reflective-dispatch"
+    skills_dir.mkdir(parents=True)
+    skills_dir.joinpath("SKILL.md").write_text(
+        """---
+name: reflective-dispatch
+description: test
+risk_level: low
+human_review_required: false
+external_io: false
+context_load: high
+---
+# Test
+""",
+        encoding="utf-8",
+    )
+    results = GovernanceValidator(str(tmp_path)).validate_all()
+    assert results["invalid_skills"] == 1
+    assert any(
+        "context_load must be 'low'" in err
+        for item in results["errors"]
+        for err in item["errors"]
+    )
+
+
+def test_missing_governance_field_fails_validation(tmp_path):
+    skills_dir = tmp_path / "reflective-prompt-library" / "skills" / "reflective-brief"
+    skills_dir.mkdir(parents=True)
+    skills_dir.joinpath("SKILL.md").write_text(
+        """---
+name: reflective-brief
+description: test
+---
+# Test
+""",
+        encoding="utf-8",
+    )
+    results = GovernanceValidator(str(tmp_path)).validate_all()
+    assert results["invalid_skills"] == 1
+    assert any(
+        "Missing required field" in err
+        for item in results["errors"]
+        for err in item["errors"]
+    )
diff --git a/reflective-prompt-library/plans/tests/test_validate_links.py b/reflective-prompt-library/plans/tests/test_validate_links.py
@@ -0,0 +1,77 @@
+"""Pytest mirrors for validate_links.py (Round 77 anti-drift)."""
+
+import sys
+from pathlib import Path
+
+import pytest
+
+sys.path.insert(0, str(Path(__file__).parent.parent))
+
+from validate_links import LinkValidator  # noqa: E402
+
+REPO_ROOT = Path(__file__).parent.parent.parent.parent
+
+
+@pytest.fixture(scope="module")
+def link_results():
+    return LinkValidator(str(REPO_ROOT)).validate_all()
+
+
+def test_live_repo_has_no_link_errors(link_results):
+    assert link_results["total_errors"] == 0, {
+        "ref_file": link_results["ref_file_errors"],
+        "ref_snippet": link_results["ref_snippet_errors"],
+        "markdown": link_results["markdown_link_errors"],
+        "frontmatter": link_results["frontmatter_errors"],
+    }
+
+
+def test_broken_markdown_link_is_detected(tmp_path):
+    bad_file = tmp_path / "broken.md"
+    bad_file.write_text("[missing](does-not-exist.md)\n", encoding="utf-8")
+    results = {
+        "ref_file_errors": [],
+        "ref_snippet_errors": [],
+        "markdown_link_errors": [],
+        "frontmatter_errors": [],
+        "total_files": 0,
+        "total_errors": 0,
+    }
+    LinkValidator(str(tmp_path)).validate_markdown_links(
+        bad_file.read_text(encoding="utf-8"),
+        bad_file,
+        bad_file.relative_to(tmp_path),
+        results,
+    )
+    assert results["markdown_link_errors"]
+
+
+def test_skill_frontmatter_requires_name_and_description(tmp_path):
+    skill_dir = tmp_path / "reflective-prompt-library" / "skills" / "reflective-dispatch"
+    skill_dir.mkdir(parents=True)
+    skill_file = skill_dir / "SKILL.md"
+    skill_file.write_text(
+        """---
+license: MIT
+---
+# Missing name and description
+""",
+        encoding="utf-8",
+    )
+    results = {
+        "ref_file_errors": [],
+        "ref_snippet_errors": [],
+        "markdown_link_errors": [],
+        "frontmatter_errors": [],
+        "total_files": 0,
+        "total_errors": 0,
+    }
+    LinkValidator(str(tmp_path)).validate_skill_frontmatter(
+        skill_file.read_text(encoding="utf-8"),
+        skill_file,
+        skill_file.relative_to(tmp_path),
+        results,
+    )
+    errors = [item["error"] for item in results["frontmatter_errors"]]
+    assert any("Missing required field: name" in err for err in errors)
+    assert any("Missing required field: description" in err for err in errors)