Skip to content

Commit 95fc8d9

Browse files
committed
Round 77: governance validator pytest mirrors + panel reseal
Add pytest anti-drift mirrors for validate_governance, validate_links, and lint_skills with live-repo pass checks and negative tmp_path fixtures. Sync panel record, Decision Index, READMEs, and QUALITY_GATES (410+ pytest).
1 parent 3cf341c commit 95fc8d9

9 files changed

Lines changed: 282 additions & 6 deletions

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Full library docs: [reflective-prompt-library/README.md](reflective-prompt-libra
2121
## Governance
2222

2323
- **Contributing:** [CONTRIBUTING.md](CONTRIBUTING.md) — quality gates, routing maintenance (R8–R12), `make all`
24-
- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–75)
24+
- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–77)
2525
- **Operator playbook:** [GLOSSARY.md](reflective-prompt-library/GLOSSARY.md) — Governance Maintenance Playbook
2626

2727
The repository contains:

reflective-prompt-library/PROJECT_KNOWLEDGE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,7 @@ deferred promotions are recurrence-gated — see [panel backlog](plans/multi-age
7575
> Pointers to the causal trail — plans, reflections, tests, commits. Detail is
7676
> not duplicated here; this is a map, not an archive.
7777
78+
- 2026-06-25 Round 77 panel — governance pytest mirrors (`test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py`) → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
7879
- 2026-06-25 Round 76 panel — standardize `06-repo/` prompt contracts (Purpose/Scope/Acceptance/Falsifiability) + thinking/workflow cross-links + `test_repo_prompts_eval_harness.py`[record](plans/multi-agent-panel-consensus-2026-06-25.md)
7980
- 2026-06-25 Round 75 panel — standardize `05-domain/` prompt contracts (Purpose/Scope/Acceptance/Falsifiability) + thinking/workflow cross-links + `test_domain_prompts_eval_harness.py`[record](plans/multi-agent-panel-consensus-2026-06-25.md)
8081
- 2026-06-25 Round 74 panel — standardize `03-context/` prompt contracts (Purpose/Scope/Acceptance/Falsifiability) + thinking/workflow cross-links + `test_context_prompts_eval_harness.py`[record](plans/multi-agent-panel-consensus-2026-06-25.md)

reflective-prompt-library/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Pick **Strictness L1–L6** first (`skills/reflective-dispatch/SKILL.md`, [GLOSS
3030

3131
## Governance Panel Record
3232

33-
Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–76, options A–DW) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
33+
Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–77, options A–DZ) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
3434

3535
## Directory Map
3636

reflective-prompt-library/plans/QUALITY_GATES_SUMMARY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,7 @@ ROUTE-002 measures unseen phrasing separately from ROUTE-001. Round 7 (2026-06-2
314314
2. **ROUTE-001/002/003 in CI** — 128 + 102 + 53 paraphrases at 100% consistency (seeded fixtures); `validate_route_fixture.py` gates minimum coverage
315315
3. **Governance validators** — links, lint, governance metadata, PROJECT_KNOWLEDGE, benchmark fixture, skill examples
316316
4. **Harness policy docs** — CONTRIBUTING, AGENTS, SKILL_INSTALLATION, maintenance playbook
317-
5. **Doc anti-drift**`test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py` (400+ pytest anti-drift suite in CI)
317+
5. **Doc anti-drift**`test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`, `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py` (410+ pytest anti-drift suite in CI)
318318

319319
### Ongoing maintenance (not blockers)
320320

reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md

Lines changed: 60 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1934,4 +1934,63 @@ User directive (repeat): review prompts, plans, skills, and Socratic/critical-th
19341934

19351935
## Panel status (updated)
19361936

1937-
**Resealed 2026-06-25** after **Round 76** (options DU–DW). Repository-template contract pass complete; full prompt-library contract sweep (`00-core``06-repo`) finished. Governance pytest mirrors remain recurrence-gated.
1937+
**Resealed 2026-06-25** after **Round 76** (options DU–DW). Repository-template contract pass complete; full prompt-library contract sweep (`00-core``06-repo`) finished. Governance pytest mirrors remain recurrence-gated.
1938+
1939+
## Round 77 — Governance pytest mirrors (2026-06-25)
1940+
1941+
User directive (repeat): review prompts, plans, skills, and Socratic/critical-thinking lenses in parallel until all roles agree, then implement.
1942+
1943+
### DX: Pytest mirrors for `validate_governance`, `validate_links`, `lint_skills`?
1944+
1945+
| Lens | Position |
1946+
| --- | --- |
1947+
| Opus | **Agree**`CANONICAL_CONTEXT_LOAD` table must be pytest-guarded; mirrors catch drift before `make validate` |
1948+
| Codex | **Agree** — negative fixtures falsify without duplicating validator logic; live-repo smoke tests |
1949+
| Gemini | **Agree** — context_load deferral is cost-relevant; pytest mirrors are cheap |
1950+
| Composer | **Agree** — IDE sessions edit SKILL frontmatter; mirrors close DH backlog |
1951+
| Sakana | **Agree** — no tenth skill; mirrors protect existing nine |
1952+
| GLM | **Agree** — English canonical metadata; TW routing unaffected |
1953+
1954+
**Socratic Q:** Why mirrors now after Round 76 rejected DV?
1955+
**Answer:** Full prompt-library contract sweep is complete; user re-triggered panel cycle; recurrence gate for DH backlog is satisfied.
1956+
1957+
**Consensus:** **Agree** — add `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py` with live-repo pass checks + negative tmp_path cases; sync QUALITY_GATES pytest floor.
1958+
1959+
### DY: Router / holdout / tenth skill?
1960+
1961+
| Lens | Position |
1962+
| --- | --- |
1963+
| All six | **Reject** — ROUTE-001/002/003 at 100%; nine-skill freeze holds |
1964+
1965+
### DZ: LLM benchmark in CI?
1966+
1967+
| Lens | Position |
1968+
| --- | --- |
1969+
| All six | **Reject** — manual `benchmark_tasks.py` only (Rounds 5–6) |
1970+
1971+
### Round 77 verdict table
1972+
1973+
| ID | Option | Verdict | Action |
1974+
| --- | --- | --- | --- |
1975+
| DX | Governance pytest mirrors | **Agree** | 3 test modules + QUALITY_GATES sync |
1976+
| DY | Router/holdout/tenth skill | **Reject** | no change |
1977+
| DZ | LLM benchmark in CI | **Reject** | no change |
1978+
1979+
**All roles agree.**
1980+
1981+
## Implemented Changes (Round 77)
1982+
1983+
- `plans/tests/test_validate_governance.py`: `CANONICAL_CONTEXT_LOAD` parity + live 9/9 pass + negative fixtures
1984+
- `plans/tests/test_validate_links.py`: live-repo zero errors + broken link / frontmatter negatives
1985+
- `plans/tests/test_lint_skills.py`: live-repo zero lint errors + nine SKILL.md detection + negative fixture
1986+
- `QUALITY_GATES_SUMMARY.md`: governance mirror tests; pytest floor 410+
1987+
- `PROJECT_KNOWLEDGE.md`: Decision Index Round 77 entry
1988+
- `README.md`, `reflective-prompt-library/README.md`, `test_readme_governance.py`: panel round 77 sync
1989+
1990+
## Verification (Round 77)
1991+
1992+
- `make all`: pytest + ROUTE-001/002/003 100%
1993+
1994+
## Panel status (updated)
1995+
1996+
**Resealed 2026-06-25** after **Round 77** (options DX–DZ). Governance validator pytest mirrors complete; prompt-library contract sweep and governance anti-drift suite closed. Holdout expansion before router tuning remains recurrence-gated maintenance.
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
"""Pytest mirrors for lint_skills.py (Round 77 anti-drift)."""
2+
3+
import sys
4+
from pathlib import Path
5+
6+
import pytest
7+
8+
sys.path.insert(0, str(Path(__file__).parent.parent))
9+
10+
from lint_skills import SkillLinter # noqa: E402
11+
from validate_skill_examples import CORE_SKILLS # noqa: E402
12+
13+
REPO_ROOT = Path(__file__).parent.parent.parent.parent
14+
15+
16+
@pytest.fixture(scope="module")
17+
def lint_results():
18+
return SkillLinter(str(REPO_ROOT)).lint_all()
19+
20+
21+
def test_live_repo_has_no_lint_errors(lint_results):
22+
assert lint_results["total_errors"] == 0, [
23+
(item["file"], item["errors"])
24+
for item in lint_results["file_results"]
25+
if item["errors"]
26+
]
27+
28+
29+
def test_all_nine_core_skills_are_linted_as_skills(lint_results):
30+
skill_files = {
31+
item["file"]
32+
for item in lint_results["file_results"]
33+
if item["type"] == "skill" and item["file"].endswith("SKILL.md")
34+
}
35+
for skill in CORE_SKILLS:
36+
expected = f"reflective-prompt-library/skills/{skill}/SKILL.md"
37+
assert expected in skill_files, skill
38+
39+
40+
def test_skill_missing_description_reports_error(tmp_path):
41+
skill_dir = tmp_path / "reflective-prompt-library" / "skills" / "reflective-brief"
42+
skill_dir.mkdir(parents=True)
43+
skill_file = skill_dir / "SKILL.md"
44+
skill_file.write_text(
45+
"""---
46+
name: reflective-brief
47+
license: MIT
48+
---
49+
# Brief
50+
""",
51+
encoding="utf-8",
52+
)
53+
result = SkillLinter(str(tmp_path)).lint_file(skill_file)
54+
assert result["type"] == "skill"
55+
assert any("description" in err.lower() for err in result["errors"])

reflective-prompt-library/plans/tests/test_readme_governance.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@
1010
METHODOLOGY_MAP_EN = Path(__file__).parent.parent.parent / "METHODOLOGY_MAP.md"
1111
SKILL_MAP = Path(__file__).parent.parent.parent / "skills" / "skill-map.md"
1212

13-
CURRENT_PANEL_ROUND = "76"
14-
CURRENT_PANEL_OPTIONS = "A–DW"
13+
CURRENT_PANEL_ROUND = "77"
14+
CURRENT_PANEL_OPTIONS = "A–DZ"
1515

1616

1717
@pytest.fixture(scope="module")
Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
"""Pytest mirrors for validate_governance.py (Round 77 anti-drift)."""
2+
3+
import sys
4+
from pathlib import Path
5+
6+
import pytest
7+
8+
sys.path.insert(0, str(Path(__file__).parent.parent))
9+
10+
from validate_governance import CANONICAL_CONTEXT_LOAD, GovernanceValidator # noqa: E402
11+
from validate_skill_examples import CORE_SKILLS # noqa: E402
12+
13+
REPO_ROOT = Path(__file__).parent.parent.parent.parent
14+
15+
16+
@pytest.fixture(scope="module")
17+
def governance_results():
18+
return GovernanceValidator(str(REPO_ROOT)).validate_all()
19+
20+
21+
def test_canonical_context_load_matches_core_skills():
22+
assert set(CANONICAL_CONTEXT_LOAD) == set(CORE_SKILLS)
23+
assert len(CANONICAL_CONTEXT_LOAD) == 9
24+
25+
26+
def test_all_skills_pass_governance_validation(governance_results):
27+
assert governance_results["total_skills"] == 9
28+
assert governance_results["invalid_skills"] == 0, governance_results["errors"]
29+
30+
31+
@pytest.mark.parametrize("skill,expected_load", sorted(CANONICAL_CONTEXT_LOAD.items()))
32+
def test_live_skill_context_load_matches_panel_table(skill, expected_load):
33+
skill_path = (
34+
REPO_ROOT / "reflective-prompt-library" / "skills" / skill / "SKILL.md"
35+
)
36+
validator = GovernanceValidator(str(REPO_ROOT))
37+
frontmatter = validator.extract_frontmatter(skill_path.read_text(encoding="utf-8"))
38+
assert frontmatter.get("context_load", "").lower() == expected_load
39+
40+
41+
def test_wrong_context_load_fails_validation(tmp_path):
42+
skills_dir = tmp_path / "reflective-prompt-library" / "skills" / "reflective-dispatch"
43+
skills_dir.mkdir(parents=True)
44+
skills_dir.joinpath("SKILL.md").write_text(
45+
"""---
46+
name: reflective-dispatch
47+
description: test
48+
risk_level: low
49+
human_review_required: false
50+
external_io: false
51+
context_load: high
52+
---
53+
# Test
54+
""",
55+
encoding="utf-8",
56+
)
57+
results = GovernanceValidator(str(tmp_path)).validate_all()
58+
assert results["invalid_skills"] == 1
59+
assert any(
60+
"context_load must be 'low'" in err
61+
for item in results["errors"]
62+
for err in item["errors"]
63+
)
64+
65+
66+
def test_missing_governance_field_fails_validation(tmp_path):
67+
skills_dir = tmp_path / "reflective-prompt-library" / "skills" / "reflective-brief"
68+
skills_dir.mkdir(parents=True)
69+
skills_dir.joinpath("SKILL.md").write_text(
70+
"""---
71+
name: reflective-brief
72+
description: test
73+
---
74+
# Test
75+
""",
76+
encoding="utf-8",
77+
)
78+
results = GovernanceValidator(str(tmp_path)).validate_all()
79+
assert results["invalid_skills"] == 1
80+
assert any(
81+
"Missing required field" in err
82+
for item in results["errors"]
83+
for err in item["errors"]
84+
)
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
"""Pytest mirrors for validate_links.py (Round 77 anti-drift)."""
2+
3+
import sys
4+
from pathlib import Path
5+
6+
import pytest
7+
8+
sys.path.insert(0, str(Path(__file__).parent.parent))
9+
10+
from validate_links import LinkValidator # noqa: E402
11+
12+
REPO_ROOT = Path(__file__).parent.parent.parent.parent
13+
14+
15+
@pytest.fixture(scope="module")
16+
def link_results():
17+
return LinkValidator(str(REPO_ROOT)).validate_all()
18+
19+
20+
def test_live_repo_has_no_link_errors(link_results):
21+
assert link_results["total_errors"] == 0, {
22+
"ref_file": link_results["ref_file_errors"],
23+
"ref_snippet": link_results["ref_snippet_errors"],
24+
"markdown": link_results["markdown_link_errors"],
25+
"frontmatter": link_results["frontmatter_errors"],
26+
}
27+
28+
29+
def test_broken_markdown_link_is_detected(tmp_path):
30+
bad_file = tmp_path / "broken.md"
31+
bad_file.write_text("[missing](does-not-exist.md)\n", encoding="utf-8")
32+
results = {
33+
"ref_file_errors": [],
34+
"ref_snippet_errors": [],
35+
"markdown_link_errors": [],
36+
"frontmatter_errors": [],
37+
"total_files": 0,
38+
"total_errors": 0,
39+
}
40+
LinkValidator(str(tmp_path)).validate_markdown_links(
41+
bad_file.read_text(encoding="utf-8"),
42+
bad_file,
43+
bad_file.relative_to(tmp_path),
44+
results,
45+
)
46+
assert results["markdown_link_errors"]
47+
48+
49+
def test_skill_frontmatter_requires_name_and_description(tmp_path):
50+
skill_dir = tmp_path / "reflective-prompt-library" / "skills" / "reflective-dispatch"
51+
skill_dir.mkdir(parents=True)
52+
skill_file = skill_dir / "SKILL.md"
53+
skill_file.write_text(
54+
"""---
55+
license: MIT
56+
---
57+
# Missing name and description
58+
""",
59+
encoding="utf-8",
60+
)
61+
results = {
62+
"ref_file_errors": [],
63+
"ref_snippet_errors": [],
64+
"markdown_link_errors": [],
65+
"frontmatter_errors": [],
66+
"total_files": 0,
67+
"total_errors": 0,
68+
}
69+
LinkValidator(str(tmp_path)).validate_skill_frontmatter(
70+
skill_file.read_text(encoding="utf-8"),
71+
skill_file,
72+
skill_file.relative_to(tmp_path),
73+
results,
74+
)
75+
errors = [item["error"] for item in results["frontmatter_errors"]]
76+
assert any("Missing required field: name" in err for err in errors)
77+
assert any("Missing required field: description" in err for err in errors)

0 commit comments

Comments
 (0)