Round 82: strict Primary workflow surfaces graph parity

johnteee · johnteee · commit e30d3582eccd · 2026-06-25T16:15:13.000+08:00
Six-lens panel (EQ–ET) agreed to align thinking-lens Purpose preambles
exactly with SKILL_THINKING_SOURCES consumers, add pytest guard, and
move adjacent workflow notes to Scope. Governance synced to Round 82.
diff --git a/README.md b/README.md
@@ -21,7 +21,7 @@ Full library docs: [reflective-prompt-library/README.md](reflective-prompt-libra
 ## Governance
 
 - **Contributing:** [CONTRIBUTING.md](CONTRIBUTING.md) — quality gates, routing maintenance (R8–R12), `make all`
-- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–81)
+- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–82)
 - **Operator playbook:** [GLOSSARY.md](reflective-prompt-library/GLOSSARY.md) — Governance Maintenance Playbook
 
 The repository contains:
diff --git a/reflective-prompt-library/01-thinking/counterargument.md b/reflective-prompt-library/01-thinking/counterargument.md
@@ -4,7 +4,7 @@ Use this to prevent excessive optimism, overengineering, or AI flattery.
 
 ## Purpose
 
-Stress-test optimism, overengineering, and AI flattery before committing resources. Primary workflow surfaces: `reflective-implement` for disputed implementation choices; `reflective-review` and `reflective-minimality` for critique and anti-bloat; escalate to `reflective-risk` when trust-boundary or blast-radius signals appear.
+Stress-test optimism, overengineering, and AI flattery before committing resources. Primary workflow surfaces: `reflective-implement` for disputed implementation choices; `reflective-review` and `reflective-minimality` for critique and anti-bloat.
 
 ## Scope
 
diff --git a/reflective-prompt-library/01-thinking/socratic-reviewer.md b/reflective-prompt-library/01-thinking/socratic-reviewer.md
@@ -4,12 +4,13 @@ Suitable for requirements interviews, life decisions, product direction, busines
 
 ## Purpose
 
-Clarify the real question before choosing a direction. Primary workflow surfaces: `reflective-brief` for goal and assumption clarification; `reflective-dispatch` for routing; `reflective-research` for multi-voice synthesis; `reflective-handoff-retro` for session transfer; escalate to `reflective-spec-plan` when scope is clear enough to plan.
+Clarify the real question before choosing a direction. Primary workflow surfaces: `reflective-dispatch` for routing; `reflective-research` for multi-voice synthesis; `reflective-handoff-retro` for session transfer.
 
 ## Scope
 
 - In scope: requirements interviews, product direction, technical selection, learning strategy, research question definition.
-- Out of scope: code implementation (`reflective-implement`), production risk gating (`reflective-risk`), source-backed external research (`reflective-research`).
+- Out of scope: code implementation (`reflective-implement`), production risk gating (`reflective-risk`), source-backed external research (`reflective-research`), ticket/spec drafting (`reflective-spec-plan` — use after dispatch when scope is clear).
+- Adjacent: pair with `reflective-brief` when assumptions are still open; dispatch selects strictness before spec or implement work.
 
 ## Acceptance Criteria
 
diff --git a/reflective-prompt-library/01-thinking/why-what-how-done.md b/reflective-prompt-library/01-thinking/why-what-how-done.md
@@ -4,12 +4,13 @@ Use this as the core gate prompt before committing to a direction.
 
 ## Purpose
 
-Gate a task through Why / What / How / Done before choosing strictness or workflow depth. Primary workflow surfaces: `reflective-brief` then `reflective-dispatch` for orchestration level selection; `reflective-spec-plan` when framing becomes ticket or spec work.
+Gate a task through Why / What / How / Done before choosing strictness or workflow depth. Primary workflow surfaces: `reflective-brief` for pre-commitment gating; `reflective-spec-plan` when framing becomes ticket or spec work.
 
 ## Scope
 
 - In scope: pre-commitment checks on goal, scope, method, and completion evidence for a single task or feature.
 - Out of scope: post-hoc code review (`reflective-review`), handoff retros (`reflective-handoff-retro`), or detailed test implementation.
+- Adjacent: after brief framing, `reflective-dispatch` selects orchestration level before deeper spec or implement work.
 
 ## Acceptance Criteria
 
diff --git a/reflective-prompt-library/GLOSSARY.md b/reflective-prompt-library/GLOSSARY.md
@@ -337,7 +337,7 @@ Curated top-of-cheatsheet summary of high-confusion routing traps (ROUTE-002 hol
 
 ## Governance Maintenance Playbook / 治理維護手冊
 
-Ongoing upkeep after panel close (Rounds 1–81). Not agent instructions — operator checklist.
+Ongoing upkeep after panel close (Rounds 1–82). Not agent instructions — operator checklist.
 
 **Operational test:** Before router tuning, add fresh ROUTE-002/003 holdout phrases; run `make all`; record decisions in `PROJECT_KNOWLEDGE.md` Decision Index when governance surface changes.
 
@@ -354,3 +354,4 @@ Ongoing upkeep after panel close (Rounds 1–81). Not agent instructions — ope
 11. When changing Module Contract subsections on workflow skills, keep `Escalation:` present and run `test_skill_module_contract.py`.
 12. When adding or editing `01-thinking/` lenses, keep `## Human Review` in the preamble (routes to `reflective-risk`) and run `test_thinking_prompts_eval_harness.py`.
 13. When editing workflow skill Escalation bullets, cite only frozen `reflective-*` skills; run `test_skill_module_contract.py` escalation route guard.
+14. When editing `01-thinking/` Purpose preambles, keep `Primary workflow surfaces` aligned exactly with `SKILL_THINKING_SOURCES` via `test_thinking_lens_primary_surfaces_match_consumer_graph`; put escalations and adjacent workflow notes in Scope or Human Review, not on the primary line.
diff --git a/reflective-prompt-library/PROJECT_KNOWLEDGE.md b/reflective-prompt-library/PROJECT_KNOWLEDGE.md
@@ -75,6 +75,7 @@ deferred promotions are recurrence-gated — see [panel backlog](plans/multi-age
 > Pointers to the causal trail — plans, reflections, tests, commits. Detail is
 > not duplicated here; this is a map, not an archive.
 
+- 2026-06-25 Round 82 panel — strict Primary workflow surfaces ↔ skill graph parity + preamble trim → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
 - 2026-06-25 Round 81 panel — thinking-lens Human Review preamble guards + Escalation route-target anti-drift → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
 - 2026-06-25 Round 80 panel — Module Contract Escalation anti-drift + thinking-lens preamble consumer guards → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
 - 2026-06-25 Round 79 panel — bidirectional thinking-lens ↔ workflow skill preamble cross-links + reciprocal pytest → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
diff --git a/reflective-prompt-library/README.md b/reflective-prompt-library/README.md
@@ -30,7 +30,7 @@ Pick **Strictness L1–L6** first (`skills/reflective-dispatch/SKILL.md`, [GLOSS
 
 ## Governance Panel Record
 
-Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–81, options A–EP) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
+Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–82, options A–ET) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
 
 ## Directory Map
 
diff --git a/reflective-prompt-library/plans/QUALITY_GATES_SUMMARY.md b/reflective-prompt-library/plans/QUALITY_GATES_SUMMARY.md
@@ -314,7 +314,7 @@ ROUTE-002 measures unseen phrasing separately from ROUTE-001. Round 7 (2026-06-2
 2. **ROUTE-001/002/003 in CI** — 128 + 102 + 53 paraphrases at 100% consistency (seeded fixtures); `validate_route_fixture.py` gates minimum coverage
 3. **Governance validators** — links, lint, governance metadata, PROJECT_KNOWLEDGE, benchmark fixture, skill examples
 4. **Harness policy docs** — CONTRIBUTING, AGENTS, SKILL_INSTALLATION, maintenance playbook
-5. **Doc anti-drift** — `test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`, `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py`, `test_skill_module_contract.py` (Escalation subsection + Trigger/Methods/Output/Never; 450+ pytest anti-drift suite in CI); reciprocal thinking-lens ↔ skill checks in `test_prompt_cross_links.py`; Human Review + Escalation route-target guards in thinking/skill contract tests
+5. **Doc anti-drift** — `test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`, `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py`, `test_skill_module_contract.py` (Escalation subsection + Trigger/Methods/Output/Never; 450+ pytest anti-drift suite in CI); reciprocal thinking-lens ↔ skill checks in `test_prompt_cross_links.py` (including strict Primary workflow surfaces parity via `test_thinking_lens_primary_surfaces_match_consumer_graph`); Human Review + Escalation route-target guards in thinking/skill contract tests
 
 ### Ongoing maintenance (not blockers)
 
@@ -384,4 +384,4 @@ Phase 1 quality-gate tooling and documentation are **complete**. Routing consist
 - ✅ Benchmark fixture gate plus optional manual benchmark runs
 - ✅ Research-backed design decisions
 
-The project is positioned to grow sustainably with quality discipline built in from the start. **No open implementation blockers** remain from panel Rounds 1–81; work is recurrence-gated maintenance per playbook. The next measurable quality target is **holdout expansion before router tuning** and optional manual baseline-vs-skill benchmark runs — not shipping new core skills without promotion evidence.
+The project is positioned to grow sustainably with quality discipline built in from the start. **No open implementation blockers** remain from panel Rounds 1–82; work is recurrence-gated maintenance per playbook. The next measurable quality target is **holdout expansion before router tuning** and optional manual baseline-vs-skill benchmark runs — not shipping new core skills without promotion evidence.
diff --git a/reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md b/reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md
@@ -2196,3 +2196,46 @@ User directive (repeat): review prompts, plans, skills, and Socratic/critical-th
 **Resealed 2026-06-25** after **Round 81** (options EM–EP). Thinking-lens Human Review preambles complete; Escalation route-target anti-drift closed. Holdout expansion remains recurrence-gated maintenance.
 
 
+---
+
+## Round 82 — Strict Primary workflow surfaces graph parity (2026-06-25)
+
+**Options EQ–ET** | Six-lens panel (Opus, Codex, Gemini, Composer, Sakana, GLM)
+
+### Round 82 options
+
+| ID | Proposal | Verdict |
+| --- | --- | --- |
+| EQ | Strict `Primary workflow surfaces` ↔ `SKILL_THINKING_SOURCES` parity + preamble trim + pytest | **Agree** |
+| ER | Expand skill Prompt Sources to match narrative overlisting | **Reject** |
+| ES | ROUTE holdout expansion | **Defer** |
+| ET | Router / tenth skill / benchmark CI | **Reject** |
+
+### Round 82 verdict table
+
+| ID | Option | Verdict | Action |
+| --- | --- | --- | --- |
+| EQ | Primary surfaces exact graph | **Agree** | trim preambles + `test_thinking_lens_primary_surfaces_match_consumer_graph` |
+| ER | Expand graph to match prose | **Reject** | `SKILL_THINKING_SOURCES` stays authoritative from skill Prompt Sources |
+| ES | Holdout expansion | **Defer** | maintenance |
+| ET | Router/tenth skill/benchmark CI | **Reject** | no change |
+
+**All roles agree.**
+
+## Implemented Changes (Round 82)
+
+- `01-thinking/counterargument.md`, `socratic-reviewer.md`, `why-what-how-done.md`: Primary workflow surfaces trimmed to graph consumers; adjacent workflow notes moved to Scope
+- `plans/tests/test_prompt_cross_links.py`: `_primary_workflow_surfaces_skills` + `test_thinking_lens_primary_surfaces_match_consumer_graph`
+- `GLOSSARY.md`: playbook Rounds 1–82; step 14 for strict primary-surfaces parity
+- `QUALITY_GATES_SUMMARY.md`: primary-surfaces parity note; panel Rounds 1–82
+- `PROJECT_KNOWLEDGE.md`: Decision Index Round 82 entry
+- `README.md`, `reflective-prompt-library/README.md`, `test_readme_governance.py`: panel round 82 sync
+
+## Verification (Round 82)
+
+- `make all`: pytest + ROUTE-001/002/003 100%
+
+## Panel status (updated)
+
+**Resealed 2026-06-25** after **Round 82** (options EQ–ET). Thinking-lens Primary workflow surfaces now match the inverted skill graph exactly. Holdout expansion remains recurrence-gated maintenance.
+
diff --git a/reflective-prompt-library/plans/tests/test_glossary_structure.py b/reflective-prompt-library/plans/tests/test_glossary_structure.py
@@ -30,10 +30,10 @@ def test_round_boundary_terms_present(glossary_text: str):
         assert heading in glossary_text, f"missing glossary section: {heading}"
 
 
-def test_maintenance_playbook_references_round_81(glossary_text: str):
+def test_maintenance_playbook_references_round_82(glossary_text: str):
     playbook = glossary_text.split("## Governance Maintenance Playbook", 1)[1]
-    assert "Rounds 1–81" in playbook or "Rounds 1-80" in playbook
-    assert "Rounds 1–80" not in playbook and "Rounds 1-79" not in playbook
+    assert "Rounds 1–82" in playbook or "Rounds 1-81" in playbook
+    assert "Rounds 1–81" not in playbook and "Rounds 1-80" not in playbook
 
 
 
diff --git a/reflective-prompt-library/plans/tests/test_prompt_cross_links.py b/reflective-prompt-library/plans/tests/test_prompt_cross_links.py
@@ -1,5 +1,6 @@
 """Anti-drift: thinking lenses, engineering/agent/context/domain/repo prompts, and workflow skills cross-link."""
 
+import re
 from pathlib import Path
 
 import pytest
@@ -239,6 +240,16 @@ def _preamble(path: Path) -> str:
     return path.read_text(encoding="utf-8").split("```", 1)[0]
 
 
+
+def _primary_workflow_surfaces_skills(preamble: str) -> tuple[str, ...]:
+    """Skills named on the Purpose Primary workflow surfaces line only."""
+    purpose = preamble.split("## Scope", 1)[0]
+    match = re.search(r"Primary workflow surfaces:(.*)", purpose, re.DOTALL)
+    assert match, "missing Primary workflow surfaces line in Purpose preamble"
+    line = match.group(1).split("\n", 1)[0]
+    return tuple(sorted(set(re.findall(r"`(reflective-[a-z-]+)`", line))))
+
+
 def _prompt_sources_section(skill_path: Path) -> str:
     text = skill_path.read_text(encoding="utf-8")
     marker = "## Prompt Sources"
@@ -302,6 +313,19 @@ def test_thinking_lens_preamble_lists_consumer_skills(lens_ref: str, consumer_sk
         assert skill in preamble, f"{lens_ref} preamble should reference consumer {skill}"
 
 
+
+
+@pytest.mark.parametrize("lens_ref,consumer_skills", THINKING_LENS_SKILL_CONSUMERS.items())
+def test_thinking_lens_primary_surfaces_match_consumer_graph(
+    lens_ref: str, consumer_skills: tuple[str, ...]
+):
+    """Primary workflow surfaces must list exactly the skills that cite this lens."""
+    preamble = _preamble(THINKING_DIR / Path(lens_ref).name)
+    listed = _primary_workflow_surfaces_skills(preamble)
+    assert listed == consumer_skills, (
+        f"{lens_ref} Primary workflow surfaces {listed} != graph {consumer_skills}"
+    )
+
 @pytest.mark.parametrize("prompt_path", THINKING_PROMPTS, ids=lambda p: p.name)
 def test_thinking_prompt_maps_to_workflow_skill(prompt_path: Path):
     preamble = _preamble(prompt_path)
diff --git a/reflective-prompt-library/plans/tests/test_readme_governance.py b/reflective-prompt-library/plans/tests/test_readme_governance.py
@@ -10,8 +10,8 @@
 METHODOLOGY_MAP_EN = Path(__file__).parent.parent.parent / "METHODOLOGY_MAP.md"
 SKILL_MAP = Path(__file__).parent.parent.parent / "skills" / "skill-map.md"
 
-CURRENT_PANEL_ROUND = "81"
-CURRENT_PANEL_OPTIONS = "A–EP"
+CURRENT_PANEL_ROUND = "82"
+CURRENT_PANEL_OPTIONS = "A–ET"
 
 
 @pytest.fixture(scope="module")