Round 87: Human Review helper DRY + GLOSSARY playbook repair

johnteee · johnteee · commit a720c33b045d · 2026-06-25T16:42:45.000+08:00
diff --git a/README.md b/README.md
@@ -21,7 +21,7 @@ Full library docs: [reflective-prompt-library/README.md](reflective-prompt-libra
 ## Governance
 
 - **Contributing:** [CONTRIBUTING.md](CONTRIBUTING.md) — quality gates, routing maintenance (R8–R12), `make all`
-- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–86)
+- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–87)
 - **Operator playbook:** [GLOSSARY.md](reflective-prompt-library/GLOSSARY.md) — Governance Maintenance Playbook
 
 The repository contains:
diff --git a/reflective-prompt-library/GLOSSARY.md b/reflective-prompt-library/GLOSSARY.md
@@ -337,7 +337,7 @@ Curated top-of-cheatsheet summary of high-confusion routing traps (ROUTE-002 hol
 
 ## Governance Maintenance Playbook / 治理維護手冊
 
-Ongoing upkeep after panel close (Rounds 1–86). Not agent instructions — operator checklist.
+Ongoing upkeep after panel close (Rounds 1–87). Not agent instructions — operator checklist.
 
 **Operational test:** Before router tuning, add fresh ROUTE-002/003 holdout phrases; run `make all`; record decisions in `PROJECT_KNOWLEDGE.md` Decision Index when governance surface changes.
 
@@ -357,4 +357,6 @@ Ongoing upkeep after panel close (Rounds 1–86). Not agent instructions — ope
 14. When editing `01-thinking/` Purpose preambles, keep `Primary workflow surfaces` aligned exactly with `SKILL_THINKING_SOURCES` via `test_thinking_lens_primary_surfaces_match_consumer_graph`; put escalations and adjacent workflow notes in Scope or Human Review, not on the primary line.
 15. When editing composable prompts (`02-engineering`–`06-repo`), keep `Primary workflow surface(s)` aligned with `*_SKILL_LINKS` in `test_prompt_cross_links.py`; use Supporting lens for cross-cutting lenses like `runtime-trust-boundary.md`; put escalate/pair notes in Scope.
 16. When editing `00-core/` prompts, keep `Primary workflow surface(s)` aligned with `CORE_SKILL_LINKS` in `test_prompt_cross_links.py`; put pair/escalation skills in Scope or Human Review, not on the primary line.
-17. When editing composable prompts (`00-core`–`06-repo`), keep `Primary workflow surface(s)` / Supporting-lens preamble lines and run `test_*_prompts_eval_harness.py` primary-surface guards.18. When adding or editing composable prompts (`02-engineering`–`06-repo`) with `## Human Review`, keep preamble escalation routed to `reflective-risk` and run Human Review guards in `test_*_prompts_eval_harness.py` (exact heading match via `prompt_eval_helpers.py`).
+17. When editing composable prompts (`00-core`–`06-repo`), keep `Primary workflow surface(s)` / Supporting-lens preamble lines and run `test_*_prompts_eval_harness.py` primary-surface guards.
+18. When adding or editing composable prompts (`02-engineering`–`06-repo`) with `## Human Review`, keep preamble escalation routed to `reflective-risk` and run Human Review guards in `test_*_prompts_eval_harness.py` (exact heading match via `prompt_eval_helpers.py`).
+19. When editing Human Review guards, use `prompt_eval_helpers.assert_human_review_preamble` in all `test_*_prompts_eval_harness.py` files (thinking lenses + composable categories).
diff --git a/reflective-prompt-library/PROJECT_KNOWLEDGE.md b/reflective-prompt-library/PROJECT_KNOWLEDGE.md
@@ -73,6 +73,7 @@ deferred promotions are recurrence-gated — see [panel backlog](plans/multi-age
 ## Decision Index
 
 - 2026-06-25 Round 85 panel — composable prompt Primary workflow surface preamble guards (`test_*_prompts_eval_harness.py`) + Supporting-lens exemption → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
+- 2026-06-25 Round 87 panel — Human Review helper DRY + GLOSSARY playbook step repair → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
 - 2026-06-25 Round 86 panel — composable Human Review preamble guards + `reflective-risk` routing alignment → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
 - 2026-06-25 Round 84 panel — `00-core` Primary workflow surface parity + primary-line trim → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
 - 2026-06-25 Round 83 panel — composable prompt Primary workflow surface parity (`02-engineering`–`06-repo`) + supporting-lens exemption → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
diff --git a/reflective-prompt-library/README.md b/reflective-prompt-library/README.md
@@ -30,7 +30,7 @@ Pick **Strictness L1–L6** first (`skills/reflective-dispatch/SKILL.md`, [GLOSS
 
 ## Governance Panel Record
 
-Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–86, options A–FJ) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
+Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–87, options A–FO) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
 
 ## Directory Map
 
diff --git a/reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md b/reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md
@@ -2409,4 +2409,49 @@ User directive (repeat): review prompts, plans, skills, and Socratic/critical-th
 
 **Resealed 2026-06-25** after **Round 86** (options FG–FJ). Composable prompts with `## Human Review` now have eval_harness preamble guards matching thinking-lens pattern (R81); full library contract parity closed (graph + primary surface + Human Review). Holdout expansion remains recurrence-gated maintenance.
 
+---
+
+## Round 87 — Human Review helper DRY + GLOSSARY playbook repair (2026-06-25)
+
+**Options FK–FO** | Six-lens panel (Opus, Codex, Gemini, Composer, Sakana, GLM)
+
+### Round 87 options
+
+| ID | Proposal | Verdict |
+| --- | --- | --- |
+| FK | Fix GLOSSARY playbook step 17/18 newline merge + `test_maintenance_playbook_steps_on_separate_lines` | **Agree** |
+| FL | DRY Human Review guards via `prompt_eval_helpers.assert_human_review_preamble` across all `test_*_prompts_eval_harness.py` | **Agree** |
+| FM | GLOSSARY playbook step 19 + governance sync | **Agree** |
+| FN | ROUTE holdout expansion | **Defer** |
+| FO | Router / tenth skill / benchmark CI | **Reject** |
+
+### Round 87 verdict table
+
+| ID | Option | Verdict | Action |
+| --- | --- | --- | --- |
+| FK | Playbook repair | **Agree** | split merged steps; anti-drift pytest |
+| FL | Shared HR helper | **Agree** | migrate thinking + composable harness files |
+| FM | Playbook + docs | **Agree** | step 19; panel round 87 sync |
+| FN | Holdout expansion | **Defer** | maintenance |
+| FO | Router/tenth skill/benchmark CI | **Reject** | no change |
+
+**All roles agree.**
+
+## Implemented Changes (Round 87)
+
+- `GLOSSARY.md`: repaired step 17/18 newline merge; playbook step 19 for shared Human Review helper
+- `plans/tests/test_glossary_structure.py`: `test_maintenance_playbook_steps_on_separate_lines`
+- `plans/tests/prompt_eval_helpers.py`: `assert_human_review_preamble`
+- `plans/tests/test_{thinking,engineering,context,agent,domain,repo}_prompts_eval_harness.py`: use shared Human Review helper
+- `QUALITY_GATES_SUMMARY.md`: shared HR helper note; panel Rounds 1–87; 550+ pytest floor
+- `PROJECT_KNOWLEDGE.md`: Decision Index Round 87 entry
+- `README.md`, `reflective-prompt-library/README.md`, `test_readme_governance.py`: panel round 87 sync
+
+## Verification (Round 87)
+
+- `make all`: pytest + ROUTE-001/002/003 100%
+
+## Panel status (updated)
+
+**Resealed 2026-06-25** after **Round 87** (options FK–FO). Human Review guards now share one helper across thinking lenses and composable prompts; GLOSSARY playbook formatting anti-drift closed. Holdout expansion remains recurrence-gated maintenance.
 
diff --git a/reflective-prompt-library/plans/tests/prompt_eval_helpers.py b/reflective-prompt-library/plans/tests/prompt_eval_helpers.py
@@ -16,3 +16,14 @@ def has_human_review_preamble(prompt_path: Path) -> bool:
 
 def prompts_with_human_review(prompts: tuple[Path, ...]) -> tuple[Path, ...]:
     return tuple(p for p in prompts if has_human_review_preamble(p))
+
+
+def assert_human_review_preamble(prompt_path: Path) -> None:
+    """Human Review sections must live in preamble and route to reflective-risk."""
+    preamble = prompt_preamble(prompt_path)
+    assert has_human_review_preamble(prompt_path), (
+        f"{prompt_path.name} missing ## Human Review preamble outside template block"
+    )
+    assert "reflective-risk" in preamble, (
+        f"{prompt_path.name} Human Review should route to reflective-risk"
+    )
diff --git a/reflective-prompt-library/plans/tests/test_agent_prompts_eval_harness.py b/reflective-prompt-library/plans/tests/test_agent_prompts_eval_harness.py
@@ -9,7 +9,7 @@
 sys.path.insert(0, str(Path(__file__).parent))
 
 from eval_harness import EvalHarness  # noqa: E402
-from prompt_eval_helpers import prompts_with_human_review  # noqa: E402
+from prompt_eval_helpers import assert_human_review_preamble, prompts_with_human_review  # noqa: E402
 
 AGENT_DIR = Path(__file__).parent.parent.parent / "04-agent"
 REPO_ROOT = str(Path(__file__).parent.parent.parent.parent)
@@ -86,8 +86,4 @@ def test_agent_prompts_have_workflow_surface_preamble_line():
 )
 def test_agent_prompt_has_human_review_section(prompt_path: Path):
     """Prompts with Human Review declare escalation outside zh-TW templates."""
-    preamble = prompt_path.read_text(encoding="utf-8").split("```", 1)[0]
-    assert "## Human Review" in preamble, f"{prompt_path.name} missing Human Review preamble"
-    assert "reflective-risk" in preamble, (
-        f"{prompt_path.name} Human Review should route to reflective-risk"
-    )
+    assert_human_review_preamble(prompt_path)
diff --git a/reflective-prompt-library/plans/tests/test_context_prompts_eval_harness.py b/reflective-prompt-library/plans/tests/test_context_prompts_eval_harness.py
@@ -9,7 +9,7 @@
 sys.path.insert(0, str(Path(__file__).parent))
 
 from eval_harness import EvalHarness  # noqa: E402
-from prompt_eval_helpers import prompts_with_human_review  # noqa: E402
+from prompt_eval_helpers import assert_human_review_preamble, prompts_with_human_review  # noqa: E402
 
 CONTEXT_DIR = Path(__file__).parent.parent.parent / "03-context"
 REPO_ROOT = str(Path(__file__).parent.parent.parent.parent)
@@ -79,8 +79,4 @@ def test_context_prompts_have_primary_workflow_surfaces_line():
 )
 def test_context_prompt_has_human_review_section(prompt_path: Path):
     """Prompts with Human Review declare escalation outside zh-TW templates."""
-    preamble = prompt_path.read_text(encoding="utf-8").split("```", 1)[0]
-    assert "## Human Review" in preamble, f"{prompt_path.name} missing Human Review preamble"
-    assert "reflective-risk" in preamble, (
-        f"{prompt_path.name} Human Review should route to reflective-risk"
-    )
+    assert_human_review_preamble(prompt_path)
diff --git a/reflective-prompt-library/plans/tests/test_domain_prompts_eval_harness.py b/reflective-prompt-library/plans/tests/test_domain_prompts_eval_harness.py
@@ -9,7 +9,7 @@
 sys.path.insert(0, str(Path(__file__).parent))
 
 from eval_harness import EvalHarness  # noqa: E402
-from prompt_eval_helpers import prompts_with_human_review  # noqa: E402
+from prompt_eval_helpers import assert_human_review_preamble, prompts_with_human_review  # noqa: E402
 
 DOMAIN_DIR = Path(__file__).parent.parent.parent / "05-domain"
 REPO_ROOT = str(Path(__file__).parent.parent.parent.parent)
@@ -80,8 +80,4 @@ def test_domain_prompts_have_primary_workflow_surfaces_line():
 )
 def test_domain_prompt_has_human_review_section(prompt_path: Path):
     """Prompts with Human Review declare escalation outside zh-TW templates."""
-    preamble = prompt_path.read_text(encoding="utf-8").split("```", 1)[0]
-    assert "## Human Review" in preamble, f"{prompt_path.name} missing Human Review preamble"
-    assert "reflective-risk" in preamble, (
-        f"{prompt_path.name} Human Review should route to reflective-risk"
-    )
+    assert_human_review_preamble(prompt_path)
diff --git a/reflective-prompt-library/plans/tests/test_engineering_prompts_eval_harness.py b/reflective-prompt-library/plans/tests/test_engineering_prompts_eval_harness.py
@@ -9,7 +9,7 @@
 sys.path.insert(0, str(Path(__file__).parent))
 
 from eval_harness import EvalHarness  # noqa: E402
-from prompt_eval_helpers import prompts_with_human_review  # noqa: E402
+from prompt_eval_helpers import assert_human_review_preamble, prompts_with_human_review  # noqa: E402
 
 ENGINEERING_DIR = Path(__file__).parent.parent.parent / "02-engineering"
 REPO_ROOT = str(Path(__file__).parent.parent.parent.parent)
@@ -80,8 +80,4 @@ def test_engineering_prompts_have_primary_workflow_surfaces_line():
 )
 def test_engineering_prompt_has_human_review_section(prompt_path: Path):
     """Prompts with Human Review declare escalation outside zh-TW templates."""
-    preamble = prompt_path.read_text(encoding="utf-8").split("```", 1)[0]
-    assert "## Human Review" in preamble, f"{prompt_path.name} missing Human Review preamble"
-    assert "reflective-risk" in preamble, (
-        f"{prompt_path.name} Human Review should route to reflective-risk"
-    )
+    assert_human_review_preamble(prompt_path)
diff --git a/reflective-prompt-library/plans/tests/test_glossary_structure.py b/reflective-prompt-library/plans/tests/test_glossary_structure.py
@@ -30,10 +30,20 @@ def test_round_boundary_terms_present(glossary_text: str):
         assert heading in glossary_text, f"missing glossary section: {heading}"
 
 
-def test_maintenance_playbook_references_round_86(glossary_text: str):
+def test_maintenance_playbook_references_round_87(glossary_text: str):
     playbook = glossary_text.split("## Governance Maintenance Playbook", 1)[1]
-    assert "Rounds 1–86" in playbook
-    assert "Rounds 1–85" not in playbook and "Rounds 1-85" not in playbook
+    assert "Rounds 1–87" in playbook
+    assert "Rounds 1–86" not in playbook and "Rounds 1-86" not in playbook
+
+
+def test_maintenance_playbook_steps_on_separate_lines(glossary_text: str):
+    """Numbered playbook steps must not merge onto one line (Round 86 corruption guard)."""
+    playbook = glossary_text.split("## Governance Maintenance Playbook", 1)[1]
+    assert re.search(r"guards\.\d+\.", playbook) is None, (
+        "playbook steps merged without newline between numbers"
+    )
+    for step in ("17.", "18.", "19."):
+        assert step in playbook
 
 
 
diff --git a/reflective-prompt-library/plans/tests/test_readme_governance.py b/reflective-prompt-library/plans/tests/test_readme_governance.py
@@ -10,8 +10,8 @@
 METHODOLOGY_MAP_EN = Path(__file__).parent.parent.parent / "METHODOLOGY_MAP.md"
 SKILL_MAP = Path(__file__).parent.parent.parent / "skills" / "skill-map.md"
 
-CURRENT_PANEL_ROUND = "86"
-CURRENT_PANEL_OPTIONS = "A–FJ"
+CURRENT_PANEL_ROUND = "87"
+CURRENT_PANEL_OPTIONS = "A–FO"
 
 
 @pytest.fixture(scope="module")
diff --git a/reflective-prompt-library/plans/tests/test_repo_prompts_eval_harness.py b/reflective-prompt-library/plans/tests/test_repo_prompts_eval_harness.py
@@ -9,7 +9,7 @@
 sys.path.insert(0, str(Path(__file__).parent))
 
 from eval_harness import EvalHarness  # noqa: E402
-from prompt_eval_helpers import prompts_with_human_review  # noqa: E402
+from prompt_eval_helpers import assert_human_review_preamble, prompts_with_human_review  # noqa: E402
 
 REPO_DIR = Path(__file__).parent.parent.parent / "06-repo"
 REPO_ROOT = str(Path(__file__).parent.parent.parent.parent)
@@ -84,8 +84,4 @@ def test_repo_prompts_have_primary_workflow_surfaces_line():
 )
 def test_repo_prompt_has_human_review_section(prompt_path: Path):
     """Prompts with Human Review declare escalation outside zh-TW templates."""
-    preamble = prompt_path.read_text(encoding="utf-8").split("```", 1)[0]
-    assert "## Human Review" in preamble, f"{prompt_path.name} missing Human Review preamble"
-    assert "reflective-risk" in preamble, (
-        f"{prompt_path.name} Human Review should route to reflective-risk"
-    )
+    assert_human_review_preamble(prompt_path)
diff --git a/reflective-prompt-library/plans/tests/test_thinking_prompts_eval_harness.py b/reflective-prompt-library/plans/tests/test_thinking_prompts_eval_harness.py
@@ -5,6 +5,9 @@
 
 import pytest
 
+sys.path.insert(0, str(Path(__file__).parent))
+from prompt_eval_helpers import assert_human_review_preamble
+
 sys.path.insert(0, str(Path(__file__).parent.parent))
 
 from eval_harness import EvalHarness  # noqa: E402
@@ -62,9 +65,5 @@ def test_thinking_prompts_have_primary_workflow_surfaces_line():
 @pytest.mark.parametrize("prompt_path", THINKING_PROMPTS, ids=lambda p: p.name)
 def test_thinking_prompt_has_human_review_section(prompt_path: Path):
     """All 01-thinking lenses declare Human Review escalation outside zh-TW templates."""
-    preamble = prompt_path.read_text(encoding="utf-8").split("```", 1)[0]
-    assert "## Human Review" in preamble, f"{prompt_path.name} missing Human Review preamble"
-    assert "reflective-risk" in preamble, (
-        f"{prompt_path.name} Human Review should route to reflective-risk"
-    )
+    assert_human_review_preamble(prompt_path)