Skip to content

Commit 0f799db

Browse files
committed
Round 91: cross-category Human Review library registry
1 parent 64e82e0 commit 0f799db

15 files changed

Lines changed: 199 additions & 16 deletions

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Full library docs: [reflective-prompt-library/README.md](reflective-prompt-libra
2121
## Governance
2222

2323
- **Contributing:** [CONTRIBUTING.md](CONTRIBUTING.md) — quality gates, routing maintenance (R8–R12), `make all`
24-
- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–90)
24+
- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–91)
2525
- **Operator playbook:** [GLOSSARY.md](reflective-prompt-library/GLOSSARY.md) — Governance Maintenance Playbook
2626

2727
The repository contains:

reflective-prompt-library/GLOSSARY.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -337,7 +337,7 @@ Curated top-of-cheatsheet summary of high-confusion routing traps (ROUTE-002 hol
337337

338338
## Governance Maintenance Playbook / 治理維護手冊
339339

340-
Ongoing upkeep after panel close (Rounds 1–90). Not agent instructions — operator checklist.
340+
Ongoing upkeep after panel close (Rounds 1–91). Not agent instructions — operator checklist.
341341

342342
**Operational test:** Before router tuning, add fresh ROUTE-002/003 holdout phrases; run `make all`; record decisions in `PROJECT_KNOWLEDGE.md` Decision Index when governance surface changes.
343343

@@ -363,3 +363,4 @@ Ongoing upkeep after panel close (Rounds 1–90). Not agent instructions — ope
363363
20. When adding or editing risk-bearing `00-core/` prompts with `## Human Review`, keep preamble escalation routed to `reflective-risk` and run `test_core_prompts_eval_harness.py` Human Review guards via `prompt_eval_helpers.py`.
364364
21. When editing `00-core/` Human Review coverage, keep `CORE_HUMAN_REVIEW_REQUIRED` and `CORE_HUMAN_REVIEW_EXEMPT` in `test_core_prompts_eval_harness.py` aligned with preamble `## Human Review` sections; run core HR parity tests.
365365
22. When editing Human Review coverage on thinking lenses or composable prompts (`01-thinking``06-repo`), keep frozen `*_HUMAN_REVIEW_REQUIRED` / `*_HUMAN_REVIEW_EXEMPT` sets in `test_*_prompts_eval_harness.py` aligned with preamble `## Human Review` sections; use `prompt_eval_helpers.assert_human_review_*` parity helpers and run HR set partition tests.
366+
23. When adding composable prompts or new categories, keep `PROMPT_LIBRARY_CATEGORIES` and `test_human_review_library_registry.py` aligned so frozen HR sets cover every `00-core``06-repo` prompt exactly once.

reflective-prompt-library/PROJECT_KNOWLEDGE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@ deferred promotions are recurrence-gated — see [panel backlog](plans/multi-age
7373
## Decision Index
7474

7575
- 2026-06-25 Round 85 panel — composable prompt Primary workflow surface preamble guards (`test_*_prompts_eval_harness.py`) + Supporting-lens exemption → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
76+
- 2026-06-25 Round 91 panel — cross-category Human Review library registry (`test_human_review_library_registry.py`, `PROMPT_LIBRARY_CATEGORIES`) → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
7677
- 2026-06-25 Round 90 panel — library-wide Human Review required/exempt set parity (`01-thinking``06-repo`) + DRY `prompt_eval_helpers` HR set guards → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
7778
- 2026-06-25 Round 89 panel — `00-core` Human Review required/exempt set parity (`CORE_HUMAN_REVIEW_REQUIRED` / `CORE_HUMAN_REVIEW_EXEMPT`) → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
7879
- 2026-06-25 Round 88 panel — `00-core` Human Review preamble guards on risk-bearing prompts + `test_core_prompts_eval_harness.py`[record](plans/multi-agent-panel-consensus-2026-06-25.md)

reflective-prompt-library/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Pick **Strictness L1–L6** first (`skills/reflective-dispatch/SKILL.md`, [GLOSS
3030

3131
## Governance Panel Record
3232

33-
Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–90, options A–GB) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
33+
Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–91, options A–GE) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
3434

3535
## Directory Map
3636

reflective-prompt-library/plans/QUALITY_GATES_SUMMARY.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,7 @@ ROUTE-002 measures unseen phrasing separately from ROUTE-001. Round 7 (2026-06-2
314314
2. **ROUTE-001/002/003 in CI** — 128 + 102 + 53 paraphrases at 100% consistency (seeded fixtures); `validate_route_fixture.py` gates minimum coverage
315315
3. **Governance validators** — links, lint, governance metadata, PROJECT_KNOWLEDGE, benchmark fixture, skill examples
316316
4. **Harness policy docs** — CONTRIBUTING, AGENTS, SKILL_INSTALLATION, maintenance playbook
317-
5. **Doc anti-drift** — `test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`, `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py`, `test_skill_module_contract.py` (Escalation subsection + Trigger/Methods/Output/Never; 580+ pytest anti-drift suite in CI); reciprocal thinking-lens ↔ skill checks and `00-core` + composable `Primary workflow surface(s)` ↔ `*_SKILL_LINKS` parity in `test_prompt_cross_links.py` (including strict Primary workflow surfaces parity via `test_thinking_lens_primary_surfaces_match_consumer_graph`); Human Review + Escalation route-target guards in thinking/skill contract tests; composable `Primary workflow surface(s)` / Supporting-lens preamble guards and composable `## Human Review` preamble guards (route to `reflective-risk`) via `prompt_eval_helpers.assert_human_review_preamble` in `test_*_prompts_eval_harness.py`; frozen `*_HUMAN_REVIEW_REQUIRED` / `*_HUMAN_REVIEW_EXEMPT` set parity across all prompt categories (Round 90)
317+
5. **Doc anti-drift** — `test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_human_review_library_registry.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`, `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py`, `test_skill_module_contract.py` (Escalation subsection + Trigger/Methods/Output/Never; 590+ pytest anti-drift suite in CI); reciprocal thinking-lens ↔ skill checks and `00-core` + composable `Primary workflow surface(s)` ↔ `*_SKILL_LINKS` parity in `test_prompt_cross_links.py` (including strict Primary workflow surfaces parity via `test_thinking_lens_primary_surfaces_match_consumer_graph`); Human Review + Escalation route-target guards in thinking/skill contract tests; composable `Primary workflow surface(s)` / Supporting-lens preamble guards and composable `## Human Review` preamble guards (route to `reflective-risk`) via `prompt_eval_helpers.assert_human_review_preamble` in `test_*_prompts_eval_harness.py`; frozen `*_HUMAN_REVIEW_REQUIRED` / `*_HUMAN_REVIEW_EXEMPT` set parity across all prompt categories (Round 90)
318318

319319
### Ongoing maintenance (not blockers)
320320

@@ -384,4 +384,4 @@ Phase 1 quality-gate tooling and documentation are **complete**. Routing consist
384384
- ✅ Benchmark fixture gate plus optional manual benchmark runs
385385
- ✅ Research-backed design decisions
386386

387-
The project is positioned to grow sustainably with quality discipline built in from the start. **No open implementation blockers** remain from panel Rounds 1–90; work is recurrence-gated maintenance per playbook. The next measurable quality target is **holdout expansion before router tuning** and optional manual baseline-vs-skill benchmark runs — not shipping new core skills without promotion evidence.
387+
The project is positioned to grow sustainably with quality discipline built in from the start. **No open implementation blockers** remain from panel Rounds 1–91; work is recurrence-gated maintenance per playbook. The next measurable quality target is **holdout expansion before router tuning** and optional manual baseline-vs-skill benchmark runs — not shipping new core skills without promotion evidence.

reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2586,3 +2586,49 @@ User directive (repeat): review prompts, plans, skills, and Socratic/critical-th
25862586

25872587
**Resealed 2026-06-25** after **Round 90** (options FX–GB). Human Review coverage is now explicit via frozen required/exempt sets across all prompt categories (`00-core``06-repo`). Holdout expansion remains recurrence-gated maintenance.
25882588

2589+
---
2590+
2591+
## Round 91 — cross-category Human Review library registry (2026-06-25)
2592+
2593+
**Options GC–GG** | Six-lens panel (Opus, Codex, Gemini, Composer, Sakana, GLM)
2594+
2595+
### Round 91 options
2596+
2597+
| ID | Proposal | Verdict |
2598+
| --- | --- | --- |
2599+
| GC | `PROMPT_LIBRARY_CATEGORIES` + `test_human_review_library_registry.py` cross-category HR registry pytest | **Agree** |
2600+
| GD | Remove duplicate `*_PROMPTS_WITH_HUMAN_REVIEW` assignments in composable harness files | **Agree** |
2601+
| GE | GLOSSARY playbook step 23 + governance sync | **Agree** |
2602+
| GF | ROUTE holdout expansion | **Defer** |
2603+
| GG | Router / tenth skill / benchmark CI | **Reject** |
2604+
2605+
### Round 91 verdict table
2606+
2607+
| ID | Option | Verdict | Action |
2608+
| --- | --- | --- | --- |
2609+
| GC | HR library registry | **Agree** | `PROMPT_LIBRARY_CATEGORIES`; registry imports all frozen HR sets; library glob parity |
2610+
| GD | Harness dedupe | **Agree** | drop duplicate `prompts_with_human_review` lines |
2611+
| GE | Playbook + docs | **Agree** | step 23; panel round 91 sync |
2612+
| GF | Holdout expansion | **Defer** | maintenance |
2613+
| GG | Router/tenth skill/benchmark CI | **Reject** | no change |
2614+
2615+
**All roles agree.**
2616+
2617+
## Implemented Changes (Round 91)
2618+
2619+
- `plans/tests/prompt_eval_helpers.py`: `PROMPT_LIBRARY_CATEGORIES` tuple
2620+
- `plans/tests/test_human_review_library_registry.py`: cross-category HR registry + library glob parity
2621+
- `plans/tests/test_{agent,context,domain,engineering,repo}_prompts_eval_harness.py`: dedupe duplicate HR prompt tuples
2622+
- `GLOSSARY.md`: playbook Rounds 1–91; step 23 for HR library registry
2623+
- `QUALITY_GATES_SUMMARY.md`: HR registry note; panel Rounds 1–91; 590+ pytest floor
2624+
- `PROJECT_KNOWLEDGE.md`: Decision Index Round 91 entry
2625+
- `README.md`, `reflective-prompt-library/README.md`, `test_readme_governance.py`: panel round 91 sync
2626+
2627+
## Verification (Round 91)
2628+
2629+
- `make all`: pytest + ROUTE-001/002/003 100%
2630+
2631+
## Panel status (updated)
2632+
2633+
**Resealed 2026-06-25** after **Round 91** (options GC–GG). Human Review frozen sets are now cross-checked by a single library registry (`00-core``06-repo`). Holdout expansion remains recurrence-gated maintenance.
2634+

reflective-prompt-library/plans/tests/prompt_eval_helpers.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,16 @@
55

66
HUMAN_REVIEW_HEADING = re.compile(r"^## Human Review\s*$", re.MULTILINE)
77

8+
PROMPT_LIBRARY_CATEGORIES = (
9+
"00-core",
10+
"01-thinking",
11+
"02-engineering",
12+
"03-context",
13+
"04-agent",
14+
"05-domain",
15+
"06-repo",
16+
)
17+
818

919
def prompt_preamble(prompt_path: Path) -> str:
1020
return prompt_path.read_text(encoding="utf-8").split("```", 1)[0]

reflective-prompt-library/plans/tests/test_agent_prompts_eval_harness.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,6 @@
3939
"workflow-recipes.md",
4040
})
4141

42-
AGENT_PROMPTS_WITH_HUMAN_REVIEW = prompts_with_human_review(AGENT_PROMPTS)
4342

4443

4544

reflective-prompt-library/plans/tests/test_context_prompts_eval_harness.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,6 @@
3636
"medium-context.md",
3737
})
3838

39-
CONTEXT_PROMPTS_WITH_HUMAN_REVIEW = prompts_with_human_review(CONTEXT_PROMPTS)
4039

4140

4241

reflective-prompt-library/plans/tests/test_domain_prompts_eval_harness.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,6 @@
3636
"writing-article.md",
3737
})
3838

39-
DOMAIN_PROMPTS_WITH_HUMAN_REVIEW = prompts_with_human_review(DOMAIN_PROMPTS)
4039

4140

4241

0 commit comments

Comments
 (0)