You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: reflective-prompt-library/GLOSSARY.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -337,7 +337,7 @@ Curated top-of-cheatsheet summary of high-confusion routing traps (ROUTE-002 hol
337
337
338
338
## Governance Maintenance Playbook / 治理維護手冊
339
339
340
-
Ongoing upkeep after panel close (Rounds 1–97). Not agent instructions — operator checklist.
340
+
Ongoing upkeep after panel close (Rounds 1–98). Not agent instructions — operator checklist.
341
341
342
342
**Operational test:** Before router tuning, add fresh ROUTE-002/003 holdout phrases; run `make all`; record decisions in `PROJECT_KNOWLEDGE.md` Decision Index when governance surface changes.
343
343
@@ -369,5 +369,5 @@ Ongoing upkeep after panel close (Rounds 1–97). Not agent instructions — ope
369
369
26. When editing composable prompt Purpose preambles, keep `Primary workflow surface(s)` / Supporting-lens lines via `assert_primary_workflow_surface_preamble` in `prompt_eval_helpers.py`; update `SUPPORTING_LENS_PRIMARY_SURFACE_BY_CATEGORY` for exemptions; run `test_prompt_primary_workflow_surface_library_registry.py` plus per-category `test_*_prompts_eval_harness.py` guards.
370
370
27. When editing category workflow skill coverage tuples, keep frozen `*_COVER_WORKFLOW_SKILLS` in `test_*_prompts_eval_harness.py` aligned with `assert_category_workflow_skill_coverage`; `01-thinking` stays exempt (consumer graph); run `test_workflow_skill_coverage_library_registry.py`.
371
371
28. When editing eval_harness score floors, keep `PROMPT_EVAL_MIN_SCORE` in `prompt_eval_helpers.py` and use `assert_prompt_meets_eval_harness_floor` in per-category `test_*_prompts_eval_harness.py` guards; run `test_prompt_eval_harness_score_library_registry.py`.
372
-
28. When editing eval_harness score floors, keep `PROMPT_EVAL_MIN_SCORE` in `prompt_eval_helpers.py` and use `assert_prompt_meets_eval_harness_floor` in per-category `test_*_prompts_eval_harness.py` guards; run `test_prompt_eval_harness_score_library_registry.py`.
373
372
29. When editing per-category `reference_workflow_skills` guards, use `assert_prompt_references_workflow_skill` in `prompt_eval_helpers.py` (preamble-scoped, not fenced templates); run `test_prompt_workflow_skill_reference_library_registry.py` plus per-category harness guards.
373
+
30. When editing per-category eval_harness fixtures, keep `PROMPT_LIBRARY_REPO_ROOT` and `make_category_eval_harness_fixture` in `prompt_eval_helpers.py`; run `test_prompt_eval_harness_fixture_library_registry.py` plus per-category harness guards.
Copy file name to clipboardExpand all lines: reflective-prompt-library/README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,7 +30,7 @@ Pick **Strictness L1–L6** first (`skills/reflective-dispatch/SKILL.md`, [GLOSS
30
30
31
31
## Governance Panel Record
32
32
33
-
Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–97, options A–HK) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
33
+
Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–98, options A–HP) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
5. **Doc anti-drift** — `test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_human_review_library_registry.py`, `test_prompt_skill_links_library_registry.py`, `test_prompt_contract_library_registry.py`, `test_prompt_primary_workflow_surface_library_registry.py`, `test_workflow_skill_coverage_library_registry.py`, `test_prompt_eval_harness_score_library_registry.py`, `test_prompt_workflow_skill_reference_library_registry.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`, `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py`, `test_skill_module_contract.py` (Escalation subsection + Trigger/Methods/Output/Never; 660+ pytest anti-drift suite in CI); reciprocal thinking-lens ↔ skill checks and `00-core` + composable `Primary workflow surface(s)` ↔ `*_SKILL_LINKS` parity in `test_prompt_cross_links.py` (including strict Primary workflow surfaces parity via `test_thinking_lens_primary_surfaces_match_consumer_graph`); Human Review + Escalation route-target guards in thinking/skill contract tests; composable `Primary workflow surface(s)` / Supporting-lens preamble guards and composable `## Human Review` preamble guards (route to `reflective-risk`) via `prompt_eval_helpers.assert_human_review_preamble` in `test_*_prompts_eval_harness.py`; frozen `*_HUMAN_REVIEW_REQUIRED` / `*_HUMAN_REVIEW_EXEMPT` set parity across all prompt categories (Round 90); library-wide contract heading registry (`PROMPT_CONTRACT_HEADINGS`, Round 93); workflow skill coverage registry (`*_COVER_WORKFLOW_SKILLS`, Round 95); eval_harness score floor registry (`PROMPT_EVAL_MIN_SCORE`, Round 96); workflow skill reference registry (`assert_prompt_references_workflow_skill`, Round 97)
317
+
5. **Doc anti-drift** — `test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_human_review_library_registry.py`, `test_prompt_skill_links_library_registry.py`, `test_prompt_contract_library_registry.py`, `test_prompt_primary_workflow_surface_library_registry.py`, `test_workflow_skill_coverage_library_registry.py`, `test_prompt_eval_harness_score_library_registry.py`, `test_prompt_workflow_skill_reference_library_registry.py`, `test_prompt_eval_harness_fixture_library_registry.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`, `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py`, `test_skill_module_contract.py` (Escalation subsection + Trigger/Methods/Output/Never; 670+ pytest anti-drift suite in CI); reciprocal thinking-lens ↔ skill checks and `00-core` + composable `Primary workflow surface(s)` ↔ `*_SKILL_LINKS` parity in `test_prompt_cross_links.py` (including strict Primary workflow surfaces parity via `test_thinking_lens_primary_surfaces_match_consumer_graph`); Human Review + Escalation route-target guards in thinking/skill contract tests; composable `Primary workflow surface(s)` / Supporting-lens preamble guards and composable `## Human Review` preamble guards (route to `reflective-risk`) via `prompt_eval_helpers.assert_human_review_preamble` in `test_*_prompts_eval_harness.py`; frozen `*_HUMAN_REVIEW_REQUIRED` / `*_HUMAN_REVIEW_EXEMPT` set parity across all prompt categories (Round 90); library-wide contract heading registry (`PROMPT_CONTRACT_HEADINGS`, Round 93); workflow skill coverage registry (`*_COVER_WORKFLOW_SKILLS`, Round 95); eval_harness score floor registry (`PROMPT_EVAL_MIN_SCORE`, Round 96); workflow skill reference registry (`assert_prompt_references_workflow_skill`, Round 97); eval_harness fixture registry (`make_category_eval_harness_fixture`, Round 98)
318
318
319
319
### Ongoing maintenance (not blockers)
320
320
@@ -384,4 +384,4 @@ Phase 1 quality-gate tooling and documentation are **complete**. Routing consist
384
384
- ✅ Benchmark fixture gate plus optional manual benchmark runs
385
385
- ✅ Research-backed design decisions
386
386
387
-
The project is positioned to grow sustainably with quality discipline built in from the start. **No open implementation blockers** remain from panel Rounds 1–97; work is recurrence-gated maintenance per playbook. The next measurable quality target is **holdout expansion before router tuning** and optional manual baseline-vs-skill benchmark runs — not shipping new core skills without promotion evidence.
387
+
The project is positioned to grow sustainably with quality discipline built in from the start. **No open implementation blockers** remain from panel Rounds 1–98; work is recurrence-gated maintenance per playbook. The next measurable quality target is **holdout expansion before router tuning** and optional manual baseline-vs-skill benchmark runs — not shipping new core skills without promotion evidence.
Copy file name to clipboardExpand all lines: reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md
+52Lines changed: 52 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2935,3 +2935,55 @@ User directive (repeat): review prompts, plans, skills, and Socratic/critical-th
2935
2935
2936
2936
**Resealed 2026-06-25** after **Round 97** (options HG–HK). Workflow skill references are now library-registry checked across all `00-core`–`06-repo` categories with shared `assert_prompt_references_workflow_skill` (preamble-scoped). Holdout expansion remains recurrence-gated maintenance.
**Resealed 2026-06-25** after **Round 98** (options HL–HP). Eval_harness fixtures are now library-registry checked across all `00-core`–`06-repo` categories with shared `make_category_eval_harness_fixture` and `PROMPT_LIBRARY_REPO_ROOT`. Holdout expansion remains recurrence-gated maintenance.
0 commit comments