Skip to content

Commit 2b3de12

Browse files
committed
Round 85: composable Primary workflow surface preamble guards
1 parent c589e50 commit 2b3de12

14 files changed

Lines changed: 111 additions & 12 deletions

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Full library docs: [reflective-prompt-library/README.md](reflective-prompt-libra
2121
## Governance
2222

2323
- **Contributing:** [CONTRIBUTING.md](CONTRIBUTING.md) — quality gates, routing maintenance (R8–R12), `make all`
24-
- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–84)
24+
- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–85)
2525
- **Operator playbook:** [GLOSSARY.md](reflective-prompt-library/GLOSSARY.md) — Governance Maintenance Playbook
2626

2727
The repository contains:

reflective-prompt-library/GLOSSARY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -337,7 +337,7 @@ Curated top-of-cheatsheet summary of high-confusion routing traps (ROUTE-002 hol
337337

338338
## Governance Maintenance Playbook / 治理維護手冊
339339

340-
Ongoing upkeep after panel close (Rounds 1–84). Not agent instructions — operator checklist.
340+
Ongoing upkeep after panel close (Rounds 1–85). Not agent instructions — operator checklist.
341341

342342
**Operational test:** Before router tuning, add fresh ROUTE-002/003 holdout phrases; run `make all`; record decisions in `PROJECT_KNOWLEDGE.md` Decision Index when governance surface changes.
343343

reflective-prompt-library/PROJECT_KNOWLEDGE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ deferred promotions are recurrence-gated — see [panel backlog](plans/multi-age
7272

7373
## Decision Index
7474

75+
- 2026-06-25 Round 85 panel — composable prompt Primary workflow surface preamble guards (`test_*_prompts_eval_harness.py`) + Supporting-lens exemption → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
7576
- 2026-06-25 Round 84 panel — `00-core` Primary workflow surface parity + primary-line trim → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
7677
- 2026-06-25 Round 83 panel — composable prompt Primary workflow surface parity (`02-engineering``06-repo`) + supporting-lens exemption → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
7778
> Pointers to the causal trail — plans, reflections, tests, commits. Detail is

reflective-prompt-library/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Pick **Strictness L1–L6** first (`skills/reflective-dispatch/SKILL.md`, [GLOSS
3030

3131
## Governance Panel Record
3232

33-
Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–84, options A–FB) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
33+
Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–85, options A–FF) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
3434

3535
## Directory Map
3636

reflective-prompt-library/plans/QUALITY_GATES_SUMMARY.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,7 @@ ROUTE-002 measures unseen phrasing separately from ROUTE-001. Round 7 (2026-06-2
314314
2. **ROUTE-001/002/003 in CI** — 128 + 102 + 53 paraphrases at 100% consistency (seeded fixtures); `validate_route_fixture.py` gates minimum coverage
315315
3. **Governance validators** — links, lint, governance metadata, PROJECT_KNOWLEDGE, benchmark fixture, skill examples
316316
4. **Harness policy docs** — CONTRIBUTING, AGENTS, SKILL_INSTALLATION, maintenance playbook
317-
5. **Doc anti-drift**`test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`, `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py`, `test_skill_module_contract.py` (Escalation subsection + Trigger/Methods/Output/Never; 520+ pytest anti-drift suite in CI); reciprocal thinking-lens ↔ skill checks and `00-core` + composable `Primary workflow surface(s)``*_SKILL_LINKS` parity in `test_prompt_cross_links.py` (including strict Primary workflow surfaces parity via `test_thinking_lens_primary_surfaces_match_consumer_graph`); Human Review + Escalation route-target guards in thinking/skill contract tests
317+
5. **Doc anti-drift** — `test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`, `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py`, `test_skill_module_contract.py` (Escalation subsection + Trigger/Methods/Output/Never; 530+ pytest anti-drift suite in CI); reciprocal thinking-lens ↔ skill checks and `00-core` + composable `Primary workflow surface(s)` ↔ `*_SKILL_LINKS` parity in `test_prompt_cross_links.py` (including strict Primary workflow surfaces parity via `test_thinking_lens_primary_surfaces_match_consumer_graph`); Human Review + Escalation route-target guards in thinking/skill contract tests; composable `Primary workflow surface(s)` / Supporting-lens preamble guards in `test_*_prompts_eval_harness.py`
318318

319319
### Ongoing maintenance (not blockers)
320320

@@ -384,4 +384,4 @@ Phase 1 quality-gate tooling and documentation are **complete**. Routing consist
384384
- ✅ Benchmark fixture gate plus optional manual benchmark runs
385385
- ✅ Research-backed design decisions
386386

387-
The project is positioned to grow sustainably with quality discipline built in from the start. **No open implementation blockers** remain from panel Rounds 1–84; work is recurrence-gated maintenance per playbook. The next measurable quality target is **holdout expansion before router tuning** and optional manual baseline-vs-skill benchmark runs — not shipping new core skills without promotion evidence.
387+
The project is positioned to grow sustainably with quality discipline built in from the start. **No open implementation blockers** remain from panel Rounds 1–85; work is recurrence-gated maintenance per playbook. The next measurable quality target is **holdout expansion before router tuning** and optional manual baseline-vs-skill benchmark runs — not shipping new core skills without promotion evidence.

reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md

Lines changed: 40 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2321,8 +2321,46 @@ User directive (repeat): review prompts, plans, skills, and Socratic/critical-th
23212321

23222322
- `make all`: pytest + ROUTE-001/002/003 100%
23232323

2324-
## Panel status (updated)
2324+
---
2325+
2326+
## Round 85 — Composable prompt Primary workflow surface preamble guards (2026-06-25)
2327+
2328+
**Options FC–FF** | Six-lens panel (Opus, Codex, Gemini, Composer, Sakana, GLM)
2329+
2330+
### Round 85 options
2331+
2332+
| ID | Proposal | Verdict |
2333+
| --- | --- | --- |
2334+
| FC | `Primary workflow surface(s)` / Supporting-lens preamble guards in all composable `test_*_prompts_eval_harness.py` files | **Agree** |
2335+
| FD | GLOSSARY playbook step 17 + governance sync | **Agree** |
2336+
| FE | ROUTE holdout expansion | **Defer** |
2337+
| FF | Router / tenth skill / benchmark CI | **Reject** |
2338+
2339+
### Round 85 verdict table
2340+
2341+
| ID | Option | Verdict | Action |
2342+
| --- | --- | --- | --- |
2343+
| FC | Composable preamble guards | **Agree** | mirror `test_thinking_prompts_eval_harness.py`; Supporting-lens exemption for `runtime-trust-boundary.md` |
2344+
| FD | Playbook + docs | **Agree** | step 17; panel round 85 sync |
2345+
| FE | Holdout expansion | **Defer** | maintenance |
2346+
| FF | Router/tenth skill/benchmark CI | **Reject** | no change |
2347+
2348+
**All roles agree.**
2349+
2350+
## Implemented Changes (Round 85)
23252351

2326-
**Resealed 2026-06-25** after **Round 84** (options EY–FB). `00-core` Primary workflow surface lines now match `CORE_SKILL_LINKS` exactly; full prompt library (`00-core``06-repo` + `01-thinking` graph parity) closed. Holdout expansion remains recurrence-gated maintenance.
2352+
- `plans/tests/test_core_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`: Primary workflow surface preamble guard
2353+
- `plans/tests/test_agent_prompts_eval_harness.py`: Primary vs Supporting-lens preamble guard (`runtime-trust-boundary.md` exemption)
2354+
- `GLOSSARY.md`: playbook Rounds 1–85; step 17 for composable preamble guards
2355+
- `QUALITY_GATES_SUMMARY.md`: preamble guard note; panel Rounds 1–85; 530+ pytest floor
2356+
- `PROJECT_KNOWLEDGE.md`: Decision Index Round 85 entry
2357+
- `README.md`, `reflective-prompt-library/README.md`, `test_readme_governance.py`: panel round 85 sync
2358+
2359+
## Verification (Round 85)
2360+
2361+
- `make all`: pytest + ROUTE-001/002/003 100%
2362+
2363+
## Panel status (updated)
23272364

2365+
**Resealed 2026-06-25** after **Round 85** (options FC–FF). Composable prompts now have eval_harness preamble guards matching thinking-lens pattern; full library parity (graph + preamble) closed. Holdout expansion remains recurrence-gated maintenance.
23282366

reflective-prompt-library/plans/tests/test_agent_prompts_eval_harness.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
)
2222

2323
AGENT_PROMPTS = tuple(sorted(AGENT_DIR.glob("*.md")))
24+
SUPPORTING_LENS_AGENT_PROMPTS = frozenset({"runtime-trust-boundary.md"})
2425

2526

2627
@pytest.fixture(scope="module")
@@ -62,3 +63,17 @@ def test_agent_prompts_cover_agent_workflow_surfaces():
6263
"reflective-research",
6364
):
6465
assert skill in text, f"04-agent should reference {skill}"
66+
67+
def test_agent_prompts_have_workflow_surface_preamble_line():
68+
"""04-agent prompts use Primary workflow surface(s) or Supporting lens (trust boundary)."""
69+
for prompt_path in AGENT_PROMPTS:
70+
preamble = prompt_path.read_text(encoding="utf-8").split("```", 1)[0]
71+
if prompt_path.name in SUPPORTING_LENS_AGENT_PROMPTS:
72+
assert "Supporting lens for" in preamble, (
73+
f"{prompt_path.name} Purpose should use Supporting lens for workflow skills"
74+
)
75+
else:
76+
assert "Primary workflow surface" in preamble, (
77+
f"{prompt_path.name} Purpose should list Primary workflow surface(s)"
78+
)
79+

reflective-prompt-library/plans/tests/test_context_prompts_eval_harness.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,3 +61,12 @@ def test_context_prompts_cover_context_workflow_surfaces():
6161
"reflective-research",
6262
):
6363
assert skill in text, f"03-context should reference {skill}"
64+
65+
def test_context_prompts_have_primary_workflow_surfaces_line():
66+
"""All 03-context prompts declare Primary workflow surface(s) in Purpose preambles."""
67+
for prompt_path in CONTEXT_PROMPTS:
68+
preamble = prompt_path.read_text(encoding="utf-8").split("```", 1)[0]
69+
assert "Primary workflow surface" in preamble, (
70+
f"{prompt_path.name} Purpose should list Primary workflow surface(s)"
71+
)
72+

reflective-prompt-library/plans/tests/test_core_prompts_eval_harness.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,3 +56,12 @@ def test_core_prompts_cover_brief_and_dispatch():
5656
text = "\n".join(p.read_text(encoding="utf-8") for p in CORE_PROMPTS)
5757
assert "reflective-brief" in text
5858
assert "reflective-dispatch" in text
59+
60+
61+
def test_core_prompts_have_primary_workflow_surfaces_line():
62+
"""All 00-core prompts declare Primary workflow surface(s) in Purpose preambles."""
63+
for prompt_path in CORE_PROMPTS:
64+
preamble = prompt_path.read_text(encoding="utf-8").split("```", 1)[0]
65+
assert "Primary workflow surface" in preamble, (
66+
f"{prompt_path.name} Purpose should list Primary workflow surface(s)"
67+
)

reflective-prompt-library/plans/tests/test_domain_prompts_eval_harness.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,3 +68,12 @@ def test_high_risk_prompt_has_human_review_section():
6868
text = (DOMAIN_DIR / "high-risk.md").read_text(encoding="utf-8")
6969
preamble = text.split("```", 1)[0]
7070
assert "## Human Review" in preamble, "high-risk.md preamble should include Human Review"
71+
72+
def test_domain_prompts_have_primary_workflow_surfaces_line():
73+
"""All 05-domain prompts declare Primary workflow surface(s) in Purpose preambles."""
74+
for prompt_path in DOMAIN_PROMPTS:
75+
preamble = prompt_path.read_text(encoding="utf-8").split("```", 1)[0]
76+
assert "Primary workflow surface" in preamble, (
77+
f"{prompt_path.name} Purpose should list Primary workflow surface(s)"
78+
)
79+

0 commit comments

Comments
 (0)