Skip to content

Commit c589e50

Browse files
committed
Round 84: 00-core Primary workflow surface parity
Extend strict Primary workflow surface(s) ↔ CORE_SKILL_LINKS parity to 00-core prompts; trim global-controller and important-task-full primary lines; add CORE_THINKING_LINKS pytest guards; sync governance (panel record, GLOSSARY step 16, QUALITY_GATES 520+ pytest floor).
1 parent c9808e8 commit c589e50

11 files changed

Lines changed: 140 additions & 13 deletions

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Full library docs: [reflective-prompt-library/README.md](reflective-prompt-libra
2121
## Governance
2222

2323
- **Contributing:** [CONTRIBUTING.md](CONTRIBUTING.md) — quality gates, routing maintenance (R8–R12), `make all`
24-
- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–83)
24+
- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–84)
2525
- **Operator playbook:** [GLOSSARY.md](reflective-prompt-library/GLOSSARY.md) — Governance Maintenance Playbook
2626

2727
The repository contains:

reflective-prompt-library/00-core/global-controller.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,12 @@ Use this as the total controller prompt for ongoing conversations.
44

55
## Purpose
66

7-
Persistent controller instruction for ongoing conversations. Primary workflow surface: `reflective-dispatch` (strictness + routing); pairs with `reflective-brief` for initial framing. Links to `01-thinking/why-what-how-done.md`.
7+
Persistent controller instruction for ongoing conversations. Primary workflow surface: `reflective-dispatch` (strictness + routing). Links to `01-thinking/why-what-how-done.md`.
88

99
## Scope
1010

1111
- In scope: conversation-wide gates, task-type classification, recommended mode, validation habits.
12+
- Pair with: `reflective-brief` for initial framing before dispatch.
1213
- Out of scope: replacing skill contracts (`skills/*/SKILL.md`), autonomous runtime orchestration.
1314

1415
## Acceptance Criteria

reflective-prompt-library/00-core/important-task-full.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,12 @@ Use this for important tasks that need stronger reflection, critique, and planni
44

55
## Purpose
66

7-
High-rigor reflection for important decisions. Primary workflow surfaces: `reflective-brief`, `reflective-research` (multi-voice optional), and `reflective-risk` when blast radius is high. Pairs with `01-thinking/socratic-reviewer.md` and `01-thinking/critical-thinking-check.md`.
7+
High-rigor reflection for important decisions. Primary workflow surfaces: `reflective-brief`, `reflective-research` (multi-voice optional). Pairs with `01-thinking/socratic-reviewer.md` and `01-thinking/critical-thinking-check.md`.
88

99
## Scope
1010

1111
- In scope: Socratic audit, counterargument, fallacy scan, cost analysis, three-tier options, Human Review triggers.
12+
- Escalate: `reflective-risk` when blast radius is high.
1213
- Out of scope: silent execution without explicit acceptance criteria.
1314

1415
## Acceptance Criteria

reflective-prompt-library/GLOSSARY.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -337,7 +337,7 @@ Curated top-of-cheatsheet summary of high-confusion routing traps (ROUTE-002 hol
337337

338338
## Governance Maintenance Playbook / 治理維護手冊
339339

340-
Ongoing upkeep after panel close (Rounds 1–83). Not agent instructions — operator checklist.
340+
Ongoing upkeep after panel close (Rounds 1–84). Not agent instructions — operator checklist.
341341

342342
**Operational test:** Before router tuning, add fresh ROUTE-002/003 holdout phrases; run `make all`; record decisions in `PROJECT_KNOWLEDGE.md` Decision Index when governance surface changes.
343343

@@ -356,3 +356,4 @@ Ongoing upkeep after panel close (Rounds 1–83). Not agent instructions — ope
356356
13. When editing workflow skill Escalation bullets, cite only frozen `reflective-*` skills; run `test_skill_module_contract.py` escalation route guard.
357357
14. When editing `01-thinking/` Purpose preambles, keep `Primary workflow surfaces` aligned exactly with `SKILL_THINKING_SOURCES` via `test_thinking_lens_primary_surfaces_match_consumer_graph`; put escalations and adjacent workflow notes in Scope or Human Review, not on the primary line.
358358
15. When editing composable prompts (`02-engineering``06-repo`), keep `Primary workflow surface(s)` aligned with `*_SKILL_LINKS` in `test_prompt_cross_links.py`; use Supporting lens for cross-cutting lenses like `runtime-trust-boundary.md`; put escalate/pair notes in Scope.
359+
16. When editing `00-core/` prompts, keep `Primary workflow surface(s)` aligned with `CORE_SKILL_LINKS` in `test_prompt_cross_links.py`; put pair/escalation skills in Scope or Human Review, not on the primary line.

reflective-prompt-library/PROJECT_KNOWLEDGE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ deferred promotions are recurrence-gated — see [panel backlog](plans/multi-age
7272

7373
## Decision Index
7474

75+
- 2026-06-25 Round 84 panel — `00-core` Primary workflow surface parity + primary-line trim → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
7576
- 2026-06-25 Round 83 panel — composable prompt Primary workflow surface parity (`02-engineering``06-repo`) + supporting-lens exemption → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
7677
> Pointers to the causal trail — plans, reflections, tests, commits. Detail is
7778
> not duplicated here; this is a map, not an archive.

reflective-prompt-library/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Pick **Strictness L1–L6** first (`skills/reflective-dispatch/SKILL.md`, [GLOSS
3030

3131
## Governance Panel Record
3232

33-
Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–83, options A–EU) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
33+
Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–84, options A–FB) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
3434

3535
## Directory Map
3636

reflective-prompt-library/plans/QUALITY_GATES_SUMMARY.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,7 @@ ROUTE-002 measures unseen phrasing separately from ROUTE-001. Round 7 (2026-06-2
314314
2. **ROUTE-001/002/003 in CI** — 128 + 102 + 53 paraphrases at 100% consistency (seeded fixtures); `validate_route_fixture.py` gates minimum coverage
315315
3. **Governance validators** — links, lint, governance metadata, PROJECT_KNOWLEDGE, benchmark fixture, skill examples
316316
4. **Harness policy docs** — CONTRIBUTING, AGENTS, SKILL_INSTALLATION, maintenance playbook
317-
5. **Doc anti-drift**`test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`, `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py`, `test_skill_module_contract.py` (Escalation subsection + Trigger/Methods/Output/Never; 500+ pytest anti-drift suite in CI); reciprocal thinking-lens ↔ skill checks and composable `Primary workflow surface(s)``*_SKILL_LINKS` parity in `test_prompt_cross_links.py` (including strict Primary workflow surfaces parity via `test_thinking_lens_primary_surfaces_match_consumer_graph`); Human Review + Escalation route-target guards in thinking/skill contract tests
317+
5. **Doc anti-drift**`test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`, `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py`, `test_skill_module_contract.py` (Escalation subsection + Trigger/Methods/Output/Never; 520+ pytest anti-drift suite in CI); reciprocal thinking-lens ↔ skill checks and `00-core` + composable `Primary workflow surface(s)``*_SKILL_LINKS` parity in `test_prompt_cross_links.py` (including strict Primary workflow surfaces parity via `test_thinking_lens_primary_surfaces_match_consumer_graph`); Human Review + Escalation route-target guards in thinking/skill contract tests
318318

319319
### Ongoing maintenance (not blockers)
320320

@@ -384,4 +384,4 @@ Phase 1 quality-gate tooling and documentation are **complete**. Routing consist
384384
- ✅ Benchmark fixture gate plus optional manual benchmark runs
385385
- ✅ Research-backed design decisions
386386

387-
The project is positioned to grow sustainably with quality discipline built in from the start. **No open implementation blockers** remain from panel Rounds 1–83; work is recurrence-gated maintenance per playbook. The next measurable quality target is **holdout expansion before router tuning** and optional manual baseline-vs-skill benchmark runs — not shipping new core skills without promotion evidence.
387+
The project is positioned to grow sustainably with quality discipline built in from the start. **No open implementation blockers** remain from panel Rounds 1–84; work is recurrence-gated maintenance per playbook. The next measurable quality target is **holdout expansion before router tuning** and optional manual baseline-vs-skill benchmark runs — not shipping new core skills without promotion evidence.

reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2282,3 +2282,47 @@ User directive (repeat): review prompts, plans, skills, and Socratic/critical-th
22822282

22832283
**Resealed 2026-06-25** after **Round 83** (options EU–EX). Composable prompts (`02-engineering``06-repo`) Primary workflow surface lines now match `*_SKILL_LINKS` exactly; supporting-lens pattern documented for `runtime-trust-boundary.md`. Holdout expansion remains recurrence-gated maintenance.
22842284

2285+
---
2286+
2287+
## Round 84 — Core prompt Primary workflow surface parity (2026-06-25)
2288+
2289+
**Options EY–FB** | Six-lens panel (Opus, Codex, Gemini, Composer, Sakana, GLM)
2290+
2291+
### Round 84 options
2292+
2293+
| ID | Proposal | Verdict |
2294+
| --- | --- | --- |
2295+
| EY | Strict `Primary workflow surface(s)``CORE_SKILL_LINKS` parity for `00-core` + pytest | **Agree** |
2296+
| EZ | Trim overlisted primary skills (`global-controller`, `important-task-full`) | **Agree** |
2297+
| FA | ROUTE holdout expansion | **Defer** |
2298+
| FB | Router / tenth skill / benchmark CI | **Reject** |
2299+
2300+
### Round 84 verdict table
2301+
2302+
| ID | Option | Verdict | Action |
2303+
| --- | --- | --- | --- |
2304+
| EY | Core primary-surface parity | **Agree** | `CORE_SKILL_LINKS` + `CORE_THINKING_LINKS` + primary tests |
2305+
| EZ | Primary-line trim | **Agree** | Move brief pairing and risk escalation to Scope |
2306+
| FA | Holdout expansion | **Defer** | maintenance |
2307+
| FB | Router/tenth skill/benchmark CI | **Reject** | no change |
2308+
2309+
**All roles agree.**
2310+
2311+
## Implemented Changes (Round 84)
2312+
2313+
- `00-core/global-controller.md`, `00-core/important-task-full.md`: Primary lines trimmed; adjacent/escalation skills in Scope
2314+
- `plans/tests/test_prompt_cross_links.py`: `CORE_SKILL_LINKS`, `CORE_THINKING_LINKS`, core primary-surface parity tests
2315+
- `GLOSSARY.md`: playbook Rounds 1–84; step 16 for `00-core` primary-surface parity
2316+
- `QUALITY_GATES_SUMMARY.md`: core primary-surface parity note; panel Rounds 1–84; 520+ pytest floor
2317+
- `PROJECT_KNOWLEDGE.md`: Decision Index Round 84 entry
2318+
- `README.md`, `reflective-prompt-library/README.md`, `test_readme_governance.py`: panel round 84 sync
2319+
2320+
## Verification (Round 84)
2321+
2322+
- `make all`: pytest + ROUTE-001/002/003 100%
2323+
2324+
## Panel status (updated)
2325+
2326+
**Resealed 2026-06-25** after **Round 84** (options EY–FB). `00-core` Primary workflow surface lines now match `CORE_SKILL_LINKS` exactly; full prompt library (`00-core``06-repo` + `01-thinking` graph parity) closed. Holdout expansion remains recurrence-gated maintenance.
2327+
2328+

reflective-prompt-library/plans/tests/test_glossary_structure.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,10 +30,10 @@ def test_round_boundary_terms_present(glossary_text: str):
3030
assert heading in glossary_text, f"missing glossary section: {heading}"
3131

3232

33-
def test_maintenance_playbook_references_round_83(glossary_text: str):
33+
def test_maintenance_playbook_references_round_84(glossary_text: str):
3434
playbook = glossary_text.split("## Governance Maintenance Playbook", 1)[1]
35-
assert "Rounds 1–83" in playbook or "Rounds 1-82" in playbook
36-
assert "Rounds 1–82" not in playbook and "Rounds 1-81" not in playbook
35+
assert "Rounds 1–84" in playbook or "Rounds 1-83" in playbook
36+
assert "Rounds 1–83" not in playbook and "Rounds 1-82" not in playbook
3737

3838

3939

reflective-prompt-library/plans/tests/test_prompt_cross_links.py

Lines changed: 80 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
1-
"""Anti-drift: thinking lenses, engineering/agent/context/domain/repo prompts, and workflow skills cross-link."""
1+
"""Anti-drift: thinking lenses, core/engineering/agent/context/domain/repo prompts, and workflow skills cross-link."""
22

33
import re
44
from pathlib import Path
55

66
import pytest
77

88
LIBRARY_ROOT = Path(__file__).parent.parent.parent
9+
CORE_DIR = LIBRARY_ROOT / "00-core"
910
THINKING_DIR = LIBRARY_ROOT / "01-thinking"
1011
ENGINEERING_DIR = LIBRARY_ROOT / "02-engineering"
1112
AGENT_DIR = LIBRARY_ROOT / "04-agent"
@@ -14,6 +15,42 @@
1415
REPO_DIR = LIBRARY_ROOT / "06-repo"
1516
SKILLS_DIR = LIBRARY_ROOT / "skills"
1617

18+
CORE_THINKING_LINKS: dict[str, tuple[str, ...]] = {
19+
"core-full.md": (
20+
"01-thinking/why-what-how-done.md",
21+
"01-thinking/critical-thinking-check.md",
22+
),
23+
"core-minimal.md": ("01-thinking/why-what-how-done.md",),
24+
"core-short.md": ("01-thinking/why-what-how-done.md",),
25+
"custom-instruction-en.md": (),
26+
"custom-instruction-zh.md": (),
27+
"daily-minimal.md": (
28+
"01-thinking/falsifiability.md",
29+
"01-thinking/why-what-how-done.md",
30+
),
31+
"global-controller.md": ("01-thinking/why-what-how-done.md",),
32+
"important-task-full.md": (
33+
"01-thinking/socratic-reviewer.md",
34+
"01-thinking/critical-thinking-check.md",
35+
),
36+
"master-prompt.md": (
37+
"01-thinking/socratic-reviewer.md",
38+
"01-thinking/critical-thinking-check.md",
39+
),
40+
}
41+
42+
CORE_SKILL_LINKS: dict[str, tuple[str, ...]] = {
43+
"core-full.md": ("reflective-brief", "reflective-dispatch"),
44+
"core-minimal.md": ("reflective-brief",),
45+
"core-short.md": ("reflective-brief", "reflective-dispatch"),
46+
"custom-instruction-en.md": ("reflective-brief",),
47+
"custom-instruction-zh.md": ("reflective-brief",),
48+
"daily-minimal.md": ("reflective-brief",),
49+
"global-controller.md": ("reflective-dispatch",),
50+
"important-task-full.md": ("reflective-brief", "reflective-research"),
51+
"master-prompt.md": ("reflective-brief", "reflective-dispatch"),
52+
}
53+
1754
ENGINEERING_THINKING_LINKS: dict[str, tuple[str, ...]] = {
1855
"task-start.md": (
1956
"01-thinking/why-what-how-done.md",
@@ -159,6 +196,7 @@
159196
}
160197

161198

199+
CORE_PROMPTS = tuple(sorted(CORE_DIR.glob("*.md")))
162200
THINKING_PROMPTS = tuple(sorted(THINKING_DIR.glob("*.md")))
163201
ENGINEERING_PROMPTS = tuple(sorted(ENGINEERING_DIR.glob("*.md")))
164202
AGENT_PROMPTS = tuple(sorted(AGENT_DIR.glob("*.md")))
@@ -277,6 +315,47 @@ def _prompt_sources_section(skill_path: Path) -> str:
277315
return text.split(marker, 1)[1].split("##", 1)[0]
278316

279317

318+
319+
@pytest.mark.parametrize("prompt_name,thinking_refs", CORE_THINKING_LINKS.items())
320+
def test_core_prompt_links_thinking_lens(prompt_name: str, thinking_refs: tuple[str, ...]):
321+
path = CORE_DIR / prompt_name
322+
preamble = _preamble(path)
323+
for ref in thinking_refs:
324+
assert ref in preamble, f"{prompt_name} preamble should reference {ref}"
325+
326+
327+
def test_all_core_prompts_have_thinking_cross_link():
328+
assert set(CORE_THINKING_LINKS) == {p.name for p in CORE_PROMPTS}
329+
330+
331+
def test_all_core_prompts_have_skill_link():
332+
assert set(CORE_SKILL_LINKS) == {p.name for p in CORE_PROMPTS}
333+
334+
335+
@pytest.mark.parametrize("prompt_name,skill_refs", CORE_SKILL_LINKS.items())
336+
def test_core_prompt_maps_workflow_skill(prompt_name: str, skill_refs: tuple[str, ...]):
337+
preamble = _preamble(CORE_DIR / prompt_name)
338+
for skill in skill_refs:
339+
assert skill in preamble, f"{prompt_name} preamble should reference {skill}"
340+
341+
342+
@pytest.mark.parametrize("prompt_name,skill_refs", CORE_SKILL_LINKS.items())
343+
def test_core_prompt_primary_surfaces_match_skill_links(
344+
prompt_name: str, skill_refs: tuple[str, ...]
345+
):
346+
preamble = _preamble(CORE_DIR / prompt_name)
347+
listed = _primary_workflow_surfaces_skills(preamble)
348+
assert listed == tuple(sorted(skill_refs)), (
349+
f"{prompt_name} Primary workflow surfaces {listed} != {skill_refs}"
350+
)
351+
352+
353+
def test_thinking_lens_files_exist_for_core_links():
354+
linked = {ref for refs in CORE_THINKING_LINKS.values() for ref in refs}
355+
for ref in linked:
356+
assert (LIBRARY_ROOT / ref).is_file(), f"missing thinking lens file {ref}"
357+
358+
280359
@pytest.mark.parametrize("prompt_name,thinking_refs", ENGINEERING_THINKING_LINKS.items())
281360
def test_engineering_prompt_links_thinking_lens(prompt_name: str, thinking_refs: tuple[str, ...]):
282361
path = ENGINEERING_DIR / prompt_name

0 commit comments

Comments
 (0)