Skip to content

Commit 0f09da3

Browse files
committed
Round 81: thinking-lens Human Review + Escalation route guards
- Add Human Review preambles to socratic-reviewer and why-what-how-done - Guard all 01-thinking lenses with Human Review pytest - Validate Escalation cites only CORE_SKILLS (reflective-risk terminal exempt) - Reseal panel, governance docs, GLOSSARY steps 12-13; pytest floor 450+
1 parent a48a8ba commit 0f09da3

12 files changed

Lines changed: 114 additions & 11 deletions

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Full library docs: [reflective-prompt-library/README.md](reflective-prompt-libra
2121
## Governance
2222

2323
- **Contributing:** [CONTRIBUTING.md](CONTRIBUTING.md) — quality gates, routing maintenance (R8–R12), `make all`
24-
- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–80)
24+
- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–81)
2525
- **Operator playbook:** [GLOSSARY.md](reflective-prompt-library/GLOSSARY.md) — Governance Maintenance Playbook
2626

2727
The repository contains:

reflective-prompt-library/01-thinking/socratic-reviewer.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,10 @@ Clarify the real question before choosing a direction. Primary workflow surfaces
2222

2323
If the session cannot name what evidence would prove the current framing wrong, stop and return to Clarify instead of recommending action.
2424

25+
## Human Review
26+
27+
Escalate to `reflective-risk` with an explicit Human Review gate when the work implies irreversible or high-blast-radius action.
28+
2529
```markdown
2630
你是 Socratic Questioner。你的目標不是立刻給答案,而是幫我逼近真正問題。
2731

reflective-prompt-library/01-thinking/why-what-how-done.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,10 @@ Gate a task through Why / What / How / Done before choosing strictness or workfl
2222

2323
Done gate must name evidence that would prove the task should not proceed or should be rolled back.
2424

25+
## Human Review
26+
27+
Escalate to `reflective-risk` with an explicit Human Review gate when the work implies irreversible or high-blast-radius action.
28+
2529
```markdown
2630
請把任務通過 Why / What / How / Done 四層檢查。
2731

reflective-prompt-library/GLOSSARY.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -337,7 +337,7 @@ Curated top-of-cheatsheet summary of high-confusion routing traps (ROUTE-002 hol
337337

338338
## Governance Maintenance Playbook / 治理維護手冊
339339

340-
Ongoing upkeep after panel close (Rounds 1–80). Not agent instructions — operator checklist.
340+
Ongoing upkeep after panel close (Rounds 1–81). Not agent instructions — operator checklist.
341341

342342
**Operational test:** Before router tuning, add fresh ROUTE-002/003 holdout phrases; run `make all`; record decisions in `PROJECT_KNOWLEDGE.md` Decision Index when governance surface changes.
343343

@@ -352,4 +352,5 @@ Ongoing upkeep after panel close (Rounds 1–80). Not agent instructions — ope
352352
9. When adding benchmark golden tasks, keep `test_benchmark_covers_all_nine_workflows` green and bump `MIN_TASK_COUNT` in `validate_benchmark_fixture.py` if the floor rises.
353353
10. When changing thinking-lens ↔ skill cross-links, update `SKILL_THINKING_SOURCES` and consumer lists in `01-thinking/` Purpose preambles; run `test_prompt_cross_links.py` (including reciprocal `THINKING_LENS_SKILL_CONSUMERS`).
354354
11. When changing Module Contract subsections on workflow skills, keep `Escalation:` present and run `test_skill_module_contract.py`.
355-
355+
12. When adding or editing `01-thinking/` lenses, keep `## Human Review` in the preamble (routes to `reflective-risk`) and run `test_thinking_prompts_eval_harness.py`.
356+
13. When editing workflow skill Escalation bullets, cite only frozen `reflective-*` skills; run `test_skill_module_contract.py` escalation route guard.

reflective-prompt-library/PROJECT_KNOWLEDGE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,7 @@ deferred promotions are recurrence-gated — see [panel backlog](plans/multi-age
7575
> Pointers to the causal trail — plans, reflections, tests, commits. Detail is
7676
> not duplicated here; this is a map, not an archive.
7777
78+
- 2026-06-25 Round 81 panel — thinking-lens Human Review preamble guards + Escalation route-target anti-drift → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
7879
- 2026-06-25 Round 80 panel — Module Contract Escalation anti-drift + thinking-lens preamble consumer guards → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
7980
- 2026-06-25 Round 79 panel — bidirectional thinking-lens ↔ workflow skill preamble cross-links + reciprocal pytest → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
8081
- 2026-06-25 Round 78 panel — complete nine-skill thinking-lens cross-links + Module Contract anti-drift → [record](plans/multi-agent-panel-consensus-2026-06-25.md)

reflective-prompt-library/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Pick **Strictness L1–L6** first (`skills/reflective-dispatch/SKILL.md`, [GLOSS
3030

3131
## Governance Panel Record
3232

33-
Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–80, options A–EL) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
33+
Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–81, options A–EP) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
3434

3535
## Directory Map
3636

reflective-prompt-library/plans/QUALITY_GATES_SUMMARY.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,7 @@ ROUTE-002 measures unseen phrasing separately from ROUTE-001. Round 7 (2026-06-2
314314
2. **ROUTE-001/002/003 in CI** — 128 + 102 + 53 paraphrases at 100% consistency (seeded fixtures); `validate_route_fixture.py` gates minimum coverage
315315
3. **Governance validators** — links, lint, governance metadata, PROJECT_KNOWLEDGE, benchmark fixture, skill examples
316316
4. **Harness policy docs** — CONTRIBUTING, AGENTS, SKILL_INSTALLATION, maintenance playbook
317-
5. **Doc anti-drift**`test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`, `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py`, `test_skill_module_contract.py` (Escalation subsection + Trigger/Methods/Output/Never; 440+ pytest anti-drift suite in CI); reciprocal thinking-lens ↔ skill checks in `test_prompt_cross_links.py`
317+
5. **Doc anti-drift**`test_routing_contract.py`, cheatsheet parity tests, `test_readme_governance.py`, `test_thinking_prompts_eval_harness.py`, `test_engineering_prompts_eval_harness.py`, `test_prompt_cross_links.py`, `test_core_prompts_eval_harness.py`, `test_agent_prompts_eval_harness.py`, `test_context_prompts_eval_harness.py`, `test_domain_prompts_eval_harness.py`, `test_repo_prompts_eval_harness.py`, `test_validate_governance.py`, `test_validate_links.py`, `test_lint_skills.py`, `test_skill_module_contract.py` (Escalation subsection + Trigger/Methods/Output/Never; 450+ pytest anti-drift suite in CI); reciprocal thinking-lens ↔ skill checks in `test_prompt_cross_links.py`; Human Review + Escalation route-target guards in thinking/skill contract tests
318318

319319
### Ongoing maintenance (not blockers)
320320

@@ -384,4 +384,4 @@ Phase 1 quality-gate tooling and documentation are **complete**. Routing consist
384384
- ✅ Benchmark fixture gate plus optional manual benchmark runs
385385
- ✅ Research-backed design decisions
386386

387-
The project is positioned to grow sustainably with quality discipline built in from the start. **No open implementation blockers** remain from panel Rounds 1–80; work is recurrence-gated maintenance per playbook. The next measurable quality target is **holdout expansion before router tuning** and optional manual baseline-vs-skill benchmark runs — not shipping new core skills without promotion evidence.
387+
The project is positioned to grow sustainably with quality discipline built in from the start. **No open implementation blockers** remain from panel Rounds 1–81; work is recurrence-gated maintenance per playbook. The next measurable quality target is **holdout expansion before router tuning** and optional manual baseline-vs-skill benchmark runs — not shipping new core skills without promotion evidence.

reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2151,3 +2151,48 @@ User directive (repeat): review prompts, plans, skills, and Socratic/critical-th
21512151

21522152
**Resealed 2026-06-25** after **Round 80** (options EI–EL). Module Contract Escalation anti-drift closed; thinking-lens preamble consumer guards complete. Holdout expansion remains recurrence-gated maintenance.
21532153

2154+
---
2155+
2156+
## Round 81 — Thinking-lens Human Review + Escalation route-target guards (2026-06-25)
2157+
2158+
**Options EM–EP** | Six-lens panel (Opus, Codex, Gemini, Composer, Sakana, GLM)
2159+
2160+
### Round 81 options
2161+
2162+
| ID | Proposal | Verdict |
2163+
| --- | --- | --- |
2164+
| EM | `## Human Review` preamble on all `01-thinking/` lenses + pytest | **Agree** |
2165+
| EN | Escalation route-target anti-drift (`reflective-*` cites only `CORE_SKILLS`; terminal `reflective-risk` exempt) | **Agree** |
2166+
| EO | ROUTE holdout expansion | **Defer** |
2167+
| EP | Router / tenth skill / benchmark CI | **Reject** |
2168+
2169+
### Round 81 verdict table
2170+
2171+
| ID | Option | Verdict | Action |
2172+
| --- | --- | --- | --- |
2173+
| EM | Human Review on thinking lenses | **Agree** | preamble + `test_thinking_prompt_has_human_review_section` |
2174+
| EN | Escalation route targets | **Agree** | `test_core_skill_escalation_routes_to_valid_workflow_skills` |
2175+
| EO | Holdout expansion | **Defer** | maintenance |
2176+
| EP | Router/tenth skill/benchmark CI | **Reject** | no change |
2177+
2178+
**All roles agree.**
2179+
2180+
## Implemented Changes (Round 81)
2181+
2182+
- `01-thinking/socratic-reviewer.md`, `why-what-how-done.md`: `## Human Review` preamble routes to `reflective-risk`
2183+
- `plans/tests/test_thinking_prompts_eval_harness.py`: Human Review preamble guard on all five lenses
2184+
- `plans/tests/test_skill_module_contract.py`: Escalation route-target guard; `reflective-risk` terminal-gate exemption
2185+
- `GLOSSARY.md`: playbook Rounds 1–81; steps 12–13 for Human Review + Escalation route targets
2186+
- `QUALITY_GATES_SUMMARY.md`: 450+ pytest floor; Human Review / Escalation route notes; panel Rounds 1–81
2187+
- `PROJECT_KNOWLEDGE.md`: Decision Index Round 81 entry
2188+
- `README.md`, `reflective-prompt-library/README.md`, `test_readme_governance.py`: panel round 81 sync
2189+
2190+
## Verification (Round 81)
2191+
2192+
- `make all`: pytest + ROUTE-001/002/003 100%
2193+
2194+
## Panel status (updated)
2195+
2196+
**Resealed 2026-06-25** after **Round 81** (options EM–EP). Thinking-lens Human Review preambles complete; Escalation route-target anti-drift closed. Holdout expansion remains recurrence-gated maintenance.
2197+
2198+

reflective-prompt-library/plans/tests/test_glossary_structure.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,10 +30,10 @@ def test_round_boundary_terms_present(glossary_text: str):
3030
assert heading in glossary_text, f"missing glossary section: {heading}"
3131

3232

33-
def test_maintenance_playbook_references_round_80(glossary_text: str):
33+
def test_maintenance_playbook_references_round_81(glossary_text: str):
3434
playbook = glossary_text.split("## Governance Maintenance Playbook", 1)[1]
35-
assert "Rounds 1–80" in playbook or "Rounds 1-79" in playbook
36-
assert "Rounds 1–79" not in playbook and "Rounds 1-78" not in playbook
35+
assert "Rounds 1–81" in playbook or "Rounds 1-80" in playbook
36+
assert "Rounds 1–80" not in playbook and "Rounds 1-79" not in playbook
3737

3838

3939

reflective-prompt-library/plans/tests/test_readme_governance.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@
1010
METHODOLOGY_MAP_EN = Path(__file__).parent.parent.parent / "METHODOLOGY_MAP.md"
1111
SKILL_MAP = Path(__file__).parent.parent.parent / "skills" / "skill-map.md"
1212

13-
CURRENT_PANEL_ROUND = "80"
14-
CURRENT_PANEL_OPTIONS = "A–EL"
13+
CURRENT_PANEL_ROUND = "81"
14+
CURRENT_PANEL_OPTIONS = "A–EP"
1515

1616

1717
@pytest.fixture(scope="module")

0 commit comments

Comments
 (0)