Skip to content

Commit 35b52a7

Browse files
committed
Round 74: standardize 03-context prompt contracts + eval anti-drift
Six-lens panel consensus (DO–DQ): add Purpose/Scope/Acceptance/Falsifiability preambles to all seven 03-context prompts with workflow-skill and thinking-lens cross-links; add test_context_prompts_eval_harness.py and extend cross-link pytest. Governance sync: panel record, Decision Index, QUALITY_GATES 330+ floor.
1 parent a23588c commit 35b52a7

15 files changed

Lines changed: 334 additions & 7 deletions

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Full library docs: [reflective-prompt-library/README.md](reflective-prompt-libra
2121
## Governance
2222

2323
- **Contributing:** [CONTRIBUTING.md](CONTRIBUTING.md) — quality gates, routing maintenance (R8–R12), `make all`
24-
- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–73)
24+
- **Panel record:** [multi-agent-panel-consensus](reflective-prompt-library/plans/multi-agent-panel-consensus-2026-06-25.md) — six-lens Socratic consensus (Rounds 1–74)
2525
- **Operator playbook:** [GLOSSARY.md](reflective-prompt-library/GLOSSARY.md) — Governance Maintenance Playbook
2626

2727
The repository contains:

reflective-prompt-library/03-context/context-engineering.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,24 @@
22

33
Use this before long tasks where context discipline matters.
44

5+
## Purpose
6+
7+
Enforce context discipline before long tasks. Primary workflow surfaces: `reflective-dispatch` (context-load deferral) and `reflective-research`. Pairs with `01-thinking/falsifiability.md` and `01-thinking/why-what-how-done.md`.
8+
9+
## Scope
10+
11+
- In scope: selective reads, artifact summaries, index-then-batch processing, and missing-info flags.
12+
- Out of scope: replacing frozen workflow skill contracts or router fairness rules.
13+
14+
## Acceptance Criteria
15+
16+
- Context-used, context-ignored, and missing-info sections appear at the end.
17+
- Large inputs are indexed before synthesis.
18+
19+
## Falsifiability
20+
21+
State what would prove the agent read irrelevant material anyway.
22+
523
```markdown
624
請以 Context Engineering 模式處理任務。
725

reflective-prompt-library/03-context/context-handoff.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,28 @@
22

33
Use this when switching models, tools, agents, or sessions.
44

5+
## Purpose
6+
7+
Produce session handoff summaries when switching models, tools, agents, or sessions. Primary workflow surface: `reflective-handoff-retro`. Pairs with `01-thinking/why-what-how-done.md` and `01-thinking/socratic-reviewer.md`.
8+
9+
## Scope
10+
11+
- In scope: goal, state, decisions, artifacts, risks, blockers, and next action for a successor agent.
12+
- Out of scope: full retrospective synthesis or repository edits (`reflective-implement`).
13+
14+
## Acceptance Criteria
15+
16+
- Output follows the handoff field structure without narrative drift.
17+
- Do-not-do guidance explicit when blast-radius warrants `reflective-risk`.
18+
19+
## Falsifiability
20+
21+
Name one handoff field that would be wrong if the successor could not resume work.
22+
23+
## Human Review
24+
25+
Require human confirmation before handoff when irreversible or high-blast-radius work remains open.
26+
527
```markdown
628
請將目前任務整理成 Context Handoff Summary,供下一個 Agent 接手。
729

reflective-prompt-library/03-context/gemini-long-document.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,24 @@
22

33
Use this when processing long documents. It is especially suited for Gemini-style large-context workflows.
44

5+
## Purpose
6+
7+
Structure-first processing for long documents (Gemini-style workflows). Primary workflow surface: `reflective-research`. Pairs with `01-thinking/critical-thinking-check.md` and `01-thinking/falsifiability.md`.
8+
9+
## Scope
10+
11+
- In scope: document map, relevant sections, claims, evidence, contradictions, and synthesis.
12+
- Out of scope: full verbatim summary or repository edits.
13+
14+
## Acceptance Criteria
15+
16+
- Seven output sections are populated before recommendation.
17+
- Missing information is flagged explicitly.
18+
19+
## Falsifiability
20+
21+
Name one contradiction that would change the recommendation.
22+
523
```markdown
624
你將處理長文件。請不要直接摘要全文,而是先建立結構索引。
725

reflective-prompt-library/03-context/large-context.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,24 @@
22

33
Use this for 200K-1M context windows while avoiding context rot.
44

5+
## Purpose
6+
7+
Use 200K–1M windows without context rot via index-extract-synthesize. Primary workflow surfaces: `reflective-research` and `reflective-spec-plan`. Pairs with `01-thinking/falsifiability.md` and `01-thinking/critical-thinking-check.md`.
8+
9+
## Scope
10+
11+
- In scope: three-stage pipeline, selective extraction, and synthesis artifacts.
12+
- Out of scope: assuming long context equals reliable understanding.
13+
14+
## Acceptance Criteria
15+
16+
- All three stages are completed in order.
17+
- Pairs with `context-engineering.md` per the composition note below.
18+
19+
## Falsifiability
20+
21+
State what contradiction in source material would invalidate the synthesis.
22+
523
```markdown
624
你在大型 context window 中工作,但不要假設長 context 等於可靠理解。
725

reflective-prompt-library/03-context/low-token.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,28 @@
22

33
Use this when budget, latency, or model quota is tight.
44

5+
## Purpose
6+
7+
Budget-aware terse output when latency or quota is tight. Primary workflow surfaces: `reflective-dispatch` (L1 fast path) and `reflective-brief`. Pairs with `01-thinking/critical-thinking-check.md` and `01-thinking/why-what-how-done.md`.
8+
9+
## Scope
10+
11+
- In scope: minimal decision, reason, plan, and acceptance output under strict length caps.
12+
- Out of scope: full spec slicing (`reflective-spec-plan`) or repository implementation.
13+
14+
## Acceptance Criteria
15+
16+
- Fixed output slots are filled without narrative padding.
17+
- Stop condition is explicit.
18+
19+
## Falsifiability
20+
21+
Name one omitted slot that would make the answer non-actionable.
22+
23+
## Human Review
24+
25+
Escalate to `reflective-risk` when compression would hide safety-critical assumptions.
26+
527
```markdown
628
低 token 模式。請只輸出必要內容。
729

reflective-prompt-library/03-context/medium-context.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,24 @@
22

33
Use this for 32K-128K context windows and ordinary ChatGPT / Claude / Codex tasks.
44

5+
## Purpose
6+
7+
Balance completeness and context cost for 32K–128K windows. Primary workflow surfaces: `reflective-spec-plan` and `reflective-brief`. Pairs with `01-thinking/why-what-how-done.md` and `01-thinking/falsifiability.md`.
8+
9+
## Scope
10+
11+
- In scope: goal through self-check with cited evidence, not full input duplication.
12+
- Out of scope: repository edits without `reflective-implement`.
13+
14+
## Acceptance Criteria
15+
16+
- Uncertainty is explicitly marked.
17+
- Composable with `02-engineering/task-start.md` as noted below.
18+
19+
## Falsifiability
20+
21+
Name one acceptance criterion that would fail if evidence were misquoted.
22+
523
```markdown
624
你在中型 context window 中工作。請平衡完整性與節省 context。
725

reflective-prompt-library/03-context/small-context.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,28 @@
22

33
Use this for 4K-16K context windows, small models, mobile, or low-cost model runs.
44

5+
## Purpose
6+
7+
Operate under small context windows (4K–16K) or low-cost models. Primary workflow surfaces: `reflective-brief` and `reflective-dispatch`. Pairs with `01-thinking/critical-thinking-check.md` and `01-thinking/why-what-how-done.md`.
8+
9+
## Scope
10+
11+
- In scope: conclusion-first answers, minimal assumptions, and capped risks and plan steps.
12+
- Out of scope: long-chain reasoning or full engineering ticket packs.
13+
14+
## Acceptance Criteria
15+
16+
- At most three risks and three to five plan steps unless escalated.
17+
- Next action is directly executable.
18+
19+
## Falsifiability
20+
21+
State what evidence would require escalating to `medium-context.md`.
22+
23+
## Human Review
24+
25+
Escalate when window limits would hide safety-critical unknowns.
26+
527
```markdown
628
你在小 context window 中工作。請極度節省 token。
729

reflective-prompt-library/PROJECT_KNOWLEDGE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,7 @@ deferred promotions are recurrence-gated — see [panel backlog](plans/multi-age
7575
> Pointers to the causal trail — plans, reflections, tests, commits. Detail is
7676
> not duplicated here; this is a map, not an archive.
7777
78+
- 2026-06-25 Round 74 panel — standardize `03-context/` prompt contracts (Purpose/Scope/Acceptance/Falsifiability) + thinking/workflow cross-links + `test_context_prompts_eval_harness.py`[record](plans/multi-agent-panel-consensus-2026-06-25.md)
7879
- 2026-06-25 Round 73 panel — standardize `04-agent/` prompt contracts (Purpose/Scope/Acceptance/Falsifiability) + thinking/workflow cross-links + `test_agent_prompts_eval_harness.py`[record](plans/multi-agent-panel-consensus-2026-06-25.md)
7980
- 2026-06-25 Round 72 panel — standardize `00-core/` prompt contracts (Purpose/Scope/Acceptance/Falsifiability) + eval_harness anti-drift → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
8081
- 2026-06-25 Round 71 panel — thinking ↔ engineering cross-links (`01-thinking/` in all 8 engineering prompts; thinking Prompt Sources on implement/spec-plan/handoff-retro) + `test_prompt_cross_links.py`[record](plans/multi-agent-panel-consensus-2026-06-25.md)

reflective-prompt-library/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Pick **Strictness L1–L6** first (`skills/reflective-dispatch/SKILL.md`, [GLOSS
3030

3131
## Governance Panel Record
3232

33-
Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–73, options A–DN) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
33+
Multi-agent Socratic consensus on project goals and the nine skills (Rounds 1–74, options A–DQ) is recorded in [plans/multi-agent-panel-consensus-2026-06-25.md](plans/multi-agent-panel-consensus-2026-06-25.md). Run `make all` before claiming routing or governance changes are verified.
3434

3535
## Directory Map
3636

0 commit comments

Comments
 (0)