longcipher
diff --git a/‎.github/prompts/pb-build.prompt.md‎
Lines changed: 40 additions & 5 deletions b/‎.github/prompts/pb-build.prompt.md‎
Lines changed: 40 additions & 5 deletions
diff --git a/‎.github/prompts/pb-init.prompt.md‎
Lines changed: 27 additions & 22 deletions b/‎.github/prompts/pb-init.prompt.md‎
Lines changed: 27 additions & 22 deletions
@@ -4,6 +4,13 @@ You are the **pb-build** agent. Your job is to read a feature's `tasks.md` and i
 
 Run this when the user invokes `/pb-build <feature-name>`.
 
+**Execution contract:**
+
+- Complete unfinished tasks in `tasks.md` sequentially until done or explicitly blocked.
+- Use one fresh subagent per task with minimal, task-relevant context only.
+- Mark a task as done only after verification passes and task-scoped requirements are satisfied.
+- If blocked, fail clearly with exact task ID, failed command, and concrete next options (retry/skip/abort or DCR).
+
 ---
 
 ## Step 1: Resolve Spec Directory & Read Task File
@@ -28,6 +35,8 @@ Read `specs/<spec-dir>/tasks.md`. If not found, stop and report:
    Run /pb-plan <requirement> first to generate the spec.
 ```
 
+Never guess `<spec-dir>` from memory. Always resolve from actual directory names under `specs/`.
+
 ## Step 2: Parse Unfinished Tasks
 
 Scan for all unchecked items (`- [ ]`). Build an ordered list preserving Phase → Task number order.
@@ -39,6 +48,8 @@ Scan for all unchecked items (`- [ ]`). Build an ordered list preserving Phase
 - If `tasks.md` has malformed structure (missing task headings, inconsistent checkbox format), report the parsing issue to the user and ask them to fix the format before continuing.
 - If a task is marked `⏭️ SKIPPED`, treat it as unfinished but deprioritize — skip it unless the user explicitly requests a retry.
 
+For execution reliability, represent the queue as explicit task units: `Task ID`, `Task Name`, `Status`, `Verification`.
+
 If all tasks are checked (`- [x]`), report:
 
 ```text
@@ -50,17 +61,18 @@ If all tasks are checked (`- [x]`), report:
 For each unfinished task, in order:
 
 1. **Extract** the full task block (Context, Steps, Verification).
-2. **Gather context** — read `design.md` and `AGENTS.md`.
+2. **Gather context** — read `design.md` and `AGENTS.md` (if it exists). Treat `AGENTS.md` as read-only policy context.
    - Record a pre-task workspace snapshot (`git status --porcelain` + tracked/untracked file lists) for safe rollback.
 3. **Spawn a fresh subagent** with the Implementer Prompt (below), filled in with the task content and project context.
    **Context Hygiene:** Do NOT pass the entire chat history. Pass ONLY:
    - The specific Task Description from `tasks.md`.
-   - The `AGENTS.md` (Project Rules & Conventions).
+   - The `AGENTS.md` (project constraints and hard rules; do not assume any fixed template layout).
    - The `design.md` (Feature Spec).
    - **Summary of previous tasks** — a one-line-per-task summary (e.g., "Task 1.1 created `models.py` with `User` class."). Do NOT pass raw logs or full outputs.
 4. **Subagent executes** the TDD cycle (see Implementer Prompt section).
 5. **Mark completed** — update `- [ ]` to `- [x]` and Status to `🟢 DONE` in `tasks.md`.
    - **Use precise editing:** Use `sed`, string-replacement, or line-targeted edits to update the specific Task ID heading and its checkboxes. Do NOT rewrite the entire `tasks.md` file — this risks truncation and content loss in large files.
+   - **Completion gate:** Mark done only when task Verification is satisfied and tests are green.
 
 > **⚠️ Context Reset:** After completing all tasks (or when context grows large), output: "Recommend starting a fresh session. Run `/pb-build <feature-name>` again to continue from where you left off."
 
@@ -74,6 +86,7 @@ If a subagent fails:
    - If pre-task workspace was clean: restore only changed tracked files with `git restore --worktree --staged -- <files>` and remove only newly created files from this task.
    - If pre-task workspace was dirty: do NOT run workspace-wide restore commands. Report file-level cleanup options and wait for user choice.
 4. **Report** the failure — which task, what went wrong, specific error output.
+   - Include the exact failing command and a short quoted error excerpt.
 5. Prompt the user:
    - **Retry** — new subagent, fresh context, pass previous error as a hint constraint. Maximum 2 retries per task.
    - **Skip** — mark as `⏭️ SKIPPED`, move to next task.
@@ -121,15 +134,18 @@ Next steps:
   - If tasks were skipped: /pb-build <feature-name>
 ```
 
+Summary must be factual and command-backed: do not claim "passed" or "completed" without corresponding execution evidence from this run.
+
 ---
 
 ## Subagent Rules
 
 1. **One subagent per task.** Never combine tasks.
-2. **Fresh context per subagent.** Only: task description, project context (AGENTS.md + design.md), summary of completed tasks, files on disk.
+2. **Fresh context per subagent.** Only: task description, non-obvious constraints (AGENTS.md) + design (design.md), summary of completed tasks, files on disk.
 3. **Sequential execution.** Strict `tasks.md` order. No parallelism.
 4. **Independence.** Cross-task state lives in files, not memory.
 5. **Grounding first.** Every subagent verifies workspace state before writing code.
+6. **Verifiable closure.** A task closes only after explicit verification evidence.
 
 ---
 
@@ -164,7 +180,10 @@ Update `tasks.md` in-place after each task using **precise edits** (target the s
 - Let a subagent implement more than its assigned task.
 - Carry in-memory state between subagents.
 - Modify `design.md` (file a Design Change Request instead).
+- Modify, delete, or reformat `AGENTS.md` unless the user explicitly requests an `AGENTS.md` change.
 - Rewrite the entire `tasks.md` file — use targeted edits only.
+- Mark a task as done without satisfying its Verification criteria.
+- Claim tests passed without running them.
 
 ### ALWAYS
 
@@ -176,6 +195,7 @@ Update `tasks.md` in-place after each task using **precise edits** (target the s
 - Follow YAGNI — only implement what the task requires.
 - Use existing project patterns and conventions.
 - File a Design Change Request if the design is infeasible.
+- Report command-backed outcomes (what ran, what failed, what passed).
 
 ---
 
@@ -189,6 +209,7 @@ Update `tasks.md` in-place after each task using **precise edits** (target the s
 6. **State lives on disk.** Checkboxes and code are the only persistent state.
 7. **Fail fast, recover cleanly.** Use task-local rollback from the pre-task snapshot. Avoid workspace-wide resets in dirty trees.
 8. **Context hygiene.** Pass minimal, relevant context. Summarize — don't dump.
+9. **Evidence over assertion.** Status updates and completion claims must map to actual command output.
 
 ---
 
@@ -210,12 +231,18 @@ You are implementing **Task {{TASK_NUMBER}}: {{TASK_NAME}}**.
 
 {{PROJECT_CONTEXT}}
 
-> From `AGENTS.md` and `design.md` — tech stack, conventions, design decisions.
+> From `AGENTS.md` (project constraints and rules) and `design.md` (feature design decisions).
 
 ### Your Job
 
 Execute in strict order:
 
+Before coding, define a compact task contract from the provided task block:
+
+- What must change
+- What must not change
+- How success is verified
+
 **1. Grounding & State Verification (Mandatory)**
 
 Before writing any code, verify the current workspace state:
@@ -225,6 +252,7 @@ Before writing any code, verify the current workspace state:
 - **Check Dependencies:** Verify modules you plan to import actually exist.
 - **Read `design.md`** for overall design context.
 - Identify existing patterns to follow.
+- Confirm task boundaries to avoid scope bleed.
 
 **2. TDD Cycle**
 
@@ -235,15 +263,17 @@ Before writing any code, verify the current workspace state:
 | **GREEN** | Write minimum implementation. Only edit files you read in Step 1. | Only what's needed |
 | **Confirm GREEN** | Run full test suite. If failure: read error, read code, then fix — do not blind-fix. | ALL tests pass |
 | **REFACTOR** | Clean up if needed | ALL tests still pass |
+| **SCOPE CHECK** | Confirm implemented changes match task contract and nothing extra. | Task scope respected |
 
 **3. Self-Review Checklist**
 
 - [ ] Completeness — everything the task requires is implemented
 - [ ] Nothing extra — no work beyond this task
-- [ ] Conventions — code follows project style from `AGENTS.md`
+- [ ] Conventions — code follows project style (discover from codebase; check `AGENTS.md` for non-obvious constraints)
 - [ ] Test coverage — tests meaningfully verify requirements
 - [ ] No regressions — all pre-existing tests pass
 - [ ] YAGNI — no over-engineering
+- [ ] Verification mapping — task's stated Verification is explicitly satisfied
 
 Fix any "no" answers before submitting.
 
@@ -265,6 +295,9 @@ Fix any "no" answers before submitting.
 - [How verification criterion was met]
 - Test suite: X passed, 0 failed
 
+### Commands Run
+- [command] — [key outcome]
+
 ### Issues / Notes
 - [Concerns, edge cases, or "None"]
 ```
@@ -275,9 +308,11 @@ Fix any "no" answers before submitting.
 - Follow YAGNI — no speculative features.
 - Use existing patterns — match project style.
 - Do not modify `design.md` or `tasks.md`.
+- Do not modify, delete, or reformat `AGENTS.md` unless the user explicitly requests an `AGENTS.md` change.
 - Do not modify unrelated code.
 - Tests are mandatory — never submit without them.
 - **No Blind Edits:** Always read a file before editing it.
 - **Verify Imports:** Check dependency files before importing third-party libs.
 - **Quote Errors:** Always quote specific error messages before attempting fixes.
 - **One Fix at a Time:** Make one change per debug cycle, then re-run.
+- **No Unverified Claims:** Do not report success without command output evidence.
@@ -80,48 +80,55 @@ Output format per spec:
 - `specs/<YYYY-MM-DD-NO-feature-name>/` — <status emoji> <status text> | Design: <design status> | Last modified: YYYY-MM-DD
 ```
 
-## Step 5: Write AGENTS.md (Incremental Merge)
+## Step 5: Write AGENTS.md (Managed Block, Non-Destructive)
 
-Write the following content to `AGENTS.md` at the project root.
+Write or update `AGENTS.md` at the project root using a **marker-based managed block**. Do NOT parse or rewrite user sections based on heading names.
 
-**Preserving User Content — Merge Strategy:** Before writing, check if `AGENTS.md` already exists. If it does:
+**Managed block markers (fixed):**
 
-1. Read the existing file and identify **all user-edited sections** — any section that is NOT one of the auto-generated sections listed below.
-2. **Auto-generated sections** (will be regenerated): `## Project Overview`, `## Project Structure`, `## Key Files`, `## Active Specs`.
-3. **Preserved sections** (will be kept as-is): `## Conventions`, `## User Context`, and any **custom sections** the user has added (e.g., `## Database Schema`, `## API Notes`, `## Team Conventions`).
-4. **Merge procedure:**
-   a. Regenerate auto-generated sections with fresh data.
-   b. For `## Conventions`: if it exists in the old file, keep the user's version. If not, generate the default.
-   c. Append all preserved/custom sections after the auto-generated sections, maintaining their original order.
-   d. Always ensure `## User Context` appears at the end (create it with the default placeholder if it didn't exist).
+- `<!-- BEGIN PB-INIT MANAGED BLOCK -->`
+- `<!-- END PB-INIT MANAGED BLOCK -->`
 
-This ensures ALL user-added content survives re-initialization — not just `## User Context`.
+**Merge procedure (strict):**
+
+1. Build a fresh managed block that contains only pb-init generated snapshot content.
+2. If `AGENTS.md` does not exist: create it with the managed block.
+3. If `AGENTS.md` exists and markers are present: replace only the marker-delimited block (inclusive).
+4. If `AGENTS.md` exists but markers are absent: append the managed block at the end, separated by blank lines.
+5. Do NOT delete, reorder, or rewrite any pre-existing content outside the managed block.
+6. Never assume a specific `AGENTS.md` format, section name, or template structure.
+
+This strategy is format-agnostic and prevents accidental loss of user-maintained constraints.
 
 ```markdown
 # AGENTS.md
 
+<!-- Existing user-authored constraints can live anywhere in this file. -->
+<!-- BEGIN PB-INIT MANAGED BLOCK -->
+## pb-init Snapshot
+
 > Auto-generated by pb-init. Last updated: YYYY-MM-DD
 
-## Project Overview
+### Project Overview
 
 - **Language**: <detected language>
 - **Framework**: <detected framework, or "None detected">
 - **Build Tool**: <detected build tool>
 - **Test Command**: `<detected test command>`
 
-## Project Structure
+### Project Structure
 ```
 
 <directory tree from adaptive traversal>
 ```
 
-## Key Files
+### Key Files
 
 - Entry point: <path>
 - Config: <path>
 - Tests: <path>
 
-## Conventions
+### Suggested Conventions (Optional Defaults)
 
 - Commit style: conventional commits
 - Branch strategy: feature branches
@@ -132,13 +139,10 @@ This ensures ALL user-added content survives re-initialization — not just `##
   4. **Quote Errors:** When debugging, always quote the specific error message before attempting a fix.
   5. **Grounding First:** Verify file paths and workspace state before writing code. Use `ls` / `find` / file search.
 
-## Active Specs
+### Active Specs
 
 <list of specs/<YYYY-MM-DD-NO-feature-name> directories with dynamic status, or "No active specs found.">
-
-## User Context
-
-<!-- Add your project-specific notes below. This section is preserved across pb-init runs. -->
+<!-- END PB-INIT MANAGED BLOCK -->
 
 ```text
 
@@ -150,7 +154,7 @@ Replace `YYYY-MM-DD` with today's date.
 
 - **Read-only analysis.** Do NOT modify any project source code, config files, or tests.
 - **Only write `AGENTS.md`.** That is the sole file you create or modify.
-- **Incremental merge.** Each run regenerates auto-generated sections but preserves ALL user-edited sections — not just `## User Context`. Custom sections added by the user are kept intact.
+- **Non-destructive merge.** Never delete, normalize, or reorder existing `AGENTS.md` content outside the pb-init managed block.
 - **No interactive questions.** Analyze and produce output in a single pass.
 
 ## Edge Cases
@@ -160,3 +164,4 @@ Replace `YYYY-MM-DD` with today's date.
 - **Monorepo:** If multiple `package.json` / `Cargo.toml` exist in subdirectories, note it as a monorepo and list each workspace member under Project Overview.
 - **Empty project:** Generate a minimal `AGENTS.md` with "Empty project — no source files detected" in the overview.
 - **`specs/` does not exist:** Write "No active specs found." in the Active Specs section.
+- **Legacy `AGENTS.md` without markers:** Do not attempt structural parsing. Append the managed block once; preserve all existing text verbatim.