diff --git a/.agents/skills/plan-pr-batch/SKILL.md b/.agents/skills/plan-pr-batch/SKILL.md index 76ad6a10cd..afc0686ce1 100644 --- a/.agents/skills/plan-pr-batch/SKILL.md +++ b/.agents/skills/plan-pr-batch/SKILL.md @@ -65,6 +65,21 @@ Plan a PR batch `batches/.json` with those lane refs before dependent workers start; otherwise `agent-coord status --batch-id --json` cannot report `blocked_on` lanes. + - Apply `.agents/workflows/pr-processing.md` under **Batch QA Lane**. Declare + whether QA is required, which subset qualifies, and the planned QA owner. + When QA is required, plan the `qa` lane name, stable owner/heartbeat + expectations, and private-state representation for the launched coordinator + to create when the backend is available. If private state will be + unavailable, require the final handoff to record QA claim/heartbeat state as + `UNKNOWN` and include allowed fallback evidence (see + `.agents/workflows/pr-processing.md`, the "Allowed fallback evidence:" + label in the **Batch QA Lane** section) instead of downgrading QA to + `not required`. + Use that section's capitalization convention: uppercase `UNKNOWN` means + coordination/backend state, and lowercase `unknown` is the QA lane status. + When QA is omitted for low-risk work, record `not required` plus the + rationale. Include the final QA Evidence expectations in the Batch Plan and + generated goal prompt. - Build a File-touch map for the batch: list the paths each item changes or intends to affect, including creates, deletes, and renames. Never guess paths. @@ -157,6 +172,7 @@ Plan a PR batch - `merge_authority`: - Concurrent activity and dependency status: - Coordination hooks, including backend claim exclusions: +- Batch QA Lane decision and QA Evidence expectations: - Verification expectations: - Prompt sizing: `Goal prompt character count: N characters`; note any split fallback and keep omitted item details here, not in the goal prompt. @@ -175,10 +191,10 @@ Preflight first: if this session cannot run workers without blocking approval pr Repository: OWNER/REPO Batch objective: ... merge_authority: . -Scope summary: [one paragraph: compact titles, sequencing, dependencies, exclusions, and path ownership for this batch. Keep bulky evidence, long validation notes, and later-batch details outside this prompt.] +Batch QA Lane: . +Scope summary: [one paragraph: compact titles, sequencing, dependencies, exclusions, path ownership; keep bulky evidence, validation notes, and later-batch details outside.] File-touch map (one line per item; pick the applicable format): -- PR/Issue #N -> changed/affected paths, including create/delete/rename (owner: lane/name) -- PR/Issue #N -> summarized path pattern(s) plus collision-relevant exact paths/renames/deletes (owner: lane/name) +- PR/Issue #N -> exact paths or summarized patterns, including creates/deletes/renames (owner: lane/name) - PR/Issue #N -> UNKNOWN (paths not determinable from issue body/design notes; treat as serial) Batch-level reservations, not tied to a single item: - Deferred/reserved paths -> path(s) (reason: ... / later owner: lane/name) @@ -187,22 +203,17 @@ Items: - PR #N: URL Goal: one-line outcome. Worker notes: short scope, branch, or dependency note. - Done when: final state is reported using the requested `merge_authority` and the split states from pr-batch. + Done when: final state satisfies requested `merge_authority` and matches a pr-batch split state. - Issue #N: URL Goal: one-line outcome. Worker notes: short scope, branch, or dependency note. - Done when: final state is reported using the requested `merge_authority` and the split states from pr-batch, with PR/no-PR evidence or documented no-fix rationale. + Done when: final state satisfies requested `merge_authority`, with PR/no-PR evidence or no-fix rationale. Execution rules: -- Run `git fetch --prune origin main` first. Verify repo-local `.agents/skills/pr-batch/SKILL.md` and `.agents/workflows/pr-processing.md` exist before launching workers. If either is missing in the checkout but present on `origin/main`, update the worktree before continuing; if still missing, stop and report repo workflow state as `UNKNOWN`. -- Follow `.agents/skills/pr-batch/SKILL.md` "Goal Prompt Template"; if skill autoloading is unavailable, copy its safety, review, /simplify, CI, and readiness gates before running. -- Dispatch one subagent per independent item; group dependent items only when shared context is required. Dispatch only the current file-disjoint wave. Hold serial and `UNKNOWN` - discovery lanes until no active editor lane can collide with them. -- Workers edit only owned File-touch map paths; this map is how the batch makes - pr-batch's "disjoint write scopes" concrete, since pr-batch's own template has - no File-touch map slot. If an `UNKNOWN`, unlisted, or other-lane path is - needed, stop, report discovered paths, and wait for an updated map or explicit - coordinator confirmation before editing. +- Run `git fetch --prune origin main` first. Verify repo-local `.agents/skills/pr-batch/SKILL.md` and `.agents/workflows/pr-processing.md` before editing. If a required file is missing locally but present on `origin/main`, update that specific file before continuing; if it is still missing, report repo workflow state as `UNKNOWN`. +- Follow `.agents/skills/pr-batch/SKILL.md`; if autoloading is unavailable, copy its safety/review/simplify/CI/readiness gates. +- Dispatch one subagent per independent item, current file-disjoint wave only. Hold serial and `UNKNOWN` discovery lanes until no active editor lane can collide. +- Workers edit only owned File-touch map paths. If an `UNKNOWN`, unlisted, or other-lane path is needed, stop before editing it and report discovered paths for coordinator confirmation. - Sequenced lanes may share declared files only in the stated order. - Each subagent must verify current GitHub state before edits and report UNKNOWN for unverifiable facts. - For coordination, respect coordination claims and dependencies: @@ -210,6 +221,7 @@ Execution rules: `agent-coord status --repo --target --json` or `agent-coord status --batch-id --json` per `AGENTS.md`; claim, heartbeat, and stop/report UNKNOWN. +- Apply Batch QA Lane in `.agents/workflows/pr-processing.md`: declare required/not required, use private `qa` lane when available, `UNKNOWN` fallback evidence when not, and include QA Evidence in final handoff. - Use local validation, self-review, review-comment, CI, and readiness gates from the repo workflow. For PRs, merge only when `merge_authority` is `auto_merge_when_gates_pass` or a later explicit approval exists, current release mode permits it, and confidence/readiness gates pass; document confidence data in the PR description. - Final handoff must include links, tests, blockers, next action, confidence or UNKNOWN facts, `merge_authority`, and explicit final-state sections: `merged`, `ready-gates-clean`, `ready-no-merge-authority`, `waiting-on-checks-or-review`, `external-gate-failing`, `blocked-user-input`, or `no-pr-evidence`. ``` diff --git a/.agents/skills/post-merge-audit/SKILL.md b/.agents/skills/post-merge-audit/SKILL.md index 11d36c4f2d..be97d8cbfe 100644 --- a/.agents/skills/post-merge-audit/SKILL.md +++ b/.agents/skills/post-merge-audit/SKILL.md @@ -95,7 +95,15 @@ The resolver is read-only. It resolves the default release-candidate base, the h Keep PR-range inclusion separate from worked-issue coverage so no-PR, blocked, parked, and unmerged lanes are still evaluated. -Show included worked issues, included PRs, excluded near-matches, base/head SHAs, coordination status evidence, and assumptions. Ask for confirmation before deep audit unless the user explicitly asks to proceed without confirmation. +After the scope algorithm identifies the batch or reports an `UNKNOWN` scope, +collect any QA lane and QA Evidence block for that batch. Do not use missing QA +state to shrink the worked-issue scope; report it as a QA coverage finding or +`UNKNOWN` fact instead. +Use the capitalization convention from `.agents/workflows/pr-processing.md`: +uppercase `UNKNOWN` means coordination/backend state, and lowercase `unknown` +is the QA lane status. + +Show included worked issues, included PRs, collected QA lanes and QA Evidence blocks, excluded near-matches, base/head SHAs, coordination status evidence, and assumptions. Ask for confirmation before deep audit unless the user explicitly asks to proceed without confirmation. ## Audit Checks @@ -114,31 +122,60 @@ For each included PR: reproducible artifact or justified missing-artifact caveat, internal consistency, production-environment caveats, and refutable-conclusion handling. - Validation: compare changed areas with the validation evidence in the PR body or comments. +- QA evidence: verify required QA Evidence exists, records `Tested at` with the + PR/head SHA or audited range it applies to, is current for that head/range, covers the changed + surfaces, and does not leave release-blocking findings untriaged. If private + coordination claim/heartbeat state is `UNKNOWN`, verify the documented + fallback evidence is complete and names a concrete QA owner and branch/worktree + before treating QA coverage as satisfied. Only private claim/heartbeat + sub-values may be `UNKNOWN` in fallback mode. If fallback evidence is absent or + incomplete, classify the QA lane as `unknown` and surface it as a readiness + blocker. Verify `Release-blocking status` is derived from QA lane status: + `satisfied` -> `clear`, `blocked` -> `blocked`, `waived` -> `waived`, + `not_applicable` -> `not_applicable`, and `in_progress` / `unknown` -> `blocked`. - Cross-PR interactions: compare changed files, shared behavior, assumptions, and release-sensitive areas across the batch. - Decision log: inspect any `Codex Decision Log` or equivalent section and verify the decisions still hold after the merge. -For each worked issue from coordination state or advisory `codex-claim` -recovery rows, including no-PR, blocked, parked, done-unmerged, or still-open -lanes: +For each worked issue, QA lane, or advisory `codex-claim` recovery row from +coordination state, including no-PR, blocked, parked, done-unmerged, or +still-open lanes: -- Intent coverage: compare the issue intent and acceptance criteria with the PR - diff, no-PR evidence comment, branch state, or blocker note. +- Intent coverage: compare the issue or QA-lane intent with the PR diff, no-PR + evidence comment, QA evidence, branch state, or blocker note. - Final state: verify whether the issue was merged, closed, parked, blocked, - left open intentionally, or remains `UNKNOWN`. + left open intentionally, or remains `UNKNOWN`; for QA lanes, verify whether + the QA coverage status is `satisfied`, `blocked`, `waived`, healthy + `in_progress`, `not_applicable` when QA was not required, or `unknown`. - Handoff expectations: check validation evidence, decision-point count, - confidence notes, review/comment triage, and any Process Gap Disposition - fields required by `.agents/workflows/pr-processing.md`. + confidence notes, QA evidence, review/comment triage, and any Process Gap + Disposition fields required by `.agents/workflows/pr-processing.md`. - Classification: reuse the intent-achievement classes from `.agents/workflows/continuous-evaluation-loop.md` (`in_progress`, `realized`, `partial`, `missed`, `regressed`, `stalled`, or `unknown`) and - explain any `UNKNOWN` evidence needed to resolve the issue outcome. -- Post-merge intake: record healthy `in_progress` lanes and evidenced - `realized` outcomes in the worked-issue table as no-action items; route + explain any `UNKNOWN` evidence needed to resolve the issue outcome. For QA + lanes, use the QA-coverage result `satisfied`, `blocked`, `waived`, + `in_progress`, `not_applicable`, or `unknown`. Use `satisfied` when required + QA evidence is current, adequately scoped, and has no untriaged + release-blocking finding; `blocked` when a release-blocking QA finding still + needs a fix or waiver; `waived` when an explicit waiver exists and the auditor + verifies a maintainer comment URL, issue link, or PR body entry names the + finding, scope, and reason; `in_progress` when required QA is not complete; + `not_applicable` when QA was correctly omitted with `QA required: no` and a + documented rationale; and `unknown` when evidence is missing, stale, or + incomplete. +- Post-merge intake: record healthy `in_progress` worked-issue lanes and + evidenced `realized` worked-issue outcomes, `satisfied` or `waived` QA lanes, + and `not_applicable` QA omissions in the coverage table as no-action items + during active batch phase; + treat required QA lanes still `in_progress` during readiness/release audits as + QA coverage findings; route `stalled` lanes back to the batch coordinator as resume/reassign/drop decisions unless the user explicitly approves tracking the stalled lane as an issue; route every other non-OK worked-issue class (`partial`, `missed`, - `regressed`, or `unknown`), merged or not, into the issue plan or an explicit - coordinator action that names the missing evidence or decision. + `regressed`, or `unknown`), merged or not, and every non-OK QA coverage + outcome (`blocked`, `unknown`, or release-audit `in_progress`) into the issue + plan or an explicit coordinator action that names the missing evidence or + decision. ## Codex And Claude Coordination @@ -185,11 +222,13 @@ lane was evaluated, even when the issue produced no merged PR: The audit should usually produce an issue plan for non-OK findings, but not create issues until approval. - **No issue**: for `OK`, duplicate findings, findings fully resolved by the - audit evidence, evidenced `realized` lanes, or healthy `in_progress` lanes; - include `realized` and `in_progress` lanes in the worked-issue coverage table - so the coordinator can see they were checked. + audit evidence, evidenced `realized` lanes, healthy `in_progress` worked-issue + lanes, evidenced `satisfied` or `waived` QA lanes, or evidenced QA omissions + marked `not_applicable`; include `realized`, worked-issue `in_progress`, + `satisfied`, `waived`, and `not_applicable` rows in the worked-issue/QA-lane + coverage table so the coordinator can see they were checked. - **Changelog only**: for missing changelog entries; prefer one bundled changelog issue or a recommendation to run `/update-changelog`, not one issue per entry. -- **One child issue**: for each independently actionable fix PR, revert consideration, maintainer question, follow-up task, or non-OK worked-issue outcome (`partial`, `missed`, `regressed`, or `unknown`) that needs follow-up. +- **One child issue**: for each independently actionable fix PR, revert consideration, maintainer question, follow-up task, non-OK worked-issue outcome (`partial`, `missed`, `regressed`, or `unknown`), or non-OK QA coverage outcome (`blocked`, `unknown`, or release-audit `in_progress`) that needs follow-up. - **Parent issue**: create one parent issue only to group two or more related _child fix_ issues from the same audit. Do **not** create a standalone audit-snapshot tracker (a `Post- audit` / `Post-rc.N catch-up audit` @@ -234,15 +273,17 @@ Only the coordinator should create issues. Independent Codex and Claude audits s Return high-risk findings first, then: 1. Review-gate violations, including PRs merged before requested reviews finished, before actionable review findings were triaged, or with AI review systems incorrectly counted as approval gates. -2. Missing changelog candidates, with a single recommendation to run `/update-changelog` when any are found. -3. Cross-PR interaction risks. -4. A deduped issue plan with parent/child recommendations and fingerprints. -5. A worked-issue coverage table with issue number, coordination lane/branch, - linked PR or no-PR/blocker evidence, final state, intent-achievement - classification, and `UNKNOWN` facts (see the example in - `.agents/workflows/post-merge-audit.md`). -6. A PR-by-PR table. -7. Exact commands and data sources used, including bounded `agent-coord status` +2. QA coverage findings, including missing, stale, still-`UNKNOWN` coverage/scope, + or insufficient required QA evidence. +3. Missing changelog candidates, with a single recommendation to run `/update-changelog` when any are found. +4. Cross-PR interaction risks. +5. A deduped issue plan with parent/child recommendations and fingerprints. +6. A worked-issue/QA-lane coverage table with issue number or QA lane id, + coordination lane/branch, linked PR or no-PR/blocker/QA evidence, final + state, issue intent-achievement or QA-coverage classification, and `UNKNOWN` + facts (see the example in `.agents/workflows/post-merge-audit.md`). +7. A PR-by-PR table. +8. Exact commands and data sources used, including bounded `agent-coord status` output for the named batch or the exact reason coordination state was `UNKNOWN`. diff --git a/.agents/skills/pr-batch/SKILL.md b/.agents/skills/pr-batch/SKILL.md index 6ad5dfdc7d..09334093cb 100644 --- a/.agents/skills/pr-batch/SKILL.md +++ b/.agents/skills/pr-batch/SKILL.md @@ -89,9 +89,19 @@ Before implementation or worker launch, produce: - likely outcome: implementation PR, combined investigation PR, no-PR evidence comment, or product-decision blocker - assigned machine or worker 6. The selected `merge_authority` value and how it affects final closeout. -7. A permission and trust preflight result. -8. A conflict check for overlapping files or dependent PRs. -9. A final `/goal` prompt when the user asked for Goal mode. +7. The Batch QA Lane decision from `.agents/workflows/pr-processing.md`: plan + the required QA lane representation for private coordination state when the + backend is available, or require the launched coordinator to record QA + claim/heartbeat state as `UNKNOWN` and use allowed fallback evidence (see the + canonical Batch QA Lane "allowed fallback evidence" definition in + `.agents/workflows/pr-processing.md`); for low-risk omitted QA, record the + final QA Evidence block with `QA required: no`, a `QA lane status` of + `not_applicable`, and the rationale. + Use the canonical capitalization convention: uppercase `UNKNOWN` means + coordination/backend state, and lowercase `unknown` is the QA lane status. +8. A permission and trust preflight result. +9. A conflict check for overlapping files or dependent PRs. +10. A final `/goal` prompt when the user asked for Goal mode. If the user is in `/plan` or asks for a plan-to-goal handoff, stop after the `/goal` prompt. Do not begin implementation from plan approval unless the user explicitly says to launch now. @@ -132,10 +142,22 @@ Targets: . Lane: . Mode: spawn worker subagents only after the target list and lane split are confirmed. merge_authority: . +Batch QA Lane: . Coordination: follow `.agents/workflows/pr-processing.md` under Coordination State and Worker Rules before creating worktrees or branches. Include stable agent ids, `agent-coord status` / claim outcomes, batch ids, dependency refs, and any `UNKNOWN` state in every worker lane and handoff. +When the Batch QA Lane section requires QA, declare a `qa` lane with stable +owner and claim/heartbeat expectations before launch when the private backend is +available. If private state is unavailable, record QA claim/heartbeat state as +`UNKNOWN` and use allowed fallback evidence (see the canonical Batch QA Lane +"allowed fallback evidence" definition in `.agents/workflows/pr-processing.md`) +instead of downgrading required QA to `not required`. Require the final QA +Evidence block in the handoff; if QA is not required, record the `not required` +status and rationale in that block. +Use the canonical capitalization convention from `.agents/workflows/pr-processing.md`: +uppercase `UNKNOWN` means coordination/backend state, and lowercase `unknown` +is the QA lane status. Attention contract: follow `AGENTS.md` under Maintainer Attention Contract and `.agents/workflows/pr-processing.md` under Maintainer Attention Contract. Do not escalate behavior-preserving optional nits, batch real questions into one @@ -310,11 +332,22 @@ Use the canonical Batch Handoff Format in **Immediate maintainer attention** for true blockers and questions only, and **FYI / decisions made** for decisions, validations, review state, hosted-CI requests already handled, no-PR rationales, autonomous nit outcomes, -confidence notes, decision-point counts per PR, and per-PR merge-ledger summaries. +confidence notes, decision-point counts per PR, QA Evidence blocks that include +`Tested at`, the QA required decision and rationale, QA lane status, and per-PR +merge-ledger summaries. Do not call a target `complete` while its ledger has `UNKNOWN` fields or `complete_allowed: false`. +Do not report a batch that requires QA as ready while required QA coverage/scope +evidence is missing, stale, scope-mismatched, `blocked`, `in_progress`, +`unknown`, or still `UNKNOWN`; the only allowed fallback is a QA lane whose +private coordination claim/heartbeat is `UNKNOWN` while documented QA evidence +is otherwise complete. Record the selected `merge_authority` value in the handoff and use the canonical split final states from `.agents/workflows/pr-processing.md`. +For every batch that requires QA under the canonical workflow's Batch QA Lane +decision, make sure the QA lane is represented in private coordination state +when available, or explicitly recorded as `UNKNOWN` with the canonical Batch QA +Lane fallback evidence, and in the final handoff. ## Coordination State diff --git a/.agents/workflows/post-merge-audit.md b/.agents/workflows/post-merge-audit.md index b235403686..ccb1d345df 100644 --- a/.agents/workflows/post-merge-audit.md +++ b/.agents/workflows/post-merge-audit.md @@ -79,6 +79,17 @@ List every issue/PR you worked on in this batch, with: - any risk you would want a maintainer to re-check after merge - anything that might interact badly with other PRs from the same batch +List any QA lane or intentionally omitted QA lane, with: +- QA lane id/owner, claim status, and last heartbeat status +- QA Evidence block URL or copied contents +- `Tested at` head(s) or audited range +- `QA required`, QA required rationale, and QA lane status / coverage result +- release-blocking status and any findings + +Use the capitalization convention from +`.agents/workflows/pr-processing.md` -> **Batch QA Lane**: uppercase `UNKNOWN` +means coordination/backend state, and lowercase `unknown` is the QA lane status. + If you do not know or cannot verify an item from GitHub/local git, say UNKNOWN rather than guessing. ``` @@ -176,31 +187,56 @@ Treat `worked_issue_scope: not applicable`, `worked_issue_scope: UNKNOWN (...)`, and `worked_issue_scope: empty (...)` as merged-PR-range-only or advisory scope states, not verified batch subsets. -Ask me to confirm the included/excluded worked issues, advisory `codex-claim` -rows, and PR range before deep audit unless I explicitly say to proceed. When -the scope is `UNKNOWN (needs batch confirmation)`, ask me to choose the -candidate batch/run id before any confirmed worked-issue audit. - -After confirmation, audit each known worked issue or advisory `codex-claim` row -for: -- whether the implementation, no-PR comment, blocker, or parked disposition - satisfied the issue intent and acceptance criteria +After the scope algorithm identifies the batch or reports an `UNKNOWN` scope, +collect any QA lane and QA Evidence block for that batch. Do not use missing QA +state to shrink the worked-issue scope; report it as a QA coverage finding or +`UNKNOWN` fact instead. + +Ask me to confirm the included/excluded worked issues, collected QA lanes and QA +Evidence blocks, advisory `codex-claim` rows, and PR range before deep audit +unless I explicitly say to proceed. When the scope is +`UNKNOWN (needs batch confirmation)`, ask me to choose the candidate batch/run id +before any confirmed worked-issue audit. + +After confirmation, audit each known worked issue, QA lane, or advisory +`codex-claim` row for: +- whether the implementation, no-PR comment, QA evidence, blocker, or parked + disposition satisfied the issue or QA-lane intent and acceptance criteria - whether the final issue state is correct: merged, closed, still open, parked, blocked, no-PR, done-unmerged, or UNKNOWN +- for QA lanes, whether the QA lane status is correct: `satisfied`, `blocked`, + `waived`, still healthy `in_progress`, `not_applicable` when QA was not + required, or `unknown` - whether review comments, handoff expectations, confidence notes, validation - evidence, decision-point count, and Process Gap Disposition fields were - handled when required + evidence, QA evidence, decision-point count, and Process Gap Disposition + fields were handled when required - classify each worked issue as `in_progress`, `realized`, `partial`, `missed`, `regressed`, `stalled`, or `unknown`, using `.agents/workflows/continuous-evaluation-loop.md` for the intent-achievement - definitions -- for healthy `in_progress` lanes and evidenced `realized` outcomes, record no - action in the worked-issue table; for `stalled` lanes, recommend resume, - reassign, or drop unless the user explicitly approves tracking the stalled - lane as an issue; for any other non-OK worked-issue class (`partial`, - `missed`, `regressed`, or `unknown`), merged or not, prepare a post-merge - audit issue-plan entry or an explicit coordinator action naming the missing - evidence or decision + definitions; classify QA lanes with the QA-coverage result `satisfied`, + `blocked`, `waived`, `in_progress`, `not_applicable`, or `unknown`. Use + `satisfied` when required QA evidence is current, adequately scoped, and has no + untriaged release-blocking finding; `blocked` when a release-blocking QA + finding still needs a fix or waiver; `waived` when an explicit waiver exists + and the auditor verifies a maintainer comment URL, issue link, or PR body entry + names the finding, scope, and reason; `in_progress` when required QA is not + complete; `not_applicable` when QA was correctly omitted with `QA required: no` + and a documented rationale; and `unknown` when evidence is missing, stale, or + incomplete. Verify `Release-blocking status` is derived from QA lane status: + `satisfied` -> `clear`, `blocked` -> `blocked`, `waived` -> `waived`, + `not_applicable` -> `not_applicable`, and `in_progress` / `unknown` -> `blocked`. +- for healthy `in_progress` worked-issue lanes, evidenced `realized` outcomes, + evidenced `satisfied` or `waived` QA lanes, and evidenced `not_applicable` QA + omissions, record no action in the worked-issue/QA table; treat required QA + lanes still `in_progress` during readiness/release audits as QA coverage + findings; for `stalled` lanes, recommend resume, reassign, or drop unless the + user explicitly approves tracking the stalled lane as an issue; for any other + non-OK worked-issue class (`partial`, `missed`, `regressed`, or `unknown`), + merged or not, prepare a post-merge audit issue-plan entry or an explicit + coordinator action naming the missing evidence or decision; for non-OK QA + coverage outcomes (`blocked`, `unknown`, or release-audit `in_progress`), + prepare a post-merge audit issue-plan entry or approved coordinator action + naming the missing evidence, fix, or waiver decision Also audit each included merged PR for: - risky behavior change @@ -221,8 +257,14 @@ Also audit each included merged PR for: - AI review findings that were ignored even though they identified a confirmed blocker such as a correctness regression, failing test, security issue, API contract break, data-loss risk, or missing required maintainer approval - requested adversarial reviews that were late, stale, missing, or left untriaged `BLOCKING`/`DISCUSS` findings - untriaged Must Fix, SHOULD-FIX, DISCUSS, Changes Requested, compatibility, security, regression, or missing-changelog review findings -- changes touching CI, packaged/commercial code, build config, code generators, performance- or - framework-sensitive paths, shared types, or release-sensitive docs (per `AGENTS.md`) +- missing, stale, insufficiently scoped, head/range-ambiguous, release-blocking, + or still-`UNKNOWN` QA coverage/scope evidence required by + `.agents/workflows/pr-processing.md`; do not treat private coordination + claim/heartbeat `UNKNOWN` as blocking when the documented fallback evidence is + complete and names a concrete QA owner and branch/worktree +- changes touching CI, packaged/commercial code, build config, code generators, + performance- or framework-sensitive paths, shared types, or release-sensitive + docs (per `AGENTS.md`) - anything that could have bad consequences after merge Classify each PR: @@ -245,17 +287,24 @@ For every non-OK finding, include a draft issue entry but do not create it: `checklist+replay`, or `park`), `Motivating miss`, `Replay evidence or park reason`, and `Non-goal` -Return high-risk findings first, then review-gate violations, missing changelog -candidates, cross-PR interaction risks, the issue plan, a worked-issue coverage -table, a PR-by-PR table, and exact commands/data sources. Include any remaining -`UNKNOWN` facts and the command or permission needed to resolve them. Do not make -code changes, comments, labels, issues, reverts, or PRs without approval. -The worked-issue coverage table must include issue number, coordination -lane/branch, linked PR or no-PR/blocker evidence, final state, -intent-achievement classification, and `UNKNOWN` facts. +Return high-risk findings first, then review-gate violations, QA coverage +findings, missing changelog candidates, cross-PR interaction risks, the issue +plan, a worked-issue/QA-lane coverage table, a PR-by-PR table, and exact +commands/data sources. Include any remaining `UNKNOWN` facts and the command or +permission needed to resolve them. Do not make code changes, comments, labels, +issues, reverts, or PRs without approval. +The worked-issue/QA-lane coverage table must include issue number or QA lane id, +coordination lane/branch, linked PR or no-PR/blocker/QA evidence, final state, +issue intent-achievement or QA-coverage classification, and `UNKNOWN` facts. Example worked-issue coverage table (`batch-abc` and issue numbers are placeholders; replace them with the real batch id and issues): + +`Final state` is the operational lane outcome. `Classification` is the +worked-issue intent class or QA-coverage result. +Use `qa` for a single QA lane. Use `qa:` for scoped QA sub-lanes, +matching the coordinator lane name. + | Issue | Lane/branch | Evidence | Final state | Classification | UNKNOWN facts | | --- | --- | --- | --- | --- | --- | | #1234 | batch-abc:issue-1234 / codex/example | PR #2345 merged | merged | realized | none | @@ -263,6 +312,11 @@ placeholders; replace them with the real batch id and issues): | #1236 | batch-abc:issue-1236 / codex/partial-example | PR #2346 merged | merged | partial | acceptance criteria C not addressed | | #1237 | UNKNOWN (advisory) / no coord data | codex-claim comment URL (advisory) | UNKNOWN | unknown | coordination state needed to confirm | | #1238 | batch-abc:issue-1238 / codex/done-no-merge | no-PR evidence comment URL | done-unmerged | realized | none | +| qa | batch-abc:qa / codex/qa-lane | QA Evidence block URL | done | satisfied | none | +| qa | not required / no branch | handoff comment URL (inline QA Evidence block) | not_applicable | not_applicable | none | +| qa | batch-abc:qa / codex/qa-lane | QA Evidence block URL | blocked | blocked | fix or waiver needed before release | +| qa | batch-abc:qa / codex/qa-lane | maintainer waiver URL | done | waived | none | +| qa:ci | batch-abc:qa:ci / codex/qa-ci-lane | QA Evidence block URL | done | satisfied | none | ``` ## Comparison Prompt @@ -287,6 +341,7 @@ For each finding: Pay special attention to disagreements: - one agent flags risk and the other misses it +- different QA coverage findings, QA lane states, or QA Evidence freshness/scope - different worked-issue inclusion lists, including one agent having coordination data while the other records `worked_issue_scope: UNKNOWN` - when one report has verified coordination data and another has @@ -299,7 +354,8 @@ Pay special attention to disagreements: needed before any confirmed worked-issue audit can proceed; continue auditing advisory `codex-claim` rows alongside the merged PR range, keeping those rows marked `UNKNOWN` -- different intent-achievement classifications for the same worked issue +- different intent-achievement classifications for the same worked issue or + QA-coverage classifications for the same QA lane - different PR inclusion lists - different release-candidate base - different interpretation of validation evidence @@ -310,13 +366,15 @@ Pay special attention to disagreements: Return: 1. consensus high-risk findings 2. reconciled review-gate violations -3. disputed findings needing human review -4. PRs both agents consider OK -5. deduped issue plan -6. reconciled worked-issue coverage table with issue number, coordination - lane/branch, linked PR or no-PR/blocker evidence, final state, - intent-achievement classification, and any unresolved `UNKNOWN` facts -7. recommended next actions, including a coordinator resume/reassign/drop +3. reconciled QA coverage findings +4. disputed findings needing human review +5. PRs both agents consider OK +6. deduped issue plan +7. reconciled worked-issue/QA-lane coverage table with issue number or QA lane + id, coordination lane/branch, linked PR or no-PR/blocker/QA evidence, final + state, issue intent-achievement or QA-coverage classification, and any + unresolved `UNKNOWN` facts +8. recommended next actions, including a coordinator resume/reassign/drop decision for `stalled` lanes instead of defaulting to issue creation Do not create issues or PRs yet. @@ -333,7 +391,9 @@ Rules: - Search existing open issues for each fingerprint and affected PR number before creating anything. - Do not create duplicate child issues. If an issue already exists, link it in the parent issue plan instead. - If there are two or more related child issues, create one parent issue first. -- Create one child issue per independently actionable fix PR, revert consideration, maintainer question, or follow-up task. +- Create one child issue per independently actionable fix PR, revert + consideration, maintainer question, follow-up task, or approved non-OK + worked-issue/QA coverage follow-up. - For release-gate audits, append the audit report to the release-gate audit ledger before creating approved follow-up issues; include the resulting ledger comment URL in every parent and child issue body. diff --git a/.agents/workflows/pr-processing.md b/.agents/workflows/pr-processing.md index c69daf7015..8e8b59e8b4 100644 --- a/.agents/workflows/pr-processing.md +++ b/.agents/workflows/pr-processing.md @@ -441,6 +441,201 @@ only a caveated no-PR `park` disposition or a product-decision blocker. Workers should not turn product-decision blockers into speculative PRs. They should post or draft the evidence-backed question and stop that target. +### Batch QA Lane + +Convention: `UNKNOWN` in capitals means coordination/backend state could not be +verified; lowercase `unknown` is the QA lane status value. + +Use a QA lane when a batch needs evidence beyond each individual worker's local validation before +coordinator closeout, release-readiness, or release-promotion decisions rely on the batch. QA is a +sibling lane to implementation and audit work: it verifies the user-visible or operator-visible result +of the batch, while audit verifies that the QA coverage and evidence were adequate. + +Create an explicit QA lane for release-affecting batches, RC or final-release preparation, +CI/tooling changes, generated-example or generator-output changes, +developer-workflow changes (runtime tool behavior, CLI commands, build/CI +paths, or generated developer outputs, not process-documentation or +agent-instruction-only edits), broad runtime behavior changes, and any batch +where the coordinator cannot tell from worker validation alone whether the +intended surfaces were exercised. These required categories take precedence over +low-risk exceptions. For docs-only, no-code process, no-PR evidence, and other +low-risk batches that are not release-affecting, developer-workflow-affecting, +or otherwise covered by the required categories above, QA may be recorded as +`not required` with a one-line rationale instead of spawning a separate worker. + +For mixed batches, apply QA to any subset that would individually qualify under the required-QA +categories above, including release-affecting, workflow/build/tooling, generated-output, +developer-workflow, or broad runtime changes, even when the remaining targets would be low-risk on +their own. Record that qualifying subset in the QA Evidence `Scope checked` field and, when the +coordination backend has a supported lane note or metadata field, in the final lane state; do not +invent new backend schema. + +Coordinate QA with the same primitives as other batch lanes: + +- The coordinator declares the QA lane in private batch state when the backend is available, for + example as lane `qa` or a backend-supported synthetic target such as `:qa`. Do not add new + backend schema requirements in this workflow; use the current private backend README/schema for the + exact representation. + For scoped QA sub-lanes, use `qa:` in human-facing evidence and + the nearest supported private-backend lane representation. +- The QA owner gets a stable agent id, branch/worktree ownership when files may be edited, and + `agent-coord claim` / `agent-coord heartbeat` updates at lane start, evidence refresh, blocked state, + resumed state, and done state. If private state is unavailable, record claim and heartbeat state as + `UNKNOWN` and use the public claim-comment fallback only where the dependency rules allow it. Even in + fallback mode, required QA needs a concrete owner and branch/worktree; only private claim/heartbeat + sub-values may be `UNKNOWN`. **Allowed fallback evidence:** complete the QA Evidence fields and use + `UNKNOWN` only for the private claim/heartbeat sub-values that cannot be verified. +- QA may run in parallel with audit or closeout once changed areas and candidate PRs are known, but it + must not push dependent changes while declared `blocked_on` refs remain unmet. +- QA findings are triaged like other batch findings: release-blocking issues stop readiness or + promotion until fixed or explicitly waived, while non-blocking process improvements are bundled in + the handoff and become follow-up issues only when the Follow-Up Tracking Policy allows it. + Waivers require an explicit maintainer comment URL, issue link, or PR body entry naming the finding, + scope, and reason. + +Each final batch handoff that has a QA lane, or intentionally omits one, includes this evidence block: + +```markdown +### QA Evidence + +- QA lane: +- Scope checked: +- Tested at: +- Automated checks: +- Manual checks: +- Findings: +- QA required: +- QA required rationale: +- QA lane status: +- Release-blocking status: +- Process-gap disposition: