You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The merge-guard binding is sound and dual-independently verified (#1042 live-probe, convergent PASS). This issue is about the separate, real problem the verification surfaced in practice: the mint-side approval UX fails silently and is hard to get right. A correct, canonical approval template does exist (pact-plugin/commands/peer-review.md:437-442), but any deviation from it mints no token, gives no diagnostic, and just re-prompts — so a single merge can take several approval prompts and still not land, with nothing telling the approver why.
Scope guardrail: this is a MINT-SIDE ERGONOMICS issue. Do not weaken the set-equality binding (the #1042 fix) to resolve it. The goal is to make legitimate approval reliable and self-correcting, not to loosen what gets bound.
Evidence (live, this session)
While merging a trivial 6-line docs PR after closing #1042, the operator was prompted to approve three times before it landed:
Decision prompt (merge vs. leave) — approved.
Focused approval, command in the option label without backticks (Approve: gh pr merge <N> --squash) — silently minted nothing → NO_TOKEN.
Focused approval, command in the labelwith backticks (Approve \gh pr merge ``) — also failed in an earlier variant; the merge only succeeded once the command was correctly anchored.
Root of the fumble: the orchestrator improvised command-in-label instead of using the documented peer-review.md form (short "Yes, merge" label + command in the description). The guard's refuse message —
Merge/close/force-push/branch-delete requires user approval. Re-approve via AskUserQuestion with the literal command embedded in the selected option ...
— never mentions the canonical template, the backtick requirement, the no-extra-flags rule, or the decline-veto wordset, so it does not steer the approver back to the form that works. Even the orchestrator that had just certified this hook got it wrong twice.
Sister friction (same session, #1037 class): the is_dangerous_command self-block fired on benign read-only diagnostics three times — a security-engineer grep | … and an inline python -c that merely quotedgh pr merge, and a lead grep "gh pr merge" | head. Each was a non-destructive read mis-flagged as a compound destructive op. Same false-positive family.
Root cause — current mint contract (empirically reproduced against installed v4.4.44)
The canonical, documented form (peer-review.md:437-442) mints reliably: SINGLE-select, one question, a short affirmative label "Yes, merge", and the command in that option's description in backticks (Run `gh pr merge <N>` to merge this PR), with <N> the only number anywhere.
Deviations that silently produce NO-MINT (each reproduced on the real merge_guard_post):
Command in the LABEL instead of the description. The mint resolves the selected option by byte-equality between answers[question] and the option label; a genuine click on a long command-as-label can return an answer that doesn't byte-round-trip (truncation/normalization) → no match → no mint. (Backticks are also required for op/target extraction.) The description-based template sidesteps this entirely.
multiSelect: true → never mints (multiSelect is refused as a mint source, and the veto then scans every option).
A decline/defer word in the SELECTED affirmative option's own label or description (the merge_guard: decline-veto false-positives on command-literal tokens (--no-* flags, branch names) #1049 wordset, whole-word, case-insensitive): cancel|abort|reject|decline|deny|stop|no|nope|never|skip|don't|later|wait|hold|pause|reviewing|review|postpone|defer|pending|once|after + phrases not yet / review first / hold off / for now. (An unselected "Cancel" sibling does not veto in single-select — confirmed.) Note --no-verify as an approval self-vetoes because the substring no matches — whole-word matching is insufficient there.
Why it matters
A security control this easy to get silently wrong trains operators to fight it, work around it, or disable it — which is worse than a slightly-weaker-but-usable gate. The binding being correct doesn't help if legitimate approval is unreliable and undiagnosable.
Proposed improvements (acceptance criteria)
Self-teaching refuse message. On a NO_TOKEN/mismatch denial, the message states (or links) the canonical peer-review.md approval template: single-select, short "Yes, merge" label, command in the description in backticks, <N> the only number.
Non-silent mint failure. When an AskUserQuestion approval was issued but did not mint, emit a diagnostic naming why (command in label not description / byte-mismatch / multiSelect / decline-veto hit / multiple (op,target) / no options) instead of nothing.
One prompt per merge. Collapse decision + mint into a single approval so a normal merge is one prompt, not a re-prompt loop.
(Optional) Copy-pasteable helper. Surface the canonical approval block (or an allowlisted exact-command path) so the approver isn't reverse-engineering the parser.
Context / provenance
Surfaced during the #1042 post-install live-probe. The probe verified the binding thoroughly via the real entry points, but both verifiers disclosed the genuine user-click mint seam could only be tested with a faithful synthetic frame (the real token is approver-session-local) — so the genuine-click path was never exercised end-to-end until this session, which is where the ergonomics gap surfaced.
Summary
The merge-guard binding is sound and dual-independently verified (#1042 live-probe, convergent PASS). This issue is about the separate, real problem the verification surfaced in practice: the mint-side approval UX fails silently and is hard to get right. A correct, canonical approval template does exist (
pact-plugin/commands/peer-review.md:437-442), but any deviation from it mints no token, gives no diagnostic, and just re-prompts — so a single merge can take several approval prompts and still not land, with nothing telling the approver why.Evidence (live, this session)
While merging a trivial 6-line docs PR after closing #1042, the operator was prompted to approve three times before it landed:
Approve: gh pr merge <N> --squash) — silently minted nothing →NO_TOKEN.Approve \gh pr merge ``) — also failed in an earlier variant; the merge only succeeded once the command was correctly anchored.Root of the fumble: the orchestrator improvised command-in-label instead of using the documented
peer-review.mdform (short"Yes, merge"label + command in the description). The guard's refuse message —— never mentions the canonical template, the backtick requirement, the no-extra-flags rule, or the decline-veto wordset, so it does not steer the approver back to the form that works. Even the orchestrator that had just certified this hook got it wrong twice.
Sister friction (same session, #1037 class): the
is_dangerous_commandself-block fired on benign read-only diagnostics three times — a security-engineergrep | …and an inlinepython -cthat merely quotedgh pr merge, and a leadgrep "gh pr merge" | head. Each was a non-destructive read mis-flagged as a compound destructive op. Same false-positive family.Root cause — current mint contract (empirically reproduced against installed v4.4.44)
The canonical, documented form (
peer-review.md:437-442) mints reliably: SINGLE-select, one question, a short affirmative label"Yes, merge", and the command in that option's description in backticks (Run `gh pr merge <N>` to merge this PR), with<N>the only number anywhere.Deviations that silently produce NO-MINT (each reproduced on the real
merge_guard_post):answers[question]and the option label; a genuine click on a long command-as-label can return an answer that doesn't byte-round-trip (truncation/normalization) → no match → no mint. (Backticks are also required for op/target extraction.) The description-based template sidesteps this entirely.multiSelect: true→ never mints (multiSelect is refused as a mint source, and the veto then scans every option).cancel|abort|reject|decline|deny|stop|no|nope|never|skip|don't|later|wait|hold|pause|reviewing|review|postpone|defer|pending|once|after+ phrasesnot yet/review first/hold off/for now. (An unselected "Cancel" sibling does not veto in single-select — confirmed.) Note--no-verifyas an approval self-vetoes because the substringnomatches — whole-word matching is insufficient there.Why it matters
A security control this easy to get silently wrong trains operators to fight it, work around it, or disable it — which is worse than a slightly-weaker-but-usable gate. The binding being correct doesn't help if legitimate approval is unreliable and undiagnosable.
Proposed improvements (acceptance criteria)
NO_TOKEN/mismatch denial, the message states (or links) the canonicalpeer-review.mdapproval template: single-select, short"Yes, merge"label, command in the description in backticks,<N>the only number.AskUserQuestionapproval was issued but did not mint, emit a diagnostic naming why (command in label not description / byte-mismatch / multiSelect / decline-veto hit / multiple(op,target)/ no options) instead of nothing.noin--no-verify, branch names) appears in the approval command literal (scope to the selected option; move past substring to whole-structure matching). merge-guard: is_dangerous_command false-positive —gh issue commentbody quoting a dangerous command is rejected (substring-scan brittleness) #1037 —is_dangerous_command's substring scan blocking a benign command line that merely quotes a dangerous literal — agh issue comment/git commitbody, or agrepforgh pr merge(classify by the leading operation verb instead of substring-scanning the whole line). Both point the same way as this issue: structure over prose.Context / provenance
Surfaced during the #1042 post-install live-probe. The probe verified the binding thoroughly via the real entry points, but both verifiers disclosed the genuine user-click mint seam could only be tested with a faithful synthetic frame (the real token is approver-session-local) — so the genuine-click path was never exercised end-to-end until this session, which is where the ergonomics gap surfaced.
Related: #1042 (binding — verified sound, closed), #1037 (
is_dangerous_commandsubstring-scan false-positive), #1049 (decline-veto false-positive on command-literal tokens), #1031 / #1032 (auth-token mint/read symmetry). Canonical approval template:pact-plugin/commands/peer-review.md:437-442.