PETE - Recover from dropped pull_request events (#13667)

jonvanausdeln · web-flow · commit d3be055c4a26 · 2026-05-22T07:59:11.000-07:00
## Summary - Listen for `synchronize` and `ready_for_review` on top of `opened` in `.github/workflows/pr-test-checker.yml` so PETE has a recovery path when GitHub drops a `pull_request: opened` delivery. - Add an idempotency guard to `.github/actions/pr-test-checker/action.yml`: any `pull_request`-triggered run short-circuits when a PETE comment already exists on the PR. `synchronize` / `ready_for_review` therefore only do work when the original `opened` was missed -- they don't re-grade on every push. - `/recheck-tests` (the `issue_comment` flow) sets `force-regrade=true` and bypasses the guard, so a manual re-grade is never short-circuited. - Rename the `pull_request` job from `Grade on open` to `Grade on PR change` to reflect the broadened trigger set. ## Motivation [#13664](#13664) (opened by @samclark2015) didn't get a PETE verdict because GitHub silently dropped the `pull_request: opened` event delivery. The `pull_request_target` channel (CLA) fired fine, but no `pull_request` workflows fired on open. PETE only listened for `opened`, so it had no recovery path; other workflows like *PR: Comment* recovered because they also listen for `synchronize`. ## Observable behavior - New PR opened (event delivered): PETE grades on `opened`. Done. - New PR opened (event dropped): no comment exists, so the first `synchronize` (next push) or `ready_for_review` (draft -> ready) fires PETE, which posts the verdict. - Subsequent pushes once a verdict exists: action looks up the existing PETE comment, exits early, no LLM tokens spent. - `/recheck-tests` comment: always re-grades regardless of any existing comment.
diff --git a/.github/actions/pr-test-checker/action.yml b/.github/actions/pr-test-checker/action.yml
@@ -10,6 +10,10 @@ inputs:
   repo-root:
     description: "Absolute path to the PR-head checkout the agent reads source from. The action itself runs from a trusted base-branch checkout; this input lets the workflow point the agent at a separate, untrusted checkout for source-reading only. Must be set for any flow that handles secrets."
     required: true
+  force-regrade:
+    description: "When 'true', always run the analyzer even if a PETE comment already exists. Set by the /recheck-tests flow so a manual re-grade is never short-circuited. Default 'false' so `pull_request` events (opened / synchronize / ready_for_review) skip when a verdict has already been posted -- this lets `synchronize` recover a missed `opened` delivery without re-grading every subsequent push."
+    required: false
+    default: "false"
   model:
     description: "Anthropic model identifier"
     required: false
@@ -21,12 +25,42 @@ inputs:
 runs:
   using: composite
   steps:
+    - name: Check for prior grade (idempotency guard)
+      # When called from a `pull_request` event, skip if a PETE comment
+      # already exists on the PR. This lets `synchronize` / `ready_for_review`
+      # act purely as recovery channels for a dropped `opened` delivery
+      # (see #13664) without re-grading on every subsequent push. The
+      # /recheck-tests flow sets force-regrade=true to bypass this.
+      id: idempotency
+      if: inputs.force-regrade != 'true'
+      shell: bash
+      env:
+        GH_TOKEN: ${{ github.token }}
+        GITHUB_REPOSITORY: ${{ github.repository }}
+        PR_NUMBER: ${{ inputs.pr-number }}
+      run: |
+        MARKER="<!-- pr-test-checker -->"
+        # Match on BOTH the marker AND the bot author -- a non-bot commenter
+        # could otherwise paste the marker to suppress future auto-grading
+        # until someone manually runs /recheck-tests.
+        EXISTING_ID=$(gh api "repos/${GITHUB_REPOSITORY}/issues/${PR_NUMBER}/comments" --paginate \
+          --jq "[.[] | select(.body | contains(\"$MARKER\")) | select(.user.login == \"github-actions[bot]\")][0].id // empty")
+        if [ -n "$EXISTING_ID" ]; then
+          echo "PETE comment already exists (id=${EXISTING_ID}); skipping. Use /recheck-tests to force a re-grade."
+          echo "skip=true" >> "$GITHUB_OUTPUT"
+        else
+          echo "No PETE comment yet; proceeding."
+          echo "skip=false" >> "$GITHUB_OUTPUT"
+        fi
+
     - name: Set up Node
+      if: steps.idempotency.outputs.skip != 'true'
       uses: actions/setup-node@v4
       with:
         node-version: "20"
 
     - name: Install action dependencies
+      if: steps.idempotency.outputs.skip != 'true'
       shell: bash
       working-directory: ${{ github.action_path }}
       run: npm ci --no-audit --no-fund
@@ -36,6 +70,7 @@ runs:
       # the musl variant over glibc on Linux runners. Installing globally and
       # passing the resolved path via pathToClaudeCodeExecutable sidesteps it.
       # See claude-agent-sdk-typescript#296.
+      if: steps.idempotency.outputs.skip != 'true'
       shell: bash
       run: |
         npm install -g @anthropic-ai/claude-code
@@ -44,6 +79,7 @@ runs:
 
     - name: Prepare working directory
       id: prep
+      if: steps.idempotency.outputs.skip != 'true'
       shell: bash
       run: |
         WORK_DIR="${RUNNER_TEMP}/pr-test-checker"
@@ -55,6 +91,7 @@ runs:
       # Runs the gather script from the action's own directory (i.e. the
       # trusted base-branch checkout), not from the PR head. PR-head code
       # must never execute with secrets in scope.
+      if: steps.idempotency.outputs.skip != 'true'
       shell: bash
       env:
         GH_TOKEN: ${{ github.token }}
@@ -72,6 +109,7 @@ runs:
       # so they are read from the trusted base-branch checkout. Only
       # REPO_ROOT (where the agent's Read/Grep/Glob tools operate) points
       # at the untrusted PR-head checkout.
+      if: steps.idempotency.outputs.skip != 'true'
       shell: bash
       working-directory: ${{ github.action_path }}
       env:
@@ -84,6 +122,7 @@ runs:
       run: node analyze.mjs
 
     - name: Upsert PR comment
+      if: steps.idempotency.outputs.skip != 'true'
       shell: bash
       env:
         GH_TOKEN: ${{ github.token }}
@@ -98,10 +137,12 @@ runs:
         fi
 
         MARKER="<!-- pr-test-checker -->"
-        # Find an existing comment with our marker. Pull only what we need from
-        # the comments listing (id + body) to keep the response small.
+        # Find an existing comment with our marker AND authored by the bot.
+        # Filtering on author prevents a spoofed marker in a human comment
+        # from being picked up here (either as a PATCH target that 403s, or
+        # by hiding the bot's own comment from the upsert lookup).
         EXISTING_ID=$(gh api "repos/${GITHUB_REPOSITORY}/issues/${PR_NUMBER}/comments" --paginate \
-          --jq "[.[] | select(.body | contains(\"$MARKER\"))][0].id // empty")
+          --jq "[.[] | select(.body | contains(\"$MARKER\")) | select(.user.login == \"github-actions[bot]\")][0].id // empty")
 
         if [ -n "$EXISTING_ID" ]; then
           echo "Updating existing comment $EXISTING_ID"
@@ -115,7 +156,7 @@ runs:
         fi
 
     - name: Upload analyzer artifacts
-      if: always()
+      if: always() && steps.idempotency.outputs.skip != 'true'
       uses: actions/upload-artifact@v4
       with:
         name: pr-test-checker-${{ inputs.pr-number }}
diff --git a/.github/workflows/pr-test-checker.yml b/.github/workflows/pr-test-checker.yml
@@ -4,6 +4,18 @@ name: "PR Test Checker"
 # Pilot scope: only runs for PR authors / commenters in the allowlist below.
 # To widen the pilot, edit both ALLOWLIST refs in this file (kept in sync).
 #
+# Trigger recovery: we listen for `opened`, `synchronize`, AND `ready_for_review`
+# on `pull_request`, not just `opened`. GitHub occasionally drops `opened`
+# deliveries (see #13664, where the `pull_request_target` CLA event fired but
+# `pull_request: opened` didn't, leaving PETE with no verdict), so the extra
+# types give us a recovery path on the next push or draft->ready transition.
+# Observable behavior: PETE grades on PR creation, and only re-grades on
+# explicit `/recheck-tests`. To avoid re-grading on every push, the action
+# short-circuits any `pull_request`-triggered run when a PETE comment already
+# exists on the PR -- `synchronize` and `ready_for_review` therefore only do
+# work in the rare case where the initial `opened` was missed. `/recheck-tests`
+# (an `issue_comment` event) sets `force-regrade=true` and bypasses the guard.
+#
 # Security model: each job checks out two trees -- the BASE branch (trusted
 # action + skill code, used to run the analyzer with secrets) and the PR HEAD
 # (untrusted source code, mounted as read-only data for the agent's
@@ -18,13 +30,15 @@ on:
   pull_request:
     types:
       - opened
+      - synchronize
+      - ready_for_review
   issue_comment:
     types:
       - created
 
 jobs:
   on-open:
-    name: Grade on open
+    name: Grade on PR change
     if: |
       github.event_name == 'pull_request' &&
       contains(fromJson('["jonvanausdeln", "sharon-wang", "melissa-barca", "samclark2015", "timtmok"]'), github.event.pull_request.user.login)
@@ -143,3 +157,4 @@ jobs:
           pr-number: ${{ github.event.issue.number }}
           anthropic-api-key: ${{ env.ANTHROPIC_KEY }}
           repo-root: ${{ github.workspace }}/pr-head
+          force-regrade: "true"