Skip to content

feat(pr-approval-agent): surface PR and comment reactions to LLM reviewer#67262

Merged
webjunkie merged 6 commits into
masterfrom
claude/stamphog-pr-approval-review-bzotzu
Jul 1, 2026
Merged

feat(pr-approval-agent): surface PR and comment reactions to LLM reviewer#67262
webjunkie merged 6 commits into
masterfrom
claude/stamphog-pr-approval-review-bzotzu

Conversation

@webjunkie

Copy link
Copy Markdown
Contributor

Problem

The LLM reviewer in the PR approval agent currently sees review states and inline comments, but misses reaction signals (👍, 👎, 👀, etc.) that teammates and bot reviewers leave on PRs and comments. These reactions are lightweight signals that provide useful context — a 👍 suggests approval, a 👎 suggests concern, and an 👀 signals an in-flight review that should block approval.

Changes

GitHub API integration:

  • Extended the GraphQL query to fetch reactions on both the PR body and individual review comments (up to 20 per object)
  • Added _normalize_reactions() to extract user and content from GraphQL reaction nodes
  • Added _reaction_emoji() to normalize reaction content across REST ("+1", "-1") and GraphQL ("THUMBS_UP", "THUMBS_DOWN") formats to consistent emoji
  • Renamed _fetch_review_threads() to _fetch_threads_and_reactions() to reflect that it now returns both comments and PR-level reactions
  • Added pr_reactions field to PRData dataclass

LLM reviewer logic:

  • Updated the review prompt to include reactions on the PR and on individual comments, formatted as 👍 @user
  • Expanded the review guidelines to explain reaction semantics:
    • 👍/👎 are weak evidence (never approve on 👍 alone or refuse on 👎 alone)
    • 👀 (eyes) signals an in-flight review — refuse rather than approving over someone mid-review
    • For non-trivial changes, require at least one independent reviewer (agent or human) to have passed over the current head
  • Updated model from claude-sonnet-4-6 to claude-sonnet-5
  • Added reaction counts to PostHog analytics events

Tests:

  • Added unit test for _reaction_emoji() covering REST/GraphQL normalization and unknown reaction passthrough

How did you test this code?

Added unit tests in test_github.py covering reaction emoji normalization across REST and GraphQL formats. The GraphQL query changes are exercised by the existing _fetch_threads_and_reactions() integration (formerly _fetch_review_threads()), which is called by fetch_pr() in the main flow. Existing tests for review thread fetching continue to pass with the refactored function signature.

🤖 Agent context

Autonomy: Human-driven (agent-assisted)

This change surfaces reaction signals to the LLM reviewer so it can factor in lightweight feedback from teammates and bot reviewers. The GraphQL query was extended to fetch reactions alongside review threads in a single call (no extra round trip). Reaction content is normalized across REST and GraphQL APIs to consistent emoji, and the review prompt guidelines were expanded to explain how to interpret each reaction type — particularly the 👀 signal, which should block approval until the in-flight review completes.

https://claude.ai/code/session_01MfQHiE3VueG1cbMbKnNre9

claude added 3 commits July 1, 2026 06:42
Read PR and comment reactions (👍/👎/👀, both REST and GraphQL spellings)
and pass them to the LLM reviewer as weak context. Tighten the reviewer
prompt so that, for non-trivial changes, at least one independent
reviewer (agent or teammate) must have passed over the current head
before approval; an 👀 reaction is treated as an in-flight review and
refuses rather than approving over it. Fix the stale "approval states
are hidden" guidance — states are shown. Bump the reviewer model to
Sonnet 5.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MfQHiE3VueG1cbMbKnNre9
The PullRequest node exposes reactions, so read PR-body reactions in the
same GraphQL round trip that already fetches review threads and their
comment reactions — dropping the dedicated REST reactions call. The REST
PR object only carries reaction counts, not reactor logins, so GraphQL is
the right consolidation point. Drops the per-reactor bot flag (GraphQL
reaction users don't expose it; the login already signals it).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MfQHiE3VueG1cbMbKnNre9
Collapse the two inline GraphQL-reaction transforms into one
`_normalize_reactions(node)` helper (used for both PR-body and comment
reactions), dropping the page-tracking flag since re-reading the tiny
node list each page is trivial. Share the per-reaction `emoji @user`
render via a single `_reaction_token` helper across both prompt sites.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MfQHiE3VueG1cbMbKnNre9
@webjunkie webjunkie requested a review from a team as a code owner July 1, 2026 06:59
Copilot AI review requested due to automatic review settings July 1, 2026 06:59
@assign-reviewers-posthog assign-reviewers-posthog Bot requested a review from a team July 1, 2026 06:59

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5719870a69

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tools/pr-approval-agent/github.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the tools/pr-approval-agent pipeline by fetching and surfacing GitHub reactions (on the PR and inline review comments) to the LLM reviewer, so it can use lightweight teammate/bot signals as additional context during automated review.

Changes:

  • Extend the GitHub GraphQL fetch to include PR-level and inline-comment reactions, normalized to consistent emoji.
  • Include reaction annotations in the LLM review prompt and track reaction counts in PostHog analytics.
  • Add unit tests for REST/GraphQL reaction-content normalization.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tools/pr-approval-agent/github.py Adds reaction normalization helpers, extends GraphQL query, and returns PR reactions alongside review-thread comments.
tools/pr-approval-agent/reviewer.py Updates model, prompt text, prompt formatting, and analytics properties to incorporate reactions.
tools/pr-approval-agent/review_pr.py Adds PR reaction count to the completed-review analytics event.
tools/pr-approval-agent/test_github.py Adds unit test coverage for _reaction_emoji() normalization behavior.
tools/pr-approval-agent/README.md Documents that the LLM considers reactions (and how it interprets them).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tools/pr-approval-agent/reviewer.py
Comment thread tools/pr-approval-agent/github.py Outdated
Comment thread tools/pr-approval-agent/github.py Outdated
Comment thread tools/pr-approval-agent/README.md Outdated
@greptile-apps

greptile-apps Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Reviews (1): Last reviewed commit: "refactor(stamphog): dedupe reaction norm..." | Re-trigger Greptile

Comment thread tools/pr-approval-agent/github.py Outdated
Comment thread tools/pr-approval-agent/reviewer.py Outdated
@veria-ai

veria-ai Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

PR overview

This PR adds PR-level and comment-level GitHub reaction data to the inputs provided to the PR approval agent/LLM reviewer. The touched GitHub integration code gathers reactions so review intent signals such as eyes or negative reactions can inform approval decisions.

There is one open issue remaining after one has already been addressed. The remaining risk is that reaction collection is limited before filtering for trusted users, so an author could crowd out trusted reactions and cause the reviewer to miss signals that a review is still in progress or should block approval. This is a concrete manipulation path with a limited but meaningful impact on automated approval behavior.

Open issues (1)

Fixed/addressed: 1 · PR risk: 6/10

Reactions on a public PR can come from anyone, so filter them to
recognized bot reviewers and org members and never the PR author before
surfacing them to the reviewer. Without this an external user could block
auto-approval with an 👀 or an author could fake an independent review by
👍-ing their own PR. Org membership is a best-effort, fail-closed check,
memoized per reactor.

Also fix docs/prompt wording: only top-level reviews carry a
current-head/older-commit marker, and no reactions are fetched for
top-level reviews (only the PR and review comments).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MfQHiE3VueG1cbMbKnNre9

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 27299584ed

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +177 to +178
concerns, or a 👍 on the PR or a review comment. If none has, ESCALATE and
tell the author to get a review before re-requesting.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Require current-head evidence for 👍 reactions

For a PR that receives new commits after an earlier teammate/bot 👍, this treats that persistent reaction as proof an independent reviewer passed over the current head, but the fetched reaction data has no commit or timestamp context. Since the workflow reruns on synchronize, a non-trivial update can be auto-approved based on a stale reaction even though no one reviewed the new head; either don't let bare reactions satisfy the current-head independent-review requirement or fetch enough metadata to ignore reactions older than the reviewed head.

Useful? React with 👍 / 👎.

Comment thread tools/pr-approval-agent/github.py Outdated
@sakce sakce added the stamphog Request AI approval (no full review) label Jul 1, 2026

@stamphog stamphog Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two unresolved bot/agent inline comments raise substantive concerns about the new reaction-gating security logic: (1) the GraphQL query uses reactions(first: 20) without pagination, so a trusted blocking 👀 beyond the 20th reaction would be silently ignored; (2) the 👀 "in-flight review" signal carries no commit or timestamp context, meaning a stale reaction from before new commits were pushed can indefinitely block auto-approval. Address or explicitly dismiss both before re-requesting.

@stamphog stamphog Bot removed the stamphog Request AI approval (no full review) label Jul 1, 2026
Comment thread tools/pr-approval-agent/github.py Outdated
query($owner: String!, $name: String!, $pr: Int!, $threadCursor: String) {
repository(owner: $owner, name: $name) {
pullRequest(number: $pr) {
reactions(first: 20) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: Trusted reactions can be crowded out

The query fetches only the first 20 PR reactions and only filters to trusted users afterward. A PR author can add enough untrusted reactions, directly or via other accounts, so a trusted 👀 or negative reaction is not included in pr_reactions and the agent may approve over an in-progress review; the same truncation exists for comment reactions below.

Fetch and paginate reactions until trusted reactions have been collected, or otherwise filter at the API boundary before applying a limit.

@webjunkie webjunkie Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in c94d16d: reactions now fetch up to 100 nodes, and bot reactions are gated on an explicit reviewer-bot allowlist instead of any [bot] login.

@pauldambra pauldambra left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the bots have thoughts but this is safe to try :shipit:

@Piccirello Piccirello left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we determine which bots to trust use as signal? For example, the inkeep bot's reactions are useless in this context.

webjunkie added 2 commits July 1, 2026 21:43
Trusting any [bot] login treats every installed GitHub App as a review
signal, but only a handful use reactions as deliberate verdicts (Codex,
Greptile, hex-security, Veria leave 👍 for "reviewed clean"). Others
react for unrelated reasons — inkeep leaves 👎 as docs feedback — which
would read as a trusted negative signal or an 👀 blocker. Gate bot
reactions on an explicit allowlist instead; org members stay trusted.

Also bump reaction fetches from 20 to 100 nodes so a trusted reaction
can't be crowded out of the first page by untrusted ones before
filtering happens client-side.
@webjunkie webjunkie enabled auto-merge (squash) July 1, 2026 19:47

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5765fb4ceb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +176 to +178
over the current head: an APPROVED or COMMENTED review with no unresolved
concerns, or a 👍 on the PR or a review comment. If none has, ESCALATE and
tell the author to get a review before re-requesting.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Exclude author reviews from independent-review evidence

When an org-member PR author submits a COMMENTED review on their own PR, this instruction can let that self-review satisfy the new independent-review requirement: _normalize_reviews_for_prompt includes trusted reviews without excluding pr.author, while only reactions are explicitly filtered to drop the author. That bypasses the added non-trivial-change safeguard and can allow Stamphog to approve with no separate teammate or agent review; filter author reviews or explicitly tell the reviewer not to count them.

Useful? React with 👍 / 👎.

@webjunkie webjunkie merged commit b3332e0 into master Jul 1, 2026
187 checks passed
@webjunkie webjunkie deleted the claude/stamphog-pr-approval-review-bzotzu branch July 1, 2026 19:57
@deployment-status-posthog

deployment-status-posthog Bot commented Jul 1, 2026

Copy link
Copy Markdown

Deploy status

Environment Status Deployed At Workflow
dev ✅ Deployed 2026-07-01 20:38 UTC Run
prod-us ✅ Deployed 2026-07-01 20:49 UTC Run
prod-eu ✅ Deployed 2026-07-01 20:52 UTC Run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants