Skip to content

Reduce PR Review API calls with GraphQL#654

Open
glenn-jocher wants to merge 17 commits into
mainfrom
pr-review-graphql
Open

Reduce PR Review API calls with GraphQL#654
glenn-jocher wants to merge 17 commits into
mainfrom
pr-review-graphql

Conversation

@glenn-jocher
Copy link
Copy Markdown
Member

@glenn-jocher glenn-jocher commented Nov 16, 2025

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Optimize GitHub PR workflows by batching multiple REST calls into fewer GraphQL requests for PR summaries, reviews, and issue labeling 🚀

📊 Key Changes

  • 🔁 Reuse existing pull_request payload data instead of performing an extra REST call when handling new PR events.
  • 🧵 Introduce get_pr_details_and_labels to fetch PR body and repository labels in a single GraphQL query for PR-open handling.
  • 🧹 Refactor dismiss_previous_reviews to use get_pr_reviews_and_comments (GraphQL) and batch_delete_review_comments to fetch and delete bot reviews/comments more efficiently.
  • 🏷️ Add GRAPHQL_LABEL_AND_COMMENT_ISSUE and related helpers (get_multiple_issue_node_ids, get_label_ids usage) to batch-label and comment on closing issues via GraphQL instead of multiple REST calls.
  • 📝 Enhance update_pr_description to accept a cached PR body, reducing redundant fetches while still safely handling missing descriptions.
  • 🔗 Export new GraphQL constants from actions.utils and wire them into summarize_pr.py and other call sites.

🎯 Purpose & Impact

  • ⚡ Reduces the number of GitHub REST API calls, lowering latency and the risk of hitting rate limits for busy repositories.
  • 🧠 Centralizes PR, review, comment, and label handling through GraphQL, simplifying maintenance and future extensions.
  • 🔒 Adds defense-in-depth checks on review/comment authors, improving safety when dismissing or deleting bot-generated content.
  • ✅ Keeps existing behavior (summaries, labels, issue auto-labeling) while making the implementation more efficient and scalable for Ultralytics workflows.

@UltralyticsAssistant UltralyticsAssistant added devops GitHub Devops or MLops enhancement New feature or request labels Nov 16, 2025
@UltralyticsAssistant
Copy link
Copy Markdown
Member

👋 Hello @glenn-jocher, thank you for submitting a ultralytics/actions 🚀 PR! This is an automated review helper, and an Ultralytics engineer will be with you shortly to assist.

  • Define a Purpose: Clearly explain the purpose of your fix or feature in your PR description, and link to any relevant issues. Ensure your commit messages are clear, concise, and adhere to the project's conventions.
  • Synchronize with Source: Confirm your PR is synchronized with the ultralytics/actions main branch. If it's behind, update it by clicking the 'Update branch' button or by running git pull and git merge main locally.
  • Ensure CI Checks Pass: Verify all Ultralytics Continuous Integration (CI) checks are passing. If any checks fail, please address the issues.
  • Update Documentation: Update the relevant documentation for any new or modified features.
  • Add Tests: If applicable, include or update tests to cover your changes, and confirm that all tests are passing.
  • Sign the CLA: Please ensure you have signed our Contributor License Agreement if this is your first Ultralytics PR by writing "I have read the CLA Document and I sign the CLA" in a new message.
  • Minimize Changes: Limit your changes to the minimum necessary for your bug fix or feature addition. "It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is." — Bruce Lee

For more guidance, please refer to our Contributing Guide. Don't hesitate to leave a comment if you have any questions. Thank you for contributing to Ultralytics! 🚀

Copy link
Copy Markdown
Member

@UltralyticsAssistant UltralyticsAssistant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 PR Review

Made with ❤️ by Ultralytics Actions

Overall this PR is a solid refactor that reduces REST calls by leveraging GraphQL for PR details, reviews, comments, and issue labeling. The main opportunities for improvement are around robustness and clarity: avoid potential GraphQL limits in batch_delete_review_comments, consider how the 100-label GraphQL limit might affect large repositories, reduce duplicated get_username usage in the review dismissal path, tighten the GraphQL query construction for multiple issues, and clean up a minor unused loop variable in label_fixed_issues.

💬 Posted 5 inline comments

@glenn-jocher glenn-jocher linked an issue Nov 16, 2025 that may be closed by this pull request
Copy link
Copy Markdown
Member

@UltralyticsAssistant UltralyticsAssistant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 PR Review

Made with ❤️ by Ultralytics Actions

Overall the PR is well-aligned with the goal of reducing REST calls by introducing targeted GraphQL helpers and reusing existing payload data. The main functional concern is that label fetching at the end of first_interaction.main now runs for all event types, including PRs, which both contradicts the comment and reintroduces an extra labels REST call on PR-open events. The rest of the changes are primarily structural and performance-oriented; I’ve noted a few minor clarity and robustness improvements around duplicate username lookups, silent failures in batch deletion, and the update_pr_description signature change potentially impacting external call sites.

💬 Posted 5 inline comments

@codecov
Copy link
Copy Markdown

codecov Bot commented Nov 16, 2025

Codecov Report

❌ Patch coverage is 13.95349% with 74 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
actions/utils/github_utils.py 15.68% 43 Missing ⚠️
actions/first_interaction.py 6.66% 14 Missing ⚠️
actions/summarize_pr.py 7.69% 12 Missing ⚠️
actions/review_pr.py 0.00% 5 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Member

@UltralyticsAssistant UltralyticsAssistant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 PR Review 3

Made with ❤️ by Ultralytics Actions

Overall the refactor to use GraphQL for PR details, reviews, and issue labeling looks solid and should reduce REST chatter while keeping behavior consistent. The main improvement to address is ensuring update_pr_description still refreshes the PR body when the cached current_body is effectively missing (empty string), so you don’t accidentally drop existing descriptions. The remaining comments are minor optimizations around duplicate lookups, batching behavior, and avoiding unnecessary label fetches.

💬 Posted 5 inline comments

Copy link
Copy Markdown
Member

@UltralyticsAssistant UltralyticsAssistant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 PR Review

Made with ❤️ by Ultralytics Actions

Overall the refactor to use GraphQL for PR details, reviews, and issue labeling looks solid and aligns with the goal of reducing REST calls. The main functional issue is that get_pr_reviews_and_comments reads from a comments field that doesn’t exist in the GraphQL response, so inline review comments are never deleted. Additionally, batch_delete_review_comments only handles the first 50 comments, which may leave older bot comments behind on large PRs. There’s also a minor efficiency concern around redundant get_username calls between dismiss_previous_reviews and get_pr_reviews_and_comments. Addressing these points will restore expected behavior while keeping the new performance benefits.

💬 Posted 3 inline comments

Copy link
Copy Markdown
Member

@UltralyticsAssistant UltralyticsAssistant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 PR Review 5

Made with ❤️ by Ultralytics Actions

Overall the refactor meaningfully reduces REST calls by introducing GraphQL helpers for PR details, reviews/comments, and issue labeling, and the call-site changes in first_interaction.py and summarize_pr.py look consistent with the new APIs. The main areas to improve are: (1) hardening update_pr_description against failed/non-JSON GET responses, (2) making batch_delete_review_comments handle more than 50 comments or at least log when truncation occurs, (3) removing the now-unused GITHUB_API_URL import from summarize_pr.py, and (4) simplifying dismiss_previous_reviews to avoid redundant username lookups and filtering logic. The new GraphQL helpers themselves are generally well-structured and parameterized safely.

💬 Posted 4 inline comments

@glenn-jocher glenn-jocher requested review from UltralyticsAssistant and removed request for UltralyticsAssistant November 16, 2025 13:07
Copy link
Copy Markdown
Member

@UltralyticsAssistant UltralyticsAssistant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 PR Review 6

Made with ❤️ by Ultralytics Actions

Overall the refactor to use GraphQL for PR metadata, reviews, and issue labeling looks solid and should reduce API calls. The main areas to tighten are around robustness and behavior parity with the old REST flow: ensure failures in the new GraphQL-based dismissal path are surfaced (or have a fallback), and restore tolerant handling of 404s when deleting old review comments. A bit more defensive error handling in update_pr_description and clear bounds for the batched issue ID lookup will make the new code more resilient.

💬 Posted 5 inline comments

Copy link
Copy Markdown
Member

@UltralyticsAssistant UltralyticsAssistant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 PR Review 6

Made with ❤️ by Ultralytics Actions

Overall, the GraphQL batching changes are well-aligned with the goal of reducing REST calls and look consistent across PR review, summary, and issue labeling flows. The main functional concern is the robustness of the GRAPHQL_PR_REVIEWS_COMMENTS query: if the author filter is not supported as written, dismissing previous reviews will quietly stop working. Additionally, the signature change to update_pr_description may break external callers, and there is a small opportunity to avoid duplicate username resolution in the review dismissal path and to centralize label-fetching logic between PR and non-PR flows.

💬 Posted 5 inline comments

Copy link
Copy Markdown
Member

@UltralyticsAssistant UltralyticsAssistant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 PR Review 8

Made with ❤️ by Ultralytics Actions

Overall the refactor to use GraphQL for PR details, reviews, and issue labeling looks solid and aligns with the goal of reducing API calls. The main functional concern is the update_pr_description signature change, which could break any positional third-argument callers that previously relied on max_retries. There are also a couple of minor inefficiencies where get_username() is called twice for the same operation, and some opportunities to clarify retry behavior and bounds in the new helper methods.

💬 Posted 5 inline comments


def update_pr_description(self, number: int, new_summary: str, max_retries: int = 2):
"""Updates PR description with summary, retrying if description is None."""
def update_pr_description(self, number: int, new_summary: str, current_body: str | None = None):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 MEDIUM: update_pr_description changed signature from (number, new_summary, max_retries=2) to (number, new_summary, current_body=None) and hard-codes 3 attempts. Any existing callers that passed a positional third argument expecting max_retries will now treat that value as the PR body, which could cause incorrect description updates. Also, the maximum retries is now implicit and no longer configurable. It would be safer either to preserve the old parameter (e.g. add max_retries as a keyword-only argument) or to ensure all internal callers use keyword arguments so this change cannot break external usages.

Comment thread actions/review_pr.py
def dismiss_previous_reviews(event: Action) -> int:
"""Dismiss previous bot reviews and delete inline comments, returns count for numbering."""
if not (pr_number := event.pr.get("number")) or not (bot_username := event.get_username()):
if not (pr_number := event.pr.get("number")):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 LOW: dismiss_previous_reviews now calls event.get_username() directly and then indirectly again via event.get_pr_reviews_and_comments(). This double lookup is minor but unnecessary overhead on every run, and also duplicates the None handling logic in two places. Consider passing the already-resolved bot_username into get_pr_reviews_and_comments (or having it accept an optional precomputed login) to avoid redundant API calls and keep the responsibility for username resolution in one place.

if i < max_retries:
print("No current PR description found, retrying...")
time.sleep(1)
if current_body is None:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 LOW: Inside update_pr_description, the retry loop always assigns current_body = self.get(url).json().get("body") or "", so by the time you reach the summary replacement/appending logic current_body is guaranteed to be a string. You can simplify and clarify the control flow by removing the for i in range(3) magic number and extracting the "fetch-with-retries" into a small helper or a clearly named constant, which makes the behavior easier to reason about and maintain.

result = self.graphql_request(GRAPHQL_PR_DETAILS, variables=variables)
return result.get("data", {}).get("repository", {})

def get_pr_reviews_and_comments(self, pr_number: int) -> tuple[list, list]:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 LOW: get_pr_reviews_and_comments calls self.get_username() internally, but dismiss_previous_reviews has already resolved and validated bot_username before calling this helper. This leads to redundant API calls and slightly divergent behavior between the helper and caller. Consider accepting bot_login as an optional argument and using the passed-in value when available to avoid duplicate lookups and keep the logic for handling a missing bot username centralized.

for comment_id in comment_ids:
self.delete(f"{GITHUB_API_URL}/repos/{self.repository}/pulls/comments/{comment_id}")

def get_multiple_issue_node_ids(self, issue_numbers: list[int]) -> dict[int, str]:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 LOW: get_multiple_issue_node_ids dynamically builds a GraphQL query with one variable per issue. This is safe given the use of variables but can grow the query size linearly with issue_numbers. Since closingIssuesReferences is already capped upstream, this is probably fine, but it may be worth documenting or enforcing a reasonable upper bound on issue_numbers here to avoid accidentally constructing very large queries if the method is reused elsewhere.

glenn-jocher and others added 2 commits November 21, 2025 11:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops GitHub Devops or MLops enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Excessive REST API calls in PR Review

2 participants