Skip to content

[no-ci] CI: Add pr-author-org-check.yml#1878

Open
rwgk wants to merge 13 commits intoNVIDIA:mainfrom
rwgk:pr-author-org-check
Open

[no-ci] CI: Add pr-author-org-check.yml#1878
rwgk wants to merge 13 commits intoNVIDIA:mainfrom
rwgk:pr-author-org-check

Conversation

@rwgk
Copy link
Copy Markdown
Collaborator

@rwgk rwgk commented Apr 7, 2026

Superseds (and reused work from) #1871

Closes #1139

TODO before merging:

  • expand INTERNAL_AUTHOR_ALLOWLIST
  • undo cuda_bindings/pyproject.toml # XXX DUMMY CHANGE XXX
  • change trigger from pull_request (useful only for testing this PR) to pull_request_target

Summary

This PR adds a separate workflow, pr-author-org-check.yml, for pull requests that touch restricted paths under cuda_bindings/ or cuda_python/.

The new pr-author-org-check.yml workflow does not implement strict enforcement. Instead, if a PR touches one of the restricted paths and we do not have a trusted signal for the PR author, the workflow assigns the Check-PR-author-ORG label for manual follow-up. The job summary shows the author login, the author association, the matched restricted paths, the trusted signal that was used if any, and whether the label was needed.

Trusted Signals

The current implementation treats the following as trusted signals:

  • github.event.pull_request.author_association is MEMBER
  • github.event.pull_request.author_association is OWNER
  • the PR author is listed in the checked-in INTERNAL_AUTHOR_ALLOWLIST

(At the moment, INTERNAL_AUTHOR_ALLOWLIST contains only rwgk.)

Notably, COLLABORATOR is not treated as a trusted signal .

Note: This workflow currently trusts the PR author; it does not check the full provenance of every commit on the PR branch. In practice, that means a trusted signal also carries the expectation that the PR author is careful not to bring in external-user commits touching cuda_bindings/ or cuda_python/.

Intended Workflow Beyond This PR

The intended direction is for Check-PR-author-ORG to become a merge-blocking mechanism for inconclusive cases.

The flow would then be:

  • the workflow detects that a PR touches cuda_bindings/ or cuda_python/
  • if no trusted signal is available, the workflow applies Check-PR-author-ORG
  • that label blocks merge, together with a clear message explaining that restricted-path review is required
  • the person merging the PR reviews the restricted-path changes specifically for the policy
  • if the PR is acceptable, they remove Check-PR-author-ORG manually and merge

In other words, the workflow should automatically clear the easy, common use cases, and route the remaining cases into an explicit manual review step instead of trying to over-infer author status from weak signals.

Report restricted-path author gating as its own workflow so contributors see the organization requirement as soon as a PR touches cuda_bindings/ or cuda_python/.

Made-with: Cursor
@rwgk rwgk added this to the cuda.bindings next milestone Apr 7, 2026
@rwgk rwgk self-assigned this Apr 7, 2026
@rwgk rwgk added P1 Medium priority - Should do CI/CD CI/CD infrastructure github_actions Pull requests that update GitHub Actions code labels Apr 7, 2026
@copy-pr-bot
Copy link
Copy Markdown
Contributor

copy-pr-bot bot commented Apr 7, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

rwgk added 3 commits April 7, 2026 15:27
Keep the lightweight PR check runnable on GitHub-hosted runners by avoiding the unsupported gh api --slurp --jq combination when scanning changed files.

Made-with: Cursor
Include the exact restricted-path matches in the job summary so contributors and reviewers can see what triggered the author organization check.

Made-with: Cursor
@rwgk
Copy link
Copy Markdown
Collaborator Author

rwgk commented Apr 7, 2026

I dug into whether we can determine conclusively from GitHub Actions if a PR author is a member of the NVIDIA GitHub organization.

  • github.event.pull_request.author_association is the built-in signal available directly in the PR payload. In this context:
    • MEMBER means the author is a member of the organization that owns the repository.
    • COLLABORATOR is a different bucket: someone with repository-level collaboration access, potentially an outside collaborator.
    • OWNER is the repository/organization owner bucket.
  • GitHub does expose organization-membership APIs that are more conclusive:
    • GET /orgs/NVIDIA/members/{username} returns 204 if the user is an org member and 404 if not, but only when the requester is itself an org member; otherwise it can return 302.
    • GET /orgs/NVIDIA/memberships/{username} returns membership details such as state and role, but also requires the requester to be an organization member.
  • The practical issue is credentials. The default workflow GITHUB_TOKEN is repository-scoped; it is not something I would rely on for conclusive organization-membership reads. Without introducing a separate PAT or GitHub App token with org-readable permissions, author_association is the only built-in signal readily available in Actions.
  • So the tradeoff looks like this:
    • no extra credential: use author_association, where MEMBER is the closest built-in proxy for "NVIDIA org member"
    • conclusive org-membership check: add a dedicated org-aware token and query the organization-membership API from a safe workflow context, most likely pull_request_target, without checking out or executing PR code

So COLLABORATOR is a policy decision, not a technical synonym for MEMBER. If we want to allow repo collaborators, that should be an explicit allowlist choice rather than something implied by GitHub's membership model.

rwgk added 2 commits April 7, 2026 16:40
Query NVIDIA org membership directly for restricted-path PRs so the workflow can distinguish members, non-members, and inconclusive cases, label PRs that need manual review, and show the matched files. Keep the trigger on pull_request temporarily so the new check can be exercised before switching back to pull_request_target.

Made-with: Cursor
@rwgk
Copy link
Copy Markdown
Collaborator Author

rwgk commented Apr 8, 2026

Reverting 42ba3f8 because the direct GET /orgs/NVIDIA/members/{username} experiment did not give us meaningful org-membership information with the default workflow token.

What we learned from the experiments:

  • The restricted-path detection itself works fine. The workflow can reliably detect changes under cuda_bindings/ / cuda_python/, including rename cases, and the summary output is useful.
  • The problem is specifically the org-membership query. Even for @rwgk, who is definitely in the NVIDIA GitHub org, the workflow did not get a conclusive "member" result from GET /orgs/NVIDIA/members/{username} when using the workflow GITHUB_TOKEN.
  • In other words, with the credentials we currently have in Actions, this query is not a trustworthy source of truth for "is the PR author in the NVIDIA org?".
  • I also did a systematic pass over .github/ and ci/ to see if we already had any more privileged PAT / GitHub App credentials we could reuse. We do not: everything there is using the standard github.token / secrets.GITHUB_TOKEN.
  • The label-based "inconclusive" fallback was secondary. It also ran into permission limits in the temporary pull_request test context, but the more important issue is that the membership query itself is not giving actionable answers with the default token.

Given the current goal of finding low-hanging fruit without introducing a dedicated PAT / GitHub App token, I think the direct org-membership query is not worth keeping in this PR.

If we revisit this later, the realistic options seem to be:

  • add a dedicated org-readable credential and do a true org-membership check, or
  • stay with a simpler policy/heuristic that does not depend on conclusive org-membership reads from Actions.

rwgk added 5 commits April 7, 2026 17:07
Treat repository collaborators as trusted for restricted-path changes so the workflow blocks only authors with no collaborator, member, or owner association.

Made-with: Cursor
Use OWNER and MEMBER associations plus public NVIDIA membership as true-positive signals, and add a review label instead of failing when the workflow cannot confirm the author automatically.

Made-with: Cursor
Replace the public-membership probe with a checked-in allowlist so restricted-path PRs can still get a known true positive during rollout while labeling authors that need manual review.

Made-with: Cursor
@rwgk
Copy link
Copy Markdown
Collaborator Author

rwgk commented Apr 8, 2026

We explored a few different ways to get a trustworthy automatic "yes" for PR authors who touch cuda_bindings/ / cuda_python/, and eventually settled on the current checked-in INTERNAL_AUTHOR_ALLOWLIST approach (commit 539461f) as the least bad rollout option for now.

  • github.event.pull_request.author_association was not sufficient by itself. In practice, COLLABORATOR is a repository-level relationship, not a reliable synonym for "external", and it does not cleanly tell us whether the PR author is in the NVIDIA GitHub org.
  • We also revisited the org-membership APIs (GET /orgs/{org}/members/{username} and the related membership endpoints). Those are attractive in theory, but with the credentials available in this workflow they did not give us a dependable organization-membership signal that we were comfortable enforcing on.
  • We checked whether public GitHub profile data could help, for example by looking at public organizations on https://github.com/{username} or via the public-org APIs. That only exposes public memberships, so it is not a reliable source of truth for private org membership.
  • We then tried the narrower endpoint GET /orgs/NVIDIA/public_members/{username} as a possible true-positive-only signal. Technically it behaves well (204 means yes, 404 means not a public member), but for @rwgk it was not a useful positive signal, so it would still leave many legitimate internal contributors in the "manual review needed" bucket.

Given those constraints, the current PR state is a pragmatic compromise:

  • keep MEMBER and OWNER as built-in true-positive signals
  • add a checked-in INTERNAL_AUTHOR_ALLOWLIST for known internal users during rollout (currently just rwgk)
  • if a PR touches cuda_bindings/ / cuda_python/ and we do not have a true-positive signal, assign the Check-PR-author-ORG label

So the reason we ended up here is that the other approaches were either too lossy (author_association), too dependent on private/public visibility (public_members / profile data), or not dependable enough with the credentials available in the workflow.

@rwgk
Copy link
Copy Markdown
Collaborator Author

rwgk commented Apr 8, 2026

Archiving OUTDATED PR description posted originally 2026-04-07 03:04 PM PDT:


Main open policy question

  • Is it acceptable for a COLLABORATOR to modify files under cuda_bindings/ or cuda_python/?
  • The current implementation says yes.
  • If the answer should instead be no, the workflow can be changed back to allow only MEMBER|OWNER.

Current state

  • This PR adds a separate workflow, pr-author-org-check.yml, so the restricted-path author check shows up as its own CI line item.
  • The workflow currently watches changes under cuda_bindings/ and cuda_python/.
  • It uses github.event.pull_request.author_association as the only author signal.
  • The current allowlist is:
    • COLLABORATOR
    • MEMBER
    • OWNER
  • If a PR touches one of the restricted paths and the author association is not in that allowlist, the workflow fails.
  • The job summary shows:
    • the author login
    • the author association
    • whether restricted paths were touched
    • the exact matched file(s) that triggered the check

What was tried and backed out

  • I also explored a more ambitious version that queried GET /orgs/NVIDIA/members/{username} directly.
  • In practice, with the default workflow token, that did not give trustworthy org-membership information even for known NVIDIA org members, so that approach was reverted.
  • I also checked systematically through .github/ and ci/ for an existing stronger PAT / GitHub App credential we could reuse, and did not find one.

What is already working

  • Pass when the PR does not touch cuda_bindings/ or cuda_python/.
  • Fail when a restricted path is touched by a non-allowed association.
  • Pass when a restricted path is touched by an allowed association.
  • The summary output is useful and shows the matched restricted file(s).

Notes before merge

  • The trigger is intentionally still pull_request right now so the workflow can be exercised from this PR branch. Before marking the PR ready for review / merging, change the trigger back to pull_request_target.

  • .github/workflows/pr-author-org-check.yml:76 now allows COLLABORATOR, but the failure text at .github/workflows/pr-author-org-check.yml:92 still says “Only NVIDIA organization members may modify...”. That message is no longer accurate and will misstate the actual policy to contributors and reviewers.

  • .github/workflows/pr-author-org-check.yml:39 matches files via both filename and previous_filename, but the emitted summary list only prints .filename at .github/workflows/pr-author-org-check.yml:45. A rename out of cuda_bindings/ or cuda_python/ will still trigger correctly, but the summary can show only the destination path and make the match look surprising.

rwgk added 2 commits April 8, 2026 13:14
Replace the remaining "true positive" terminology with "trusted signal" so the workflow summary and implementation language match the current manual-review design.

Made-with: Cursor
Show both the previous and current path in the PR author check summary so rename-triggered matches are understandable during manual review.

Made-with: Cursor
@rwgk rwgk marked this pull request as ready for review April 8, 2026 21:40
@rwgk rwgk requested review from leofang and rparolin April 8, 2026 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI/CD CI/CD infrastructure github_actions Pull requests that update GitHub Actions code P1 Medium priority - Should do

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI: Autogenerate a PR comment for contributors making changes to the cuda_bindings/ directory

2 participants