Skip to content

Run fork PR unit tests offline from a pre-warmed Go module cache#5803

Merged
chrisst merged 6 commits into
mainfrom
fork-ci-cache-warming
Jun 24, 2026
Merged

Run fork PR unit tests offline from a pre-warmed Go module cache#5803
chrisst merged 6 commits into
mainfrom
fork-ci-cache-warming

Conversation

@chrisst

@chrisst chrisst commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Changes

Adds a "Warm Go Cache" workflow and a fork-aware path to the tests job so that PRs from forks can pass the required build/unit-test check without JFrog credentials.

Why

Problem

Fork PRs currently fail the required tests check unconditionally: the build authenticates to the JFrog Go module proxy via GitHub OIDC, and GitHub does not issue OIDC tokens to fork PR workflow runs, so the very first setup step fails (Unable to get ACTIONS_ID_TOKEN_REQUEST_URL) before any code is compiled. This blocks all external contributions and any Databricks employee contributing from a personal fork.

Solution

This ports the cache-warming SOP

  • warm-go-cache.yml (new) — the sole writer of the go-modules-* Actions cache. Runs on the hardened runner group with JFrog OIDC, resolves modules with the exact command CI uses (jf go mod vendor), verifies the cache resolves offline, and saves it under a timestamped key. Triggers: daily cron (beats 7-day cache eviction), pushes to main touching go.mod/go.sum, and manual dispatch.
  • tests job fork routing — fork PRs skip JFrog entirely: restore the warmed cache (restore-only), resolve modules offline through Go's file:// proxy, run make test as usual. The check context stays tests. Same-repo runs are unchanged.

Maintainer flow for fork PRs that change dependencies

If a fork PR modifies go.mod/go.sum, its tests check fails on cache miss. After reviewing the dependency change, dispatch Actions → Warm Go Cache → Run workflow → pr_number = N: the warmer fetches only go.mod/go.sum from the fork (never source code; the PR number and head ref are validated before use), rebuilds the cache, and the contributor re-runs the failed check. The fork-PR auto-comment and CONTRIBUTING.md document this.

Security properties

  • No fork-controlled code ever executes in a privileged context; the warmer overlays dependency manifests only.
  • Only the warmer (main branch or maintainer-dispatched) writes the cache; PR workflows are restore-only, and workflow_dispatch inputs / PR head fields are validated against injection.
  • Fork runs hold no credentials at all; integration tests keep their existing manual trigger flow.

Tests

  • actionlint and the repo's conftest policies pass on all changed workflow/action files.
  • The offline-resolution step in the warmer self-verifies the cache: a broken cache fails the warmer, not contributor PRs.
  • Note for reviewers: the fork path only works after the warmer's first run on main — please dispatch "Warm Go Cache" once right after merge.

This pull request and its description were written by Isaac.

chrisst added 5 commits June 12, 2026 15:07
Fork PRs cannot mint OIDC tokens, so they cannot authenticate to JFrog to
resolve Go modules. This workflow downloads all modules through JFrog on a
trusted runner and saves them as an immutable, timestamped GitHub Actions
cache. It runs daily (GitHub evicts caches after 7 days without access), on
pushes to main that change go.mod/go.sum, and on manual dispatch with an
optional pr_number input that overlays only go.mod/go.sum from a fork PR --
never fork source code.

Co-authored-by: Isaac
The tests job previously failed for fork PRs at the JFrog OIDC setup step,
because GitHub never issues OIDC tokens to fork PR runs. Add an is-fork
input to the setup-go-build-environment action: fork runs skip JFrog
entirely, restore the module cache saved by the Warm Go Cache workflow
(restore-only; the warmer is the sole cache writer), and resolve modules
offline through Go's file:// proxy. Same-repo runs keep the existing JFrog
behavior unchanged, and the required check context stays 'tests'.

The coverage upload is skipped for fork PRs since they cannot access
CODECOV_TOKEN.

Co-authored-by: Isaac
Explain in the PR auto-comment and CONTRIBUTING.md that fork PR unit tests
run offline from a pre-warmed dependency cache, and that PRs changing
go.mod/go.sum need a maintainer to dispatch the Warm Go Cache workflow with
the PR number before the tests check can pass.

Co-authored-by: Isaac
Pass pr_number to the fork-overlay step through an env var instead of
interpolating it into the script body, and require it to be numeric.
Also validate the fork repo name and ref taken from the GitHub API
response, since both are attacker-controlled, and reject refs starting
with "-" so git fetch cannot parse them as options.

Co-authored-by: Isaac
@chrisst chrisst requested review from a team as code owners June 12, 2026 22:29
@chrisst chrisst requested review from hectorcast-db and removed request for a team June 12, 2026 22:29
@chrisst chrisst temporarily deployed to test-trigger-is June 12, 2026 22:29 — with GitHub Actions Inactive
@chrisst chrisst temporarily deployed to test-trigger-is June 12, 2026 22:31 — with GitHub Actions Inactive
Comment thread .github/workflows/push.yml Outdated
shell: bash
run: |
if [ "${{ github.event.pull_request.head.repo.full_name }}" != "" ] && \
[ "${{ github.event.pull_request.head.repo.full_name }}" != "${{ github.repository }}" ]; then

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I vaguely remember that this was not "good enough" for a 100% foolproof solution. E.g., some entity may not be using a fork and still not have access to secrets/proxy.

But it should cover most of the cases.

@hectorcast-db hectorcast-db Jun 22, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember: some bots (e.g., Dependabot ) don't use a fork, but uses a scoped token which doesn't have access to secrets.
So such PRs will not work with this approach. Maybe we can do a similar approach that we did for tests. Try to read the token: if it's not there for any reason, default to the cache.

https://github.com/databricks/terraform-provider-databricks/blob/main/.github/workflows/integration-tests.yml#L31-L51

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG, I'll take a stab at this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

- .github/workflows/warm-go-cache.yml
schedule:
- cron: "0 6 * * *" # Daily; GitHub evicts caches not accessed for 7 days
workflow_dispatch:

@hectorcast-db hectorcast-db Jun 22, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am being over cautious, but workflow executions are public. Someone could open a benign PR, monitor the workflow execution and, when this workflow is triggered, quickly push a malicious update to the PR. Should we use a commit hash instead of PR number?

I think this may be ok. The reason is that malicious versions should not be in the proxy to begin with, so they would not be able to be downloaded into the cache even with such "switcharoo". We can use the hash if we want, but I am fine with using the PR number too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with pr for now too

Replace the fork-name check that routed unit tests to the pre-warmed Go
module cache with a capability check on ACTIONS_ID_TOKEN_REQUEST_URL. The
runner sets that variable only when id-token:write is effectively granted,
which GitHub denies to both fork and Dependabot PRs. The previous check
matched forks only, so Dependabot PRs (branched within the repo) were
routed to the JFrog proxy path and failed for lack of OIDC access.

Co-authored-by: Isaac
@chrisst chrisst temporarily deployed to test-trigger-is June 23, 2026 10:09 — with GitHub Actions Inactive
@github-actions

Copy link
Copy Markdown
Contributor

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/terraform

Inputs:

  • PR number: 5803
  • Commit SHA: 1b79587e34f2800abf2d5827b2d6b1cf38096cb0

Checks will be approved automatically on success.

@chrisst chrisst temporarily deployed to test-trigger-is June 23, 2026 10:11 — with GitHub Actions Inactive
@chrisst chrisst merged commit 636a2f3 into main Jun 24, 2026
16 of 17 checks passed
@chrisst chrisst deleted the fork-ci-cache-warming branch June 24, 2026 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants