Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions .github/actions/setup-go-build-environment/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,25 +15,70 @@ inputs:
cache-dependency-path:
description: "Path to go.sum for caching"
required: false
use-cache:
description: "Whether the run lacks JFrog/OIDC access (fork or Dependabot PR) and must resolve modules from the pre-warmed cache instead"
required: false
default: "false"

runs:
using: composite
steps:
- name: Setup JFrog CLI with OIDC
if: inputs.use-cache != 'true'
uses: jfrog/setup-jfrog-cli@279b1f629f43dd5bc658d8361ac4802a7ef8d2d5 # v4.9.1
env:
JF_URL: https://databricks.jfrog.io
with:
oidc-provider-name: github-actions

- name: Setup Go
if: inputs.use-cache != 'true'
uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5.6.0
with:
go-version: ${{ inputs.go-version }}
go-version-file: ${{ inputs.go-version-file }}
cache: ${{ inputs.go-cache }}
cache-dependency-path: ${{ inputs.cache-dependency-path }}

# No proxy access: setup-go without its built-in cache; the "Warm Go Cache"
# workflow is the sole writer of the module cache restored below.
- name: Setup Go (cached)
if: inputs.use-cache == 'true'
uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5.6.0
with:
go-version: ${{ inputs.go-version }}
go-version-file: ${{ inputs.go-version-file }}
cache: false

- name: Configure Go module proxy via JFrog
if: inputs.use-cache != 'true'
shell: bash
run: jf goc --repo-resolve=db-golang

# Without OIDC access a run cannot authenticate to JFrog. Restore the module
# cache pre-warmed by the "Warm Go Cache" workflow and resolve modules from
# it offline through Go's file:// proxy.
- name: Compute pre-warmed cache key (cached)
if: inputs.use-cache == 'true'
id: modcache
shell: bash
run: |
echo "prefix=go-modules-${{ runner.os }}-${{ hashFiles('go.sum') }}" >> "$GITHUB_OUTPUT"

- name: Restore pre-warmed Go module cache (cached)
if: inputs.use-cache == 'true'
uses: actions/cache/restore@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
with:
path: ~/go/pkg/mod/cache/download
key: ${{ steps.modcache.outputs.prefix }}
restore-keys: |
${{ steps.modcache.outputs.prefix }}-
go-modules-${{ runner.os }}-

- name: Configure Go to resolve modules from the pre-warmed cache (cached)
if: inputs.use-cache == 'true'
shell: bash
run: |
echo "GOPROXY=file://$(go env GOMODCACHE)/cache/download" >> $GITHUB_ENV
echo "GONOSUMCHECK=*" >> $GITHUB_ENV
echo "GONOSUMDB=*" >> $GITHUB_ENV
14 changes: 13 additions & 1 deletion .github/workflows/external-message.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,19 @@ jobs:
run: |
gh pr comment ${{ github.event.pull_request.number }} --body \
"<!-- INTEGRATION_TESTS_MANUAL -->
If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:
### Unit tests

If this PR is from a fork, the \`tests\` check runs offline against a pre-warmed Go module cache because fork PRs cannot authenticate to the internal Go module proxy.

If this PR changes \`go.mod\` or \`go.sum\`, the \`tests\` check will fail until a maintainer warms the cache for it:

Actions -> Warm Go Cache -> Run workflow -> pr_number = ${{github.event.pull_request.number}}

Re-run the failed check once the cache warming completes.

### Integration tests

Integration tests don't run automatically for external contributors; an authorized user can run them manually by following the instructions below:

Trigger:
[go/deco-tests-run/terraform](https://go/deco-tests-run/terraform)
Expand Down
31 changes: 30 additions & 1 deletion .github/workflows/push.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,22 +13,49 @@ jobs:
labels: linux-ubuntu-latest

permissions:
# Required for JFrog OIDC auth on same-repo runs. GitHub never issues
# OIDC tokens to fork PR runs, so the permission is inert there; GitHub
# offers no way to condition job-level permissions on the event.
id-token: write
contents: read

steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1

# The runner injects ACTIONS_ID_TOKEN_REQUEST_URL only when id-token:write
# is effectively granted, which GitHub denies to fork and Dependabot PRs.
# Without it a run can't authenticate to JFrog, so it resolves modules
# offline from the cache pre-warmed by the "Warm Go Cache" workflow.
- name: Detect Go module proxy access
id: proxy-check
shell: bash
run: |
if [ -n "$ACTIONS_ID_TOKEN_REQUEST_URL" ]; then
echo "use_cache=false" >> "$GITHUB_OUTPUT"
else
echo "use_cache=true" >> "$GITHUB_OUTPUT"
fi

- name: Setup Go build environment
uses: ./.github/actions/setup-go-build-environment
with:
go-version-file: go.mod
use-cache: ${{ steps.proxy-check.outputs.use_cache }}

- name: Pull external libraries
shell: bash
env:
USE_CACHE: ${{ steps.proxy-check.outputs.use_cache }}
run: |
jf go mod vendor
if [ "$USE_CACHE" = "true" ]; then
# Resolve modules offline from the pre-warmed cache. If this fails
# after a go.mod/go.sum change, a maintainer must run the
# "Warm Go Cache" workflow with this PR's number first.
go mod vendor
else
jf go mod vendor
fi

# Point native go commands at the local module cache instead of
# the network. go run pkg@version needs module lookups even when
Expand All @@ -48,6 +75,8 @@ jobs:
make test

- name: Publish test coverage
# No CODECOV_TOKEN on fork/Dependabot runs; skip the upload.
if: steps.proxy-check.outputs.use_cache != 'true'
uses: codecov/codecov-action@b9fd7d16f6d7d1b5d2bec1a2887e65ceed900238 # v4.6.0
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
Expand Down
113 changes: 113 additions & 0 deletions .github/workflows/warm-go-cache.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
name: Warm Go Cache

# Pre-warms the Go module cache using JFrog as the module proxy.
#
# Fork PRs cannot mint OIDC tokens, so they cannot authenticate to JFrog.
# Instead, the tests workflow restores the cache saved here and resolves
# modules offline through Go's file:// proxy, which reads the same layout
# that 'go mod vendor' writes to the module cache.
#
# This workflow is the sole writer of the go-modules-* cache. PR workflows
# only ever restore it (actions/cache/restore).
#
# Fork PR with a go.mod/go.sum change:
# The PR's tests job fails because the new module is not in the cache.
# After reviewing the dependency change, run this workflow manually with
# pr_number set to the PR's number. It fetches only go.mod and go.sum from
# the fork (never source code), rebuilds the cache, and the contributor can
# then re-run the failed check.

on:
push:
branches: [main]
paths:
- go.mod
- go.sum
- .github/workflows/warm-go-cache.yml
schedule:
- cron: "0 6 * * *" # Daily; GitHub evicts caches not accessed for 7 days
workflow_dispatch:

@hectorcast-db hectorcast-db Jun 22, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am being over cautious, but workflow executions are public. Someone could open a benign PR, monitor the workflow execution and, when this workflow is triggered, quickly push a malicious update to the PR. Should we use a commit hash instead of PR number?

I think this may be ok. The reason is that malicious versions should not be in the proxy to begin with, so they would not be able to be downloaded into the cache even with such "switcharoo". We can use the hash if we want, but I am fine with using the PR number too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with pr for now too

inputs:
pr_number:
description: "Fork PR number to fetch go.mod/go.sum from. Leave empty to warm from main."
required: false

permissions:
id-token: write
contents: read
pull-requests: read

jobs:
warm-cache:
runs-on:
group: databricks-protected-runner-group
labels: linux-ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1

# Overlay only the dependency manifests from the fork PR - never its
# source code - so nothing from the fork is executed in this privileged
# workflow.
- name: Fetch go.mod and go.sum from fork PR
if: inputs.pr_number != ''
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PR_NUMBER: ${{ inputs.pr_number }}
run: |
[[ "$PR_NUMBER" =~ ^[0-9]+$ ]] || { echo "pr_number must be a positive integer"; exit 1; }
pr_data=$(gh api "repos/${{ github.repository }}/pulls/${PR_NUMBER}")
fork_repo=$(echo "$pr_data" | jq -er '.head.repo.full_name | select(type == "string" and length > 0)')
fork_ref=$(echo "$pr_data" | jq -er '.head.ref | select(type == "string" and length > 0)')
# Both values come from the PR head and are attacker-controlled:
# restrict the repo to owner/name characters and reject refs that
# git itself considers invalid or that start with "-" (git fetch
# would parse those as options even when quoted).
[[ "$fork_repo" =~ ^[A-Za-z0-9][A-Za-z0-9_.-]*/[A-Za-z0-9_.-]+$ ]] || { echo "unexpected fork repository name: $fork_repo"; exit 1; }
if [[ "$fork_ref" == -* ]] || ! git check-ref-format "refs/heads/$fork_ref"; then
echo "unexpected fork ref: $fork_ref"; exit 1
fi
echo "Warming cache for PR #${PR_NUMBER} from ${fork_repo}@${fork_ref}"
git remote add fork "https://github.com/${fork_repo}.git"
git fetch --depth=1 fork "refs/heads/${fork_ref}"
git checkout FETCH_HEAD -- go.mod go.sum

- name: Setup Go build environment
uses: ./.github/actions/setup-go-build-environment
with:
go-version-file: go.mod

# This workflow is the sole writer of the warmed module cache;
# don't let setup-go save its own.
go-cache: "false"

# Run the same command as the tests workflow's "Pull external libraries"
# step so the module cache ends up with the exact layout fork PR runs
# resolve from.
- name: Resolve module dependencies via JFrog
shell: bash
run: jf go mod vendor

- name: Verify offline resolution from the warmed cache
shell: bash
run: |
export GOPROXY=file://$(go env GOMODCACHE)/cache/download
export GONOSUMCHECK=*
export GONOSUMDB=*
go mod vendor

- name: Generate cache key
id: cache-key
shell: bash
run: |
# GitHub caches are immutable, so a timestamp suffix makes each run
# save a fresh entry. Restores prefix-match the latest entry via
# restore-keys.
echo "key=go-modules-${{ runner.os }}-${{ hashFiles('go.sum') }}-$(date -u +%Y%m%d%H%M%S)" >> "$GITHUB_OUTPUT"

- name: Save Go module cache
uses: actions/cache/save@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
with:
path: ~/go/pkg/mod/cache/download
key: ${{ steps.cache-key.outputs.key }}
6 changes: 6 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -499,6 +499,12 @@ func TestAccSecretAclResource(t *testing.T) {
}
```

## Unit Testing on Forked Pull Requests

Unit tests run in CI on every PR. PRs opened from forks cannot authenticate to the internal Go module proxy, so CI resolves their Go modules offline from a dependency cache that the "Warm Go Cache" workflow pre-warms daily from the `main` branch.

If your PR changes `go.mod` or `go.sum`, the `tests` check will fail until a maintainer re-warms the cache for your PR (Actions -> Warm Go Cache -> Run workflow -> pr_number). Once the warming run completes, re-run the failed check.

## Integration Testing

Integration tests are run as part of every PR made to the Databricks Terraform provider. Tests are run against AWS, Azure, and GCP infrastructure, in workspaces and accounts, and in Unity Catalog and non-Unity Catalog environments.
Expand Down
1 change: 1 addition & 0 deletions NEXT_CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,4 @@
### Exporter

### Internal Changes
* Run unit tests offline from a pre-warmed Go module cache for PRs that cannot authenticate to the internal Go module proxy (fork and Dependabot PRs), populated by the new "Warm Go Cache" workflow.
Loading