Route integration-tests trigger through emu-access runner#5609
Merged
mihaimitrea-db merged 1 commit intoApr 21, 2026
Conversation
Co-authored-by: Isaac
Contributor
|
If integration tests don't run automatically, an authorized user can run them manually by following the instructions below: Trigger: Inputs:
Checks will be approved automatically on success. |
hectorcast-db
approved these changes
Apr 21, 2026
tanmay-db
pushed a commit
that referenced
this pull request
Apr 29, 2026
## Why
Between **2026-04-17** and **2026-04-20**, the `databricks-eng` org
tightened its IP allow list. Since then, every `pull_request`-triggered
Integration Tests run on this repo has failed silently:
- The `trigger-tests` job in `.github/workflows/integration-tests.yml`
runs on `databricks-deco-testing-runner-group` (label
`ubuntu-latest-deco`).
- That job calls `actions/create-github-app-token` with `owner: \${{
secrets.ORG_NAME }}` (= `databricks-eng`), which resolves the
installation via `/repos/databricks-eng/.../installation`.
- The deco runner pool's egress IPs are no longer on the
`databricks-eng` allow list, so that lookup returns **403**.
- The downstream `gh workflow run terraform-isolated-pr.yml -R
databricks-eng/eng-dev-ecosystem` never dispatches.
- Merges still land only because the `merge_group` `auto-approve` job
rubber-stamps the check without running tests.
The `databricks-release-runner-group-emu-access` pool's egress IPs
**are** on the `databricks-eng` allow list, so moving the cross-org
dispatch job to that pool unblocks the lookup.
## What changed
Minimal 2-line runner swap on the `trigger-tests` job only:
| Before | After |
|--------|-------|
| `group: databricks-deco-testing-runner-group` | `group:
databricks-release-runner-group-emu-access` |
| `labels: ubuntu-latest-deco` | `labels:
linux-ubuntu-latest-emu-access` |
### Not changed
- `check-token` job — runs a shell script checking secret presence; no
external calls, stays on deco.
- `auto-approve` job — creates a same-org check via `context.repo`,
unaffected by the cross-org allow list, stays on deco.
- Any other workflow (`tagging.yml` uses the same app-token action but
for same-org release work; unaffected).
- Private-side `eng-dev-ecosystem` workflows — no changes required; the
private workflow already accepts `pull_request_number` + `commit_sha`.
## Why a single runner swap (vs. Go SDK's job split)
The Go SDK fix (databricks/databricks-sdk-go#1638) split `trigger-tests`
into a `create-check` job (stays on deco, creates a same-org check run)
+ a `trigger-tests` job (moves to emu-access, does the cross-org
dispatch). That was needed **only** because the Go SDK workflow creates
a `check_run` on the public repo and passes `check_run_id` into the
private workflow.
This repo's workflow does **not** create a check run — `trigger-tests`
calls `gh workflow run terraform-isolated-pr.yml` and nothing else. No
`check_run_id` is produced or passed. Same shape as Python SDK and Java
SDK, so the minimal single-runner swap is the right fix. No job
splitting, no new dependencies, no `check_run_id` plumbing.
## Reference PRs (same pattern, already merged)
- Python SDK: databricks/databricks-sdk-py#1396
- Java SDK: databricks/databricks-sdk-java#769
- Go SDK (different shape — split into two jobs — for contrast, not to
mirror): databricks/databricks-sdk-go#1638
## Test plan
The PR's own Integration Tests run is the test. Expected outcome:
- [ ] \`trigger-tests\` runs on \`linux-ubuntu-latest-emu-access\` and
\`create-github-app-token\` succeeds (no 403).
- [ ] A \`terraform-isolated-pr\` \`workflow_dispatch\` event appears on
\`databricks-eng/eng-dev-ecosystem\`.
- [ ] The \`Integration Tests\` check on this PR transitions to
\`success\` / \`failure\` based on the dispatched run.
- [ ] Existing \`merge_group\` \`auto-approve\` path still works
unchanged (not touched by this PR).
NO_CHANGELOG=true
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Between 2026-04-17 and 2026-04-20, the
databricks-engorg tightened its IP allow list. Since then, everypull_request-triggered Integration Tests run on this repo has failed silently:trigger-testsjob in.github/workflows/integration-tests.ymlruns ondatabricks-deco-testing-runner-group(labelubuntu-latest-deco).actions/create-github-app-tokenwithowner: \${{ secrets.ORG_NAME }}(=databricks-eng), which resolves the installation via/repos/databricks-eng/.../installation.databricks-engallow list, so that lookup returns 403.gh workflow run terraform-isolated-pr.yml -R databricks-eng/eng-dev-ecosystemnever dispatches.merge_groupauto-approvejob rubber-stamps the check without running tests.The
databricks-release-runner-group-emu-accesspool's egress IPs are on thedatabricks-engallow list, so moving the cross-org dispatch job to that pool unblocks the lookup.What changed
Minimal 2-line runner swap on the
trigger-testsjob only:group: databricks-deco-testing-runner-groupgroup: databricks-release-runner-group-emu-accesslabels: ubuntu-latest-decolabels: linux-ubuntu-latest-emu-accessNot changed
check-tokenjob — runs a shell script checking secret presence; no external calls, stays on deco.auto-approvejob — creates a same-org check viacontext.repo, unaffected by the cross-org allow list, stays on deco.tagging.ymluses the same app-token action but for same-org release work; unaffected).eng-dev-ecosystemworkflows — no changes required; the private workflow already acceptspull_request_number+commit_sha.Why a single runner swap (vs. Go SDK's job split)
The Go SDK fix (databricks/databricks-sdk-go#1638) split
trigger-testsinto acreate-checkjob (stays on deco, creates a same-org check run) + atrigger-testsjob (moves to emu-access, does the cross-org dispatch). That was needed only because the Go SDK workflow creates acheck_runon the public repo and passescheck_run_idinto the private workflow.This repo's workflow does not create a check run —
trigger-testscallsgh workflow run terraform-isolated-pr.ymland nothing else. Nocheck_run_idis produced or passed. Same shape as Python SDK and Java SDK, so the minimal single-runner swap is the right fix. No job splitting, no new dependencies, nocheck_run_idplumbing.Reference PRs (same pattern, already merged)
Test plan
The PR's own Integration Tests run is the test. Expected outcome:
NO_CHANGELOG=true