Skip to content

Commit 2753ad3

Browse files
committed
[SEA-NodeJS] Pin the kernel by SHA (KERNEL_REV) + kernel-e2e CI
The SEA napi binding is built from the kernel's private Rust source, not a published/versioned artifact, and the actual `.node` binary is gitignored — so nothing in the repo records *which* kernel revision the committed `native/sea/index.d.ts` / `index.js` correspond to, and the standard e2e job never builds or exercises the binding (its SEA suite skips). Mirror the databricks-sql-python connector's mechanism: - `KERNEL_REV` — a single 40-char kernel commit SHA at the repo root. This is the one source of truth for the kernel version the driver is built against. Bumping it is the only way to pick up a new kernel, so a driver change and its kernel dependency always land together in one bisectable diff. Pinned to the current kernel main (b4d8822), verified to produce a binding byte-identical to the one shipped in the SEA feature PRs. - `.github/workflows/kernel-e2e.yml` — reads `KERNEL_REV`, checks the kernel out at that SHA via a GitHub App token, builds the napi binding (`npm run build:native` against the pinned checkout, with cargo routed through the JFrog proxy), and runs the SEA e2e suite (`tests/e2e/sea/**`) against the dogfood warehouse. Gate semantics match the python workflow: synthetic-success on plain PRs, real run in the merge queue (or via the `kernel-e2e` label), change-detection to auto-pass when no SEA-relevant files moved. - `native/sea/README.md` — documents the pin and how to match it locally. Requires one-time repo-admin setup (GitHub App allowlist for the kernel repo, the `kernel-e2e` label, warehouse secrets in the azure-prod environment) — see the workflow header. Does not change the published-binary story (the `@databricks/sql-kernel-*` optional packages remain a separate, later step). Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
1 parent de04e19 commit 2753ad3

3 files changed

Lines changed: 381 additions & 0 deletions

File tree

.github/workflows/kernel-e2e.yml

Lines changed: 359 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,359 @@
1+
name: Kernel E2E Tests
2+
3+
# Runs the SEA backend e2e suite (tests/e2e/sea/**) against a real
4+
# Databricks warehouse with a freshly-built napi-rs kernel binding.
5+
#
6+
# The kernel is a private repo with no published binary artifact. We pin
7+
# a kernel SHA in the `KERNEL_REV` file at the repo root, check the kernel
8+
# out via a GitHub App token, and run `npm run build:native` to compile
9+
# the napi binding into native/sea/ in the same checkout the tests run
10+
# against. Bumping `KERNEL_REV` is the ONLY way to pick up a new kernel
11+
# version — this keeps the driver <-> kernel pair bisectable, mirroring
12+
# the databricks-sql-python connector's KERNEL_REV mechanism.
13+
#
14+
# Why this exists: the committed native/sea/index.d.ts + index.js are the
15+
# TypeScript declarations and the napi-rs platform router; the actual
16+
# `.node` binary is gitignored (large, per-platform) and is NOT in the
17+
# repo. The standard `main.yml` e2e job has no binary, so its SEA suite
18+
# skips (it gates on DATABRICKS_PECOTESTING_* secrets it doesn't set).
19+
# This workflow is what actually exercises the SEA path end-to-end against
20+
# a known kernel revision.
21+
#
22+
# Gate semantics mirror the python connector's kernel-e2e.yml:
23+
# - Plain PR events post a synthetic-success check so the required
24+
# "Kernel E2E" check doesn't block PRs that don't touch the SEA path.
25+
# Real tests run in the merge queue.
26+
# - `kernel-e2e` label triggers a preview run on the PR; the label is
27+
# auto-removed on `synchronize` for the same security reason.
28+
# - merge_group fires the real gate — runs when SEA-relevant files
29+
# changed, auto-passes otherwise.
30+
#
31+
# Required external setup (one-time, by a repo admin):
32+
# 1. `kernel-e2e` label exists in this repo.
33+
# 2. `INTEGRATION_TEST_APP_ID` / `INTEGRATION_TEST_PRIVATE_KEY` secrets
34+
# exist and the GitHub App's repo allowlist includes
35+
# `databricks/databricks-sql-kernel`.
36+
# 3. `KERNEL_REV` at the repo root contains a 40-char kernel commit SHA.
37+
# 4. `azure-prod` environment exposes DATABRICKS_HOST /
38+
# TEST_PECO_WAREHOUSE_HTTP_PATH / DATABRICKS_TOKEN.
39+
40+
on:
41+
pull_request:
42+
types: [opened, synchronize, reopened, labeled]
43+
merge_group:
44+
45+
permissions:
46+
contents: read
47+
id-token: write
48+
49+
concurrency:
50+
group: kernel-e2e-${{ github.workflow }}-${{ github.ref }}
51+
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
52+
53+
jobs:
54+
# ───────────────────────────────────────────────────────────────
55+
# Security: auto-remove `kernel-e2e` label on new commits so a
56+
# labelled preview run can't be re-triggered with unreviewed code.
57+
# ───────────────────────────────────────────────────────────────
58+
strip-label:
59+
if: github.event_name == 'pull_request' && github.event.action == 'synchronize'
60+
runs-on:
61+
group: databricks-protected-runner-group
62+
labels: linux-ubuntu-latest
63+
permissions:
64+
pull-requests: write
65+
steps:
66+
- name: Remove kernel-e2e label
67+
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
68+
with:
69+
github-token: ${{ github.token }}
70+
script: |
71+
try {
72+
await github.rest.issues.removeLabel({
73+
owner: context.repo.owner,
74+
repo: context.repo.repo,
75+
issue_number: context.payload.pull_request.number,
76+
name: 'kernel-e2e',
77+
});
78+
} catch (error) {
79+
if (error.status !== 404) throw error;
80+
}
81+
82+
# ───────────────────────────────────────────────────────────────
83+
# Synthetic success on every non-label PR event so the required
84+
# "Kernel E2E" check doesn't permablock PRs that don't touch SEA
85+
# code. Real run happens in the merge queue (or via explicit label).
86+
# ───────────────────────────────────────────────────────────────
87+
skip-kernel-e2e-pr:
88+
if: github.event_name == 'pull_request' && github.event.action != 'labeled'
89+
runs-on:
90+
group: databricks-protected-runner-group
91+
labels: linux-ubuntu-latest
92+
permissions:
93+
checks: write
94+
steps:
95+
- name: Post synthetic-success check
96+
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
97+
with:
98+
github-token: ${{ github.token }}
99+
script: |
100+
await github.rest.checks.create({
101+
owner: context.repo.owner,
102+
repo: context.repo.repo,
103+
name: 'Kernel E2E',
104+
head_sha: context.payload.pull_request.head.sha,
105+
status: 'completed',
106+
conclusion: 'success',
107+
completed_at: new Date().toISOString(),
108+
output: {
109+
title: 'Skipped on PR — runs in merge queue',
110+
summary: 'Kernel E2E is skipped on PRs and runs as a required gate in the merge queue. Add the `kernel-e2e` label to preview on this PR.'
111+
}
112+
});
113+
114+
# ───────────────────────────────────────────────────────────────
115+
# Detect whether SEA-relevant files changed. Used by both the
116+
# labelled-PR path and the merge-queue path to decide between
117+
# "really run the suite" and "auto-pass the check".
118+
# ───────────────────────────────────────────────────────────────
119+
detect-changes:
120+
if: |
121+
github.event_name == 'merge_group' ||
122+
(github.event_name == 'pull_request' &&
123+
github.event.action == 'labeled' &&
124+
contains(github.event.pull_request.labels.*.name, 'kernel-e2e'))
125+
runs-on:
126+
group: databricks-protected-runner-group
127+
labels: linux-ubuntu-latest
128+
outputs:
129+
run_tests: ${{ steps.changed.outputs.run_tests }}
130+
head_sha: ${{ steps.refs.outputs.head_sha }}
131+
steps:
132+
- name: Resolve head SHA
133+
id: refs
134+
env:
135+
MERGE_QUEUE_REF: ${{ github.event.merge_group.head_ref }}
136+
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
137+
with:
138+
script: |
139+
if (context.eventName === 'pull_request') {
140+
core.setOutput('head_sha', context.payload.pull_request.head.sha);
141+
return;
142+
}
143+
core.setOutput('head_sha', context.payload.merge_group.head_sha);
144+
145+
- name: Check out repo at head SHA
146+
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
147+
with:
148+
ref: ${{ steps.refs.outputs.head_sha }}
149+
fetch-depth: 0
150+
151+
- name: Detect SEA-relevant changes
152+
id: changed
153+
env:
154+
HEAD_SHA: ${{ steps.refs.outputs.head_sha }}
155+
BASE_SHA: ${{ github.event_name == 'merge_group' && github.event.merge_group.base_sha || github.event.pull_request.base.sha }}
156+
run: |
157+
CHANGED=$(git diff --name-only "$BASE_SHA" "$HEAD_SHA")
158+
echo "Changed files:"
159+
echo "$CHANGED"
160+
# Run when the SEA driver layer, the napi binding contract, SEA
161+
# e2e tests, this workflow, the kernel revision pin, or core deps
162+
# move.
163+
if echo "$CHANGED" | grep -qE "^(lib/sea/|native/sea/|tests/e2e/sea/|tests/unit/sea/|\.github/workflows/kernel-e2e\.yml|KERNEL_REV|package\.json|package-lock\.json)"; then
164+
echo "run_tests=true" >> "$GITHUB_OUTPUT"
165+
else
166+
echo "run_tests=false" >> "$GITHUB_OUTPUT"
167+
fi
168+
169+
# ───────────────────────────────────────────────────────────────
170+
# Real test job. Builds the napi binding from the pinned kernel SHA
171+
# and runs the SEA e2e suite against the dogfood warehouse.
172+
# ───────────────────────────────────────────────────────────────
173+
run-kernel-e2e:
174+
needs: detect-changes
175+
if: needs.detect-changes.outputs.run_tests == 'true'
176+
runs-on:
177+
group: databricks-protected-runner-group
178+
labels: linux-ubuntu-latest
179+
environment: azure-prod
180+
permissions:
181+
contents: read
182+
checks: write
183+
id-token: write
184+
env:
185+
# SEA e2e tests gate on the DATABRICKS_PECOTESTING_* vars; map the
186+
# warehouse secrets onto them so the suite actually runs (it skips
187+
# when they are absent).
188+
DATABRICKS_PECOTESTING_SERVER_HOSTNAME: ${{ secrets.DATABRICKS_HOST }}
189+
DATABRICKS_PECOTESTING_HTTP_PATH: ${{ secrets.TEST_PECO_WAREHOUSE_HTTP_PATH }}
190+
DATABRICKS_PECOTESTING_TOKEN_PERSONAL: ${{ secrets.DATABRICKS_TOKEN }}
191+
steps:
192+
- name: Check out driver
193+
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
194+
with:
195+
ref: ${{ needs.detect-changes.outputs.head_sha }}
196+
197+
- name: Read pinned kernel SHA
198+
id: kernel-rev
199+
run: |
200+
if [[ ! -f KERNEL_REV ]]; then
201+
echo "::error::KERNEL_REV file missing"
202+
exit 1
203+
fi
204+
REV=$(tr -d '[:space:]' < KERNEL_REV)
205+
if [[ ! "$REV" =~ ^[0-9a-f]{40}$ ]]; then
206+
echo "::error::KERNEL_REV must be a 40-char commit SHA, got: $REV"
207+
exit 1
208+
fi
209+
echo "rev=$REV" >> "$GITHUB_OUTPUT"
210+
echo "Pinned kernel SHA: $REV"
211+
212+
- name: Generate GitHub App token (kernel repo read access)
213+
id: app-token
214+
uses: actions/create-github-app-token@f8d387b68d61c58ab83c6c016672934102569859 # v3.0.0
215+
with:
216+
app-id: ${{ secrets.INTEGRATION_TEST_APP_ID }}
217+
private-key: ${{ secrets.INTEGRATION_TEST_PRIVATE_KEY }}
218+
owner: databricks
219+
repositories: databricks-sql-kernel
220+
221+
- name: Check out kernel at pinned SHA
222+
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
223+
with:
224+
repository: databricks/databricks-sql-kernel
225+
ref: ${{ steps.kernel-rev.outputs.rev }}
226+
token: ${{ steps.app-token.outputs.token }}
227+
path: databricks-sql-kernel
228+
229+
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
230+
with:
231+
node-version: 20
232+
233+
- name: Set up Rust toolchain
234+
uses: actions-rust-lang/setup-rust-toolchain@1780873c7b576612439a134613cc4cc74ce5538c # v1.15.2
235+
with:
236+
cache: false
237+
238+
- name: Cache cargo build artifacts (keyed on kernel SHA)
239+
uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
240+
with:
241+
workspaces: databricks-sql-kernel
242+
key: kernel-${{ steps.kernel-rev.outputs.rev }}
243+
244+
- name: Set up JFrog (npm registry proxy)
245+
uses: ./.github/actions/setup-jfrog
246+
247+
- name: Configure Cargo for JFrog proxy
248+
shell: bash
249+
# databricks-protected-runner-group blocks direct egress to
250+
# index.crates.io, so cargo must route through JFrog's
251+
# db-cargo-remote proxy. Reuses the JFrog token setup-jfrog
252+
# exported into the environment.
253+
run: |
254+
set -euo pipefail
255+
mkdir -p ~/.cargo
256+
cat > ~/.cargo/config.toml << 'EOF'
257+
[source.crates-io]
258+
replace-with = "jfrog"
259+
[source.jfrog]
260+
registry = "sparse+https://databricks.jfrog.io/artifactory/api/cargo/db-cargo-remote/index/"
261+
[registries.jfrog]
262+
index = "sparse+https://databricks.jfrog.io/artifactory/api/cargo/db-cargo-remote/index/"
263+
credential-provider = ["cargo:token"]
264+
EOF
265+
cat > ~/.cargo/credentials.toml << EOF
266+
[registries.jfrog]
267+
token = "Bearer ${JFROG_ACCESS_TOKEN}"
268+
EOF
269+
echo "CARGO_REGISTRIES_JFROG_TOKEN=Bearer ${JFROG_ACCESS_TOKEN}" >> "$GITHUB_ENV"
270+
271+
- name: Install driver deps
272+
run: npm ci
273+
274+
- name: Build napi binding from pinned kernel
275+
# build:native cd's into ${DATABRICKS_SQL_KERNEL_REPO}/napi, runs the
276+
# napi-rs build, and copies index.* into native/sea/. Pointing it at
277+
# the SHA-pinned kernel checkout is what makes the binary match
278+
# KERNEL_REV exactly.
279+
env:
280+
DATABRICKS_SQL_KERNEL_REPO: ${{ github.workspace }}/databricks-sql-kernel
281+
run: npm run build:native
282+
283+
- name: Smoke-check binding loads
284+
run: node -e "const b=require('./native/sea'); if(typeof b.version!=='function'){throw new Error('napi binding failed to load')} console.log('kernel binding ok:', b.version())"
285+
286+
- name: Run SEA e2e tests
287+
run: NODE_OPTIONS="--max-old-space-size=4096" npm run e2e -- 'tests/e2e/sea/**/*.test.ts'
288+
289+
- name: Post Kernel E2E check (success)
290+
if: success()
291+
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
292+
with:
293+
github-token: ${{ github.token }}
294+
script: |
295+
await github.rest.checks.create({
296+
owner: context.repo.owner,
297+
repo: context.repo.repo,
298+
name: 'Kernel E2E',
299+
head_sha: '${{ needs.detect-changes.outputs.head_sha }}',
300+
status: 'completed',
301+
conclusion: 'success',
302+
completed_at: new Date().toISOString(),
303+
output: {
304+
title: 'Kernel E2E passed',
305+
summary: 'tests/e2e/sea ran green against the pinned kernel SHA.'
306+
}
307+
});
308+
309+
- name: Post Kernel E2E check (failure)
310+
if: failure()
311+
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
312+
with:
313+
github-token: ${{ github.token }}
314+
script: |
315+
await github.rest.checks.create({
316+
owner: context.repo.owner,
317+
repo: context.repo.repo,
318+
name: 'Kernel E2E',
319+
head_sha: '${{ needs.detect-changes.outputs.head_sha }}',
320+
status: 'completed',
321+
conclusion: 'failure',
322+
completed_at: new Date().toISOString(),
323+
output: {
324+
title: 'Kernel E2E failed',
325+
summary: 'See workflow logs for details.'
326+
}
327+
});
328+
329+
# ───────────────────────────────────────────────────────────────
330+
# Auto-pass the Kernel E2E check in the merge queue when no SEA-
331+
# relevant files changed.
332+
# ───────────────────────────────────────────────────────────────
333+
auto-pass-merge-queue:
334+
needs: detect-changes
335+
if: github.event_name == 'merge_group' && needs.detect-changes.outputs.run_tests != 'true'
336+
runs-on:
337+
group: databricks-protected-runner-group
338+
labels: linux-ubuntu-latest
339+
permissions:
340+
checks: write
341+
steps:
342+
- name: Auto-pass
343+
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
344+
with:
345+
github-token: ${{ github.token }}
346+
script: |
347+
await github.rest.checks.create({
348+
owner: context.repo.owner,
349+
repo: context.repo.repo,
350+
name: 'Kernel E2E',
351+
head_sha: '${{ github.event.merge_group.head_sha }}',
352+
status: 'completed',
353+
conclusion: 'success',
354+
completed_at: new Date().toISOString(),
355+
output: {
356+
title: 'Skipped — no SEA-relevant changes',
357+
summary: 'No files under lib/sea/, native/sea/, tests/e2e/sea/, tests/unit/sea/, KERNEL_REV, package.json, or package-lock.json changed.'
358+
}
359+
});

KERNEL_REV

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
b4d88220cdfad8dba1cfa89892269342ae26feeb

native/sea/README.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,27 @@ directory containing `napi/`) and is required when your kernel
5353
checkout isn't at `../../databricks-sql-kernel` relative to the
5454
nodejs repo.
5555

56+
## Kernel version pin (`KERNEL_REV`)
57+
58+
The kernel is a private repo with no published binary artifact, and the
59+
napi binding is built from its Rust source rather than a versioned crate.
60+
To keep the driver ↔ kernel pair reproducible and bisectable, the exact
61+
kernel commit the binding is built against is pinned in the **`KERNEL_REV`**
62+
file at the repo root — a single 40-char commit SHA. This mirrors the
63+
`databricks-sql-python` connector's `KERNEL_REV` mechanism.
64+
65+
The `.github/workflows/kernel-e2e.yml` CI job is the consumer: it reads
66+
`KERNEL_REV`, checks the kernel out at that SHA (via a GitHub App token
67+
with read access to `databricks/databricks-sql-kernel`), runs
68+
`npm run build:native` against it, and runs the SEA e2e suite
69+
(`tests/e2e/sea/**`) against the dogfood warehouse. **Bumping `KERNEL_REV`
70+
is the only way to pick up a new kernel version** — so a driver change and
71+
the kernel revision it depends on always land together in one reviewable
72+
diff.
73+
74+
For local dev, point `DATABRICKS_SQL_KERNEL_REPO` at a kernel checkout on
75+
that SHA (`git -C <kernel> checkout "$(cat KERNEL_REV)"`) to match CI.
76+
5677
## Production load path
5778

5879
At release time the kernel's CI publishes

0 commit comments

Comments
 (0)