Skip to content

Commit 5295e13

Browse files
authored
feat: SPOG (Single Point of Gateway) host support (#1479)
## Summary Adds SPOG (Single Point of Gateway) support — account-level vanity hosts (e.g. `xyz.azuredatabricks.net`) where workspaces are disambiguated by `?o=<workspace-id>` on `http_path`. Matches the contract in `databricks-sql-python` ([#767](databricks/databricks-sql-python#767)), `databricks-sql-go`, `databricks-jdbc`, and the ADBC Rust driver. Opt-in via the dep-ceiling bumps in [#1474](#1474): activates only when `databricks-sql-connector ≥ 4.2.6` and `databricks-sdk ≥ 0.76.0` are installed (`Config.workspace_id` was introduced in SDK 0.76.0; verified end-to-end against a SPOG vanity URL). Legacy hosts and older deps are unaffected. ### Pre-flight checks | Host type | `?o=` in `http_path` | Outcome | |---|---|---| | SPOG (`unified`) | present | proceed — workspace id passed to SDK | | SPOG (`unified`) | missing | **warning** | | non-SPOG | present | **warning** | | non-SPOG | missing | proceed | | host probe failed | — | proceed (probe is non-fatal) | ## Test plan - [x] Unit (`hatch run unit tests/unit -q`) — 1174 passed - [x] `pre-commit run --all-files` - [x] `dbt debug` SPOG status block verified on both - [x] New SPOG tests added (unit + functional) - [x] Integration test run succeeds (🆗 [link](#1479)) - [x] Integration test run suceeds at `min-deps` (🆗 [link](#1479)) - [x] New workflow to run integration tests on spog urls weekly for sanity check (`spog-integration.yml`) succeeds - [ ] Follow up docs PR
1 parent 7ffddd3 commit 5295e13

29 files changed

Lines changed: 1473 additions & 61 deletions

.github/workflows/build_cluster_http_path.py

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,24 @@
11
import os
22
import re
33

4-
workspace_re = re.compile(r"^.*-(\d+)\..*$")
5-
hostname = os.getenv("DBT_DATABRICKS_HOST_NAME", "")
6-
matches = workspace_re.match(hostname)
7-
if matches:
8-
workspace_id = matches.group(1)
9-
print(workspace_id)
4+
spog_native = os.getenv("TEST_PECO_SPOG_NATIVE") == "1"
5+
spog_workspace_id = os.getenv("TEST_PECO_SPOG_WORKSPACE_ID")
6+
7+
if spog_native:
8+
if not spog_workspace_id:
9+
raise RuntimeError("TEST_PECO_SPOG_NATIVE requires TEST_PECO_SPOG_WORKSPACE_ID.")
10+
workspace_id = spog_workspace_id
11+
else:
12+
workspace_re = re.compile(r"^.*-(\d+)\..*$")
13+
hostname = os.getenv("DBT_DATABRICKS_HOST_NAME", "")
14+
matches = workspace_re.match(hostname)
15+
workspace_id = matches.group(1) if matches else ""
16+
1017
cluster_id = os.getenv("TEST_PECO_CLUSTER_ID")
1118
uc_cluster_id = os.getenv("TEST_PECO_UC_CLUSTER_ID")
12-
http_path = f"sql/protocolv1/o/{workspace_id}/{cluster_id}"
13-
uc_http_path = f"sql/protocolv1/o/{workspace_id}/{uc_cluster_id}"
19+
suffix = f"?o={workspace_id}" if spog_native else ""
20+
http_path = f"sql/protocolv1/o/{workspace_id}/{cluster_id}{suffix}"
21+
uc_http_path = f"sql/protocolv1/o/{workspace_id}/{uc_cluster_id}{suffix}"
1422

1523
# https://stackoverflow.com/a/72225291/5093960
1624
env_file = os.getenv("GITHUB_ENV", "")

.github/workflows/integration-min-deps.yml

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -290,11 +290,6 @@ jobs:
290290
run: |
291291
mkdir -p logs
292292
DBT_TEST_USER=notnecessaryformosttests@example.com \
293-
DBT_DATABRICKS_LOCATION_ROOT="$DBT_DATABRICKS_LOCATION_ROOT" \
294-
DBT_DATABRICKS_HOST_NAME="$DBT_DATABRICKS_HOST_NAME" \
295-
DBT_DATABRICKS_UC_CLUSTER_HTTP_PATH="$DBT_DATABRICKS_UC_CLUSTER_HTTP_PATH" \
296-
DBT_DATABRICKS_CLIENT_ID="$DBT_DATABRICKS_CLIENT_ID" \
297-
DBT_DATABRICKS_CLIENT_SECRET="$DBT_DATABRICKS_CLIENT_SECRET" \
298293
xargs -r hatch -v run min-deps:pytest \
299294
--color=yes -v \
300295
--profile databricks_uc_cluster \
@@ -389,11 +384,6 @@ jobs:
389384
run: |
390385
mkdir -p logs
391386
DBT_TEST_USER=notnecessaryformosttests@example.com \
392-
DBT_DATABRICKS_LOCATION_ROOT="$DBT_DATABRICKS_LOCATION_ROOT" \
393-
DBT_DATABRICKS_HOST_NAME="$DBT_DATABRICKS_HOST_NAME" \
394-
DBT_DATABRICKS_UC_CLUSTER_HTTP_PATH="$DBT_DATABRICKS_UC_CLUSTER_HTTP_PATH" \
395-
DBT_DATABRICKS_CLIENT_ID="$DBT_DATABRICKS_CLIENT_ID" \
396-
DBT_DATABRICKS_CLIENT_SECRET="$DBT_DATABRICKS_CLIENT_SECRET" \
397387
xargs -r hatch -v run min-deps:pytest \
398388
--color=yes -v \
399389
--profile databricks_uc_sql_endpoint \
@@ -486,11 +476,7 @@ jobs:
486476
run: |
487477
mkdir -p logs
488478
DBT_TEST_USER=notnecessaryformosttests@example.com \
489-
DBT_DATABRICKS_LOCATION_ROOT="$DBT_DATABRICKS_LOCATION_ROOT" \
490-
DBT_DATABRICKS_HOST_NAME="$DBT_DATABRICKS_HOST_NAME" \
491479
DBT_DATABRICKS_HTTP_PATH="$DBT_DATABRICKS_CLUSTER_HTTP_PATH" \
492-
DBT_DATABRICKS_CLIENT_ID="$DBT_DATABRICKS_CLIENT_ID" \
493-
DBT_DATABRICKS_CLIENT_SECRET="$DBT_DATABRICKS_CLIENT_SECRET" \
494480
xargs -r hatch -v run min-deps:pytest \
495481
--color=yes -v \
496482
--profile databricks_cluster \
Lines changed: 325 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,325 @@
1+
# SPOG Integration Tests for dbt-databricks.
2+
#
3+
# Mirrors integration.yml's matrix (prepare-shards + 3 sharded functional
4+
# jobs) on a weekly Sunday schedule, but with the host name and http_path
5+
# wired to the SPOG vanity host + ?o= so every test path exercises SPOG
6+
# routing. PR-targeting / status-reporting logic is omitted — this is a
7+
# scheduled smoke, not a PR gate.
8+
name: SPOG Integration Tests
9+
on:
10+
workflow_dispatch:
11+
schedule:
12+
- cron: "30 21 * * 0" # Weekly: Sunday 21:30 UTC (Monday 03:00 IST).
13+
14+
permissions:
15+
id-token: write
16+
contents: read
17+
18+
concurrency:
19+
group: ${{ github.workflow }}-${{ github.ref }}
20+
cancel-in-progress: true
21+
22+
jobs:
23+
prepare-shards:
24+
runs-on:
25+
group: databricks-protected-runner-group
26+
labels: linux-ubuntu-latest
27+
env:
28+
UV_FROZEN: "1"
29+
steps:
30+
- name: Check out the repository
31+
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
32+
33+
- name: Setup JFrog PyPI Proxy
34+
uses: ./.github/actions/setup-jfrog-pypi
35+
36+
- name: Set up Python
37+
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
38+
with:
39+
python-version: "3.10"
40+
41+
- name: Install uv
42+
uses: astral-sh/setup-uv@38f3f104447c67c051c4a08e39b64a148898af3a # v4
43+
with:
44+
cache-local-path: ~/.cache/uv
45+
46+
- name: Install Hatch
47+
uses: pypa/hatch@257e27e51a6a5616ed08a39a408a21c35c9931bc # install
48+
49+
- name: Collect tests and assign to shards
50+
run: |
51+
set -euo pipefail
52+
mkdir -p shard-assignments
53+
declare -A NUM_SHARDS=(
54+
[databricks_cluster]=2
55+
[databricks_uc_cluster]=3
56+
[databricks_uc_sql_endpoint]=3
57+
)
58+
for PROFILE in "${!NUM_SHARDS[@]}"; do
59+
(
60+
hatch run pytest --collect-only -q --profile "$PROFILE" tests/functional 2>&1 \
61+
| grep "::" \
62+
> "shard-assignments/${PROFILE}-collected.txt"
63+
) &
64+
done
65+
wait
66+
for PROFILE in "${!NUM_SHARDS[@]}"; do
67+
python3 scripts/shard_assign.py \
68+
--profile "$PROFILE" \
69+
--num-shards "${NUM_SHARDS[$PROFILE]}" \
70+
--input "shard-assignments/${PROFILE}-collected.txt" \
71+
--output-dir shard-assignments \
72+
--algo lpt_historical_time \
73+
--timings .github/test_timings.json
74+
done
75+
76+
- name: Upload shard assignments
77+
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
78+
with:
79+
name: shard-assignments-spog
80+
path: shard-assignments/
81+
retention-days: 5
82+
83+
uc-cluster:
84+
needs: prepare-shards
85+
strategy:
86+
fail-fast: false
87+
matrix:
88+
shard: [0, 1, 2]
89+
runs-on:
90+
group: databricks-protected-runner-group
91+
labels: linux-ubuntu-latest
92+
environment: azure-prod
93+
env:
94+
DBT_DATABRICKS_HOST_NAME: ${{ secrets.TEST_PECO_SPOG_HOST }}
95+
DBT_DATABRICKS_CLIENT_ID: ${{ secrets.TEST_PECO_SP_ID }}
96+
DBT_DATABRICKS_CLIENT_SECRET: ${{ secrets.TEST_PECO_SP_SECRET }}
97+
DBT_DATABRICKS_UC_INITIAL_CATALOG: peco
98+
DBT_DATABRICKS_LOCATION_ROOT: ${{ secrets.TEST_PECO_EXTERNAL_LOCATION }}test
99+
TEST_PECO_SPOG_WORKSPACE_ID: ${{ secrets.TEST_PECO_SPOG_WORKSPACE_ID }}
100+
TEST_PECO_UC_CLUSTER_ID: ${{ secrets.TEST_PECO_UC_CLUSTER_ID }}
101+
TEST_PECO_SPOG_NATIVE: "1"
102+
UV_FROZEN: "1"
103+
steps:
104+
- name: Check out repository
105+
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
106+
107+
- name: Setup JFrog PyPI Proxy
108+
uses: ./.github/actions/setup-jfrog-pypi
109+
110+
- name: Set up python
111+
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
112+
with:
113+
python-version: "3.10"
114+
115+
- name: Get http path from environment
116+
run: python .github/workflows/build_cluster_http_path.py
117+
shell: sh
118+
119+
- name: Install uv
120+
uses: astral-sh/setup-uv@38f3f104447c67c051c4a08e39b64a148898af3a # v4
121+
with:
122+
cache-local-path: ~/.cache/uv
123+
124+
- name: Install Hatch
125+
uses: pypa/hatch@257e27e51a6a5616ed08a39a408a21c35c9931bc # install
126+
127+
- name: Download shard assignments
128+
uses: actions/download-artifact@65a9edc5881444af0b9093a5e628f2fe47ea3b2e # v4.1.7
129+
with:
130+
name: shard-assignments-spog
131+
path: shard-assignments/
132+
133+
- name: Resolve test list for this shard
134+
run: |
135+
set -euo pipefail
136+
SHARD_FILE="shard-assignments/databricks_uc_cluster-shard-${{ matrix.shard }}.txt"
137+
if [ ! -s "$SHARD_FILE" ]; then
138+
echo "::error::Shard file missing or empty: $SHARD_FILE"
139+
exit 1
140+
fi
141+
echo "SHARD_TESTS_FILE=$SHARD_FILE" >> "$GITHUB_ENV"
142+
echo "Files in shard ${{ matrix.shard }}: $(wc -l < "$SHARD_FILE")"
143+
144+
- name: Run UC Cluster Functional Tests (shard ${{ matrix.shard }})
145+
run: |
146+
mkdir -p logs
147+
DBT_TEST_USER=notnecessaryformosttests@example.com \
148+
xargs -r hatch -v run pytest \
149+
--color=yes -v \
150+
--profile databricks_uc_cluster \
151+
-n 10 --dist=loadfile \
152+
--reruns 2 --reruns-delay 120 \
153+
--junitxml=logs/junit-uc-cluster-shard-${{ matrix.shard }}.xml \
154+
< "$SHARD_TESTS_FILE"
155+
156+
- name: Upload UC Cluster Test Logs
157+
if: always()
158+
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
159+
with:
160+
name: spog-uc-cluster-logs-shard-${{ matrix.shard }}
161+
path: logs/
162+
retention-days: 14
163+
164+
uc-sql-endpoint:
165+
needs: prepare-shards
166+
strategy:
167+
fail-fast: false
168+
matrix:
169+
shard: [0, 1, 2]
170+
runs-on:
171+
group: databricks-protected-runner-group
172+
labels: linux-ubuntu-latest
173+
environment: azure-prod
174+
env:
175+
DBT_DATABRICKS_HOST_NAME: ${{ secrets.TEST_PECO_SPOG_HOST }}
176+
DBT_DATABRICKS_CLIENT_ID: ${{ secrets.TEST_PECO_SP_ID }}
177+
DBT_DATABRICKS_CLIENT_SECRET: ${{ secrets.TEST_PECO_SP_SECRET }}
178+
DBT_DATABRICKS_HTTP_PATH: ${{ secrets.TEST_PECO_WAREHOUSE_HTTP_PATH }}?o=${{ secrets.TEST_PECO_SPOG_WORKSPACE_ID }}
179+
DBT_DATABRICKS_UC_INITIAL_CATALOG: peco
180+
DBT_DATABRICKS_LOCATION_ROOT: ${{ secrets.TEST_PECO_EXTERNAL_LOCATION }}test
181+
TEST_PECO_SPOG_WORKSPACE_ID: ${{ secrets.TEST_PECO_SPOG_WORKSPACE_ID }}
182+
TEST_PECO_UC_CLUSTER_ID: ${{ secrets.TEST_PECO_UC_CLUSTER_ID }}
183+
TEST_PECO_SPOG_NATIVE: "1"
184+
UV_FROZEN: "1"
185+
steps:
186+
- name: Check out repository
187+
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
188+
189+
- name: Setup JFrog PyPI Proxy
190+
uses: ./.github/actions/setup-jfrog-pypi
191+
192+
- name: Set up python
193+
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
194+
with:
195+
python-version: "3.10"
196+
197+
- name: Get http path from environment
198+
run: python .github/workflows/build_cluster_http_path.py
199+
shell: sh
200+
201+
- name: Install uv
202+
uses: astral-sh/setup-uv@38f3f104447c67c051c4a08e39b64a148898af3a # v4
203+
with:
204+
cache-local-path: ~/.cache/uv
205+
206+
- name: Install Hatch
207+
uses: pypa/hatch@257e27e51a6a5616ed08a39a408a21c35c9931bc # install
208+
209+
- name: Download shard assignments
210+
uses: actions/download-artifact@65a9edc5881444af0b9093a5e628f2fe47ea3b2e # v4.1.7
211+
with:
212+
name: shard-assignments-spog
213+
path: shard-assignments/
214+
215+
- name: Resolve test list for this shard
216+
run: |
217+
set -euo pipefail
218+
SHARD_FILE="shard-assignments/databricks_uc_sql_endpoint-shard-${{ matrix.shard }}.txt"
219+
if [ ! -s "$SHARD_FILE" ]; then
220+
echo "::error::Shard file missing or empty: $SHARD_FILE"
221+
exit 1
222+
fi
223+
echo "SHARD_TESTS_FILE=$SHARD_FILE" >> "$GITHUB_ENV"
224+
echo "Files in shard ${{ matrix.shard }}: $(wc -l < "$SHARD_FILE")"
225+
226+
- name: Run Sql Endpoint Functional Tests (shard ${{ matrix.shard }})
227+
run: |
228+
mkdir -p logs
229+
DBT_TEST_USER=notnecessaryformosttests@example.com \
230+
xargs -r hatch -v run pytest \
231+
--color=yes -v \
232+
--profile databricks_uc_sql_endpoint \
233+
-n 10 --dist=loadfile \
234+
--reruns 2 --reruns-delay 120 \
235+
--junitxml=logs/junit-uc-sql-endpoint-shard-${{ matrix.shard }}.xml \
236+
< "$SHARD_TESTS_FILE"
237+
238+
- name: Upload SQL Endpoint Test Logs
239+
if: always()
240+
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
241+
with:
242+
name: spog-uc-sql-endpoint-logs-shard-${{ matrix.shard }}
243+
path: logs/
244+
retention-days: 14
245+
246+
cluster:
247+
needs: prepare-shards
248+
strategy:
249+
fail-fast: false
250+
matrix:
251+
shard: [0, 1]
252+
runs-on:
253+
group: databricks-protected-runner-group
254+
labels: linux-ubuntu-latest
255+
environment: azure-prod
256+
env:
257+
DBT_DATABRICKS_HOST_NAME: ${{ secrets.TEST_PECO_SPOG_HOST }}
258+
DBT_DATABRICKS_CLIENT_ID: ${{ secrets.TEST_PECO_SP_ID }}
259+
DBT_DATABRICKS_CLIENT_SECRET: ${{ secrets.TEST_PECO_SP_SECRET }}
260+
DBT_DATABRICKS_LOCATION_ROOT: ${{ secrets.TEST_PECO_EXTERNAL_LOCATION }}test
261+
TEST_PECO_SPOG_WORKSPACE_ID: ${{ secrets.TEST_PECO_SPOG_WORKSPACE_ID }}
262+
TEST_PECO_CLUSTER_ID: ${{ secrets.TEST_PECO_CLUSTER_ID }}
263+
TEST_PECO_SPOG_NATIVE: "1"
264+
UV_FROZEN: "1"
265+
steps:
266+
- name: Check out repository
267+
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
268+
269+
- name: Setup JFrog PyPI Proxy
270+
uses: ./.github/actions/setup-jfrog-pypi
271+
272+
- name: Set up python
273+
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
274+
with:
275+
python-version: "3.10"
276+
277+
- name: Get http path from environment
278+
run: python .github/workflows/build_cluster_http_path.py
279+
shell: sh
280+
281+
- name: Install uv
282+
uses: astral-sh/setup-uv@38f3f104447c67c051c4a08e39b64a148898af3a # v4
283+
with:
284+
cache-local-path: ~/.cache/uv
285+
286+
- name: Install Hatch
287+
uses: pypa/hatch@257e27e51a6a5616ed08a39a408a21c35c9931bc # install
288+
289+
- name: Download shard assignments
290+
uses: actions/download-artifact@65a9edc5881444af0b9093a5e628f2fe47ea3b2e # v4.1.7
291+
with:
292+
name: shard-assignments-spog
293+
path: shard-assignments/
294+
295+
- name: Resolve test list for this shard
296+
run: |
297+
set -euo pipefail
298+
SHARD_FILE="shard-assignments/databricks_cluster-shard-${{ matrix.shard }}.txt"
299+
if [ ! -s "$SHARD_FILE" ]; then
300+
echo "::error::Shard file missing or empty: $SHARD_FILE"
301+
exit 1
302+
fi
303+
echo "SHARD_TESTS_FILE=$SHARD_FILE" >> "$GITHUB_ENV"
304+
echo "Files in shard ${{ matrix.shard }}: $(wc -l < "$SHARD_FILE")"
305+
306+
- name: Run Cluster Functional Tests (shard ${{ matrix.shard }})
307+
run: |
308+
mkdir -p logs
309+
DBT_TEST_USER=notnecessaryformosttests@example.com \
310+
DBT_DATABRICKS_HTTP_PATH="$DBT_DATABRICKS_CLUSTER_HTTP_PATH" \
311+
xargs -r hatch -v run pytest \
312+
--color=yes -v \
313+
--profile databricks_cluster \
314+
-n 10 --dist=loadfile \
315+
--reruns 2 --reruns-delay 120 \
316+
--junitxml=logs/junit-cluster-shard-${{ matrix.shard }}.xml \
317+
< "$SHARD_TESTS_FILE"
318+
319+
- name: Upload Cluster Test Logs
320+
if: always()
321+
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
322+
with:
323+
name: spog-cluster-logs-shard-${{ matrix.shard }}
324+
path: logs/
325+
retention-days: 14

0 commit comments

Comments
 (0)