Skip to content

Commit 0c7d688

Browse files
ghansemwojtyczka
andauthored
Fixed AI-assisted sql_query generation and made has_valid_schema compatible with older Spark versions (#995)
## Changes <!-- Summary of your changes that are easy to understand. Add screenshots when necessary --> * Improved Agent guidelines * Make `has_valid_schema` check compatible with older spark versions (< 4) * Fix issue with subqueries in sql `expression` check in Serverless v5 when check name is not provided (auto-derived) * Added guidelines on how to configure DQX with LDP/DLT (Lakeflow Declarative Pipelines) to enable incrementalization for Materialized Views (MVs) * Improved validation of required check function arguments * Fix CI issues ### Linked issues <!-- DOC: Link issue with a keyword: close, closes, closed, fix, fixes, fixed, resolve, resolves, resolved. See https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword --> Resolves: #1053 ### Tests <!-- How is this tested? Please see the checklist below and also describe any other relevant tests --> - [x] manually tested - [x] added unit tests - [x] added integration tests - [ ] added end-to-end tests - [ ] added performance tests --------- Co-authored-by: Marcin Wojtyczka <marcin.wojtyczka@databricks.com> Co-authored-by: mwojtyczka <mwojtyczka@users.noreply.github.com>
1 parent e48c29b commit 0c7d688

50 files changed

Lines changed: 1500 additions & 984 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/actions/jfrog-auth/action.yml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,3 +48,15 @@ runs:
4848
uv auth login "${UV_INDEX_URL}" --username gha-service-account --password "${JFROG_ACCESS_TOKEN}"
4949
printf "%s=%s\n" 'UV_INDEX_URL' "${UV_INDEX_URL}" >> "${GITHUB_ENV}"
5050
printf "%s=%s\n" 'UV_FROZEN' '1' >> "${GITHUB_ENV}"
51+
52+
- name: Configure npm/yarn for JFrog
53+
shell: bash
54+
env:
55+
JFROG_ACCESS_TOKEN: "${{ steps.jfrog-auth.outputs.jfrog-access-token }}"
56+
run: |
57+
umask 077
58+
cat > ~/.npmrc << EOF
59+
registry=https://databricks.jfrog.io/artifactory/api/npm/db-npm/
60+
//databricks.jfrog.io/artifactory/api/npm/db-npm/:_authToken=${JFROG_ACCESS_TOKEN}
61+
always-auth=true
62+
EOF
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
name: 'Setup Environment'
2+
description: 'Setup uv, authenticate with JFrog, and configure package managers (pip, uv, npm).'
3+
runs:
4+
using: "composite"
5+
steps:
6+
- name: Setup uv
7+
uses: astral-sh/setup-uv@5a095e7a2014a4212f075830d4f7277575a9d098 # v7.3.1
8+
with:
9+
version: "0.11.2"
10+
checksum: "7ac2ca0449c8d68dae9b99e635cd3bc9b22a4cb1de64b7c43716398447d42981"
11+
12+
- name: Setup for JFrog
13+
uses: ./.github/actions/jfrog-auth

.github/pull_request_template.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,9 @@ Resolves #..
1414
- [ ] added integration tests
1515
- [ ] added end-to-end tests
1616
- [ ] added performance tests
17+
18+
### Documentation and Demos
19+
<!-- Any user facing changes require documentation and demos update -->
20+
21+
- [ ] added/updated demos
22+
- [ ] added/updated docs

.github/workflows/acceptance.yml

Lines changed: 38 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,6 @@ on:
55
types: [ opened, synchronize, ready_for_review ]
66
merge_group:
77
types: [ checks_requested ]
8-
push:
9-
branches:
10-
- main
118

129
permissions:
1310
contents: read
@@ -17,14 +14,23 @@ concurrency:
1714
cancel-in-progress: false # don't cancel ongoing runs to ensure fixtures are completed and resources terminated
1815

1916
jobs:
17+
# Gate all downstream jobs behind a single check so PRs from forks (no access to the
18+
# tool environment) and draft PRs do not trigger the expensive acceptance suite.
19+
# PRs from forks are to be tested by the reviewer(s) / maintainer(s) before merging.
20+
not-a-fork:
21+
runs-on:
22+
group: databrickslabs-protected-runner-group
23+
labels: linux-ubuntu-latest
24+
if: github.event_name == 'pull_request' && !github.event.pull_request.draft && !github.event.pull_request.head.repo.fork
25+
steps:
26+
- run: echo "Not a fork PR, proceeding"
2027

2128
integration:
22-
# Only run this job for PRs from branches on the main repository and not from forks.
23-
# Workflows triggered by PRs from forks don't have access to the tool environment.
24-
# PRs from forks to be tested by the reviewer(s) / maintainer(s) before merging.
25-
if: github.event_name == 'pull_request' && !github.event.pull_request.draft && !github.event.pull_request.head.repo.fork
29+
needs: not-a-fork
2630
environment: tool
27-
runs-on: larger
31+
runs-on:
32+
group: larger-runners
33+
labels: larger
2834
permissions:
2935
# Access to the integration testing infrastructure.
3036
id-token: write
@@ -36,14 +42,8 @@ jobs:
3642
with:
3743
fetch-depth: 0
3844

39-
- name: Setup uv
40-
uses: astral-sh/setup-uv@5a095e7a2014a4212f075830d4f7277575a9d098 # v7.3.1
41-
with:
42-
version: "0.11.2"
43-
checksum: "7ac2ca0449c8d68dae9b99e635cd3bc9b22a4cb1de64b7c43716398447d42981"
44-
45-
- name: Setup for JFrog
46-
uses: ./.github/actions/jfrog-auth
45+
- name: Setup environment
46+
uses: ./.github/actions/setup-env
4747

4848
- name: Run unit tests and generate test coverage report
4949
run: make test
@@ -56,6 +56,7 @@ jobs:
5656
[run]
5757
source = ../../src
5858
relative_files = true
59+
parallel = true
5960
EOF
6061
6162
# Run tests from `tests/integration` as defined in .codegen.json
@@ -84,12 +85,11 @@ jobs:
8485
use_oidc: true
8586

8687
integration_serverless:
87-
# Only run this job for PRs from branches on the main repository and not from forks.
88-
# Workflows triggered by PRs from forks don't have access to the tool environment.
89-
# PRs from forks to be tested by the reviewer(s) / maintainer(s) before merging.
90-
if: github.event_name == 'pull_request' && !github.event.pull_request.draft && !github.event.pull_request.head.repo.fork
88+
needs: not-a-fork
9189
environment: tool
92-
runs-on: larger
90+
runs-on:
91+
group: larger-runners
92+
labels: larger
9393
permissions:
9494
id-token: write
9595
pull-requests: write
@@ -101,14 +101,8 @@ jobs:
101101
with:
102102
fetch-depth: 0
103103

104-
- name: Setup uv
105-
uses: astral-sh/setup-uv@5a095e7a2014a4212f075830d4f7277575a9d098 # v7.3.1
106-
with:
107-
version: "0.11.2"
108-
checksum: "7ac2ca0449c8d68dae9b99e635cd3bc9b22a4cb1de64b7c43716398447d42981"
109-
110-
- name: Setup for JFrog
111-
uses: ./.github/actions/jfrog-auth
104+
- name: Setup environment
105+
uses: ./.github/actions/setup-env
112106

113107
# Integration tests are run from within tests/integration folder.
114108
# Create .coveragerc with correct relative path to source code.
@@ -118,6 +112,7 @@ jobs:
118112
[run]
119113
source = ../../src
120114
relative_files = true
115+
parallel = true
121116
EOF
122117
123118
- name: Run integration tests on serverless cluster
@@ -143,9 +138,11 @@ jobs:
143138
use_oidc: true
144139

145140
e2e:
146-
if: github.event_name == 'pull_request' && !github.event.pull_request.draft && !github.event.pull_request.head.repo.fork
141+
needs: not-a-fork
147142
environment: tool
148-
runs-on: larger
143+
runs-on:
144+
group: larger-runners
145+
labels: larger
149146
permissions:
150147
id-token: write
151148
pull-requests: write
@@ -155,18 +152,12 @@ jobs:
155152
with:
156153
fetch-depth: 0
157154

158-
- name: Setup uv
159-
uses: astral-sh/setup-uv@5a095e7a2014a4212f075830d4f7277575a9d098 # v7.3.1
160-
with:
161-
version: "0.11.2"
162-
checksum: "7ac2ca0449c8d68dae9b99e635cd3bc9b22a4cb1de64b7c43716398447d42981"
163-
164-
- name: Setup for JFrog
165-
uses: ./.github/actions/jfrog-auth
155+
- name: Setup environment
156+
uses: ./.github/actions/setup-env
166157

167158
# Required for DAB (Databricks Asset Bundle) e2e tests
168159
- name: Install Databricks CLI
169-
uses: databricks/setup-cli@acd0e77a1ed7f15f528faca1e1f7f5590bcfdff8 # v0.296.0
160+
uses: databricks/setup-cli@596b0a354ba14aa59921aca1b02bd67c2b0a81a5 # v0.297.2
170161

171162
- name: Run e2e tests
172163
uses: databrickslabs/sandbox/acceptance@3313d06ce86227537b3f37f5974f7eecb2a8e59a # acceptance/v0.4.4
@@ -181,9 +172,11 @@ jobs:
181172
ARM_TENANT_ID: ${{ secrets.ARM_TENANT_ID }}
182173

183174
e2e_serverless:
184-
if: github.event_name == 'pull_request' && !github.event.pull_request.draft && !github.event.pull_request.head.repo.fork
175+
needs: not-a-fork
185176
environment: tool
186-
runs-on: larger
177+
runs-on:
178+
group: larger-runners
179+
labels: larger
187180
permissions:
188181
id-token: write
189182
pull-requests: write
@@ -195,18 +188,12 @@ jobs:
195188
with:
196189
fetch-depth: 0
197190

198-
- name: Setup uv
199-
uses: astral-sh/setup-uv@5a095e7a2014a4212f075830d4f7277575a9d098 # v7.3.1
200-
with:
201-
version: "0.11.2"
202-
checksum: "7ac2ca0449c8d68dae9b99e635cd3bc9b22a4cb1de64b7c43716398447d42981"
203-
204-
- name: Setup for JFrog
205-
uses: ./.github/actions/jfrog-auth
191+
- name: Setup environment
192+
uses: ./.github/actions/setup-env
206193

207194
# Required for DAB (Databricks Asset Bundle) e2e tests
208195
- name: Install Databricks CLI
209-
uses: databricks/setup-cli@acd0e77a1ed7f15f528faca1e1f7f5590bcfdff8 # v0.296.0
196+
uses: databricks/setup-cli@596b0a354ba14aa59921aca1b02bd67c2b0a81a5 # v0.297.2
210197

211198
- name: Run e2e tests on serverless cluster
212199
uses: databrickslabs/sandbox/acceptance@3313d06ce86227537b3f37f5974f7eecb2a8e59a # acceptance/v0.4.4

.github/workflows/anomaly.yml

Lines changed: 25 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,6 @@ on:
55
types: [ opened, synchronize, ready_for_review ]
66
merge_group:
77
types: [ checks_requested ]
8-
push:
9-
branches:
10-
- main
118

129
permissions:
1310
contents: read
@@ -17,10 +14,23 @@ concurrency:
1714
cancel-in-progress: false # don't cancel ongoing runs to ensure fixtures are completed and resources terminated
1815

1916
jobs:
20-
anomaly-tests:
17+
# Gate all downstream jobs behind a single check so PRs from forks (no access to the
18+
# tool environment) and draft PRs do not trigger the expensive anomaly suite.
19+
# PRs from forks are to be tested by the reviewer(s) / maintainer(s) before merging.
20+
not-a-fork:
21+
runs-on:
22+
group: databrickslabs-protected-runner-group
23+
labels: linux-ubuntu-latest
2124
if: github.event_name == 'pull_request' && !github.event.pull_request.draft && !github.event.pull_request.head.repo.fork
25+
steps:
26+
- run: echo "Not a fork PR, proceeding"
27+
28+
anomaly-tests:
29+
needs: not-a-fork
2230
environment: tool
23-
runs-on: larger
31+
runs-on:
32+
group: larger-runners
33+
labels: larger
2434
permissions:
2535
id-token: write
2636
pull-requests: write
@@ -30,14 +40,8 @@ jobs:
3040
with:
3141
fetch-depth: 0
3242

33-
- name: Setup uv
34-
uses: astral-sh/setup-uv@5a095e7a2014a4212f075830d4f7277575a9d098 # v7.3.1
35-
with:
36-
version: "0.11.2"
37-
checksum: "7ac2ca0449c8d68dae9b99e635cd3bc9b22a4cb1de64b7c43716398447d42981"
38-
39-
- name: Setup for JFrog
40-
uses: ./.github/actions/jfrog-auth
43+
- name: Setup environment
44+
uses: ./.github/actions/setup-env
4145

4246
# Create .coveragerc with correct relative path to source code.
4347
- name: Prepare code coverage configuration for anomaly tests
@@ -46,6 +50,7 @@ jobs:
4650
[run]
4751
source = ../../src
4852
relative_files = true
53+
parallel = true
4954
EOF
5055
5156
- name: Run anomaly integration tests
@@ -76,9 +81,11 @@ jobs:
7681
use_oidc: true
7782

7883
anomaly-tests-serverless:
79-
if: github.event_name == 'pull_request' && !github.event.pull_request.draft && !github.event.pull_request.head.repo.fork
84+
needs: not-a-fork
8085
environment: tool
81-
runs-on: larger
86+
runs-on:
87+
group: larger-runners
88+
labels: larger
8289
permissions:
8390
id-token: write
8491
pull-requests: write
@@ -88,14 +95,8 @@ jobs:
8895
with:
8996
fetch-depth: 0
9097

91-
- name: Setup uv
92-
uses: astral-sh/setup-uv@5a095e7a2014a4212f075830d4f7277575a9d098 # v7.3.1
93-
with:
94-
version: "0.11.2"
95-
checksum: "7ac2ca0449c8d68dae9b99e635cd3bc9b22a4cb1de64b7c43716398447d42981"
96-
97-
- name: Setup for JFrog
98-
uses: ./.github/actions/jfrog-auth
98+
- name: Setup environment
99+
uses: ./.github/actions/setup-env
99100

100101
# Create .coveragerc with correct relative path to source code.
101102
- name: Prepare code coverage configuration for anomaly tests
@@ -104,6 +105,7 @@ jobs:
104105
[run]
105106
source = ../../src
106107
relative_files = true
108+
parallel = true
107109
EOF
108110
109111
- name: Run anomaly integration tests on serverless cluster

.github/workflows/docs-release.yml

Lines changed: 3 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -25,18 +25,12 @@ jobs:
2525
with:
2626
fetch-depth: 0
2727

28-
- name: Setup uv
29-
uses: astral-sh/setup-uv@5a095e7a2014a4212f075830d4f7277575a9d098 # v7.3.1
30-
with:
31-
version: "0.11.2"
32-
checksum: "7ac2ca0449c8d68dae9b99e635cd3bc9b22a4cb1de64b7c43716398447d42981"
33-
34-
- name: Setup for JFrog
35-
uses: ./.github/actions/jfrog-auth
28+
- name: Setup environment
29+
uses: ./.github/actions/setup-env
3630

3731
- uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f # v6.3.0
3832
with:
39-
node-version: 20
33+
node-version: 22
4034
cache: yarn
4135
cache-dependency-path: docs/dqx/yarn.lock # need to put the lockfile path explicitly
4236

0 commit comments

Comments
 (0)